Archon::Utilities::Regex Class Reference

Regular expression. More...

#include <archon/util/regex.H>

Collaboration diagram for Archon::Utilities::Regex:

Collaboration graph
[legend]
List of all members.

Public Member Functions

 Regex (ustring s, Logger *l=0)
 Construct a regular expression from a string representation.
 Regex (ustring s, const Environment &e, Logger *l=0)
 Construct a regular expression from a string representation.
 Regex (string s, Logger *l=0)
 Construct a regular expression from a UTF8 encoded string representation.
 Regex (string s, const Environment &e, Logger *l=0)
 Construct a regular expression from a UTF8 encoded string representation.
string print () const
 Return an UTF8 encoded string representation of this regular expression.

Static Public Member Functions

static Regex altern (Regex r1, Regex r2)
 Match either 'r1' or 'r2'.
static Regex juxta (Regex r1, Regex r2)
 Match the juxtaposition of 'r1' and 'r2'.
static Regex repeat (Regex r, int min, int max)
 Match 'n' repetitions of 'r' where 0 <= 'min' <= 'n' <= max.
static Regex repeat (Regex r, int n, bool orMore)
 If 'orMore' is false, match exactly 'n' repetitions of 'r' where 0 <= 'n'.
static Regex star (Regex r)
static Regex plus (Regex r)
static Regex option (Regex r)
static Regex str (ustring s)
static Regex str (string s)
 Match the string 's' which must be UFT8 encoded.
static Regex empty ()
 Match the empty string.
static Regex bracket (const vector< pair< uchar, uchar > > &ranges, const vector< string > &namedClasses, bool invert=false)
static Regex range (uchar from, uchar to, bool invert=false)
 Match one character in the range 'from' - 'to' (both inclusive).
static Regex namedClass (string name, bool invert=false)
 Match one character from the named class.
static Regex anyChar ()
 Match one arbitrary character.
static Regex lineBegin ()
static Regex lineEnd ()
static Regex wordBegin ()
static Regex wordEnd ()

Classes

struct  Altern
struct  Class
class  Environment
struct  Exp
struct  Juxta
struct  Lexer
struct  LineBegin
 Match the beginning of a line.
struct  LineEnd
 Match the end of a line.
struct  Parser
struct  ParserContext
struct  Repeat
struct  String
struct  WordBegin
 Match the beginning of a word.
struct  WordEnd
 Match the end of a word.

Detailed Description

Regular expression.

Precedence: alternation (|) 0 juxtaposition 1 repeatition (*,+,?,{}) 2

Todo:
Considder using a bitset<N> instead of vector<bool> for representing named character classes.

Todo:
Prevent users from using characters in range 0xE000 - 0xF8FF or even better think of a way to represent the anchor edges without using symbol values.

Todo:
Exclude newline characters from "." See http://www.unicode.org/unicode/reports/tr18/tr18-5.1.html#End%20Of%20Line

Definition at line 70 of file regex.H.


Constructor & Destructor Documentation

Archon::Utilities::Regex::Regex ustring  s,
Logger l = 0
[inline]
 

Construct a regular expression from a string representation.

Parameters:
l If null is passed for the logger then no errors are accepted in the string representation, otherwise non-fatal errors are logged, and only a fatal error will result in a ArgumentException.

Definition at line 304 of file regex.H.

Referenced by altern(), anyChar(), bracket(), empty(), juxta(), lineBegin(), lineEnd(), option(), plus(), repeat(), star(), str(), wordBegin(), and wordEnd().

Archon::Utilities::Regex::Regex ustring  s,
const Environment e,
Logger l = 0
[inline]
 

Construct a regular expression from a string representation.

Accept the special syntax extension where {name} stands for a previously defined expression.

Parameters:
l See Regex(ustring, Logger *)

Definition at line 316 of file regex.H.

Archon::Utilities::Regex::Regex string  s,
Logger l = 0
[inline]
 

Construct a regular expression from a UTF8 encoded string representation.

Parameters:
l See Regex(ustring, Logger *)

Definition at line 327 of file regex.H.

References Archon::Utilities::Unicode::decodeUtf8().

Archon::Utilities::Regex::Regex string  s,
const Environment e,
Logger l = 0
[inline]
 

Construct a regular expression from a UTF8 encoded string representation.

Accept the special syntax extension where {name} stands for a previously defined expression.

Parameters:
l See Regex(ustring, Logger *)

Definition at line 339 of file regex.H.

References Archon::Utilities::Unicode::decodeUtf8().


Member Function Documentation

static Regex Archon::Utilities::Regex::namedClass string  name,
bool  invert = false
[inline, static]
 

Match one character from the named class.

Parameters:
name@see{bracket} 

Definition at line 265 of file regex.H.

References bracket(), and n.

Regex Archon::Utilities::Regex::repeat Regex  r,
int  n,
bool  orMore
[static]
 

If 'orMore' is false, match exactly 'n' repetitions of 'r' where 0 <= 'n'.

If 'orMore' is true, match at least 'n' repetitions of 'r' where 0 <= 'n'.

Definition at line 879 of file regex.C.

References exp, and Regex().


The documentation for this class was generated from the following files:
Generated on Sun Jul 30 22:57:55 2006 for Archon by  doxygen 1.4.4