Archon::Utilities::Lexer Class Reference

A lexer controlled by regular expression. More...

#include <archon/util/lexer.H>

Inheritance diagram for Archon::Utilities::Lexer:

Inheritance graph
[legend]
Collaboration diagram for Archon::Utilities::Lexer:

Collaboration graph
[legend]
List of all members.

Public Member Functions

 Lexer (const Engine &e, Ref< Stream::UReader > &r, Context *c=0)
 Configure a lexer engine with an input character stream and an optional context.
void getNext (Lexeme &)
 Extract the next lexeme from the input.
ustring getText () const
int getType () const
int getLineNumber () const

Classes

class  Actor
 A object in which context methods may be registered. More...
class  ActorBase
 A common base for the Actor class template. More...
struct  Context
 An abstract base class for the context of the lexer. More...
class  Engine
 The engine that knowns how to match prefixes of the input character stream with the regular expressions of the lexer rules. More...
class  RuleSet
 A set of lexer rules. More...

Detailed Description

A lexer controlled by regular expression.

Lexemes are defined by rules. Each rule is defined by at least a regular expression that define which strings that correspond to that rule. A rule may also specify a terminal type in which case the rule yields lexeme objects of that type. By default the value associated with the lexeme objects are the strings that were matched by the corresponding regular expression. Alternatively an action may be associated with a rule. In that case the action chooses the value of the generated lexeme objects.

A rule may also have an action associated with it but still not generate lexeme objects. This may be use to strip comments for example.

Finally a rule may have no action associated with it and generate no lexeme objects. This could be used for eating whitespace for example.

Todo:
Verify rules for no actions when no Actor is passed to Engine constructor and when no Context is passed to Lexer constructor.
Also verify Context against Actor.

Todo:
Detect word boundaries and maybe only on demand ie.
when any word boundary anchor ([[:<:]] or [[:>:]]) is in use.

Definition at line 120 of file lexer.H.


Constructor & Destructor Documentation

Archon::Utilities::Lexer::Lexer const Engine e,
Ref< Stream::UReader > &  r,
Context c = 0
[inline]
 

Configure a lexer engine with an input character stream and an optional context.

The context must only be left out if none of the rules have methods associated with them.

The type of the context object is verified against the actor object given at the construction of the lexer engine to ensure that they are compatible. This in general means that the template argument to the Actor template must be the type of the context object passed to this constructor or at least a derivative of it.

Definition at line 377 of file lexer.H.


Member Function Documentation

int Archon::Utilities::Lexer::getLineNumber  )  const [inline]
 

Returns:
The line number where the last lexeme extracted by getNext() ended.

Definition at line 412 of file lexer.H.

Referenced by Archon::X3D::VRML::Parser::Context::warning().

void Archon::Utilities::Lexer::getNext Lexeme &   ) 
 

Extract the next lexeme from the input.

Parameters:
l The extracted lexeme is stored herein. If l.type is -1 upon return this idicates EOI (end of input).

Definition at line 77 of file lexer.C.

References Archon::Utilities::Lexer::Engine::actor, Archon::Utilities::Lexer::Engine::anchorMasks, Archon::Utilities::Lexer::ActorBase::call(), Archon::Utilities::Lexer::Engine::dfa, Archon::Utilities::DFA::State::getFinalValue(), getText(), Archon::Utilities::Lexer::Context::lexerError(), Archon::Utilities::Lexer::Engine::rules, and Archon::Utilities::DFA::State::step().

ustring Archon::Utilities::Lexer::getText  )  const [inline, virtual]
 

Returns:
The text corresponding the the last lexeme extracted by getNext(). This method may also be used within Context::lexerError to fetch the faulty character.

Implements Archon::Utilities::LexerBase.

Definition at line 394 of file lexer.H.

Referenced by getNext(), Archon::X3D::VRML::Parser::Context::lexer_decInt(), Archon::X3D::VRML::Parser::Context::lexer_float(), Archon::X3D::VRML::Parser::Context::lexer_hexInt(), Archon::X3D::VRML::Parser::Context::lexer_id(), Archon::X3D::VRML::Parser::Context::lexer_string(), Archon::X3D::VRML::Parser::Context::lexerError(), and Archon::X3D::VRML::Parser::Context::parserError().

int Archon::Utilities::Lexer::getType  )  const [inline, virtual]
 

Returns:
True the type of the last lexeme extracted by getNext(). -1 indicates EOI.

Implements Archon::Utilities::LexerBase.

Definition at line 403 of file lexer.H.

Referenced by Archon::X3D::VRML::Parser::Context::parserError().


The documentation for this class was generated from the following files:
Generated on Sun Jul 30 22:57:31 2006 for Archon by  doxygen 1.4.4