|
reglibcpp
1.7.0
(Naïve) C++ implementation of models for regular languages
|
Represents formal regular expressions. More...
#include <expression.h>
Classes | |
| struct | literals |
| struct | parser |
| Parses regular expressions. More... | |
Public Types | |
| enum | operation { empty, symbol, kleene, concatenation, alternation } |
| The different purposes an RE may fulfill. More... | |
| typedef std::shared_ptr< expression const > | exptr |
| This is the type used to handle regular expressions. More... | |
Public Member Functions | |
| size_t | size () const |
| Reports the size of this RE's tree representation. More... | |
| operation | getOperation () const |
| Reports this RE's function. More... | |
| operator nfa const & () const | |
| Returns an NFA accepting the language that this RE describes. More... | |
| bool | operator== (nfa const &other) const |
| Checks whether this RE describes the same regular language as another object. More... | |
| bool | operator!= (nfa const &other) const |
| Checks whether this RE describes a different regular language than another object. More... | |
| bool | operator== (expression const &r) const |
| bool | operator!= (expression const &r) const |
| char32_t | extractSymbol () const |
| Reports this symbol expression's UTF-32-encoded symbol. More... | |
| std::string | extractUtf8Symbol () const |
| Reports this symbol expression's UTF-8-encoded symbol. More... | |
| std::u32string | to_u32string () const |
| Describes this RE in UTF-32-encoded human-readable form. More... | |
| std::string | to_string () const |
| Describes this RE in UTF-8-encoded human-readable form. More... | |
| std::vector< exptr >::const_iterator | begin () const |
Returns an iterator pointing to this RE's first subexpression. More... | |
| std::vector< exptr >::const_iterator | end () const |
Returns an iterator pointing behind this RE's last subexpression. More... | |
Static Public Member Functions | |
| static void | reset () |
| Resets the symbols used for RE operators to their defaults. More... | |
| static exptr const & | spawnEmptySet () |
| Gives an RE representing the empty set ∅. More... | |
| static exptr const & | spawnEmptyString () |
| Gives an RE representing the empty string ε. More... | |
| static exptr const & | spawnSymbol (char32_t symbol) |
| Gives an RE representing the given UTF-32-encoded symbol. More... | |
| static exptr const & | spawnSymbol (std::string const &utf8Symbol) |
| Same as above for a UTF-8-encoded symbol. More... | |
| static exptr | spawnKleene (exptr const &b, bool optimized=true, bool aggressive=false) |
| Gives an RE representing the Kleene closure of a given RE. More... | |
| static exptr | spawnConcatenation (exptr const &l, exptr const &r, bool optimized=true, bool aggressive=false) |
| Gives an RE representing the concatenation of two given REs. More... | |
| static exptr | spawnAlternation (exptr const &l, exptr const &r, bool optimized=true, bool aggressive=false) |
| Gives an RE representing the alternation of two given REs. More... | |
| static exptr | spawnFromString (std::u32string const &re, literals lits, bool optimized=false, bool aggressive=false) |
| static exptr | spawnFromString (std::string const &utf8Re, literals lits, bool optimized=false, bool aggressive=false) |
| static exptr | spawnFromString (std::u32string const &re, bool optimized=false, bool aggressive=false) |
| Gives an RE encoded in a given string. More... | |
| static exptr | spawnFromString (std::string const &utf8Re, bool optimized=false, bool aggressive=false) |
| Same as above for a UTF-8-encoded string. More... | |
Static Public Attributes | |
| static char32_t | L = U'(' |
| The symbol used to represent the Left parenthesis in a regular expression. More... | |
| static char32_t | R = U')' |
| The symbol used to represent the Right parenthesis in a regular expression. More... | |
| static char32_t | K = U'*' |
| The symbol used to represent the Kleene star in a regular expression. More... | |
| static char32_t | A = U'+' |
| The symbol used to represent the Alternation in a regular expression. More... | |
| static char32_t | E = U'ε' |
| The symbol used to represent the Empty string in a regular expression. More... | |
| static char32_t | N = U'∅' |
| The symbol used to represent the Null/empty set in a regular expression. More... | |
Represents formal regular expressions.
One should never need to handle such an object directly, however, much less copy or move it and therefore copy and move constructors are deleted.
To work with regular expressions, one should use expression::exptr, which aliases a shared_ptr to an actual object and can be copied and moved to one's heart's content. To access member functions, one might dereference exptrs temporarily or, better yet, use the arrow -> operator.
Definition at line 28 of file expression.h.
| typedef std::shared_ptr<expression const> reg::expression::exptr |
This is the type used to handle regular expressions.
Every method works on shared_ptrs to the actual regular expressions, to help with basic comparisons and to save memory.
For example, every symbol's (and the empty string's and the empty set's) regular expression is only instantiated once and then pointed to by as many exptrs as one likes.
Definition at line 40 of file expression.h.
|
strong |
The different purposes an RE may fulfill.
Definition at line 84 of file expression.h.
| vector< expression::exptr >::const_iterator reg::expression::begin | ( | ) | const |
Returns an iterator pointing to this RE's first subexpression.
Definition at line 315 of file expression.cpp.
| vector< expression::exptr >::const_iterator reg::expression::end | ( | ) | const |
Returns an iterator pointing behind this RE's last subexpression.
Definition at line 320 of file expression.cpp.
| char32_t reg::expression::extractSymbol | ( | ) | const |
Reports this symbol expression's UTF-32-encoded symbol.
char32_t encoded within this symbol expression, U'\0' for an empty string | std::logic_error | if this expression's purpose is not that of a symbol |
Definition at line 249 of file expression.cpp.
| string reg::expression::extractUtf8Symbol | ( | ) | const |
Reports this symbol expression's UTF-8-encoded symbol.
"" for an empty string | std::logic_error | if this expression's purpose is not that of a symbol |
Definition at line 266 of file expression.cpp.
| expression::operation reg::expression::getOperation | ( | ) | const |
Reports this RE's function.
Note that the empty string's function is technically that of a symbol.
Definition at line 205 of file expression.cpp.
| reg::expression::operator nfa const & | ( | ) | const |
Returns an NFA accepting the language that this RE describes.
| bool reg::expression::operator!= | ( | nfa const & | other | ) | const |
Checks whether this RE describes a different regular language than another object.
false if this RE's language is exactly the same as the other object's, true else Definition at line 230 of file expression.cpp.
| bool reg::expression::operator!= | ( | expression const & | other | ) | const |
Definition at line 240 of file expression.cpp.
| bool reg::expression::operator== | ( | nfa const & | other | ) | const |
Checks whether this RE describes the same regular language as another object.
true if this RE's language is exactly the same as the other object's, false else Definition at line 222 of file expression.cpp.
| bool reg::expression::operator== | ( | expression const & | other | ) | const |
Definition at line 235 of file expression.cpp.
|
static |
Resets the symbols used for RE operators to their defaults.
Definition at line 36 of file expression.cpp.
| size_t reg::expression::size | ( | ) | const |
Reports the size of this RE's tree representation.
In this context, an RE's size will be defined recursively as follows:
.size() = 1.size() = 1symbol>.size() = 1(l+r).size() = 1 + l.size() + r.size()(lr).size() = 1 + l.size() + r.size()(b*).size() = 1 + b.size() Definition at line 192 of file expression.cpp.
|
static |
Gives an RE representing the alternation of two given REs.
More formally, the RE's language will be L(l+r) = L(l) ∪ L(r).
| l | exptr to one of the REs |
| r | exptr to the other RE |
| optimized | whether simplifications on the syntax level should be applied |
| aggressive | whether the simplifications should check the semantic level |
exptr to the RE representing the alternation of l and r Definition at line 123 of file expression.cpp.
|
static |
Gives an RE representing the concatenation of two given REs.
More formally, the RE's language will be L(lr) = L(l) • L(r).
| l | exptr to the first RE |
| r | exptr to the second RE |
| optimized | whether simplifications on the syntax level should be applied |
| aggressive | whether the simplifications should check the semantic level |
exptr to the RE representing the concatenation of l and r Definition at line 88 of file expression.cpp.
|
static |
Gives an RE representing the empty set ∅.
More formally, the RE's language will be {}.
exptr to the RE representing the empty set ∅ Definition at line 50 of file expression.cpp.
|
static |
Gives an RE representing the empty string ε.
More formally, the RE's language will be {ε}.
exptr to the RE representing the empty string ε Definition at line 59 of file expression.cpp.
|
static |
Definition at line 609 of file expression.cpp.
|
static |
Definition at line 645 of file expression.cpp.
|
static |
Gives an RE encoded in a given string.
| re | the RE in text form |
| optimized | whether simplifications on the syntax level should be applied |
| aggressive | whether the simplifications should check the semantic level |
exptr to the RE represented by the given string | std::invalid_argument | if the re string is malformed |
Definition at line 657 of file expression.cpp.
|
static |
Same as above for a UTF-8-encoded string.
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Definition at line 667 of file expression.cpp.
|
static |
Gives an RE representing the Kleene closure of a given RE.
More formally, the RE's language will be L(b*) = L(b)*.
| b | exptr to the RE |
| optimized | whether simplifications on the syntax level should be applied |
| aggressive | whether the simplifications should check the semantic level |
exptr to the RE representing the Kleene closure of l Definition at line 164 of file expression.cpp.
|
static |
Gives an RE representing the given UTF-32-encoded symbol.
More formally, the RE's language will be {<symbol>}.
| symbol | the symbol the RE should represent or "" for the empty string ε |
exptr to the RE representing the symbol Definition at line 69 of file expression.cpp.
|
static |
Same as above for a UTF-8-encoded symbol.
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Definition at line 74 of file expression.cpp.
| string reg::expression::to_string | ( | ) | const |
Describes this RE in UTF-8-encoded human-readable form.
Definition at line 310 of file expression.cpp.
| u32string reg::expression::to_u32string | ( | ) | const |
Describes this RE in UTF-32-encoded human-readable form.
Definition at line 276 of file expression.cpp.
|
static |
The symbol used to represent the Alternation in a regular expression.
Definition at line 41 of file expression.h.
|
static |
The symbol used to represent the Empty string in a regular expression.
Definition at line 41 of file expression.h.
|
static |
The symbol used to represent the Kleene star in a regular expression.
Definition at line 41 of file expression.h.
|
static |
The symbol used to represent the Left parenthesis in a regular expression.
Definition at line 41 of file expression.h.
|
static |
The symbol used to represent the Null/empty set in a regular expression.
Definition at line 41 of file expression.h.
|
static |
The symbol used to represent the Right parenthesis in a regular expression.
Definition at line 41 of file expression.h.
1.8.14