reglibcpp  1.3.0
(Naïve) C++ implementation of models for regular languages
Classes | Public Member Functions | Static Public Member Functions | List of all members
reg::dfa Class Reference

Represents deterministic finite automata. More...

#include <dfa.h>

Classes

class  builder
 Constructs DFAs step by step. More...
 
struct  pImpl
 Private implementation details of DFAs. More...
 

Public Member Functions

 dfa ()
 Constructs a DFA accepting the empty language ∅. More...
 
 dfa (dfa const &d)
 Copy-constructs a DFA by copying another one's private implementation object. More...
 
 dfa (dfa &&d)
 Move-constructs a DFA by stealing another one's private implementation object. More...
 
 dfa (nfa const &n)
 Constructs a DFA from an NFA via powerset construction. More...
 
 dfa (builder &b)
 Constructs a DFA from a builder. More...
 
dfaoperator= (const dfa &d)
 Copy-assigns this DFA by copying another one's private implementation object. More...
 
dfaoperator= (dfa &&d)
 Move-assigns this DFA by stealing another one's private implementation object. More...
 
bool operator== (const dfa &d) const
 Tests whether this DFA accepts exactly the same language as another one. More...
 
bool operator!= (const dfa &d) const
 Tests whether this DFA doesn't accept the same language as another one. More...
 
size_t delta (size_t q, char32_t symbol) const
 Computes this DFA's transition function for a state and a symbol. More...
 
size_t delta (size_t q, std::string const &utf8Symbol) const
 Same as above for a UTF-8-encoded symbol. More...
 
size_t deltaHat (size_t q, std::u32string const &word) const
 Computes this DFA's transition function recursively for a string of symbols, starting in a specified state. More...
 
size_t deltaHat (size_t q, std::string const &utf8Word) const
 Same as above for a string of UTF-8-encoded symbols. More...
 
std::u32string getShortestWord () const
 Searches the shortest UTF-32-encoded word accepted by this DFA. More...
 
std::string getShortestUtf8Word () const
 Same as above for a UTF-8-encoded word, INCLUDING POTENTIAL INFINITE LOOP. More...
 
std::string const & getLabelOf (size_t q) const
 Puts a name to an index. More...
 
std::vector< char32_t > const & getAlphabet () const
 Fetches this DFA's set of processable symbols. More...
 
std::vector< std::string > const & getUtf8Alphabet () const
 Fetches this DFA's set of processable symbols as UTF-8-encoded strings. More...
 
size_t getNumberOfStates () const
 Counts this DFA's states. More...
 
bool isAccepting (size_t q) const
 Tests whether a state is an accept state within this DFA. More...
 

Static Public Member Functions

static dfa::builder unite (dfa const &d1, dfa const &d2)
 Creates a builder for a DFA accepting the union of the languages of two DFAs. More...
 
static dfa::builder intersect (dfa const &d1, dfa const &d2)
 Creates a builder for a DFA accepting the intersection of the languages of two DFAs. More...
 
static dfa::builder subtract (dfa const &d1, dfa const &d2)
 Creates a builder for a DFA accepting the set difference of the languages of two DFAs. More...
 
static dfa::builder complement (dfa const &d)
 Creates a builder for a DFA accepting the complement of the language of a DFA. More...
 

Detailed Description

Represents deterministic finite automata.

Instances of this class are completely immutable. The builder class is the preferred way of constructing DFAs.

By convention, the state with index 0 is the initial state.

Definition at line 26 of file dfa.h.

Constructor & Destructor Documentation

◆ dfa() [1/5]

reg::dfa::dfa ( )

Constructs a DFA accepting the empty language ∅.

Definition at line 104 of file dfa.cpp.

104 : p(new pImpl) {}

◆ dfa() [2/5]

reg::dfa::dfa ( dfa const &  d)

Copy-constructs a DFA by copying another one's private implementation object.

Definition at line 114 of file dfa.cpp.

114 : p(new pImpl(*(d.p))) {}

◆ dfa() [3/5]

reg::dfa::dfa ( dfa &&  d)

Move-constructs a DFA by stealing another one's private implementation object.

The other DFA will be accepting the empty language ∅ afterwards.

Definition at line 118 of file dfa.cpp.

118 : p(new pImpl) {p.swap(d.p);}

◆ dfa() [4/5]

reg::dfa::dfa ( nfa const &  n)

Constructs a DFA from an NFA via powerset construction.

Definition at line 121 of file dfa.cpp.

121 : dfa(builder(n).build()) {}
dfa()
Constructs a DFA accepting the empty language ∅.
Definition: dfa.cpp:104

◆ dfa() [5/5]

reg::dfa::dfa ( dfa::builder b)

Constructs a DFA from a builder.

Definition at line 124 of file dfa.cpp.

124 : dfa(b.build()) {}
dfa()
Constructs a DFA accepting the empty language ∅.
Definition: dfa.cpp:104

Member Function Documentation

◆ complement()

dfa::builder reg::dfa::complement ( dfa const &  d)
static

Creates a builder for a DFA accepting the complement of the language of a DFA.

The input DFAs' state names will be retained in the created DFA.

Parameters
nthe DFA
Returns
builder fo a DFA accepting all words not accepted by the input DFA (provided they can be built from symbols of that DFA's alphabet)

Definition at line 376 of file dfa.cpp.

376  {
377  return dfa::builder(d).complement();
378 }

◆ delta() [1/2]

size_t reg::dfa::delta ( size_t  q,
char32_t  symbol 
) const

Computes this DFA's transition function for a state and a symbol.

Parameters
qthe state's index (< getNumberOfStates)
symbolthe symbol to process (∈ getAlphabet)
Returns
the reached state's index

Definition at line 203 of file dfa.cpp.

203  {
204  size_t i = index_of(p->alphabet, symbol);
205  if (i >= p->alphabet.size()) {
206  throw std::domain_error(
207  string(u8"δ is not defined for symbol ") + converter.to_bytes(symbol)
208  );
209  }
210  if (q >= p->labels.size()) {
211  throw std::out_of_range(
212  string("There is no state with index ").append(std::to_string(q))
213  );
214  }
215  return p->transitions[q][i];
216 }
size_t index_of(vector< T > const &vec, T const &element)
Basically Java&#39;s List interface&#39;s indexOf, but as a non-member function and returning the container&#39;s...
Definition: dfa.cpp:995
std::wstring_convert< std::codecvt_utf8< char32_t >, char32_t > converter
Converts between UTF-8-encoded and UTF-32-encoded strings.
Definition: dfa.cpp:1001

◆ delta() [2/2]

size_t reg::dfa::delta ( size_t  q,
std::string const &  utf8Symbol 
) const

Same as above for a UTF-8-encoded symbol.

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Definition at line 219 of file dfa.cpp.

219  {
220  return delta(q, converter.from_bytes(utf8Symbol)[0]);
221 }
size_t delta(size_t q, char32_t symbol) const
Computes this DFA's transition function for a state and a symbol.
Definition: dfa.cpp:203
std::wstring_convert< std::codecvt_utf8< char32_t >, char32_t > converter
Converts between UTF-8-encoded and UTF-32-encoded strings.
Definition: dfa.cpp:1001

◆ deltaHat() [1/2]

size_t reg::dfa::deltaHat ( size_t  q,
std::u32string const &  word 
) const

Computes this DFA's transition function recursively for a string of symbols, starting in a specified state.

Parameters
qthe starting state's index (< getNumberOfStates)
wordthe string of symbols to process (all of which ∈ getAlphabet)
Returns
the reached state's index

Definition at line 229 of file dfa.cpp.

229  {
230  for (char32_t symbol : word) {
231  q = delta(q, symbol);
232  }
233  return q;
234 }
size_t delta(size_t q, char32_t symbol) const
Computes this DFA's transition function for a state and a symbol.
Definition: dfa.cpp:203

◆ deltaHat() [2/2]

size_t reg::dfa::deltaHat ( size_t  q,
std::string const &  utf8Word 
) const

Same as above for a string of UTF-8-encoded symbols.

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Definition at line 237 of file dfa.cpp.

237  {
238  return deltaHat(q, converter.from_bytes(utf8Word));
239 }
size_t deltaHat(size_t q, std::u32string const &word) const
Computes this DFA's transition function recursively for a string of symbols, starting in a specified ...
Definition: dfa.cpp:229
std::wstring_convert< std::codecvt_utf8< char32_t >, char32_t > converter
Converts between UTF-8-encoded and UTF-32-encoded strings.
Definition: dfa.cpp:1001

◆ getAlphabet()

vector< char32_t > const & reg::dfa::getAlphabet ( ) const

Fetches this DFA's set of processable symbols.

Returns
a vector containing all the valid symbol inputs for delta and deltaHat

Definition at line 294 of file dfa.cpp.

294  {
295  return p->alphabet;
296 }

◆ getLabelOf()

string const & reg::dfa::getLabelOf ( size_t  q) const

Puts a name to an index.

Parameters
qthe state's index (< getNumberOfStates)
Returns
the state's name

Definition at line 286 of file dfa.cpp.

286  {
287  return p->labels.at(q);
288 }

◆ getNumberOfStates()

size_t reg::dfa::getNumberOfStates ( ) const

Counts this DFA's states.

Returns
the number of states this DFA has

Definition at line 310 of file dfa.cpp.

310  {
311  return p->labels.size();
312 }

◆ getShortestUtf8Word()

string reg::dfa::getShortestUtf8Word ( ) const

Same as above for a UTF-8-encoded word, INCLUDING POTENTIAL INFINITE LOOP.

Definition at line 277 of file dfa.cpp.

277  {
278  return converter.to_bytes(getShortestWord());
279 }
std::u32string getShortestWord() const
Searches the shortest UTF-32-encoded word accepted by this DFA.
Definition: dfa.cpp:254
std::wstring_convert< std::codecvt_utf8< char32_t >, char32_t > converter
Converts between UTF-8-encoded and UTF-32-encoded strings.
Definition: dfa.cpp:1001

◆ getShortestWord()

u32string reg::dfa::getShortestWord ( ) const

Searches the shortest UTF-32-encoded word accepted by this DFA.

LOOPS INFINITELY FOR DFA WITH AN EMPTY LANGUAGE!

Hint: Compare with the empty-language dfa() before using this method:

dfa empty();
if (myDfa != empty) {
  myDfa.getShortestWord();
}
Returns
the shortest word leading to one of this DFA's accept states

Definition at line 254 of file dfa.cpp.

254  {
255  if (p->accepting[0]) {
256  return U"";
257  }
258  unordered_map<size_t, u32string> shortestWords(p->labels.size());
259  shortestWords.emplace(0, U"");
260  while (true) {
261  for (auto const& stateWord : shortestWords) {
262  for (auto symbol : p->alphabet) {
263  size_t reached = delta(stateWord.first, symbol);
264  u32string shWord = stateWord.second + symbol;
265  if (p->accepting[reached]) {
266  return shWord;
267  }
268  if (shortestWords.find(reached) == shortestWords.end()) {
269  shortestWords.emplace(reached, shWord);
270  }
271  }
272  }
273  }
274 }
size_t delta(size_t q, char32_t symbol) const
Computes this DFA's transition function for a state and a symbol.
Definition: dfa.cpp:203

◆ getUtf8Alphabet()

vector< string > const & reg::dfa::getUtf8Alphabet ( ) const

Fetches this DFA's set of processable symbols as UTF-8-encoded strings.

Returns
a vector containing all the valid symbol inputs for delta and deltaHat

Definition at line 302 of file dfa.cpp.

302  {
303  return p->utf8Alphabet;
304 }

◆ intersect()

dfa::builder reg::dfa::intersect ( dfa const &  d1,
dfa const &  d2 
)
static

Creates a builder for a DFA accepting the intersection of the languages of two DFAs.

The input DFAs' state names will be concatenated and collisions resolved by appending _ in the created DFA.

Parameters
d1the first DFA
d2the other DFA
Returns
builder for a DFA accepting all the words accepted by both of the input DFAs

Definition at line 348 of file dfa.cpp.

348  {
349  return builder(d1).intersect(d2);
350 }

◆ isAccepting()

bool reg::dfa::isAccepting ( size_t  q) const

Tests whether a state is an accept state within this DFA.

Parameters
qthe state's index (< getNumberOfStates)
Returns
true if the state is in the set of accept states, false else

Definition at line 319 of file dfa.cpp.

319  {
320  if (q >= p->labels.size()) {
321  throw std::out_of_range(
322  string("There is no state with index ").append(std::to_string(q))
323  );
324  }
325  return p->accepting[q];
326 }

◆ operator!=()

bool reg::dfa::operator!= ( const dfa d) const

Tests whether this DFA doesn't accept the same language as another one.

Definition at line 195 of file dfa.cpp.

195 {return !operator==(d);}
bool operator==(const dfa &d) const
Tests whether this DFA accepts exactly the same language as another one.
Definition: dfa.cpp:151

◆ operator=() [1/2]

dfa & reg::dfa::operator= ( const dfa d)

Copy-assigns this DFA by copying another one's private implementation object.

Definition at line 127 of file dfa.cpp.

127  {
128  if (this != &d) {
129  p.reset(new pImpl(*(d.p)));
130  }
131  return *this;
132 }

◆ operator=() [2/2]

dfa & reg::dfa::operator= ( dfa &&  d)

Move-assigns this DFA by stealing another one's private implementation object.

The other DFA will be accepting the empty language ∅ afterwards.

Definition at line 136 of file dfa.cpp.

136  {
137  if (this != &d) {
138  p.reset(new pImpl);
139  p.swap(d.p);
140  }
141  return *this;
142 }

◆ operator==()

bool reg::dfa::operator== ( const dfa d) const

Tests whether this DFA accepts exactly the same language as another one.

More specifically, checks whether this DFA's initial state is indistinguishable from the other one's.

Definition at line 151 of file dfa.cpp.

151  {
152  forward_list<char32_t> unitedAlphabet(p->alphabet.begin(), p->alphabet.end());
153  size_t unitedAlphabetSize(p->alphabet.size());
154  for (char32_t symbol : d.getAlphabet()) {
155  if (find(p->alphabet.begin(), p->alphabet.end(), symbol) == p->alphabet.end()) {
156  unitedAlphabet.push_front(symbol);
157  unitedAlphabetSize++;
158  }
159  }
160  vector<vector<size_t>> tTable(getNumberOfStates()+d.getNumberOfStates()+1, vector<size_t>(unitedAlphabetSize));
161  valarray<bool> accepting(false, getNumberOfStates()+d.getNumberOfStates()+1);
162  for (size_t q(0); q < getNumberOfStates(); q++) {
163  size_t s(0);
164  vector<size_t>& row = tTable[q];
165  for (char32_t symbol : unitedAlphabet) {
166  try {
167  row[s] = delta(q, symbol);
168  } catch (std::domain_error e) {
169  row[s] = getNumberOfStates()+d.getNumberOfStates();
170  }
171  s++;
172  }
173  accepting[q] = isAccepting(q);
174  }
175  for (size_t q(0); q < d.getNumberOfStates(); q++) {
176  size_t s(0);
177  vector<size_t>& row = tTable[q+getNumberOfStates()];
178  for (char32_t symbol : unitedAlphabet) {
179  try {
180  row[s] = getNumberOfStates()+d.delta(q, symbol);
181  } catch (std::domain_error e) {
182  row[s] = getNumberOfStates()+d.getNumberOfStates();
183  }
184  s++;
185  }
186  accepting[q+getNumberOfStates()] = d.isAccepting(q);
187  }
188  for (size_t s(0); s < unitedAlphabetSize; s++) {
189  tTable[getNumberOfStates()+d.getNumberOfStates()][s] = getNumberOfStates()+d.getNumberOfStates();
190  }
191  return pImpl::indistinguishableStates(tTable, accepting)[0][getNumberOfStates()];
192 }
bool isAccepting(size_t q) const
Tests whether a state is an accept state within this DFA.
Definition: dfa.cpp:319
size_t delta(size_t q, char32_t symbol) const
Computes this DFA's transition function for a state and a symbol.
Definition: dfa.cpp:203
static vector< valarray< bool > > indistinguishableStates(vector< vector< size_t >> const &transitions, valarray< bool > const &accepting)
Builds the table of indistinguishable states w.r.t. a transition function.
Definition: dfa.cpp:61
size_t getNumberOfStates() const
Counts this DFA's states.
Definition: dfa.cpp:310

◆ subtract()

dfa::builder reg::dfa::subtract ( dfa const &  d1,
dfa const &  d2 
)
static

Creates a builder for a DFA accepting the set difference of the languages of two DFAs.

The input DFAs' state names will be concatenated and collisions resolved by appending _ in the created DFA.

Parameters
d1the first DFA
d2the other DFA
Returns
builder for a DFA accepting all the words accepted by the first but not the other input DFA

Definition at line 360 of file dfa.cpp.

360  {
361  builder b1(d1);
362  for (auto symbol : d2.getAlphabet()) {
363  b1.addSymbol(symbol);
364  }
365  return b1.complement().unite(d2).complement();
366 }

◆ unite()

dfa::builder reg::dfa::unite ( dfa const &  d1,
dfa const &  d2 
)
static

Creates a builder for a DFA accepting the union of the languages of two DFAs.

The input DFAs' state names will be concatenated and collisions resolved by appending _ in the created DFA.

Parameters
d1the first DFA
d2the other DFA
Returns
builder for a DFA accepting all the words accepted by any of the input DFAs

Definition at line 336 of file dfa.cpp.

336  {
337  return builder(d1).unite(d2);
338 }

The documentation for this class was generated from the following files: