Simulating a regular expression
| For other ways to use regular expressions in Grafstate, see the sections on∶ |
A regular expression is a string of finite length that encodes a regular language. The expression in a sequence of symbols from the input alphabet Σ together with the 3 regular operators. The regular operators are described below∶
Symbol | Grafstate code | Description |
---|
⋃ | \u | The union of two regular languages L1 ⋃ L2 is a regular language. |
· | \. | The concatenation of two regular languages L1 ∘ L2 is a regular language. |
* | * | The Kleene closure of a regular language L* is a regular language. |
 | The concatenation operator \. is optional. Adjacency may be used instead. |
 | The following regular expressions are considered to be equal∶ |
(a ⋃ b)* · c
(a ⋃ b)*c
Below is the Grafstate code for the regular expressions above∶
(a \u b)* \. c
(a \u b)*c
The regular expression above describes the language of strings consisting of any combinations of a's and b's followd by a single c.
A regular expression is checked against the following constraints. If a regular expression violates a constraint, then you will receive an error message.
1 | The regular expression can only consist of the symbols in the input alphabet and \e in addition to the parentheses ( and ) , and the regular operators \u , \. and * . |
|
2 | Parentheses must be balanced. |
|
3 | There cannot be two adjacent operators. |
|
4 | No operator can follow a left parenthesis. |
|
5 | \u and \. cannot precede a right parenthesis. |
|
 | Formally, ∅ is a valid regular expression. However, ∅ is not recognized as a Grafstate regular expression. |
Simulating a regular expression
You can simulate a regular expression with a string using the :sim
command with 2 arguments. The first argument is the regular expression with no spaces and the second argument is an input string. The input alphabet Σ is given as an argument introduced using the &
symbol.
 | The following Grafstate code simulates the regular expression (a ⋃ b)*c with the string babc∶ |
Code |
---|
:sim (a\ub)*c babc & S={a,b,c} |
Simulation |
---|
Input babc |
Result MATCHED |
Σ Input alphabet {a,b,c} |
Regular expression∶ (a⋃b)*c
If an input string is not matched by a regular expression, then the Simulator will indicate how far the string can be read by the regular expression before matching fails.
 | The following Grafstate code simulates the regular expression (ab)* with the string aba∶ |
Code |
---|
:sim (ab)* aba & S={a,b} |
Simulation |
---|
Input aba |
Result NOT MATCHED |
Σ Input alphabet {a,b} |
Regular expression∶ (ab)*
Understanding regular expressions
The regular expression can be defined formally using the following recursive definition∶
1 | "∅" is a regular expression, and L("∅")=∅. |
|
| The regular expression ∅ represents the empty language. |
|
2 | "ε" is a regular expression, and L("ε")={ε}. |
|
| The regular expression ε represents the language containing only the empty string. |
|
3 | ∀c∈Σ c is a regular expression, and L(c)={c}. |
|
| For each alphabet symbol c, the regular expression c represents the language containing only c. |
|
1 | Let R1 and R2 be regular expressions. Then R1 "⋃" R2 is a regular expression, and L(R1 "⋃" R2)=L(R1) ⋃ L(R2). |
|
2 | Let R1 and R2 be regular expressions. Then R1 "∘" R2 is a regular expression, and L(R1 "@" R2) = L(R1) ∘ L(R2). |
|
3 | Let R be a regular expression. Then R"*" is a regular expression, and L(R"*") = L(R)*. |
|
 | Let Σ = {a,b}. The regular expression (a ⋃ b)*a ⋃ ε represents the language |
(({a} ⋃ {b})* ∘ {a}) ⋃ {ε}.