Converting the Regular Expression into an NFA
The regular expression denotes the language of all strings over the alphabet that end with the suffix .2 Here, means union between and , the operator is the Kleene star, and juxtaposition means concatenation.2 Therefore, can be read as “any sequence of ’s and ’s, followed by , then .”
A standard way to convert a regular expression into an NFA is Thompson's construction.2 This method recursively builds small NFAs for symbols and combines them using rules for union, concatenation, and star, while preserving the language of the expression.2
For this expression, there are two useful viewpoints:
- Formal Thompson construction: build an -NFA step by step from .2
- Compact direct NFA: design a smaller NFA that accepts exactly the strings ending in .
This section presents both, with emphasis on the construction logic and acceptance behavior.2
Footnotes
-
Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like as strings ending in ab. ↩ ↩2 ↩3 ↩4
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩ ↩2 ↩3 ↩4 ↩5 ↩6
-
Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. ↩ ↩2 ↩3
-
From Regular Languages to Finite Automata - University of Waterloo - Slides covering NFA construction for union, concatenation, and Kleene star using ε-transitions. ↩
Regex to NFA Conversion Isn't Hard! (Sipser 1.28a)
Key Language Insight
The expression accepts exactly those strings over {a, b} whose last two symbols are 'a' followed by 'b'. Examples include ab, aab, bab, and bbab.
Footnotes
-
Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like as strings ending in ab. ↩
Parsing the expression structurally
Before constructing the automaton, parse the expression as:
This matters because operator precedence in regular expressions treats star as applying to the immediately preceding subexpression, and concatenation then chains the resulting components.2
So the construction proceeds through these subexpressions:
| Subexpression | Meaning | Thompson rule used |
|---|---|---|
| single symbol | base rule | |
| single symbol | base rule | |
| either or | union | |
| any string over | Kleene star | |
| strings over followed by | concatenation | |
| strings over followed by | concatenation |
In Thompson construction, each symbol NFA has one start state and one accept state, union introduces a new start and accept state with -transitions, concatenation links the first NFA to the second using , and star adds new start/accept states plus looping -transitions.3
Footnotes
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩ ↩2
-
Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. ↩ ↩2
-
From Regular Languages to Finite Automata - University of Waterloo - Slides covering NFA construction for union, concatenation, and Kleene star using ε-transitions. ↩
Thompson Construction for $(a+b)^*ab$
- 1Step 1
Create two states with one transition labeled 'a' from the start state to the accept state. This is the base case in Thompson construction.2
Footnotes
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩
-
Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. ↩
-
- 2Step 2
Create two states with one transition labeled 'b' from the start state to the accept state. This is the second base component needed for the union and final suffix.2
Footnotes
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩
-
Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. ↩
-
- 3Step 3
Introduce a new start state and a new accept state. Add ε-transitions from the new start to the start states of the 'a' and 'b' NFAs, then add ε-transitions from their accept states to the new accept state. This gives an NFA for .2
Footnotes
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩
-
Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. ↩
-
- 4Step 4
Add a new start state and a new accept state. Add ε-transitions from the new start to the old start and directly to the new accept. Add ε-transitions from the old accept back to the old start and also to the new accept. This produces an NFA for , including the empty string.2
Footnotes
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩
-
Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. ↩
-
- 5Step 5
Take the accept state of the NFA for and connect it by ε-transition to the start state of a new symbol-NFA for 'a'. The old accept state becomes non-final. This yields .2
Footnotes
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩
-
From Regular Languages to Finite Automata - University of Waterloo - Slides covering NFA construction for union, concatenation, and Kleene star using ε-transitions. ↩
-
- 6Step 6
Connect the accept state of the previous component by ε-transition to the start state of a new symbol-NFA for 'b'. The final accept state of this last component becomes the only accepting state. The resulting ε-NFA accepts exactly .2
Footnotes
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩
-
Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. ↩
-
Thompson -NFA for
One valid state labeling is shown below. The labels are arbitrary; the structure is what matters.2
Let the states be:
- Start state:
- Accept state:
Transitions:
This follows exactly from the recursive rules of Thompson construction: base NFAs for and , then union for , then star for , then concatenation with and finally with .2
This NFA is an -NFA because several transitions consume no input symbol. Such -moves are standard in Thompson's method and do not change the recognized language.2
Footnotes
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩ ↩2 ↩3
-
Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. ↩ ↩2
-
Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions. ↩
How to Read the Star Component
In the star construction, the ε-path from the new start directly to the new accept allows zero repetitions, while the loop from the old accept back to the old start allows repeated occurrences of the subexpression.2
Footnotes
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩
-
Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. ↩
A smaller direct NFA for the same language
Although Thompson construction is systematic, it often produces more states than necessary. Since means “all strings ending in ,” we can write a much smaller NFA directly.
Let:
- States:
- Start state:
- Accept state:
Transitions:
- From on : to and
- From on : to
- From on : to
No other transitions are needed.
Intuition:
- keeps scanning any prefix in .
- On reading an , the automaton may nondeterministically guess that this is the beginning of the final suffix .
- If the next symbol is , it reaches and accepts.
Formally, this NFA accepts exactly the strings whose last two symbols are .2
This direct NFA is often preferred in exams when the question simply asks to “convert the regular expression into NFA,” unless the instructor explicitly requires Thompson construction.2
Footnotes
-
From Regular Languages to Finite Automata - University of Waterloo - Slides covering NFA construction for union, concatenation, and Kleene star using ε-transitions. ↩
-
Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like as strings ending in ab. ↩ ↩2 ↩3 ↩4
-
Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions. ↩ ↩2
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩
Uses recursive construction rules for symbols, union, concatenation, and Kleene star. It is systematic and guaranteed to work for any regular expression, but may use many states and ε-transitions.2
Footnotes
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩
-
Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. ↩
State Complexity Comparison
A practical comparison between the systematic Thompson construction and a compact direct NFA for .
Transition table for the compact NFA
A transition table makes the direct NFA easier to simulate.2
| State | Input | Input |
|---|---|---|
Because the automaton is nondeterministic, the entry for contains two possible next states. Acceptance occurs if at least one computation path ends in the accepting state after all input is consumed.
A useful characterization is:
which directly explains why strings like , , and are accepted, while , , and are rejected.
Footnotes
-
Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like as strings ending in ab. ↩ ↩2
-
Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions. ↩ ↩2 ↩3
Simulating the Direct NFA on Input bbab
- 1Step 1
The current state set is {q0}. This is the initial configuration.
Footnotes
-
Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions. ↩
-
- 2Step 2
From q0 on input b, the automaton stays in q0. The state set remains {q0}.
Footnotes
-
Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like as strings ending in ab. ↩
-
- 3Step 3
Again, from q0 on input b, the state set remains {q0}. The machine is still scanning the arbitrary prefix from .
Footnotes
-
Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like as strings ending in ab. ↩
-
- 4Step 4
From q0 on input a, the automaton can stay in q0 or move to q1. The new state set is {q0, q1}. This reflects the guess that the current a may start the final suffix ab.
Footnotes
-
Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions. ↩
-
- 5Step 5
From q0 on b, one branch stays in q0, and from q1 on b another branch moves to q2. The resulting state set is {q0, q2}. Since q2 is accepting after the full input is consumed, the string is accepted.2
Footnotes
-
Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like as strings ending in ab. ↩
-
Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions. ↩
-
Common Questions and Exam Pitfalls
Common Mistake
Do not confuse with ; the first means any string over {a,b} ending in ab, while the second means zero or more repetitions of the exact block 'ab'. These languages are different.
Footnotes
-
Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like as strings ending in ab. ↩
Final answer
If the task is to convert into an NFA, a concise correct NFA is:
- Start state
- Final state
Transition function :
and all other transitions go to .2
This NFA is equivalent to the regular expression because it accepts exactly those strings whose last two characters are . If the instructor requires Thompson construction specifically, the larger -NFA shown earlier is the formal algorithmic construction.2
Footnotes
-
Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like as strings ending in ab. ↩ ↩2
-
Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions. ↩
-
THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. ↩
-
Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. ↩
Knowledge Check
What language is denoted by the regular expression ?
Explore Related Topics
Solving the Recurrence $T(n)=T(n-1)+n$ by Substitution Method
The course shows how to solve the decrease‑by‑one recurrence (T(n)=T(n-1)+n) (with (T(1)=1)) using the substitution method.
- Repeatedly substitute (T(n-i)=T(n-i-1)+(n-i)) until reaching the base case, yielding (T(n)=T(1)+2+3+\dots+n).
- The resulting sum is the triangular number (\frac{n(n+1)}{2}).
- The dominant term (\frac{1}{2}n^{2}) gives a tight asymptotic bound (\Theta(n^{2})).
- For recurrences of the form (T(n)=T(n-1)+f(n)), expanding to a summation quickly reveals the closed form.
Banker's Algorithm Safe-State Analysis for Processes $P_0$ Through $P_4$
The course walks through a full Banker's‑algorithm safety analysis for processes –, computing the Need matrix and using the work‑vector test to verify a safe state and a valid completion order.
- Need matrix: , , , , .
- Starting work ; each step selects a process with , releases its allocation, and updates .
- All processes finish, yielding a safe sequence .
- A safe state requires only one such sequence; it does not mean every immediate allocation is safe.
- Exam tip: test processes with minimal or zero Need first to unlock the rest.
Relational Algebra Equivalence: Why $\pi_A(R) - \pi_A((\pi_A(R) \times S) - R)$ Represents Division
The expression
[ \pi_A(R)-\pi_A\big((\pi_A(R)\times S)-R\big) ]
is a derived form of the relational‑algebra division operator, returning all (A) values that pair with every tuple in (S).
- Division is defined as (R\div S={a\mid\forall b\in S,;(a,b)\in R}).
- The formula works by (1) projecting candidate (A) values, (2) forming all required ((A,B)) pairs with (S), (3) subtracting existing pairs to find missing ones, (4) projecting the missing (A) values, and (5) removing them from the candidates.
- In the example, (R(A,B)={(1,x),(1,y),(2,x),(2,y),(3,x)}) and (S(B)={x,y}) yield (R\div S={1,2}).
- This construction captures the universal (“for all”) query pattern, unlike selection, join, or simple projection.
