CoursifyCoursify

Converting the Regular Expression (a+b)ab(a+b)^*ab into an NFA

Converting the Regular Expression (a+b)ab(a+b)^*ab into an NFA

Verified Sources
May 20, 2026

The regular expression (a+b)ab(a+b)^*ab denotes the language of all strings over the alphabet Σ={a,b}\Sigma=\{a,b\} that end with the suffix abab.2 Here, (a+b)(a+b) means union between aa and bb, the operator ^* is the Kleene star, and juxtaposition means concatenation.2 Therefore, (a+b)ab(a+b)^*ab can be read as “any sequence of aa’s and bb’s, followed by aa, then bb.”

A standard way to convert a regular expression into an NFA is Thompson's construction.2 This method recursively builds small NFAs for symbols and combines them using rules for union, concatenation, and star, while preserving the language of the expression.2

For this expression, there are two useful viewpoints:

  1. Formal Thompson construction: build an ε\varepsilon-NFA step by step from (a+b)ab(a+b)^*ab.2
  2. Compact direct NFA: design a smaller NFA that accepts exactly the strings ending in abab.

This section presents both, with emphasis on the construction logic and acceptance behavior.2

Footnotes

  1. Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like (ab)ab(a|b)^*ab as strings ending in ab. 2 3 4

  2. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. 2 3 4 5 6

  3. Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. 2 3

  4. From Regular Languages to Finite Automata - University of Waterloo - Slides covering NFA construction for union, concatenation, and Kleene star using ε-transitions.

Regex to NFA Conversion Isn't Hard! (Sipser 1.28a)

Key Language Insight

The expression (a+b)ab(a+b)^*ab accepts exactly those strings over {a, b} whose last two symbols are 'a' followed by 'b'. Examples include ab, aab, bab, and bbab.

Footnotes

  1. Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like (ab)ab(a|b)^*ab as strings ending in ab.

Parsing the expression structurally

Before constructing the automaton, parse the expression as:

((a+b))ab((a+b)^*)\cdot a \cdot b

This matters because operator precedence in regular expressions treats star as applying to the immediately preceding subexpression, and concatenation then chains the resulting components.2

So the construction proceeds through these subexpressions:

SubexpressionMeaningThompson rule used
aasingle symbolbase rule
bbsingle symbolbase rule
a+ba+beither aa or bbunion
(a+b)(a+b)^*any string over {a,b}\{a,b\}Kleene star
(a+b)a(a+b)^*astrings over {a,b}\{a,b\} followed by aaconcatenation
(a+b)ab(a+b)^*abstrings over {a,b}\{a,b\} followed by ababconcatenation

In Thompson construction, each symbol NFA has one start state and one accept state, union introduces a new start and accept state with ε\varepsilon-transitions, concatenation links the first NFA to the second using ε\varepsilon, and star adds new start/accept states plus looping ε\varepsilon-transitions.3

Footnotes

  1. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. 2

  2. Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. 2

  3. From Regular Languages to Finite Automata - University of Waterloo - Slides covering NFA construction for union, concatenation, and Kleene star using ε-transitions.

Thompson Construction for $(a+b)^*ab$

  1. 1
    Step 1

    Create two states with one transition labeled 'a' from the start state to the accept state. This is the base case in Thompson construction.2

    Footnotes

    1. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction.

    2. Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state.

  2. 2
    Step 2

    Create two states with one transition labeled 'b' from the start state to the accept state. This is the second base component needed for the union and final suffix.2

    Footnotes

    1. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction.

    2. Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state.

  3. 3
    Step 3

    Introduce a new start state and a new accept state. Add ε-transitions from the new start to the start states of the 'a' and 'b' NFAs, then add ε-transitions from their accept states to the new accept state. This gives an NFA for (a+b)(a+b).2

    Footnotes

    1. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction.

    2. Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state.

  4. 4
    Step 4

    Add a new start state and a new accept state. Add ε-transitions from the new start to the old start and directly to the new accept. Add ε-transitions from the old accept back to the old start and also to the new accept. This produces an NFA for (a+b)(a+b)^*, including the empty string.2

    Footnotes

    1. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction.

    2. Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state.

  5. 5
    Step 5

    Take the accept state of the NFA for (a+b)(a+b)^* and connect it by ε-transition to the start state of a new symbol-NFA for 'a'. The old accept state becomes non-final. This yields (a+b)a(a+b)^*a.2

    Footnotes

    1. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction.

    2. From Regular Languages to Finite Automata - University of Waterloo - Slides covering NFA construction for union, concatenation, and Kleene star using ε-transitions.

  6. 6
    Step 6

    Connect the accept state of the previous component by ε-transition to the start state of a new symbol-NFA for 'b'. The final accept state of this last component becomes the only accepting state. The resulting ε-NFA accepts exactly (a+b)ab(a+b)^*ab.2

    Footnotes

    1. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction.

    2. Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state.

Thompson ε\varepsilon-NFA for (a+b)ab(a+b)^*ab

One valid state labeling is shown below. The labels are arbitrary; the structure is what matters.2

Let the states be:

q0,q1,,q11q_0,q_1,\dots,q_{11}
  • Start state: q6q_6
  • Accept state: q11q_{11}

Transitions:

  • q0aq1q_0 \xrightarrow{a} q_1
  • q2bq3q_2 \xrightarrow{b} q_3
  • q4εq0q_4 \xrightarrow{\varepsilon} q_0
  • q4εq2q_4 \xrightarrow{\varepsilon} q_2
  • q1εq5q_1 \xrightarrow{\varepsilon} q_5
  • q3εq5q_3 \xrightarrow{\varepsilon} q_5
  • q6εq4q_6 \xrightarrow{\varepsilon} q_4
  • q6εq7q_6 \xrightarrow{\varepsilon} q_7
  • q5εq4q_5 \xrightarrow{\varepsilon} q_4
  • q5εq7q_5 \xrightarrow{\varepsilon} q_7
  • q7εq8q_7 \xrightarrow{\varepsilon} q_8
  • q8aq9q_8 \xrightarrow{a} q_9
  • q9εq10q_9 \xrightarrow{\varepsilon} q_{10}
  • q10bq11q_{10} \xrightarrow{b} q_{11}

This follows exactly from the recursive rules of Thompson construction: base NFAs for aa and bb, then union for (a+b)(a+b), then star for (a+b)(a+b)^*, then concatenation with aa and finally with bb.2

This NFA is an ε\varepsilon-NFA because several transitions consume no input symbol. Such ε\varepsilon-moves are standard in Thompson's method and do not change the recognized language.2

Footnotes

  1. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction. 2 3

  2. Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state. 2

  3. Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions.

How to Read the Star Component

In the star construction, the ε-path from the new start directly to the new accept allows zero repetitions, while the loop from the old accept back to the old start allows repeated occurrences of the subexpression.2

Footnotes

  1. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction.

  2. Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state.

A smaller direct NFA for the same language

Although Thompson construction is systematic, it often produces more states than necessary. Since (a+b)ab(a+b)^*ab means “all strings ending in abab,” we can write a much smaller NFA directly.

Let:

  • States: {q0,q1,q2}\{q_0,q_1,q_2\}
  • Start state: q0q_0
  • Accept state: q2q_2

Transitions:

  • From q0q_0 on aa: to q0q_0 and q1q_1
  • From q0q_0 on bb: to q0q_0
  • From q1q_1 on bb: to q2q_2

No other transitions are needed.

Intuition:

  • q0q_0 keeps scanning any prefix in (a+b)(a+b)^*.
  • On reading an aa, the automaton may nondeterministically guess that this aa is the beginning of the final suffix abab.
  • If the next symbol is bb, it reaches q2q_2 and accepts.

Formally, this NFA accepts exactly the strings whose last two symbols are abab.2

This direct NFA is often preferred in exams when the question simply asks to “convert the regular expression into NFA,” unless the instructor explicitly requires Thompson construction.2

Footnotes

  1. From Regular Languages to Finite Automata - University of Waterloo - Slides covering NFA construction for union, concatenation, and Kleene star using ε-transitions.

  2. Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like (ab)ab(a|b)^*ab as strings ending in ab. 2 3 4

  3. Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions. 2

  4. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction.

Uses recursive construction rules for symbols, union, concatenation, and Kleene star. It is systematic and guaranteed to work for any regular expression, but may use many states and ε-transitions.2

Footnotes

  1. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction.

  2. Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state.

State Complexity Comparison

A practical comparison between the systematic Thompson construction and a compact direct NFA for (a+b)ab(a+b)^*ab.

Transition table for the compact NFA

A transition table makes the direct NFA easier to simulate.2

StateInput aaInput bb
q0q_0{q0,q1}\{q_0,q_1\}{q0}\{q_0\}
q1q_1\varnothing{q2}\{q_2\}
q2q_2\varnothing\varnothing

Because the automaton is nondeterministic, the entry for (q0,a)(q_0,a) contains two possible next states. Acceptance occurs if at least one computation path ends in the accepting state after all input is consumed.

A useful characterization is:

L((a+b)ab)={w{a,b}w ends with ab}L((a+b)^*ab)=\{\,w\in\{a,b\}^* \mid w \text{ ends with } ab\,\}

which directly explains why strings like abab, aabaab, and bbabbbab are accepted, while aa, abaaba, and bbbbbb are rejected.

Footnotes

  1. Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like (ab)ab(a|b)^*ab as strings ending in ab. 2

  2. Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions. 2 3

Simulating the Direct NFA on Input bbab

  1. 1
    Step 1

    The current state set is {q0}. This is the initial configuration.

    Footnotes

    1. Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions.

  2. 2
    Step 2

    From q0 on input b, the automaton stays in q0. The state set remains {q0}.

    Footnotes

    1. Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like (ab)ab(a|b)^*ab as strings ending in ab.

  3. 3
    Step 3

    Again, from q0 on input b, the state set remains {q0}. The machine is still scanning the arbitrary prefix from (a+b)(a+b)^*.

    Footnotes

    1. Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like (ab)ab(a|b)^*ab as strings ending in ab.

  4. 4
    Step 4

    From q0 on input a, the automaton can stay in q0 or move to q1. The new state set is {q0, q1}. This reflects the guess that the current a may start the final suffix ab.

    Footnotes

    1. Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions.

  5. 5
    Step 5

    From q0 on b, one branch stays in q0, and from q1 on b another branch moves to q2. The resulting state set is {q0, q2}. Since q2 is accepting after the full input is consumed, the string is accepted.2

    Footnotes

    1. Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like (ab)ab(a|b)^*ab as strings ending in ab.

    2. Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions.

Common Questions and Exam Pitfalls

Common Mistake

Do not confuse (a+b)ab(a+b)^*ab with (ab)(ab)^*; the first means any string over {a,b} ending in ab, while the second means zero or more repetitions of the exact block 'ab'. These languages are different.

Footnotes

  1. Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like (ab)ab(a|b)^*ab as strings ending in ab.

Final answer

If the task is to convert (a+b)ab(a+b)^*ab into an NFA, a concise correct NFA is:

  • Q={q0,q1,q2}Q=\{q_0,q_1,q_2\}
  • Σ={a,b}\Sigma=\{a,b\}
  • Start state q0q_0
  • Final state F={q2}F=\{q_2\}

Transition function δ\delta:

δ(q0,a)={q0,q1},δ(q0,b)={q0},δ(q1,b)={q2}\delta(q_0,a)=\{q_0,q_1\},\quad \delta(q_0,b)=\{q_0\},\quad \delta(q_1,b)=\{q_2\}

and all other transitions go to \varnothing.2

This NFA is equivalent to the regular expression because it accepts exactly those strings whose last two characters are abab. If the instructor requires Thompson construction specifically, the larger ε\varepsilon-NFA shown earlier is the formal algorithmic construction.2

Footnotes

  1. Regular expression to NFA union concatenation Kleene star example (a+b)*ab - Tavily results - Discussion illustrating the language behavior of expressions like (ab)ab(a|b)^*ab as strings ending in ab. 2

  2. Thompson's construction - Wikipedia - Overview of Thompson's construction and epsilon-transition-based NFAs for regular expressions.

  3. THOMPSON CONSTRUCTION - IITKgp CSE - Lecture notes describing the recursive rules for building NFAs from regular expressions using Thompson construction.

  4. Regular Expressions - Stanford CS103 Lecture Notes - Notes explaining Thompson's algorithm and its structural properties, including one start and one accept state.

Knowledge Check

Question 1 of 4
Q1Single choice

What language is denoted by the regular expression (a+b)ab(a+b)^*ab?

Explore Related Topics

1

Solving the Recurrence $T(n)=T(n-1)+n$ by Substitution Method

The course shows how to solve the decrease‑by‑one recurrence (T(n)=T(n-1)+n) (with (T(1)=1)) using the substitution method.

  • Repeatedly substitute (T(n-i)=T(n-i-1)+(n-i)) until reaching the base case, yielding (T(n)=T(1)+2+3+\dots+n).
  • The resulting sum is the triangular number (\frac{n(n+1)}{2}).
  • The dominant term (\frac{1}{2}n^{2}) gives a tight asymptotic bound (\Theta(n^{2})).
  • For recurrences of the form (T(n)=T(n-1)+f(n)), expanding to a summation quickly reveals the closed form.
2

Banker's Algorithm Safe-State Analysis for Processes $P_0$ Through $P_4$

The course walks through a full Banker's‑algorithm safety analysis for processes P0P_0P4P_4, computing the Need matrix and using the work‑vector test to verify a safe state and a valid completion order.

  • Need matrix: P0  (0,0,0,0)P_0\;(0,0,0,0), P1  (0,7,5,0)P_1\;(0,7,5,0), P2  (1,0,0,2)P_2\;(1,0,0,2), P3  (0,0,2,0)P_3\;(0,0,2,0), P4  (0,6,4,2)P_4\;(0,6,4,2).
  • Starting work Work=Available=(1,5,2,0)Work = Available = (1,5,2,0); each step selects a process with NeedWorkNeed \le Work, releases its allocation, and updates WorkWork.
  • All processes finish, yielding a safe sequence P0,P2,P1,P3,P4\langle P_0, P_2, P_1, P_3, P_4\rangle.
  • A safe state requires only one such sequence; it does not mean every immediate allocation is safe.
  • Exam tip: test processes with minimal or zero Need first to unlock the rest.
3

Relational Algebra Equivalence: Why $\pi_A(R) - \pi_A((\pi_A(R) \times S) - R)$ Represents Division

The expression

[ \pi_A(R)-\pi_A\big((\pi_A(R)\times S)-R\big) ]

is a derived form of the relational‑algebra division operator, returning all (A) values that pair with every tuple in (S).

  • Division is defined as (R\div S={a\mid\forall b\in S,;(a,b)\in R}).
  • The formula works by (1) projecting candidate (A) values, (2) forming all required ((A,B)) pairs with (S), (3) subtracting existing pairs to find missing ones, (4) projecting the missing (A) values, and (5) removing them from the candidates.
  • In the example, (R(A,B)={(1,x),(1,y),(2,x),(2,y),(3,x)}) and (S(B)={x,y}) yield (R\div S={1,2}).
  • This construction captures the universal (“for all”) query pattern, unlike selection, join, or simple projection.
Chat with Kiro