Ambiguity in the Grammar (S \rightarrow ABA,; A \rightarrow aA \mid \epsilon,; B \rightarrow bB \mid \epsilon)
Ambiguity in the Grammar (S \rightarrow ABA,; A \rightarrow aA \mid \epsilon,; B \rightarrow bB \mid \epsilon)
A context-free grammar is called ambiguous if there exists at least one string in its language that admits two distinct parse trees; equivalently, the same string has two different leftmost derivations or two different rightmost derivations.2 Ambiguity matters because a parser may assign multiple syntactic structures to the same input, which makes interpretation non-unique.2
For the grammar
the nonterminal generates any number of 's, including none, so , and generates any number of 's, including none, so .2 Therefore, the start symbol generates strings of the form
This can also be written as .2 The key source of ambiguity is that the two occurrences of in can both generate -strings, so when a derived string contains only 's, or when the middle vanishes to , there may be multiple ways to distribute the same 's between the left and right .2
A compact structural view is:
Because both and can independently derive and can derive , the boundary between the left and right blocks of 's is not unique for some strings.2
Footnotes
-
Parse trees, ambiguity, and Chomsky normal form - Defines ambiguity via existence of a string with at least two parse trees. ↩ ↩2 ↩3 ↩4 ↩5
-
CS 373: Theory of Computation - University of Illinois - Explains ambiguity of CFGs in terms of different parse trees and derivations. ↩ ↩2 ↩3
-
Context-Free Grammars and Languages - Discusses ambiguity, parse trees, and why multiple parses create non-unique structure. ↩ ↩2
-
Unambiguous Grammar - Includes this exact grammar as an ambiguity exercise and supports the analysis of how the productions generate strings. ↩ ↩2
Ambiguous Grammar
Core Definition
To prove a grammar ambiguous, it is enough to find one string in the language that has two distinct parse trees or two distinct leftmost derivations.2
Footnotes
-
Parse trees, ambiguity, and Chomsky normal form - Defines ambiguity via existence of a string with at least two parse trees. ↩
-
CS 373: Theory of Computation - University of Illinois - Explains ambiguity of CFGs in terms of different parse trees and derivations. ↩
Why this grammar is ambiguous
Let us choose the string
This string belongs to because we may let one generate , let , and let the other . However, there is also another derivation in which the first , , and the second generates . Since these correspond to different syntactic structures, the grammar is ambiguous.2
More generally, every string with is ambiguous in this grammar. Since , we have
and for any fixed , there are multiple choices of such that . For example, can be split as , , or . At least two such splits yield distinct parse trees, which is sufficient for ambiguity.2
An abstract view of the ambiguity for is:
| Left (A) contributes | (B) contributes | Right (A) contributes | Result |
|---|---|---|---|
| (a^n) | (\epsilon) | (\epsilon) | (a^n) |
| (a^{n-1}) | (\epsilon) | (a) | (a^n) |
| (\cdots) | (\epsilon) | (\cdots) | (a^n) |
| (\epsilon) | (\epsilon) | (a^n) | (a^n) |
Thus the grammar does not assign a unique structure to such strings.2
Footnotes
-
Unambiguous Grammar - Includes this exact grammar as an ambiguity exercise and supports the analysis of how the productions generate strings. ↩ ↩2
-
Parse trees, ambiguity, and Chomsky normal form - Defines ambiguity via existence of a string with at least two parse trees. ↩ ↩2 ↩3
-
CS 373: Theory of Computation - University of Illinois - Explains ambiguity of CFGs in terms of different parse trees and derivations. ↩ ↩2
-
Context-Free Grammars and Languages - Discusses ambiguity, parse trees, and why multiple parses create non-unique structure. ↩
Proof That the Grammar Is Ambiguous
- 1Step 1
From , the nonterminal generates . From , the nonterminal generates . Hence generates .2
Footnotes
-
Parse trees, ambiguity, and Chomsky normal form - Defines ambiguity via existence of a string with at least two parse trees. ↩
-
Unambiguous Grammar - Includes this exact grammar as an ambiguity exercise and supports the analysis of how the productions generate strings. ↩
-
- 2Step 2
Choose the shortest nontrivial string . It is in the language because one occurrence of can produce , while and the other can both produce .
Footnotes
-
Unambiguous Grammar - Includes this exact grammar as an ambiguity exercise and supports the analysis of how the productions generate strings. ↩
-
- 3Step 3
Here the left contributes the terminal , while and the right both derive .
- 4Step 4
Here the left derives , derives , and the right contributes the terminal .
- 5Step 5
These are two distinct leftmost derivations of the same terminal string . Therefore the grammar is ambiguous.2
Footnotes
-
CS 373: Theory of Computation - University of Illinois - Explains ambiguity of CFGs in terms of different parse trees and derivations. ↩
-
Context-Free Grammars and Languages - Discusses ambiguity, parse trees, and why multiple parses create non-unique structure. ↩
-
Two distinct parse trees for the string (a)
Below are two different parse trees for the same string .
Parse Tree 1: the left produces , while and the right produce .
Parse Tree 2: the left produces , produces , and the right produces .
Since these parse trees are structurally different but yield the same terminal string , the grammar is ambiguous by definition.2
Footnotes
-
Parse trees, ambiguity, and Chomsky normal form - Defines ambiguity via existence of a string with at least two parse trees. ↩
-
CS 373: Theory of Computation - University of Illinois - Explains ambiguity of CFGs in terms of different parse trees and derivations. ↩
Common Mistake
Different derivation orders do not by themselves prove ambiguity. The important point is that the same string must have different parse trees, or equivalently different leftmost derivations.2
Footnotes
-
CS 373: Theory of Computation - University of Illinois - Explains ambiguity of CFGs in terms of different parse trees and derivations. ↩
-
Context-Free Grammars and Languages - Discusses ambiguity, parse trees, and why multiple parses create non-unique structure. ↩
The grammar generates . The source of ambiguity is that both copies of generate the same language , and may disappear by deriving . Therefore some strings, especially , can be split between the two symbols in multiple ways.2
Footnotes
-
Parse trees, ambiguity, and Chomsky normal form - Defines ambiguity via existence of a string with at least two parse trees. ↩
-
Unambiguous Grammar - Includes this exact grammar as an ambiguity exercise and supports the analysis of how the productions generate strings. ↩
Number of Possible Splits of the String (a^n)
If (B \Rightarrow \epsilon), the (n) copies of (a) can be divided between the two occurrences of (A) in (n+1) ways.
Formal interpretation of the chart
For a string , if , then the left occurrence of may generate and the right occurrence may generate for any integer with . Hence the number of possible splits is
So for , there are splits; for , there are splits; and in general the grammar offers multiple structural decompositions of the same terminal string. The existence of more than one valid decomposition for even one value of already proves ambiguity.2
This also shows that the ambiguity is not accidental or isolated: it is a systematic effect caused by overlapping generative roles of the two nonterminals combined with the epsilon production .2
Footnotes
-
Unambiguous Grammar - Includes this exact grammar as an ambiguity exercise and supports the analysis of how the productions generate strings. ↩ ↩2
-
Parse trees, ambiguity, and Chomsky normal form - Defines ambiguity via existence of a string with at least two parse trees. ↩ ↩2
-
CS 373: Theory of Computation - University of Illinois - Explains ambiguity of CFGs in terms of different parse trees and derivations. ↩
Further Clarifications
Exam Strategy
When asked to prove a grammar ambiguous, first compute what each nonterminal generates, then look for overlap caused by repeated nonterminals or epsilon-productions. Short strings often expose ambiguity fastest.2
Footnotes
-
Parse trees, ambiguity, and Chomsky normal form - Defines ambiguity via existence of a string with at least two parse trees. ↩
-
Unambiguous Grammar - Includes this exact grammar as an ambiguity exercise and supports the analysis of how the productions generate strings. ↩
Knowledge Check
When is a context-free grammar called ambiguous?
Explore Related Topics
Lexical Analysis and the Main Structure Used: Finite Automata
Lexical analysis relies on finite automata—typically deterministic finite automata (DFA)—to recognize token patterns defined by regular expressions.
- Regular expressions for identifiers, numbers, etc., are converted to NFAs then to a DFA for fast scanning.
- The DFA processes the source character by character, tracking a single current state and emitting a token at each accepting state.
- Queues, stacks, and trees support other compiler phases (parsing, AST construction) but are not the primary model for token recognition.
- Lexers output a stream of tokens that the parser consumes for syntax analysis.
Which Grammar Type Is the Most Powerful? Understanding Type-0, Type-1, Type-2, and Type-3 Grammars
The most expressive grammar in the Chomsky hierarchy is the unrestricted Type‑0 grammar, which generates all recursively enumerable languages and matches the power of a Turing machine.
- The hierarchy is strict: , so each higher type can generate everything the lower types can, plus more.
- Type‑3 (regular) → finite automaton; Type‑2 (context‑free) → pushdown automaton; Type‑1 (context‑sensitive) → linear‑bounded automaton; Type‑0 (unrestricted) → Turing machine.
- Example languages: is context‑free but not regular; is context‑sensitive but not context‑free.
- “More powerful” refers to expressive capacity (ability to generate a larger class of languages), not ease of parsing or practical use.
Syntax-Directed Translation: Infix to Prefix Notation
The module shows how a syntax‑directed translation scheme using only synthesized attributes can convert infix arithmetic expressions into prefix (Polish) notation while preserving operator precedence and left‑associativity.
- Grammar: E → E + T | E - T | T; T → T * F | F; F → digit, enforcing precedence ( * > + / - ).
- Semantic actions compute a
valstring for each non‑terminal, concatenating the operator before its operand strings. - Example results:
9 - 5 + 2→+ - 9 5 2;9 - 5 * 2→- 9 * 5 2. - Synthesized (S‑attributed) attributes allow immediate bottom‑up evaluation during LR‑style parsing.
- Left‑recursive rules enable left‑associativity; to use LL parsers the grammar must be transformed and inherited attributes introduced.
