Syntax-Directed Translation: Infix to Prefix Notation

Verified Sources

May 18, 2026

In compiler design, translating high-level program expressions into machine-friendly intermediate representations is a core task of the front-end parser. One common task is converting infix notation into prefix notation. This conversion is governed by rules of operator precedence and associativity .

To perform this translation systematically, we construct a Syntax-Directed Translation Scheme (SDTS) based on a context-free grammar (CFG) . By defining structural attributes for grammar symbols, we can perform translation in a single pass.

In this section, we will define a robust context-free grammar that handles basic arithmetic operators ( $+$ , $-$ , $*$ ) while respecting operator precedence. We will establish a Syntax-Directed Definition (SDD) that uses synthesized attributes to construct the prefix string.

Aho, A. V., Lam, M. S., Sethi, R., & Ullman, J. D. (2006). Compilers: Principles, Techniques, and Tools (2nd Edition). Addison-Wesley. (Sections on Syntax-Directed Translation and Parser design). ↩
Cooper, K. D., & Torczon, L. (2011). Engineering a Compiler. Morgan Kaufmann. (Analysis of expression grammars, operator precedence, and syntax trees). ↩

Syntax Directed Translation (SDT) | Scheme & Implementation

The Grammar & Semantic Rules

To handle addition ( $+$ ), subtraction ( $-$ ), and multiplication ( $*$ ), we must design a grammar that naturally implements operator precedence (multiplication has higher precedence than addition and subtraction) and left-associativity (operations of equal precedence are evaluated from left to right) .

We use the standard unambiguous expression grammar:

\begin{aligned} E &\to E_1 + T \mid E_1 - T \mid T \\ T &\to T_1 * F \mid F \\ F &\to \text{digit} \end{aligned}

To translate this grammar into prefix notation, we associate a synthesized string attribute, val, with each non-terminal ( $E$ , $T$ , and $F$ ). The token $\text{digit}$ has a lexical attribute $\text{lexval}$ which represents its character value.

The Syntax-Directed Translation Scheme (SDTS) for this conversion is defined below:

Production	Semantic Rule / Action
$E \to E_1 + T$	$E.\text{val} = \text{strcat}("+ ", E_1.\text{val}, " ", T.\text{val})$
$E \to E_1 - T$	$E.\text{val} = \text{strcat}("- ", E_1.\text{val}, " ", T.\text{val})$
$E \to T$	$E.\text{val} = T.\text{val}$
$T \to T_1 * F$	$T.\text{val} = \text{strcat}("* ", T_1.\text{val}, " ", F.\text{val})$
$T \to F$	$T.\text{val} = F.\text{val}$
$F \to \text{digit}$	$F.\text{val} = \text{digit}.\text{lexval}$

Parse Tree Construction and Evaluation

When a parser processes an input string, it builds a parse tree. In an S-attributed definition like ours, the synthesized attributes are evaluated bottom-up, starting from the leaf nodes ( $\text{digit}$ ) and propagating up to the root ( $E$ ) .

Cooper, K. D., & Torczon, L. (2011). Engineering a Compiler. Morgan Kaufmann. (Analysis of expression grammars, operator precedence, and syntax trees). ↩
Grune, D., van Reeuwijk, K., Bal, H. E., Jacobs, C. J., & Langendoen, K. (2012). Modern Compiler Design. Springer. (Detailed explanation of S-attributed and L-attributed Definitions). ↩

Why Use Synthesized Attributes?

Synthesized attributes are perfect for bottom-up parsing algorithms (like LR parsers). Since the attributes of a parent node depend entirely on the attributes of its children, the parser can execute the semantic translation action immediately upon reducing a production rule.

Parsing & Translating 9 - 5 + 2

1
Step 1
The input string 9 - 5 + 2 is tokenized into three digits and two operators: $\text{digit}_1 (9)$ , $-$ , $\text{digit}_2 (5)$ , $+$ , and $\text{digit}_3 (2)$ .
2
Step 2
First, reduce individual digits to factors ( $F$ ) and then terms ( $T$ ):

$F_1 \to \text{digit}_1 (9) \implies F_1.\text{val} = '9'$

$T_1 \to F_1 \implies T_1.\text{val} = '9'$

$E_1 \to T_1 \implies E_1.\text{val} = '9'$
3
Step 3
Next, parse the digit $5$ :

$F_2 \to \text{digit}_2 (5) \implies F_2.\text{val} = '5'$

$T_2 \to F_2 \implies T_2.\text{val} = '5'$

Apply the reduction $E_2 \to E_1 - T_2$ :

$E_2.\text{val} = \text{strcat}('- ', '9', ' ', '5') = '- 9 5'$
4
Step 4
Parse the final digit $2$ :

$F_3 \to \text{digit}_3 (2) \implies F_3.\text{val} = '2'$

$T_3 \to F_3 \implies T_3.\text{val} = '2'$

Apply the final reduction at the root $E \to E_2 + T_3$ :

$E.\text{val} = \text{strcat}('+ ', '- 9 5', ' ', '2') = '+ - 9 5 2'$

Parsing & Translating 9 - 5 * 2

1
Step 1
The input is 9 - 5 * 2. Because multiplication ( $*$ ) has a higher precedence than subtraction ( $-$ ), the grammar forces $5 * 2$ to be grouped together under a term ( $T$ ) before resolving the subtraction under an expression ( $E$ ).
2
Step 2
Reduce the first digit:

$F_1 \to \text{digit}_1 (9) \implies F_1.\text{val} = '9'$

$T_1 \to F_1 \implies T_1.\text{val} = '9'$

$E_1 \to T_1 \implies E_1.\text{val} = '9'$
3
Step 3
Reduce the remaining terms:

$F_2 \to \text{digit}_2 (5) \implies F_2.\text{val} = '5'$

$T_2 \to F_2 \implies T_2.\text{val} = '5'$

$F_3 \to \text{digit}_3 (2) \implies F_3.\text{val} = '2'$

Apply the reduction $T_3 \to T_2 * F_3$ :

$T_3.\text{val} = \text{strcat}('* ', '5', ' ', '2') = '* 5 2'$
4
Step 4
Apply the subtraction reduction at the root $E \to E_1 - T_3$ :

$E.\text{val} = \text{strcat}('- ', '9', ' ', '* 5 2') = '- 9 * 5 2'$

Annotated Parse Trees

Below are the visual parse trees with annotated synthesized attributes showing exactly how the final prefix string is constructed from bottom-up attribute evaluation .

Annotated Parse Tree for $9 - 5 + 2$

Annotated Parse Tree for $9 - 5 * 2$

Grune, D., van Reeuwijk, K., Bal, H. E., Jacobs, C. J., & Langendoen, K. (2012). Modern Compiler Design. Springer. (Detailed explanation of S-attributed and L-attributed Definitions). ↩

Left-Recursion and Infix-to-Prefix Translation

While left-recursive grammars are perfectly suited for bottom-up parsers (like LALR/Yacc), they cannot be parsed directly by top-down LL(1) parsers. If you write a top-down translator, you must eliminate left recursion first, which changes the grammar structure and requires using inherited attributes to pass operands down the parse tree.

This tab displays the output generated by our synthesized SDTS:

Input: 9 - 5 + 2
Prefix Output: + - 9 5 2
Input: 9 - 5 * 2
Prefix Output: - 9 * 5 2

In synthesized attributes, strings are accumulated and concatenated as operations are completed up the parse tree.

Parse Tree Nodes and Traversal Step Count

Comparison of node allocations and evaluation overhead between addition/subtraction and multiplication precedence levels.

Deep Dive into Grammar Mechanics

Knowledge Check

Question 1 of 3

Q1Single choice

Which of the following is the correct prefix notation for the infix expression 7 * (3 + 2)?

7 + 3 2

- 7 3 2

- 7 3 2

7 3 + 2

Explore Related Topics

Lexical Analysis and the Main Structure Used: Finite Automata

Lexical analysis relies on finite automata—typically deterministic finite automata (DFA)—to recognize token patterns defined by regular expressions.

Regular expressions for identifiers, numbers, etc., are converted to NFAs then to a DFA for fast scanning.
The DFA processes the source character by character, tracking a single current state and emitting a token at each accepting state.
Queues, stacks, and trees support other compiler phases (parsing, AST construction) but are not the primary model for token recognition.
Lexers output a stream of tokens that the parser consumes for syntax analysis.

Comparing Imperative and Non-Imperative Programming Languages

Imperative programming describes computation as explicit step‑by‑step commands that change mutable state, while non‑imperative (declarative) styles describe what result is desired, often minimizing direct state manipulation.

Imperative languages use assignments, loops, and explicit control flow (e.g., C, Java, Python).
Non‑imperative approaches include functional (Haskell), logic (Prolog), and query languages (SQL), focusing on expressions, relations, or rules.
Hybrid languages are common; developers mix imperative orchestration with functional cores or declarative queries.
Choose imperative when precise sequencing or low‑level control is needed; choose non‑imperative for abstraction, predictability, easier reasoning, and parallelism.
Functional code emphasizes immutability and pure functions, while logic code relies on unification and backtracking to infer answers.

Converting the Regular Expression $(a+b)^*ab$ into an NFA

The course shows how to turn the regular expression  $(a+b)^*ab$ —the set of all strings over  $\{a,b\}$ that end with “ab”—into an NFA, first via Thompson’s systematic ε‑construction and then with a minimal three‑state NFA.

Thompson’s construction builds ε‑transitions for symbols  $a$ ,  $b$ , their union, the Kleene star, and the final concatenations, yielding a 12‑state ε‑NFA.
A compact direct NFA uses only three states: $q_0$ loops on $a,b$ , branches on $a$ to $q_1$ , and $q_1$ moves on $b$ to accepting $q_2$ .
The compact NFA’s transition table illustrates nondeterministic moves (e.g., $δ(q_0,a)=\{q_0,q_1\}$ ) and accepts exactly the strings ending in “ab”.
State‑complexity comparison: Thompson’s method needs many states and ε‑moves, while the direct NFA is far smaller and easier to simulate.

Research more with Coursify

Syntax-Directed Translation: Infix to Prefix Notation

AI Summary

Footnotes

Syntax Directed Translation (SDT) | Scheme & Implementation

The Grammar & Semantic Rules

Parse Tree Construction and Evaluation

Footnotes

Why Use Synthesized Attributes?

Parsing & Translating 9 - 5 + 2

Parsing & Translating 9 - 5 * 2

Annotated Parse Trees

Annotated Parse Tree for 9−5+29 - 5 + 29−5+2

Annotated Parse Tree for 9−5∗29 - 5 * 29−5∗2

Footnotes

Left-Recursion and Infix-to-Prefix Translation

Parse Tree Nodes and Traversal Step Count

Deep Dive into Grammar Mechanics

Knowledge Check

Explore Related Topics

Annotated Parse Tree for $9 - 5 + 2$

Annotated Parse Tree for $9 - 5 * 2$