What is Error Recovery in Compiler Construction?
When a compiler finds a syntax error (like missing semicolon, wrong bracket, etc.), it should not
stop immediately. Instead, it should handle the error gracefully and try to continue compiling
the rest of the program.
This process is called Error Recovery.
Why is Error Recovery Important?
Without error recovery:
The compiler would stop at the first error, which is frustrating for programmers.
Programmers would have to fix one error, recompile, and repeat the cycle many times.
With good error recovery:
The compiler shows multiple errors at once, saving time.
It helps in debugging and fixing the program more quickly.
Types of Error Recovery Techniques
1. Panic Mode Recovery
Idea:
When an error is found, the compiler skips tokens (words of code) until it finds a
synchronizing symbol (like ;, }, etc.) that marks the end of a statement or block.
Example:
int a = 10
int b = 20;
The semicolon is missing after 10, causing an error.
In panic mode, the compiler skips everything after the error until it sees the next
semicolon ;.
Advantages:
Simple to implement
Never gets stuck in infinite loops
Disadvantage:
It may skip important parts of the code
2. Phrase-Level Recovery
In this strategy, on discovering an error, parser performs local correction on the remaining
input. It can replace a prefix of the remaining input with some string. This actually helps the
parser to continue its job. The local correction can be replacing the comma with semicolons,
omission of semicolons, or, fitting missing semicolons. This type of local correction is
decided by the compiler developer.
Examples:
int a,b
// AFTER RECOVERY:
int a,b; //Semicolon is added by the compiler
Advantages: This method is used in many errors repairing compilers.
Disadvantages: While doing the replacement the program should be prevented from falling
into an infinite loop.
3. Error Productions
It requires good knowledge of common errors that might get encountered, then we can
augment the grammar for the corresponding language with error productions that generate the
erroneous constructs. If error production is used during parsing, we can generate an
appropriate error message to indicate the error that has been recognized in the input. This
method is extremely difficult to maintain, because if we change grammar, then it becomes
necessary to change the corresponding productions.
Example: Suppose the input string is abcd.
Grammar: S-> A
A-> aA | bA | a | b
B-> cd
The input string is not obtainable by the above grammar, so we need to add Augmented
Grammar.
Grammar: E->SB // AUGMENT THE GRAMMAR
S-> A
A-> aA| bA | a | b
B-> cd
Now, string abcd is possible to obtain.
Advantages:
1. Syntactic phase errors are generally recovered by error productions.
Disadvantages:
1. The method is very difficult to maintain because if we change the grammar then it
becomes necessary to change the corresponding production.
2. It is difficult to maintain by the developers.
Sure! Let’s explain LL(1) Grammars in easy wording and detail, step by step.
What is an LL(1) Grammar?
An LL(1) Grammar is a special type of grammar used in top-down parsing (predictive parsers).
Let's break it down:
L → Scan input Left to right
L → Produce the Leftmost derivation
1 → Use 1 token of lookahead (i.e., look only one symbol ahead in the input to decide
what rule to apply)
So, LL(1) means: Left-to-right scanning, Leftmost derivation, and 1 lookahead token
Why are LL(1) Grammars Important?
They are:
Easy to parse
Fast to implement using Predictive Parsers
Help avoid backtracking (rechecking again and again)
When is a Grammar LL(1)?
A grammar is LL(1) if:
1. Each non-terminal has no two productions that begin with the same terminal.
2. If one production can lead to ε (empty string), then the FIRST and FOLLOW sets must be
disjoint.
FIRST and FOLLOW Sets (brief idea)
FIRST(X): Set of terminals that can appear first when X is derived.
FOLLOW(X): Set of terminals that can appear after X in some derivation.
Example of LL(1) Grammar
E → T E'
E' → + T E' | ε
T → id
Let’s understand:
We can decide which production to apply by looking at just one symbol ahead (like + or
id)
FIRST and FOLLOW sets don’t overlap in a bad way
So, this is a valid LL(1) grammar
Example of Non-LL(1) Grammar
S→aA|aB
This is not LL(1) because:
Both productions of S start with 'a'
So, the parser can’t decide which one to use just by looking at 'a'
To fix this, we can apply Left Factoring.
How to Make a Grammar LL(1)?
1. Remove Left Recursion
Replace rules like: A → Aα | β
With:
A → β A'
A' → α A' | ε
2. Left Factor the Grammar
If two productions start the same:
A → ab | ac
Turn into:
A → aB
B→b|c
Bonus: Uses of LL(1) Grammar
Syntax checking in compilers
Building simple parsers manually
Grammar design in language development
What is Predictive Parsing?
Predictive parsing is a kind of top-down parsing technique that tries to predict the production
rule to use based on the current input and grammar rules — without backtracking.
It is called "predictive" because it uses lookahead (usually 1 symbol) to make parsing decisions.
Variants of Predictive Parsing
There are two main types:
1. Recursive Descent Parsing
Uses a set of recursive functions for each non-terminal.
Manually written by programmers.
Easy to implement but can be inefficient for complex grammars.
Example behavior:
E → T E'
T → id
E' → + T E' | ε
Each non-terminal (E, T, E') becomes a separate function.
2. Non-Recursive Predictive Parsing (Table-Driven Parsing)
Uses a parsing table (called the LL(1) parsing table) and a stack.
This is a more automated and efficient method.
No recursion is used; instead, a stack guides the process.
Example components:
Parsing table
Stack (initially contains start symbol + $)
Input buffer
Match rule from table using top of stack and lookahead symbol
Components Needed for Predictive Parser
To build a predictive parser (non-recursive):
1. Grammar in LL(1) form
2. FIRST and FOLLOW sets
3. Parsing Table
4. Stack + Input pointer
5. Parser logic to match terminals and expand non-terminals
Sample Program – Table-Driven Predictive Parser (in Python)
Let’s implement a simple LL(1) predictive parser for the following grammar:
Grammar:
E → T E'
E' → + T E' | ε
T → id
FIRST and FOLLOW (for simplicity):
FIRST(E) = { id }
FIRST(E') = { +, ε }
FIRST(T) = { id }
FOLLOW(E) = { $, ) }
FOLLOW(E') = { $, ) }
FOLLOW(T) = { +, $, ) }