Parsing
Short Notes
Parsing (Syntax Analysis) checks if the tokens follow the grammar rules and builds a Parse Tree.
Types of Parsers
- Top-Down: Starts from root (Start symbol) and reaches leaves. (LL parsers, Recursive Descent).
- Bottom-Up: Starts from leaves (Tokens) and reduces to root. (LR parsers, Operator Precedence).
Key Theories & Formulas
1. LL(1) Parsing
- Left-to-right scan, Leftmost derivation, 1 token lookahead.
- Condition: For \(A \to \alpha \mid \beta\), \(FIRST(\alpha) \cap FIRST(\beta) = \emptyset\).
- Requirement: No Left Recursion, No Left Factoring required.
2. LR Parsing
- Left-to-right scan, Reverse of Rightmost derivation.
- Power hierarchy: \(LR(0) < SLR < LALR < CLR\).
- Number of States: \(LR(0), SLR, LALR\) have the same number of states. \(CLR\) has many more.
3. Operator Precedence Parsing
- Only works for "Operator Grammars". No \(\epsilon\) or adjacent non-terminals.
Example Problems
Problem: Find FIRST and FOLLOW for: \(S \to aAB\) \(A \to b \mid \epsilon\) \(B \to c\)
- \(FIRST(A) = \{b, \epsilon\}\)
- \(FIRST(S) = \{a\}\)
- \(FOLLOW(A) = FIRST(B) = \{c\}\)
Hardest GATE Questions
Topic: Conflict Analysis in Parsing Tables Tricky Question (GATE 2011/2013/2018): Given an LR item set, is there a shift-reduce (S-R) or reduce-reduce (R-R) conflict?
- Analysis:
- S-R Conflict: A state has \(A \to \alpha \cdot a \beta\) and \(B \to \gamma \cdot\) where \(a \in FOLLOW(B)\) in SLR.
- R-R Conflict: Two reductions in the same state for the same lookahead.
- The "Trap": "LALR vs CLR".
- Merging two CLR states with same core but different lookaheads into one LALR state can create R-R conflicts but never S-R conflicts.
- This is a very common theoretical question.
- Hard Aspect: Identifying if a grammar is LR(1) by manually building the item sets.
- Complexity: Parsing complexity \(O(n^3)\) for any CFG (CYK algorithm), but \(O(n)\) for LR/LL grammars