Left recursion
From Wikipedia, the free encyclopedia
The introduction to this article provides insufficient context for those unfamiliar with the subject. Please help improve the article with a good introductory style. |
In computer science, left recursion is a special case of recursion.
In terms of context-free grammar, a non-terminal r
is left-recursive if the left-most symbol in any of r
’s ‘alternatives’ either immediately (direct left-recursive) or through some other non-terminal definitions (indirect/hidden left-recursive) rewrites to r
again.
Contents |
[edit] Definition
"A grammar is left-recursive if we can find some non-terminal A which will eventually derive a sentential form with itself as the left-symbol."[1]
[edit] Immediate left recursion
Immediate left recursion occurs in rules of the form
Where α and β are sequences of nonterminals and terminals, and β doesn't start with A.
Example : The rule
is immediately left-recursive. The recursive descent parser for this rule might look like :
- function Expr() {
- Expr(); match('+'); Term();
- }
and a recursive descent parser would fall into infinite recursion when trying to parse a grammar which contains this rule.
[edit] Indirect left recursion
Indirect left recursion in its simplest form could be defined as :
Possibly giving the derivation
More generally, for the non-terminals A0,A1,...,An, indirect left recursion can be defined as being of the form :
...
Where α1,α2,...,αn are sequences of nonterminals and terminals.
[edit] Accommodating Left Recursion in Top-down Parsing
A formal grammar that contains left recursion cannot be parsed by a naive recursive descent parser unless they are converted to a weakly equivalent right-recursive form. (In contrast, left recursion is preferred for LALR parsers because it results in lower stack usage than right recursion.) However, recent research demonstrates that it is possible to accommodate left-recursive grammars (along with all other forms of general CFGs) in a more sophisticated top-down parser by use of curtailment. A recognition algorithm which accommodates ambiguous grammars with direct left-recursive production rules is described by Frost and Hafiz in 2006 [2] . That algorithm was extended to a complete parsing algorithm to accommodate indirect as well as direct left-recursion in polynomial time, and to generate compact polynomial-size representations of the potentially-exponential number of parse trees for highly-ambiguous grammars by Frost, Hafiz and Callaghan in 2007 [3]. The algorithm has since been implemented as a set of parser combinators written in the Haskell programming language. The implementation details of these new set of combinators can be found in a paper [4] by the above-mentioned authors, which was presented in PADL'08. The X-SAIGA site has more about the algorithms and implementation details.
[edit] Removing left recursion
[edit] Removing immediate left recursion
The general algorithm to remove immediate left recursion follows. Several improvements to this method have been made, including the ones described in "Removing Left Recursion from Context-Free Grammars" [5], written by Robert C. Moore.
For each rule of the form
Where :
- A is a left-recursive nonterminal
- α is a sequence of nonterminals and terminals that is not null ()
- β is a sequence of nonterminals and terminals that does not start with A.
Replace the A-production by the production :
And create a new nonterminal
This newly created symbol is often called the "tail", or the "rest".
[edit] Removing indirect left recursion
If the grammar has no ε-productions (no productions of the form ) and is not cyclic (no derivations of the form for any nonterminal A), this general algorithm may be applied to remove indirect left recursion :
Arrange the nonterminals in some (any) fixed order A1, ... An.
- for i = 1 to n {
- for j = 1 to i – 1 {
-
- let the current Aj productions be
-
- replace each production by
-
- remove direct left recursion for Ai
-
- }
- for j = 1 to i – 1 {
- }
[edit] Pitfalls
The above transformations remove left-recursion by creating a right-recursive grammar; but this changes the associativity of our rules. Left recursion makes left associativity; right recursion makes right associativity. Example : We start out with a grammar :
After having applied standard transformations to remove left-recursion, we have the following grammar :
Parsing the string 'a + a + a' with the first grammar in an LALR parser (which can recognize left-recursive grammars) would have resulted in the parse tree :
Expr / \ Expr + Term / | \ \ Expr + Term Factor | | | Term Factor Int | | Factor Int | Int
This parse tree grows to the left, indicating that the '+' operator is left associative, representing (a + a) + a.
But now that we've changed the grammar, our parse tree looks like this :
Expr ---
/ \
Term Expr' --
| / | \
Factor + Term Expr' ------
| | | \ \
Int Factor + Term Expr'
| | |
Int Factor ε
|
Int
We can see that the tree grows to the right, representing a + ( a + a). We have changed the associativity of our operator '+', it is now right-associative. While this isn't a problem for the associativity of addition with addition it would have a significantly different value if this were subtraction.
The problem is that normal arithmetic requires left associativity. Several solutions are: (a) rewrite the grammar to be left recursive, or (b) rewrite the grammar with more nonterminals to force the correct precedence/associativity, or (c) if using YACC or Bison, there are operator declarations, %left, %right and %nonassoc, which tell the parser generator which associativity to force.
[edit] See also
[edit] References
- ^ Notes on Formal Language Theory and Parsing, James Power, Department of Computer Science National University of Ireland, Maynooth Maynooth, Co. Kildare, Ireland.JPR02
- ^ Frost, R. and Hafiz, R. (2006) "A New Top-Down Parsing Algorithm to Accommodate Ambiguity and Left Recursion in Polynomial Time." ACM SIGPLAN Notices, Volume 41 Issue 5, Pages: 46 - 54.
- ^ Frost, R., Hafiz, R. and Callaghan, P. (2007) "Modular and Efficient Top-Down Parsing for Ambiguous Left-Recursive Grammars." 10th International Workshop on Parsing Technologies (IWPT), ACL-SIGPARSE , Pages: 109 - 120, June 2007, Prague.
- ^ Frost, R., Hafiz, R. and Callaghan, P. (2008) "Parser Combinators for Ambiguous Left-Recursive Grammars." 10th International Symposium on Practical Aspects of Declarative Languages (PADL), ACM-SIGPLAN , Volume 4902/2008, Pages: 167-181, January 2008, San Francisco.
- ^ Removing Left Recursion from Context-Free Grammars, Robert C. Moore, Microsoft Research, Redmond, WA, USA