LL Parsing - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

LL Parsing

Description:

DFA yields the predicted alt number. Grammar actions are not sucked into ... { int alt=0; if ( LA(1)==A ) consume(); else error; while ( LA(1)==A ) consume ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 23
Provided by: terenc2
Learn more at: https://www.antlr.org
Category:
Tags: alt | parsing

less

Transcript and Presenter's Notes

Title: LL Parsing


1
LL() Parsing
  • Terence Parr
  • University of San Francisco

2
Topics
  • Research goals
  • Background
  • Problem definition
  • Solution overview
  • What is LL()?
  • How much more powerful is it?
  • Limitations
  • Nondeterminism detection
  • LL() Algorithm
  • Generated code

3
Research goals
  • Make top-down LL-based parsers as powerful as
    possible
  • allows more natural grammars
  • language tools more accessible
  • My research constrained by what programmers
    can/will use
  • recursive-descent parsers must be the base
  • k1 fixed lookahead
  • semantic predicates
  • syntactic predicates controlled backtracking and
    means of specifying ambiguity resolution
  • And for my next trick LL()

4
Background parsers
  • Building a parser generator is easy except for
    the lookahead analysis
  • rule ref ? rule()
  • token ref ? match(t)
  • rule def ? void rule() if (
    lookahead-expr-alt 1 ) match alt 1 else if
    ( lookahead-expr-alt 2 ) match alt 2 else
    error
  • The nature of the lookahead expressions dictates
    the strength of your parser generator

5
LL(2) parser example
void a() if ( LA(1)A LA(2)X )
match(A) match(X) match(R) else if (
LA(1)A LA(2)Y ) match(A)
match(Y) match(S) else error
a A X R A Y S
Lookahead is set of2-sequences that indicate
what alternativewill ultimately succeed
6
Lookahead as DFA
void a() int alt0 if ( LA(1)A )
if ( LA(2)X ) alt1 if ( LA(2)Y )
alt2 switch (alt) case 1
match(A) match(X) match(R)
case 2 match(A) match(Y)
match(S) default error
7
Linear approximate lookahead
  • Note that LA(1) doesnt help distinguish
  • Often its the depth not sequence of tokens that
    matters
  • Reduces O(Tk)to O(T x k) spacefor
    lookaheadsequences
  • Collapse all tokensat depth d
  • Only slightly weakerthan LL(k)

void a() int alt0 if ( LA(2)X ) alt1
if ( LA(2)Y ) alt2 switch (alt) case
1 case 2 default error
8
Problem what cant LL(k) do?
  • Cant see past arbitrarily long constructs from
    left edge
  • For example, cant see past A here
  • Could left-factor, but not always possible and
    its unnatural!

a A X R A Y S
a A (X R Y S)
9
Solution overview
  • Natural extension to LL(k) lookahead DFA Allow
    cyclic DFA that can skip ahead past the As to X
    or Y
  • Dont approximate entire CFGwith regex i.e.,
    dont include R or S
  • Just predict and proceed normallywith LL parse
  • DFA yields the predicted alt number
  • Grammar actions are not sucked intoDFAs and
    arent executed duringprediction

10
LL() code
  • Arbitrary cyclic graphs cant be encoded w/o
    gotos in Java, but here a simple while is ok

void a() int alt0 if ( LA(1)A )
consume() else error while ( LA(1)A )
consume() if ( LA(1)X ) alt1 if (
LA(1)Y ) alt2 switch (alt) case 1
case 2 default error
11
Isnt that just backtracking?
  • No. For example, if I can guarantee you will
    never lookahead more than 10 symbols, it's just
    LL(10), right?
  • Not backtracking with the parser! DFA is smaller
    and faster e.g., DFA predicting expr does not
    follow deep call chain parser does
  • Dont have to avoid or unroll actions in grammar!
  • The DFAs are efficiently coded and automatically
    throttle down when less lookahead is needed

12
Do we need LL() in practice?
  • Natural grammars sometimes not LL(k) e.g. C
    function decl vs def
  • From the left edge, lookahead is not fixed to see
    the vs . We need arbitrary lookahead
    because of the arg
  • If you have actions at ID, cant easily refactor
  • Lookahead will be 5k10 usually for this decision

func type ID ( arg ) type ID
( arg ) body
13
Can we classify LL() strength?
  • Obviously stronger than LL(k) for fixed k
  • Weaker than syntactic predicates LL(k), but
    its automatic and faster
  • ANTLR v3 will have LL() syntactic predicates
    )
  • What about LL(k)s traditional foe LR(k) and its
    nefarious minion LALR(1) (yacc)?
  • No strict ordering! (see next slide)
  • Weaker than GLR or any other system that handles
    all context-free grammars

14
LL() vs LR(k)
  • LR(k) even with k1 is generally more powerful
    than LL() or at least more efficient for same
    grammar, but there is no strict ordering add
    epsilon rule refs to left edge of our grammar and
    its not LR(k) for fixed k derived from adding
    actions

a b A X R c A Y S b c
LL() but not LR(k) due to reduce-reduceconflict
15
LL() Strength Limitations
  • Limited to regular approximation
  • Creating regular covering approximation to
    lookahead language of context-free grammar
    fragment
  • Cant distinguish between context-free fragments
  • Cant see past recursive structures
  • Still deterministic cant deal with ambiguous
    grammars must pick one interpretation

16
Cant see past recursion
  • LL() DFA construction takes LL stack into
    consideration, but resulting DFA will not have
    stack uses sequence instead
  • Example weakness (same language diff grammar)

// works a b X b Y b A
// doesnt work a b X b Y b A A b
// tail recursion
t.g25 Alternative 1 after matching input such
as A A A A decision cannot predict what comes
next due to recursion overflow to b from
b t.g25 Alternative 2
17
LL() Static Analysis Problems
  • Sometimes LL() creates giant DFA looking for
    more lookahead to distinguish alternatives
  • most often due to true ambiguity
  • wont ever succeed, but it keeps trying
  • w/o throttle would be hideous in time/space
  • Workarounds
  • can manually set fixed k lookahead
  • refactor grammar if ambiguous or to reduce
    lookahead requirements
  • Algorithm O() constant is critical got java.g
    processing to drop from 20 minutes to 10s

18
LL() Analysis Benefits
  • LL() analysis and resulting prediction DFAs are
    paradoxically simpler sometimes
  • LL(k) must compute all possible sequences with
    fixed k length using acyclic DFA
  • LL(3) lookahead of (AB) is AAA,AAB,ABA,ABB,BAA,
    BAB,BBA,BBB
  • LL() lookahead of (AB) is simply

19
LL() Algorithm Outline
  • build RTN-like NFA from grammar (similar to LR
    machine construction actually)
  • modified classical NFA-to-DFA conversion (subset
    construction algorithm)
  • DFA state encodes configurations NFA could be in
    after having seen input sequence including call
    invocation stack
  • NFA configuration (saltcontext) tracks state,
    predicted alt, and rule invocation stack to get
    to that state
  • terminate algorithm when state uniquely predicts
    an alternative or nondeterminism found (sictx)
    and (sjctx) for same state s but different alts
    i,j and same/similar context
  • verify DFA is reduced and all alternatives have
    predict state

20
Example difference from classical conversion
a A X R A Y S
a (AA) B
DFA
DFA
LL()
LL()
Stops as nondeterminism or unique prediction
21
Generated Code
  • acyclic DFA generated inline as above
  • cyclic DFA dumped as state objects and walked at
    parse-time withint predict(IntStream input,
    State start)

class DFA3 extends DFA DFA.State s2 new
DFA.State() alt1 DFA.State s1 new
DFA.State() public DFA.State
transition(IntStream input) switch (
input.LA(1) ) case X return s2

22
Summary and Conclusions
  • LL() syntactic predicates is the most powerful
    parsing strategy accessible and attractive to
    average programmer
  • LL() has all benefits of LL but is much
    stronger results in natural grammars
  • Doesn't alter recursive descent parser itself at
    all just enhances the predictive capabilities.
  • Basic algorithm is not that complicated, but
    making it real and useful is interesting it
    has taken 2.5 years to fully understand
  • Pre-release http//www.antlr.org/download/
Write a Comment
User Comments (0)
About PowerShow.com