COMBINING COMPATIBLE STATES DURING LR(1) PARSER CONSTRUCTION - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

COMBINING COMPATIBLE STATES DURING LR(1) PARSER CONSTRUCTION

Description:

COMBINING COMPATIBLE STATES DURING LR(1) PARSER CONSTRUCTION The LR(0) algorithm for creating compilers is one in which contexts are not evaluated, and states are ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 32
Provided by: DavidP363
Category:

less

Transcript and Presenter's Notes

Title: COMBINING COMPATIBLE STATES DURING LR(1) PARSER CONSTRUCTION


1
COMBINING COMPATIBLE STATES DURING LR(1) PARSER
CONSTRUCTION
2
  • The LR(0) algorithm for creating compilers is
    one in which contexts are not evaluated, and
    states are considered identical if they consist
    of the same set of marked productions

3
  • But this algorithm is insufficient for actual
    programming languages, producing parsers with
    numerous conflicts

4
  • The LR(1) algorithm when applied to creating
    compilers for real computer languages, such as
    those for Java or C, results in a parsing
    machine that is a order or more larger than those
    produced by an LR(0) algorithm for the same
    grammar.

5
  • On the other hand the LR(1) algorithm, which
    you made use of in your last assignment, produces
    parsers, for the large grammars employed for
    actual computer languages, which are a few orders
    larger than those produced by the LR(0)
    algorithm.

6
  • As a compromise, various methods, including
    the one employed by Yacc, have been devised for
    subsets of the LR(1) languages, using a hybrid
    approach.

7
  • This works well for most programming languages,
    but imposes a greater responsibility on the
    compiler writer, to come up with a grammar that
    does not lead to conflicts (i.e. to cases where
    more than one action is defined at a parsing
    machine state for the same next input symbol).
  • These methods only work for a subset of the
    LR(1) grammars, and there are applications,
    including ones involving natural language
    processing, for which they are inadequate.

8
  • However one can employ a definition of
    compatibility between states, which works for all
    LR(1) languages, and which produces parsers of
    the same size as those referred to previously

9
  • DEFINITION. The nucleus of state consists of the
    configurations in the state in which the marker
    is in a position greater that zero.
  • Example
  • A configuration in a state of the form
  • A ? bc.d, x,y
  • would be a member of its nucleus, but a
    configuration such as
  • A ? .bcd, x,y
  • would not be a member.

10
  • DEFINITION OF COMPATIBILITY BETWEEN LR(1)
    STATES
  • Let S and S? be two states in a LR(1) parsing
    machine whose nuclei consist of the same marked
    productions, which we will denote as P1,,Pn .
  • For 1 t n, let Ut denote the set of
    contexts associated with marked production Pt in
    state S, and let Ut? denote the set of contexts
    associated with that marked production in state
    S?.
  • Then states S and S? are compatible if, for
    all 1 i lt j n, at least one of the following
    condition holds
  • (a) Ui ? Uj? ? and Ui? ? Uj ?
  • (? is the empty set, i.e. the
    intersections involved are both empty)
  • (b) Ui ? Uj ? ?
  • (c) Ui? ? Uj? ? ?

11
  • Note
  • If states S and S? are as described above, and
    their nuclei consist of only a single
    configuration, then according to the above
    definition they are compatible

12
  • In the case where S and S? as described above are
    compatible, one can combine the states into a
    single state whose nucleus consists of the same
    marked productions listed above, while for 1 t
    n, the set of contexts associated with marked
    production Pt is Ut ? Ut? .

13
  • One way of looking at the definition is to say
    that every pair of configurations in the nuclei
    must pass a test, and that two states are
    compatible only if they all in fact pass.

14
  • Fortunately, in grammars for actual
    programming languages such as Java, C, etc.,
    there are at most 6 configurations in the nucleus
    of any state.
  • The states may be large, with many immediate
    successors, but the nuclei are all quite small.

15
  • EXAMPLES
  • We show only the nucleus of the states in
    these examples, since, according to the
    definition, states are compatible if and only if
    their nuclei are.

16
  • S
    S
  • The above two states are not compatible
  • because the pair consisting of the first and
    last configurations fail the test.
  • For this pair condition (a) of the defn. is
    not true, since the context of the first
    configuration of S contains an x, and so does the
    context of the third production of S
  • In addition neither of conditions (b) or (c)
    are true.

A ? ab.c x,y B ? b.n s,t C ? rb.ed
u,v
A ? ab.c d B ? b.n s C ? rb.ed
x,v

17
  • S
    S
  • The first and third configurations in this
    case pass the test because condition (b) of the
    defn. applies to the first and third
    configurations of S. Both of these
    configurations contain x in their set of
    contexts. The states in this case are
    compatible.
  • Remember, that while every pair of
    configurations in the nucleus must pass the test,
    it only requires that one of conditions (a), (b)
    or (c) be true for a given pair for it to pass.

A ? ab.c x,y B ? b.n s,t C ? rb.ed
x,v
A ? ab.c x,y,d B ? b.n s C ? rb.ed
x,v

18
  • Since the states are compatible, they can be
    combined to form one whose nucleus is

A ? ab.c x,y,d B ? b.n s,t C ?
rb.ed x,u,v
19
  • Note.
  • In the figure on the next slide, where we omit
    the context set of various configurations (i.e.
    only show the marked production involved), the
    inference involved is that they are irrelevant to
    the assertions being made about the figure.

20
States 2 and 8 are not compatible since the first
configuration of state 2 has d as context in
common with the second configuration of state 8.
In fact if we were to combine states 2 and 8, it
would produce a combination of states 3 and 9 as
its u-successor. This state would have a
conflict, in that in had reduce actions, for when
the next input symbol was d, for both Z ? tu and
V ? ?
21
  • Now consider the altered machine obtained if the
    production X ? aYd where replaced by (say) X ?
    aYa. In this case the first configuration of
    state 2 would be Y ? t.W a. It would then
    follow that states 2 and 8 were compatible and
    could safely be combined to form
  • Y ? t.W a, e.
  • Z ? t.u c, d
  • W ? .uV

22
  • The Journal paper describing this method of
    combining states contains a formal proof of its
    correctness. But seeing ours is a practically
    oriented course, we will just consider an
    informal justification based on a few examples to
    supply a flavor of the reasoning involved

23
  • The main argument is that if the parsing machine
    containing the states S and S?, as described in
    the defn. of compatibility, has no conflicts, and
    S and S? are compatible, then the parsing machine
    obtained by combining them will also have no
    conflicts.

24
  • The argument is by contradiction. Lets consider
    examples of the various ways that two
    configurations in the combination of S and S?
    could have conflicts or lead to conflicts between
    other pairs of configurations in states reachable
    from S. In each case we hope to show that either
    the parsing machine as it was before S and S?
    were combined contained conflicts in the first
    place or that S and S? could not in fact have
    been compatible.

25
  • Case 1. Let configs 1 and 2 of the combined
    state formed from
  • states S and S be
  • A ? r B.uv a,b
  • C ? t B.uv a,c
  • Seeing that the machine as it was before the
  • combination contained no conflicts, and
    specifically did not
  • contain a conflict in the uv successor of these
    states, either
  • state S must have contained the a in its
    version of config1, while state S? contained the
    a in its version of config 2, or
  • vice-versa.

26
  • Case 1 contd.
  • A ? r B.uv a,b
  • C ? t B.uv a,c
  • In either case neither condition (a) nor (b) of
    the defn.
  • would then be true for the two configs, and
    since
  • condition (c) is also not true, states S and S
    could not
  • have been compatible in the first place.

27
  • Case 2. Let configs 1 and 2 of the combined
    state be
  • A ? r B.uv a,b
  • D ? t B.Ca
  • C ?.uv a
  • Either S or S? must contain A ? r B.uv a.. ,
  • in which case the original parsing machine would
    have had a conflict at its uv-successor. This is
    in contradiction to our assumption that the
    original parsing machine was conflict-free.

28
  • Case 3. Let configs 1 and 2 be
  • A ? s B.Ea
  • E ?.uv a
  • D ? t B.Ca
  • C ?.uv a
  • Here again the original parsing machine would
    have had conflicts in the uv-successors of both S
    and S?

29
  • Case 4. Let configs 1 and 2 be
  • A ? r B.uv
  • D ? t B.uvr
  • Here too the original parsing machine would have
    had conflicts in the uv-successors of both S and
    S?. In this case the conflict would have been
    between a reduction and a transition.

30
  • EXERCISE
  • Construct an LR(1) parsing machine for the
  • grammar on the next slide, combining
    compatible states as you encounter them

31
  • program ? main statement_list end main
  • statement_list ? statement_list statement
  • statement
  • statement ? assign_statement
  • while_statement
  • do_statement
  • assign_statement ? identifier identifier
  • while_statement ? while ( condition )

  • statement_list wend
  • condition ? identifier identifier
  • do_statement ? do identifier number to
  • number
    statement_list end do
Write a Comment
User Comments (0)
About PowerShow.com