Bottom-Up Parsing

home1 home2
 Bib
 Algorithms
 Bioinfo
 FP
 Logic
 MML
 Prog.Lang
and the
 Book

Prog-Langs
 glossary
 Grammar
  Arith-Exp
  Abstract-syn
  Top-Down
  Bottom-Up

Reductions

sentence: w * x + y * z <end>
reductions
 w*x+y*z<end>
1Factor
2Term
3Factor
4Term
5Exp
6Factor
7Term
8Factor
9Term
10Exp

The reductions are equivalent to performing the following right-most (rm) derivations
 
<Exp> rm=> <Exp> + <Term> rm=> <Exp> + <Term>*<Factor> rm=> <Exp> + <Term>*z rm=> <Exp> + <Factor>*z rm=> <Exp> + y*z rm=> <Term> + y*z rm=> <Term>*<Factor> + y*z rm=> <Term>*x + y*z rm=> <Factor>*x + y*z rm=> w*x+y*z
 
but performing them in reverse.

Parse tree

  +
 
*
  *
w   x   y   z

Table-driven, bottom-up parsing

Grammar (E ~ <Exp>, T ~ <Term>, F ~ <Factor>, id ~ x | y | ... , for brevity)
E -> E + T
E -> E - T
E -> T
T -> T * F
T -> T / F
T -> F
F -> id
F -> ( E )
F -> - F
Parsing, w*x+y*z,   i.e., id*id+id*id,
general bottom-up shift-reduce scheme
>stack> <input< action
  w * x + y * z $ shift
w * x + y * z $ reduce
F * x + y * z $ reduce
T * x + y * z $ shift
T * x + y * z $ shift
T * x + y * z $ reduce
T * F + y * z $ reduce
T + y * z $ reduce
E + y * z $ shift
E + y * z $ shift
E + y * z $ reduce
E + F * z $ reduce
E + T * z $ shift
E + T * z $ shift
E + T * z $ reduce
E + T * F $ reduce
E + T $ reduce
E $ accept
handles for reduction are underlined
(Handle: a substring of a right sentential form that (i) is the right hand side of a production rule and (ii) is created by a step in a right-most derivation.)

SLR Parsing

To produce an SLR (simple LR) parsing table, first augment the grammar with new start symbol E' and production...
E' -> E
 
then use this algorithm
C := { closure({E' -> . E}) };
for each set, I, in C do
  for each symbol, X, s.t. goto(I, X) is not empty and is not in C do
    C := C + goto(I, X)
  end_for
end_for
 
where
function goto(I, X)
begin --X is a terminal or a non-terminal symbol
  I' := { };
  for each member of I of the form A -> α . X β do
    I' := I' + { A -> α X . β }   --jump the X
  end_for;
  return closure(I')
end --goto
 
and
function closure(I)
begin
  I' := I;
  for each member of I' of the form C -> γ . D φ do
    for each production D -> ψ do
      I' := I' + {D -> . ψ}
    end_for
  end_for
end --closure
 
This produces the canonical collection of LR(0) items, e.g.,
 
I0 = { E' -> . E,
closure: E -> . E + T, E -> . E - T, E -> . T,
T -> . T * F, T -> . T / F, T -> . F,
F -> . id, F -> . ( E ), F -> . - F }
 
goto(I0, E) =
I1 = { E' -> E . , E -> E . + T, E -> E . - T }
goto(I0, T) =
I2 = { E -> T . , T -> T . * F, T -> T . / F }
goto(I0, F) =
I3 = { T -> F . }
goto(I0, id) =
I4 = { F -> id . }
goto(I0, ( ) =
I5 = { F -> ( . E ),
closure: E -> . E + T, E -> . E - T, E -> . T,
T -> . T * F, T -> . T / F, T -> . F,
F -> . id, F -> . ( E ), F -> . - F }
goto(I0, - ) =
I6 = { F -> - . F,
closure: F -> . id, F -> . ( E ) F -> . - F }
 
goto(I1, +) =
I7 = { E -> E + . T,
closure: T -> . T * F, T -> . T / F, T -> . F,
F -> . id, F -> . ( E ), F -> . - F }
goto(I1, - ) =
I8 = { E -> E - . T,
closure: T -> . T * F, T -> . T / F, T -> . F,
F -> . id, F -> . ( E ), F -> . - F }
 
goto(I2, *) =
I9 = { T -> T * . F,
closure: F -> . id, F -> . ( E ), F -> . - F }
goto (I2, /) =
I10 = { T -> T / . F,
closure: F -> .id, F -> . ( E ), F -> . - F }
 
goto(I5, E) =
I11 = { F -> ( E . ), E -> E . + T, E -> E . - T }
goto(I5, T) = I2
goto(I5, F) = I3
goto(I5, id) = I4
goto(I5, ( ) = I5 (!)
 
goto(I6, F) =
I12 = { F -> - F . }
goto(I6, id) = I4
goto(I6, - ) = I6
goto(I6, ( ) = I5
 
goto(I7, T) =
I13 = { E -> E + T . , T -> T . * F, T -> T . / F }
goto(I7, F) = I3
goto(I7, id) = I4
goto(I7, ( ) = I5
goto(I7, -) = I6
 
goto(I8, T) =
I14 = { E -> E - T . , T -> T . * F, T -> T . / F }
goto(I8, F) = I3
goto(I8, id) = I4
goto(I8, ( ) = I5
goto(I8, -) = I6
 
goto(I9, F) =
I15 = { T -> T * F . }
goto(I9, id) = I4
goto(I9, ( ) = I5
goto(I9, -) = I6
 
goto(I10, F) =
I16 = { T -> T / F . }
goto(I10, id) = I4
goto(I10, ( ) = I5
goto(I10, -) = I6
 
goto(I11, +) = I7
goto(I11, -) = I8
goto(I11, ) ) =
I17 = { F -> ( E ) . }
 
goto(I13, *) = I9
goto(I13, /) = I10
 
goto(I14, *) = I9
goto(I14, /) = I10
 
To fill in the entries of the SLR parsing table where statei corresponds to Ii
 
initialise all action[,] and goto[,] entries to blank (indicating parse error);
 
for each state, i, do
 
  for each terminal symbol, a, do
    if A -> α . a β is in Ii, and goto(Ii,a)=Ij then
      action[i, a] := shift j   --[*]
    end_if
  end_for;
 
  if S' -> S . is in Ii then
    action[i, $] := accept   --[*]
  else
  end_if;
 
  for each non-terminal, A, other than S' do
    if A -> α . is in Ii then
      for each a in FOLLOW(A) do
        action[i, a] := reduce by A -> α   --[*]
      end_for
    end_if
  end_for;
  end_if;
 
  for each non-terminal, A, do
    if goto(Ii, A) = Ij then
      goto[i, A] := j
    end_if
  end_for
 
end_for --state i
 
[*] If any conflicting actions are produced here then fail, the grammar is not SLR(1).
Blank entries indicate parsing errors.
SLR(1) table
state Action Goto
id + - * / ( ) $ E T F
0s4 s6 s5 123
1 s7s8 accept
2 r(E->T) r(E->T) s9s10 r(E->T) r(E->T)
3 r(T->F) r(T->F) r(T->F) r(T->F) r(T->F) r(T->F)
4 r(F->id) r(F->id) r(F->id) r(F->id) r(F->id) r(F->id)
5s4 s5 11 2 3
6s4 s6 s5 12
7s4 s6 s5 13 3
8s4 s6 s5 14 3
9s4 s6 s5 15
10s4 s6 s5 16
11 s7 s8 s17
12 r(F->-F) r(F->-F) r(F->-F) r(F->-F) r(F->-F) r(F->-F)
13 r(E->E+T) r(E->E+T) s9 s10 r(E->E+T) r(E->E+T)
14 r(E->E-T) r(E->E-T) s9 s10 r(E->E-T) r(E->E-T)
15 r(T->T*F) r(T->T*F) r(T->T*F) r(T->T*F) r(T->T*F) r(T->T*F)
16 r(T->T/F) r(T->T/F) r(T->T/F) r(T->T/F) r(T->T/F) r(T->T/F)
17 r(F->(E)) r(F->(E)) r(F->(E)) r(F->(E)) r(F->(E)) r(F->(E))

To parse using an SLR(1) parsing table
 
initialise stack;
push start_state;
do
  s := top of stack;  --a state
  a := next input symbol;
 
  case action[s, a] of
    shift s' =>
      push a; push s';
      advance to next input symbol;
 
    | reduce by A -> β =>
      pop 2 × |β| items from stack;
      output A -> β;
      s' := top of stack;
      push A; push(goto[s', A]);
 
    | accept => ... success ... ;
 
    | _ => ... error ...
  end_case
end_do
Parsing w*x+y*z, i.e., id*id+id*id, using the SLR(1) table
>stack> <input< action
0 w * x + y * z $ s4
0 w 4 * x + y * z $ r
0 F 3 (=goto(0,F)) * x + y * z $ r
0 T 2 * x + y * z $ s9
0 T 2 * 9 x + y * z $ s4
0 T 2 * 9 x 4 + y * z $ r
0 T 2 * 9 F 15 + y * z $ r
0 T 2 + y * z $ r
0 E 1 + y * z $ s7
0 E 1 + 7 y * z $ s4
0 E 1 + 7 y 4 * z $ r
0 E 1 + 7 F 3 * z $ r
0 E 1 + 7 T 13 * z $ s9
0 E 1 + 7 T 13 * 9 z $ s4
0 E 1 + 7 T 13 * 9 z 4 $ r
0 E 1 + 7 T 13 * 9 F 15 $ r
0 E 1 + 7 T 13 $ r
0 E 1 $ accept

References

The dragon book,
also search for [parsing] in the [Bib].
-- 2002 LA, CSSE, Monash
Coding Ockham's Razor, L. Allison, Springer

A Practical Introduction to Denotational Semantics, L. Allison, CUP

Linux
 Ubuntu
free op. sys.
OpenOffice
free office suite
The GIMP
~ free photoshop
Firefox
web browser

key
::= "can be replaced by"
->
| "or by"
=> derivation
lm left most
rm right most

© L. Allison   http://www.allisons.org/ll/   (or as otherwise indicated),
Faculty of Information Technology (Clayton), Monash University, Australia 3800 (6/'05 was School of Computer Science and Software Engineering, Fac. Info. Tech., Monash University,
was Department of Computer Science, Fac. Comp. & Info. Tech., '89 was Department of Computer Science, Fac. Sci., '68-'71 was Department of Information Science, Fac. Sci.)
Created with "vi (Linux + Solaris)",  charset=iso-8859-1,  fetched Thursday, 18-Apr-2024 14:58:07 AEST.