Bottom-Up Parsing of Arithmetic Expressions

Bottom-Up Parsing

home₁ home₂
Bib
Algorithms
Bioinfo
FP
Logic
MML
Prog.Lang
and the
Book

Prog-Langs
glossary
Grammar
  Arith-Exp
  Abstract-syn
  Top-Down
  Bottom-Up

Reductions

sentence:	w	*	x	+	y	*	z	<end>
reductions	w	*	x	+	y	*	z	<end>
	w	*	x	+	y	*	z	<end>
1	Factor
2	Term
3	Term		Factor
4	Term
5	Exp
6					Factor
7					Term
8					Term		Factor
9					Term
10	Exp

The reductions are equivalent to performing the following right-most (rm) derivations

<Exp> _rm=> <Exp> + <Term> _rm=> <Exp> + <Term>*<Factor> _rm=> <Exp> + <Term>*z _rm=> <Exp> + <Factor>*z _rm=> <Exp> + y*z _rm=> <Term> + y*z _rm=> <Term>*<Factor> + y*z _rm=> <Term>*x + y*z _rm=> <Factor>*x + y*z _rm=> w*x+y*z

but performing them in reverse.

Parse tree

	+
*			*
w		x	y	z

Table-driven, bottom-up parsing

Grammar (E ~ <Exp>, T ~ <Term>, F ~ <Factor>, id ~ x | y | ... , for brevity): E -> E + T; E -> E - T; E -> T; T -> T * F; T -> T / F; T -> F; F -> id; F -> ( E ); F -> - F

Parsing, wx+yz, i.e., idid+idid, general bottom-up shift-reduce scheme
>stack>	<input<	action
	w * x + y * z $	shift
w	* x + y * z $	reduce
F	* x + y * z $	reduce
T	* x + y * z $	shift
T *	x + y * z $	shift
T * x	+ y * z $	reduce
T * F	+ y * z $	reduce
T	+ y * z $	reduce
E	+ y * z $	shift
E +	y * z $	shift
E + y	* z $	reduce
E + F	* z $	reduce
E + T	* z $	shift
E + T *	z $	shift
E + T * z	$	reduce
E + T * F	$	reduce
E + T	$	reduce
E	$	accept
handles for reduction are underlined

(Handle: a substring of a right sentential form that (i) is the right hand side of a production rule and (ii) is created by a step in a right-most derivation.)

SLR Parsing

To produce an SLR (simple LR) parsing table, first augment the grammar with new start symbol E' and production...: E' -> E
then use this algorithm: C := { closure({E' -> . E}) };; for each set, I, in C do; for each symbol, X, s.t. goto(I, X) is not empty and is not in C do; C := C + goto(I, X); end_for; end_for
where: function goto(I, X); begin --X is a terminal or a non-terminal symbol; I' := { };; for each member of I of the form A -> α . X β do; I' := I' + { A -> α X . β } --jump the X; end_for;; return closure(I'); end --goto
and: function closure(I); begin; I' := I;; for each member of I' of the form C -> γ . D φ do; for each production D -> ψ do; I' := I' + {D -> . ψ}; end_for; end_for; end --closure
This produces the canonical collection of LR(0) items, e.g.,
I0 = { E' -> . E,: closure: E -> . E + T, E -> . E - T, E -> . T,; T -> . T * F, T -> . T / F, T -> . F,; F -> . id, F -> . ( E ), F -> . - F }; goto(I0, E) =
I1 = { E' -> E . , E -> E . + T, E -> E . - T }: goto(I0, T) =
I2 = { E -> T . , T -> T . * F, T -> T . / F }: goto(I0, F) =
I3 = { T -> F . }: goto(I0, id) =
I4 = { F -> id . }: goto(I0, ( ) =
I5 = { F -> ( . E ),: closure: E -> . E + T, E -> . E - T, E -> . T,; T -> . T * F, T -> . T / F, T -> . F,; F -> . id, F -> . ( E ), F -> . - F }; goto(I0, - ) =
I6 = { F -> - . F,: closure: F -> . id, F -> . ( E ) F -> . - F }; goto(I1, +) =
I7 = { E -> E + . T,: closure: T -> . T * F, T -> . T / F, T -> . F,; F -> . id, F -> . ( E ), F -> . - F }; goto(I1, - ) =
I8 = { E -> E - . T,: closure: T -> . T * F, T -> . T / F, T -> . F,; F -> . id, F -> . ( E ), F -> . - F }; goto(I2, *) =
I9 = { T -> T * . F,: closure: F -> . id, F -> . ( E ), F -> . - F }; goto (I2, /) =
I10 = { T -> T / . F,: closure: F -> .id, F -> . ( E ), F -> . - F }; goto(I5, E) =
I11 = { F -> ( E . ), E -> E . + T, E -> E . - T }: goto(I5, T) = I2; goto(I5, F) = I3; goto(I5, id) = I4; goto(I5, ( ) = I5 (!); goto(I6, F) =
I12 = { F -> - F . }: goto(I6, id) = I4; goto(I6, - ) = I6; goto(I6, ( ) = I5; goto(I7, T) =
I13 = { E -> E + T . , T -> T . * F, T -> T . / F }: goto(I7, F) = I3; goto(I7, id) = I4; goto(I7, ( ) = I5; goto(I7, -) = I6; goto(I8, T) =
I14 = { E -> E - T . , T -> T . * F, T -> T . / F }: goto(I8, F) = I3; goto(I8, id) = I4; goto(I8, ( ) = I5; goto(I8, -) = I6; goto(I9, F) =
I15 = { T -> T * F . }: goto(I9, id) = I4; goto(I9, ( ) = I5; goto(I9, -) = I6; goto(I10, F) =
I16 = { T -> T / F . }: goto(I10, id) = I4; goto(I10, ( ) = I5; goto(I10, -) = I6; goto(I11, +) = I7; goto(I11, -) = I8; goto(I11, ) ) =
I17 = { F -> ( E ) . }: goto(I13, *) = I9; goto(I13, /) = I10; goto(I14, *) = I9; goto(I14, /) = I10
To fill in the entries of the SLR parsing table where state_i corresponds to I_i: initialise all action[,] and goto[,] entries to blank (indicating parse error);; for each state, i, do; for each terminal symbol, a, do; if A -> α . a β is in I_i, and goto(I_i,a)=I_j then; action[i, a] := shift j --[*]; end_if; end_for;; if S' -> S . is in I_i then; action[i, $] := accept --[*]; ~~else~~; end_if;; for each non-terminal, A, other than S' do; if A -> α . is in I_i then; for each a in FOLLOW(A) do; action[i, a] := reduce by A -> α --[*]; end_for; end_if; end_for;; ~~end_if;~~; for each non-terminal, A, do; if goto(I_i, A) = I_j then; goto[i, A] := j; end_if; end_for; end_for --state i
[*] If any conflicting actions are produced here then fail, the grammar is not SLR(1).
Blank entries indicate parsing errors.

SLR(1) table
state	Action								Goto
state	id	+	-	*	/	(	)	$	E	T	F
0	s4		s6			s5			1	2	3
1		s7	s8					accept
2		r(E->T)	r(E->T)	s9	s10		r(E->T)	r(E->T)
3		r(T->F)	r(T->F)	r(T->F)	r(T->F)		r(T->F)	r(T->F)
4		r(F->id)	r(F->id)	r(F->id)	r(F->id)		r(F->id)	r(F->id)
5	s4					s5			11	2	3
6	s4		s6			s5					12
7	s4		s6			s5				13	3
8	s4		s6			s5				14	3
9	s4		s6			s5					15
10	s4		s6			s5					16
11		s7	s8				s17
12		r(F->-F)	r(F->-F)	r(F->-F)	r(F->-F)		r(F->-F)	r(F->-F)
13		r(E->E+T)	r(E->E+T)	s9	s10		r(E->E+T)	r(E->E+T)
14		r(E->E-T)	r(E->E-T)	s9	s10		r(E->E-T)	r(E->E-T)
15		r(T->T*F)	r(T->T*F)	r(T->T*F)	r(T->T*F)		r(T->T*F)	r(T->T*F)
16		r(T->T/F)	r(T->T/F)	r(T->T/F)	r(T->T/F)		r(T->T/F)	r(T->T/F)
17		r(F->(E))	r(F->(E))	r(F->(E))	r(F->(E))		r(F->(E))	r(F->(E))

To parse using an SLR(1) parsing table: initialise stack;; push start_state;; do; s := top of stack; --a state; a := next input symbol;; case action[s, a] of; shift s' =>; push a; push s';; advance to next input symbol;; | reduce by A -> β =>; pop 2 × |β| items from stack;; output A -> β;; s' := top of stack;; push A; push(goto[s', A]);; | accept => ... success ... ;; | _ => ... error ...; end_case; end_do

Parsing wx+yz, i.e., idid+idid, using the SLR(1) table
>stack>	<input<	action
0	w * x + y * z $	s4
0 w 4	* x + y * z $	r
0 F 3 (=goto(0,F))	* x + y * z $	r
0 T 2	* x + y * z $	s9
0 T 2 * 9	x + y * z $	s4
0 T 2 * 9 x 4	+ y * z $	r
0 T 2 * 9 F 15	+ y * z $	r
0 T 2	+ y * z $	r
0 E 1	+ y * z $	s7
0 E 1 + 7	y * z $	s4
0 E 1 + 7 y 4	* z $	r
0 E 1 + 7 F 3	* z $	r
0 E 1 + 7 T 13	* z $	s9
0 E 1 + 7 T 13 * 9	z $	s4
0 E 1 + 7 T 13 * 9 z 4	$	r
0 E 1 + 7 T 13 * 9 F 15	$	r
0 E 1 + 7 T 13	$	r
0 E 1	$	accept

References

The dragon book,
also search for [parsing] in the [Bib].

-- 2002 LA, CSSE, Monash

Coding Ockham's Razor, L. Allison, Springer

A Practical Introduction to Denotational Semantics, L. Allison, CUP

Linux
Ubuntu
free op. sys.
OpenOffice
free office suite
The GIMP
~ free photoshop
Firefox
web browser

key
::=	"can be replaced by"
->	"can be replaced by"
\|	"or by"
=>	derivation
lm	left most
rm	right most

© L. Allison http://www.allisons.org/ll/ (or as otherwise indicated),
Faculty of Information Technology (Clayton), Monash University, Australia 3800 (6/'05 was School of Computer Science and Software Engineering, Fac. Info. Tech., Monash University, was Department of Computer Science, Fac. Comp. & Info. Tech., '89 was Department of Computer Science, Fac. Sci., '68-'71 was Department of Information Science, Fac. Sci.)

Created with "vi (Linux + Solaris)", charset=iso-8859-1, fetched Wednesday, 22-May-2024 01:34:42 AEST.