Arithmetic Expressions

LA home
Computing
 Algorithms
 Bioinformatics
 FP,  λ
 Logic,  π
 MML
 Prog.Langs

Prog-Langs
 glossary
 Grammar
  Arith-Exp
  Abstract-syn
  Top-Down
  Bottom-Up

Also see:
 Parser.c
 *.java
 exp.sml
<Exp> ::= <Exp> + <Term> |
<Exp> - <Term> |
<Term>

<Term> ::= <Term> * <Factor> |
<Term> / <Factor> |
<Factor>

<Factor> ::= x | y | ... |
( <Exp> ) |
- <Factor>

A grammar for the concrete syntax of simple arithmetic expressions

Non-terminal symbols:
<Exp>, <Term>, <Factor>
Terminal symbols:
+, -, *, /, (, ), x, y, z, ...
Start symbol:
<Exp>
Production rules as above.
 
Note the following are meta symbols:
::= (read it as "can be replaced by")
-> is alternative notation for ::=,
|   (read it as "or by")

e.g. An <Exp> can be replaced by   <Exp> + <Term>   or by <Exp> - <Term>   or by   <Term>.

Note that the production (replacement) rules for <Exp>, <Term> and <Factor> are recursive. The rule for <Exp> states that an Expression is a list of Terms separated by `+' or `-' symbols. A Term is a list of Factors separated by `*' or `/' symbols. A Factor is a variable (x, y, ...), or a parenthesized sub-expression, or a unary `-' followed by a Factor.

The operators `*' and '/' bind more tightly to their operands than `+' and binary `-' because `*' and '/' appear in the rule for <Term>, closer to the operands (<Factor>). Unary `-' binds most tightly of all operators.

The rules for <Exp> and <Term> are left recursive. This is because `+', `-', `*' and `/' are left associative: `a-b-c' gives the same parse tree as `(a-b)-c', not `a-(b-c)'.

A grammar like this can be turned into a "recursive descent parser" (a program) by writing a routine for each non-terminal in the grammar. The routines call each other as specified by the replacement rules. Left recursion, that is a replacement that begins with a recursive non-terminal, must be removed so that the parser does not loop. A simple parser of this type can be found in [Parser.c (click)] or [*.java] or [exp.sml].

1. x+y*z:

<Exp> => <Exp> + <Term> => <Term> + <Term> => <Factor> + <Term> => x + <Term> => x + <Term> * <Factor> => x + <Factor> * <Factor> => x + y * <Factor> => x + y * z
tree of nonterminals and terminals
x+y*z: Full derivation-tree of non-terminals and replacements.

The result of parsing is often returned as a parse-tree based on a simpler abstract syntax. The abstract syntax corresponds to the datatype of parse trees returned by the parser.

parse tree
x+y*z: Parse tree.

2. (x+y)*z:

Parentheses '(' and ')' can be used to force a different parsing:

tree of nonterminals and terminals
(x+y)*z: Full derivation-tree of non-terminals and replacements.

parse tree
(x+y)*z: Parse tree.

Note that parentheses do not appear in any parse tree. They directed the parser to build a certain tree and after that they are unnecessary. The structure of the tree shows the binding of operators to operands without the need for (). Parentheses are only needed in strings.

-- 2002 LA, CSSE, Monash
window on the wide world:

Computer Science Education Week

Linux
 Ubuntu
free op. sys.
OpenOffice
free office suite,
ver 3.4+

The GIMP
~ free photoshop
Firefox
web browser
FlashBlock
like it says!

key
::= "can be replaced by"
->
| "or by"
=> derivation
lm left most
rm right most

© L. Allison   http://www.allisons.org/ll/   (or as otherwise indicated),
Faculty of Information Technology (Clayton), Monash University, Australia 3800 (6/'05 was School of Computer Science and Software Engineering, Fac. Info. Tech., Monash University,
was Department of Computer Science, Fac. Comp. & Info. Tech., '89 was Department of Computer Science, Fac. Sci., '68-'71 was Department of Information Science, Fac. Sci.)
Created with "vi (Linux + Solaris)",  charset=iso-8859-1,  fetched Sunday, 21-Sep-2014 20:07:03 EST.