Arithmetic Expressions



Also see:
<Exp> ::= <Exp> + <Term> |
<Exp> - <Term> |

<Term> ::= <Term> * <Factor> |
<Term> / <Factor> |

<Factor> ::= x | y | ... |
( <Exp> ) |
- <Factor>

A grammar for the concrete syntax of simple arithmetic expressions

Non-terminal symbols:
<Exp>, <Term>, <Factor>
Terminal symbols:
+, -, *, /, (, ), x, y, z, ...
Start symbol:
Production rules as above.
Note the following are meta symbols:
::= (read it as "can be replaced by")
-> is alternative notation for ::=,
|   (read it as "or by")

e.g. An <Exp> can be replaced by   <Exp> + <Term>   or by <Exp> - <Term>   or by   <Term>.

Note that the production (replacement) rules for <Exp>, <Term> and <Factor> are recursive. The rule for <Exp> states that an Expression is a list of Terms separated by `+' or `-' symbols. A Term is a list of Factors separated by `*' or `/' symbols. A Factor is a variable (x, y, ...), or a parenthesized sub-expression, or a unary `-' followed by a Factor.

The operators `*' and '/' bind more tightly to their operands than `+' and binary `-' because `*' and '/' appear in the rule for <Term>, closer to the operands (<Factor>). Unary `-' binds most tightly of all operators.

The rules for <Exp> and <Term> are left recursive. This is because `+', `-', `*' and `/' are left associative: `a-b-c' gives the same parse tree as `(a-b)-c', not `a-(b-c)'.

A grammar like this can be turned into a "recursive descent parser" (a program) by writing a routine for each non-terminal in the grammar. The routines call each other as specified by the replacement rules. Left recursion, that is a replacement that begins with a recursive non-terminal, must be removed so that the parser does not loop. A simple parser of this type can be found in [Parser.c (click)] or [*.java] or [exp.sml].

1. x+y*z:

<Exp> => <Exp> + <Term> => <Term> + <Term> => <Factor> + <Term> => x + <Term> => x + <Term> * <Factor> => x + <Factor> * <Factor> => x + y * <Factor> => x + y * z
tree of nonterminals and terminals
x+y*z: Full derivation-tree of non-terminals and replacements.

The result of parsing is often returned as a parse-tree based on a simpler abstract syntax. The abstract syntax corresponds to the datatype of parse trees returned by the parser.

parse tree
x+y*z: Parse tree.

2. (x+y)*z:

Parentheses '(' and ')' can be used to force a different parsing:

tree of nonterminals and terminals
(x+y)*z: Full derivation-tree of non-terminals and replacements.

parse tree
(x+y)*z: Parse tree.

Note that parentheses do not appear in any parse tree. They directed the parser to build a certain tree and after that they are unnecessary. The structure of the tree shows the binding of operators to operands without the need for (). Parentheses are only needed in strings.

-- 2002 LA, CSSE, Monash
window on the wide world:

Computer Science Education Week

free op. sys.
free office suite,
ver 3.4+

~ free photoshop
web browser
like it says!

::= "can be replaced by"
| "or by"
=> derivation
lm left most
rm right most

© L. Allison   (or as otherwise indicated),
Faculty of Information Technology (Clayton), Monash University, Australia 3800 (6/'05 was School of Computer Science and Software Engineering, Fac. Info. Tech., Monash University,
was Department of Computer Science, Fac. Comp. & Info. Tech., '89 was Department of Computer Science, Fac. Sci., '68-'71 was Department of Information Science, Fac. Sci.)
Created with "vi (Linux + Solaris)",  charset=iso-8859-1,  fetched Wednesday, 06-May-2015 15:26:30 EST.