Syntax Analysis - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Syntax Analysis

Description:

Grammars offer significant advantage to compiler designer design and construction: ... modulo fashion. 10. Something more is needed! ... – PowerPoint PPT presentation

Number of Views:1002
Avg rating:3.0/5.0
Slides: 25
Provided by: ps184
Category:
Tags: analysis | syntax

less

Transcript and Presenter's Notes

Title: Syntax Analysis


1
Syntax Analysis
  • Source
  • 1. Chapter 4, Compilers Principles Techniques
    Tools
  • 2. Compiler Construction Lecture Notes,
  • Prof Trevor Mudge and Prof Mark Hodges
  • University of Michigan

2
Agenda
  • In this lecture we expand on the introduction of
    syntax analysis
  • With particular attention to
  • Parsing and
  • Context Free Grammars

3
Syntax Analysis
  • The syntax of programming language constructs can
    be described by context free grammars of BNF
  • Grammars offer significant advantage to compiler
    designer design and construction
  • Gives precise, easy to understand, syntactic
    specification of a programming language.
  • Grammars help to automatically construct
    efficient parser which in turn can reveal
    syntactic ambiguities.

4
Syntax AnalysisAdvantages of Grammars
  • Give precise, easy to understand, syntactic
    specification of a programming language.
  • Help to automatically construct efficient parser
    which in turn can reveal syntactic ambiguities.
  • Give structure to programming language that is
    useful for the translation of source code into
    correct object code and error checking.
  • New constructs can be added to a language when
    there is an existing grammatical description of
    the language.

5
Syntax AnalysisParsers
  • Most of this lecture will deal with parsing
    methods that are typically used in compilers.
  • Recall the position of parser in compiler model

Parse tree
token
Rest of Front end
Parser
Source program
Lexical Analyser
get next token
Intermediate Representation
Symbol Table
6
Where is Syntax Analysis Performed?
if (b 0) a b
Lexical Analysis or Scanner
if
(
b

0
)
a

b

Syntax Analysis or Parsing
if
abstract syntax tree or parse tree


b
0
a
b
7
Parsing Analogy
  • Syntax analysis for natural languages
  • Recognize whether a sentence is grammatically
    correct
  • Identify the function of each word

sentence
subject
verb
indirect object
object
I
gave
him
noun phrase
article
noun
I gave him the book
book
the
8
Syntax Analysis Overview
  • Goal determine if the input token stream
    satisfies the syntax of the program
  • What do we need to do this?
  • An expressive way to describe the syntax
  • A mechanism that determines if the input token
    stream satisfies the syntax description
  • For lexical analysis
  • Regular expressions describe tokens
  • Finite automata mechanisms to generate tokens
    from input stream

9
Can we just use Regular Expressions?
  • REs can expressively describe tokens
  • Easy to implement via DFAs
  • So can we just use them to describe the syntax of
    a programming language?
  • NO! They dont have enough power to express any
    non-trivial syntax
  • Example Nested constructs (blocks, expressions,
    statements) Detect balanced braces






. . .






- We need unbounded counting! - FSAs cannot count
except in a strictly modulo fashion
10
Something more is needed!
  • Most programming language constructs have an
    inherently recursive structure that can be
    defined by context-free grammars (CFG).
  • For example a conditional statement defined by a
    rule such as
  • if E then S1 else S2 can not be specified by a
    REs.
  • REs can specify the lexical structure of tokens
  • We will find CFGs handy!

11
Context-Free Grammars
  • Consist of 4 components
  • Terminal symbols token or ?. e.g.
  • if, then, and else
  • Non-terminal symbols syntactic variables that
    denote sets of strings.
  • Define sets of strings that help define language
    generated by grammer. e.g.
  • if expr then stmt else stmt expr and stmt are
    non-terminals
  • Start symbol S special non-terminal
  • Productions of a grammar of the form LHS?RHS
  • LHS single non-terminal
  • RHS string of terminals and non-terminals
  • Specify how non-terminals and terminals can be
    combined to form strings.

12
Context-Free Grammars
  • Each production consists of a non-terminal
    followed by an arrow (or ), followed by a
    string of non-terminals and terminals
  • Language generated by a grammar is the set of
    strings of terminals derived from the start
    symbol by repeatedly applying the productions
  • L(G) language generated by grammar G

S ? a S a S ? T T ? b T b T ? ?
13
CFG Example 1
  • A grammar that defines simple arithmetic
    expressions can be defined by these productions
  • expr ? expr op expr
  • expr ? (expr)
  • expr ? - expr
  • expr ? id
  • op ? op ?
  • op ? -
  • op ?
  • op ? /
  • The terminal symbols are id - / ( )

14
CFG Shorthand-Some National Conventions
  • Terminal symbols
  • Lower case letters a,b,a
  • Operator symbols (,-,etc)
  • Punctuation symbols , (, ), etc
  • The digits 0,1,..,9
  • Boldface strings id, if

Grammer symbols Upper case letters late in
alphabet X, Y, Z Strings of grammar
symbols Lower case Greek letters a,b, g A ? a
  • Nonterminals
  • A,B,C
  • (early in alphabet)
  • S start symbol
  • Lower-case italic names expr, etc

String of terminals u,v,., z (late in alphabet)
vertical bar for Multiple productions S ? a S a
T T ? b T b ?
15
CFG - Example 2
  • Using CFG shorthand we can rewrite the grammar
    for the previous example as
  • E ? E A E (E) -E id
  • A ? - /

expr ? expr op expr expr ? (expr) expr ? -
expr expr ? id op ? op ? op ? - op ? op ? /
16
More on CFGs
  • Shorthand notation vertical bar is used for
    multiple productions
  • S ? a S a T
  • T ? b T b ?
  • Definitions
  • Derivation successive application of
    productions starting from S
  • Acceptance Determine if there is a derivation
    for an input token stream

17
CFG Example 3
  • Grammar for balanced-parentheses language
  • S ? ( S ) S
  • S ? ?
  • ? stands for empty string
  • 1 non-terminal S
  • 2 terminals (, )
  • Start symbol S
  • 2 productions
  • If grammar accepts a string, there is a
    derivation of that string using the productions
  • (())
  • S (S) ? ((S) S) ? ((?) ? ) ? (())

18
Parsers
Context free grammar, G
Parser
Yes, if s in L(G) No, otherwise
Token stream, s (from lexer)
Error messages
Syntax analyzers (parsers) CFG acceptors which
also output the corresponding derivation when the
token stream is accepted Various kinds LL, LR
19
Parsers
  • Popular parsing methods are classified to be
    either
  • Top-down or
  • Build parse tree from the top (root) of the parse
    tree to bottom (leaves)
  • Bottom up
  • Start from the bottom (Leaves) and work up to the
    root.
  • In both cases the input to the parser is scanned
    from left to right, one symbol at a time.

20
LL Parsers
  • www.Wikipedia.org
  • An LL parser is a table-based top-down parser for
    a subset of the context-free grammars.
  • It parses the input from Left to right, and
    constructs a Leftmost derivation of the sentence.
  • The class of grammars which are parsable in this
    way is known as the LL grammars.

21
LR parser
  • www.Wikipedia.org
  • A type of bottom-up parser for context-free
    grammars that is very commonly used by computer
    programming language compilers (and other
    associated tools).
  • LR parsers read their input from Left to right
    and produce a Rightmost derivation

22
Parsers Other tasks
  • A number of tasks might be carried out during
    parsing
  • Collecting information about various tokens into
    symbol table
  • performing type checking,
  • semantic analysis and
  • Generating intermediate code

23
Syntax AnalysisSyntax Error Handling
  • Most of error handling (detection and recovery)
    is often done during syntax analysis phase.
  • Programs can have errors in different levels. For
    example the errors can be
  • Lexical
  • misspelling of keywords, identifiers or operators
  • Syntactic
  • Arithmetic expression with unbalanced parentheses
  • Semantic
  • Operator applied to an incompatible operand
  • Logical
  • Infinitely recursive call

24
What Next?
  • With the background on grammars so far we shall
    next look at
  • How we can construct parse trees.
  • Ambiguous grammars
  • A grammar that produces more than one parse tree
  • Some derivation examples
Write a Comment
User Comments (0)
About PowerShow.com