CSCI 435 Compiler Design - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

CSCI 435 Compiler Design

Description:

Upon execution, control follows one possible path through the ... Read section 3.2.2.3 on ... 39 pages of 'A Compact Guide To Lex and Yacc' // you can skim ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 22
Provided by: OwenAst9
Category:

less

Transcript and Presenter's Notes

Title: CSCI 435 Compiler Design


1
CSCI 435 Compiler Design
  • Week 6 Class 1
  • Section 3.2.2 to 3.2.2.3
  • (245-253)
  • Ray Schneider

2
Topics of the Day
  • Symbolic Interpretation
  • Simple Symbolic Interpretation
  • Full Symbolic Interpretation

3
Symbolic interpretation
  • Upon execution, control follows one possible path
    through the control flow graph
  • Code executed at nodes represents the RUN-TIME
    semantics of the node NOT the compile time
    context relations.
  • Ex. attribute evaluation code is concerned with
    the AST and distribution of information about the
    syntax AT RUN TIME an if-statement is only about
    a JMP to the THEN-part or ELSE part depending on
    the state of the condition just computed

4
Compile-Time versus Run-Time interests
If_statement (INH successor, SYN first)? 'IF'
Condition 'THEN' Then_part 'ELSE' Else_part 'END'
'IF' ATTRIBUTE RULES SET IF_statement
.first TO Condition .first SET Condition
.true successor TO Then_part .first SET
Condition .false successor TO Else_part .first
SET Then_part .successor TO If_statement
.successor SET Else_part .successor TO
If_statement .successor
AST
Run-time Path
5
Run-time behavior depends on variables
  • Much contextual information about variables can
    be deduced statically at compile time by
    simulating the run-time process using a technique
    called
  • SYMBOLIC INTERPRETATION or
  • SIMULATION ON THE STACK
  • We attach a Stack Representation to each arrow in
    the Control Flow Graph
  • Compile time representation of the run-time stack
    holds an entry for each identifier visible at
    that point in the program
  • Mostly interested in Variable and Constants

6
Stack representations in CFG of an if-statement
true
condition y gt 0
x is initialized and y has the value 5
7
Symbolically interpret an if statement
  • two parameter provided a Stack Representation,
    and an If node1) symbolically interprets the
    condition (new stack with condition on top), 2)
    condition unstacked and used to get stack
    representation for end of then/else clause, 3)
    routine merges stack representation

FUNCTION Symbolically interpret an if statement
( Stack representation, If node ) RETURNING a
stack representation SET New stack
representation TO Symbolically interpret a
condition ( Stack representation, If
node .condition ) Discard top entry
from New stack representation RETURN Merge
stack representations ( Symbolically
interpret a statement sequence ( New
stack representation, If node .then part
), Symbolically interpret a statement
sequence ( New stack representation, If
node .else part ) )
8
reality
  • example is oversimplified, more details are
    needed in actual code
  • ex. a check for the presence of an ELSE part
    since statement could be an IF...THEN only
  • might need to copy the stack representation to
    pass a copy to each branch
  • Symbolic Interpretation was already in use in the
    1960's for type checking in Algol 60 (Naur 1965)
  • Next we'll look at checking for uninitialized
    variables using two variants of symbolic
    interpretation
  • SIMPLE Symbolic Interpretation (works for
    structured programs and simple properties)
    L-attributes
  • FULL Symbolic Interpretation (more general)
    works for general attribute grammars and
    L-attributed grammars.

9
Simple Symbolic Interpretation 1
  • Checking for the use of uninitialized variables
  • make a compile time representation of the local
    stack of a routine (possibly including its
    parameter stack)
  • follow representation through the entire routine
  • (representation can be a linked list of name,
    property pairs called a PROPERTY LIST)
  • List starts empty or initialized with
    parameters with properties
  • IN parameters Initialized
  • INOUT parameters Initialized
  • OUT parameters Uninitialized
  • ALSO A RETURN LIST which combines the stack
    representations as found at return statements and
    routine exit

(247)
10
Simple Symbolic Interpretation 2
  • Then we follow the arrows of the control flow
    graph updating our list as we go
  • ex. At each node we do the appropriate thing
  • meet a DECLARATION add declared name to list
    with its status, (initialized or uninitialized)
  • IF FLOW OF CONTROL SPLITS a copy of the list
    follows each path, when the paths combine the
    lists are merged (generally merger is obvious
    except when a variable gets a value in one path
    and not the other then label it May be
    initialized)
  • ASSIGNMENTset left variable to initialized and
    check variables on right side in the expression
    issuing an error if they are uninitialized or a
    warning if they May Be Initialized
  • ROUTINE CALL different run-time stack (not our
    problem) but ought to check parameters
  • FOR pass through bounds and loop initialization
    of control variable and make a copy of the list
    LOOP-EXIT LIST

11
Execution of a For_Loop
v
v
F T
v
from
F
from to
v
For_statement
Body
vcontrol variable
body exec 0 times
v
E F T
from to
v
End_for
E' F T
v
E F T
from to
from to
12
Get'in Out'ta Dodge
  • When we find an exit_loop inside a loop we merge
    the list we've collected with the loop-exit list
    and continue with the empty list. (same with
    return statements and end of routine)
  • If we find an exit_loop outside a loop we give an
    error.
  • When all stack representations have been computed
    check return list to see if all OUT parameters
    have been defined. Error if not!
  • Check all variables at the end of routine and any
    that are not initialized then give a warning

13
Extending the system
  • once we have a system of symbolic interpretation
    in place it is relatively easy to extend
  • extending the tracking variable, constant, field
    selector, etc.
  • extend beyond status (ex. initialized) to value,
    range, set of possible values, a technique called
    CONSTANT PROPAGATION
  • 2 Purposes 1) id variable used as constants, 2)
    get a tighter grip on tests in for- and while-
    loops, or last-def analysis (3.2.2.3)

int i 0 while (some condition) if (igt0)
printf("Loop reentered i d\n",i) i
//fig 3.46
If we try to eliminate the printf( ) because we
know i0 so the the first test fails we rather
obviously have concluded too much.
14
What kind of properties allow simple symbolic
interpretation to WORK?
  • Four Requirements for it to work
  • 1. Program must consist of flow-of-control
    structures with one entry point and one exit
    point only (Structured Programming)
  • 2. Values of the property must form a lattice,
    i.e. values can be ordered in a sequence v1..vn
    such that no operation can transform vj to vi
    where iltj we will write viltvj for all iltj.
  • 3. Result of merging two values must be at least
    as large as the smaller of the two values
  • 4. Action taken on vi in a given situation must
    make any action taken on vj in that same
    situation superflueous, for viltvj

15
Why? 1
  • 1. Program must consist of flow-of-control
    structures with one entry point and one exit
    point only (Structured Programming)
  • this allows each control structure to be treated
    in isolation with the property being analyzed
    well-defined at both the entry point and the exit
    point of the structure
  • Other three requirement allow us to ignore jump
    back to the beginning of the looping control
    structures

16
Why? 2
  • vin is property at entrance to the loop and vout
    is property at loop exist
  • (2) guarantees that vin?vout
  • (3) guarantees that when we merge vout after
    first pass through the loop with vin to obtain
    the value vnew at the start of the second loop
    vnew?vin
  • from (4) it follows that we don't need to perform
    a second scan or to consider the jump back
  • consider the initialization example the values v1
    Uninitialized, v2 May be initialized, and v3
    Initialized fulfills the property since
    initialization can only progress left to right as
    a property
  • If the four requirements are not satisfied then
    FULL SYMBOLIC INTERPRETATION is necessary.

17
Full Symbolic Interpretation
  • Full Symbolic Interpretation consists of
    performing Simple Symbolic Interpretation
    repeatedly until no more changes in the values of
    the properties occur (a closure algorithm)
  • GOTO statements cannot be handled by Simple
    Symbolic Interpretation (Requirement 1 is
    violated)
  • For each label in the routine we need a separate
    list. They start off empty but every time we
    jump to the label we merge the present list with
    the label list and continue with an empty list
    but we're not done since the next pass might
    modify the list so we repeat until nothing
    changes. Then we can be certain we have found
    all paths by which a variable can be
    uninitialized at a particular label
  • Closure Algorithm and have to postpone actions on
    initialization status until everything is known
  • Simplegt single pass Fullgt multipass

18
Full Symbolic Interpretation as a Closure
Algorithm
  • Data Definitions Stack representations, with
    entries for every item we are interested in.
  • Initializations
  • 1. Empty stack representations are attached to
    all arrows in the control flow graph residing in
    the threaded AST.
  • 2. Some stack representations at strategic points
    are initialized in accordance with properties of
    the source language ex. the stack representation
    of input parameters are initialized to
    Initialized
  • Inference rules For each node type, source
    language dependent rules allow inferences to be
    made, adding information to the stack
    representation on the outgoing arrows based on
    those on the incoming arrows and the node itself,
    and vice versa.

19
So where are we? ...
  • Full symbolic interpretation removes almost all
    the requirements listed in the previous section
    but IT DOESN'T SOLVE ALL THE PROBLEMS
  • ex. No guarantee that the algorithm terminates
  • open ended algorithms like fig 3.46
  • The complete set of possible values of a variable
    cannot be determined at compile time in all cases
  • Approximation 1) a set of at most two values, or
    2) any value
  • a little like the counting of primitive tribes 1,
    2, 3, many
  • Read section 3.2.2.3 on Last-Def analysis
  • involves collecting all the places where a
    variable is defined (used for code generation and
    in particular register allocation) also called
    REACHING-DEFINITIONS ANALYSIS

20
Homework for Week 8
  • Bison Familiarization
  • Read the entire 39 pages of "A Compact Guide To
    Lex and Yacc" // you can skim through it the
    first time
  • THEN concentrate first on getting the lex example
    on page 10 running
  • THEN after you have that running go on to
    Practice, Part 1 and strive to get the primitive
    calculator running (pages 14 through 17)

21
References
  • Text Modern Compiler Design
Write a Comment
User Comments (0)
About PowerShow.com