Title: Chapters 6 and 8: type checking
1Chapters 6 and 8 type checking intermediate
code generation
2Type checking static checking
- Compilers ultimate job is to translate the input
program into a form that can executed on the
target machine - need to check that the source program adheres
syntactic (structure) semantics (meaning) - Need to maintain large knowledge base
- Need to know the representation of values
- Need to know how the valies flow between
variables - Structure of the computations
- Need to understand how the program interacts with
external files/devices
3A simple scenario
- Consider a simple variable x and its
corresponding executable code. - To emit the code, compiler needs to answer these
questions - What kind of value must be kept in x?
- How big is x?
- Is x ID?
- Is x function?
- Is x local?
- Is x global?
- Compiler uses
- declaration (e.g., C)
4Checking?
- Type Checking ?
- The processes of identifying errors in a program
based on explicitly or implicitly stated type
information - Static (Compile time) vs. Dynamic (Run-time)
- Strongly typed language vs. weakly type language
- Kind Checking
- The processes of identifying errors in a program
based on stated kind information - Variables
- Functions/procedures
5Examples of static checking
- Type checks
- Control flow checking
- Uniqueness checking
- Name-related checking
6Type systems
- Type
- A se of common properties associated with each
value of that type - Type systems?
- Specifies the semantics of a valid program
- Creates a knowledge base for both structure and
behavior - Ensure run-time Safety (run-time errors using
type checking) - Improve Expressiveness (operator
overloading/polymorphisms) - Improve Run-time efficiency by generating better
code (e.g., addition when one of the operand is
zero) - Type checker
- Verify that the actual type (or found) matches
the expected type - e.g.
- Mod operation in Pascal expects the found types
of operands to be integers
7Example of Type checking1
- Type compatibility checks
- compatibility checking between operands
- (e.g., ai ai1)
- Flow-of-control checks
- if a statement results flow of the control, then
there must be a place where the flow can be
transferred - e.g.,
- break statements in C
8Type checking2
- uniqueness checks
- the situation in which an object must be defined
exactly once - e.g.,
- an identifier in Pascal
- named-related checks
- the situation in which the same name must appear
more than one times - e.g.,
- a loop or program names in Modula-2
9Type checker
- A type checker?
- Verifies that the type of a construct matches the
expected type at any given context - E.g.,
- operator mod in Pascal requires integer operands
- dereferencing (i.e., getting the value of data
structure using pointer) is applied only to
pointer - indexing is applied only to array
10Type checker Position
- Some compilers combine type checking (or type
checker) and Intermediate Code Generation with
Parsing (ICG) - E.g., Pascal
Parser
Type checker
IC
AST
AST
tokens
IC
11Why do we need type information?
- Type information is needed by code generator
- Operator overloading (different implementation
/ad hoc polymorphism) - A symbol that represents different operators in
different context - e.g operators can be applied to Integers,
real, and strings - Polymorphism(multiform)
- Differs from overloading
- Refers to the situation in which a body of
function is executed with arguments of several
types
12The Idea behind Type Systems
- Design of type checkers depends on
- information about the syntactic constructs
- the notion of types and type compatibility
- the rules for assigning types to language
constructs (e.g., expressions) - Mapping between operand types and the result type
- e.g., assignment-stmt RV-type ?LV-type
- Where this information comes from?
- reference manual
13Example of using manual
- Example of information that comes from manual(C
or Pascal) - If both operands of the arithmetic operators of
, -, are type Integer, then the result
is of type integer - The result of the unary operator is a pointer
to the object referred to by the operand. If the
type of the operand is integer/real, then the
type of the result is pointer to integer/real - Each expression has a type associated with it
- Each expression has a structure
- E.g.,
- the type pointer to integer is constructed from
the type that integer refers to
14Type systems 1
- A type system
- A set of rules for constructing new types from
the existing types - A method for determining if two types are
equivalent or compatible - Set of base types, or built-in types
- Char, integer, void (or null), Real,
- Specification using AG or SDT
15Type Systems 2
- Two types
- Basic types
- Constructed types
- Basic types
- Atomic types with no internal structure
- e.g.,
- Boolean (AND/OR/XOR operations)
- Characters (STRING OPERATIOSNS)
- Numbers ( , /, -, operators)
- Void (null value)
- Constructed types (user-defined types)
- types that constructed from basic types and other
constructed types - Var A Array 1..10 of integer
16Type expressions?
- The type of a language constructs will be denoted
by a type expression - Type expression could be
- Basic (e.g., Integer)
- Formed by applying type constructor to other type
expressions - Type expressions?
- All basic types are considered type expressions
- A type name is a type expression (e.g., record)
17Type Constructors
- Type constructors applied to type expressions are
- Array if T is a type expression, then Array (I,
T) is a type expression (where I is index and T
is type) - Var A Array 1..10 of integer // associate the
type expression array (1..10) with A - Products if T1 and T2 are type expression, then
T1?T2 is a type expression - Pointer If T is a type expression, then Pointer
(T) is a type expression denoting the type
pointer to an object of type T - E.g., var p?row (i.e., declares var p to have
type pointer (row) - Functions maps elements of one set (domain) to
anther set (rang) - E.g., mod int ? int ?int
- Function f (a, bchar) ? integer
- Type of f char ? char ? pointer (integer)
18Type checker
- Type checker
- implements a type systems
- different type system can be used by different
compilers - Static and dynamic checking of types
- Checking done by a compiler is said to be static
- Checking done when the target program runs is
dynamic - A good type system attempts to eliminates dynamic
checking as much as possible - A language is strongly typed?
- if the compiler can guarantee that the programs
it accepts will execute without type errors
19Dynamic type checking
- Some checks can be done only dynamically
- E.g.,
- Table array 1..255 of char
- i integer
- Compute Table i?
- Compiler cannot guarantee that during execution,
the value of i will be in the rang 1..255 - Error recovery
- What if errors are caught during type checking?
- It is a reasonable expectation that a type
checker do something about them - send type checking error messages
- report the nature and the location of errors
- A desirable feature of type checker to recover
from error
20Runt-time Safety
- Strongly typed languages (e.g., Pascal, Java)
- Safety is a strong reason for using typed
languages - Language implementation that guarantees to catch
most type-related errors before they execute can
simplify the design/implementation - Weakly typed languages
- E.g., C
- Un-typed languages
- E.g., Assembler
21More on typing
- Weak typing
- Type errors can lead to erroneous calculations
- E.g., Ruby
- Strong typing
- Type errors cannot cause erroneous calculations
- The type checking is done at compile time /run
time - E.g., Pascal
- Static typing (strong typing)
- The types of all expressions are determined at
compile time before the program is executed - The type check is typically carried out in an
early phase of the compilation - Comes in two flavors
- explicit type declaration and
- implicit type inference
- Static typing implies strong typing
22Ensuring Run-time safety
To eliminate run-time errors, the compiler must
infer a type for each expression e.g., Addition
operation in FORTRAN 77 (a b)
23Generating Better CODE (Implementation of in
FORTARN 77)
The code on the right shows IC operation for
addition together with conversion for mix-typed
expressions
24RUN-Time Checking and Conversion for in FORTAN
77
Example of code for language in which type
checking differed until run-time
25Compile-time vs. Run-time
- Strongly typed languages
- All inference and all checking are done at
compile time - Statically typed and statically checking refer to
the implementation that perform all this work at
compile time - Dynamically typed and dynamically checked
- Strongly typed, statically typed language with
dynamic checking (Java) - Perform some of the checking at run-time (e.g.,
execution model)
26type checking type of declaration
- Types are introduced in two ways
- Type declaration
- Anonymously
- Example of type declaration
- Type MyArray Array 1..10 of Integers
27anonymously
- Example of anonymously
- Var a Array 1..10 of real
- compiler will expand the above to
- TYPE type01_in_line_77 Array 1..10 of real
- Var a type01_in_line_77
28forward references
- Type declarations may refer to identifiers that
have not been declared (forward references) - E.g.,
- TYPE ptr_List_Entry POINTER TO List_Entry
- TYPE List_Entry
- Record
- Elm integer
- Nxt Ptr_Entry
- END RECORD
29THE TYPE TABLES
- Various information is being DOCUMENTED for types
include - Its type constructor (basic, record, array, ptrs,
etc) - The size and alignment requirements of a variable
of the type - Types of the components, if applicable
- for a basic type its types (e.g., integer, real,
etc) - for a record type its list of fields
- for an array type number of dimensions, the
index types, element type - for ptr type the referenced type
- others the appropriate info
30Type equivalence1
- Types MUST be checked for compatibility and
equivalent - Examples include
- Assignment statements
- Formal and actual parameters of
procedure/function calls - Compilers must detect and report any situation
that violates the type incompatibility - e.g., found integers but compilers expected real
31Type equivalence2
- An important element of any type system is the
mechanism to decide if two different type
declarations are equivalent - To compare two types, need to understand the
notion of type equivalence - two types are equivalent iff values of their
types have the same representations - (i.e., one can be used where the other is
required, and v.s.)
32Type equivalence3
- There are two kinds of type equivalence
- Name equivalence (used for almost all languages)
- Structural equivalence (difficult to implement)
33Name equivalence
- Two types are name equivalent if they have the
SAME NAME - e.g.
- Type t1 Array 1..10 of integer
- Type t2 Array 1..10 of integer
- are they name equivalent? No because they have
distinct type definitions (names)
34More on Name equivalence
- TYPE t3 Array 1..10 of Integer
- TYPE t4t3
- Are they equivalent? Yes
- Name equivalence check is easy by compiler by
just comparing the pointers
35Structural equivalence
- Two types are structurally equivalent iff they
have the same structure - same set of fields
- same order
- corresponding fields having the equivalent types
- Difficult to implement (parallel traversal of two
type descriptors is needed) - e.g.,
- TYPE t5 RECORD c integer p?t5 END
- TYPE t6 RECORD c integer p?t6 END
- Examples of languages support structural type
are - C, C, and Algol 68
36Type checking (revisited)
- To do type checking a compiler needs
- to assign a type expression to each element of
the source program - to determine that these type expressions conforms
to collection of rules known as type system for
the source language - A sound type system eliminates the need for
dynamic checking for type errors - Ideas from type checking have been used to
improve the security of the systems that allows
software components to be imported - E.g., imported java code is check first before it
can be executed to prevent both malicious
behaviors or unwanted errors
37Rules for type checking
- Type checking can be done in two ways
- Synthesis
- Inference
- Synthesis compute the type of an expression
using its sub-expressions - Requires define/use semantics
- E.g. x K
- Type inference computes the type of a language
constructs from the way it is used - E.g., function null(x) that tests if the list is
empty - Need for the language that do not require
define/use - E.g. ML
38Inference Rule
- For each operator (e.g., ), type inference rules
specify - The mapping from the operands type(s) to the
result type - Simple mapping
- E.g., assignment statement ()
- It requires one operand and one result
- L.H.S (result) must have a type that is
compatible with R.H.S
39Relationship between operands and result type and
type error
- The relationship can be defined by functions
- the relationship between operands and result can
be defined as functions using table to compute
the result type - (e.g. Fortran77uses table )
- The relationship can be defined by rules
- Java adding two integer types of different
precision produces a result of the more precise
(longer) type - The inference also specifies type error
- E.g.,
- FORTAN table precisely forbid some combinations
such as double with complex - E.g., Java forbid assigning a number to a
character
40Example Result Types for operator in Fortran77
Integer Real Double complex
Integer Integer Real Double complex
Real Real Real Double complex
double Double Double Double Illegal
complex Complex Complex Illegal complex
41Compiler and type error
- Any type error must result in
- error messages to be reported
- Or fixing error by compiler by inserting
conversion operation - E.g.,
- In FORTRAN 77, addition of integer and
floating-point requires conversion of integer to
floating-point before the addition - E.g.,
- Javas rule for integer addition of values with
less precision coerces the less precise value to
the form of the more precise value - e.g. in R.H.S with less precision coerces to
the precision of L.H.S)
42Declaration and Inference
- Remember Many programming languages require
define/use semantics (i.e. mandatory
declarations) for each Var with well-defined type - Compiler can assign types to any expression over
variables and constant using - Define/use for variables
- Implied types for constants
- Complete set of type-inference rules
- Type information about function
- Assigning types to constant
- Type of constant can be inferred from usage and
its context - E.g., X2 ( implies that 2 is integer)
- E.g. Sin(2) ( implies that 2 is real)
- Type inference becomes complicate when define/use
is not mandated - Remember the goal of type inference is to assign
a type to each expression that occurs in a
program
43Inferring types for expression
- The simplest form for type inference occurs when
the compiler can assign a type to each element in
an expression (i.e., each leaf in Parse Tree for
an expression) - It requires to build parse tree and assign a type
to each value in the expression during a simple
postorder tree walk - The process should let the compiler to detect
every violation of an inference rule and report
it at compile time
44Type conversions
- How about when we expect a value of type T1 but
we find a value of type T2. is this OK? - Type expected vs. type found
- If T1 ? T2, then no problem
- Else the rules are language-dependent
- Casting (Explicit type conversion)
- Coercion (Implicit type conversion)
45Coercions Implicit type conversion
- The language definition specifies what
conversions are necessary - In assignment statement
- the conversion is always to type of left hand
side (LHS) - In expression (x k), where x is real and k is
integer - the compiler has to convert one of the operands
of to the other to ensure both having the same
type - E.g., double d long l int i
- if (d gt i) d i
- if (i gt l) l i
- if (d l) d 2
46Casting
- When programmers are required to convert type
explicitly - Explicit conversion looks like function
applications to type checker - double da 3.3
- double db 3.3
- double dc 3.4
- int result (int)da (int)db (int)dc
//result 9 - //if implicit conversion would be used (as with
"result da db dc"), result would be equal
to 10
47Type checker for Simples language
- A simple type checker for a simple language in
with define/usage semantics - The type checker is specification of translation
scheme - it synthesizes type of each expression using the
type of its sub-expressions - P?D E
- D?D D id T
- T? char integer array num of T )? T
- E? literal num id E mod E E ( E )E?
48Example of SAMPLE program
- Program generated with grammar include
- Key integer
- Key mod 2011
- Array 256 of char
- Key integer?
49Simple language Partial translation
- P?D E
- D?D D
- D? id T addtype(id.entry, T.type)
- T? char T.type char
- T? integer T.type char
- T? array num of T T.type array (1..num.val,
T1.type) - T? ?T1 T.type pointer (T1.type)
50Type checking of expressions (E)
- E?literal E.type char
- (i.e.type of constant literal is char)
- E?num E.type integer
- (i.e.type of constant number is integer)
- E?id E.type lookup
(id.entry) - (i.e. when an id appears in an expression, its
defined type is fetched and assigned to attribute
type) - E? E1 mod E2 E.type if E1.type integer
and E2.type integer then integer
else type_error - E?E1 ? E.type If E1.type pointer(t) then t
else type error - (i.e. type of E is the type t of the object
pointed to by the pointer E1)
51Type checking of functions
- E? E1 (E2)
- (i.e. expression is formed from the applying E1
to E2 ) - E.type if E2.type s and E1.type (f(s)t)
then t else type_error - Where f S?T for types S and T
- Example
- Root (real ?real) ?real ?real
- A function that takes as input a function from
real to real, AND a real as arguments - Function root in psacl (function f (real)
real x real) real
52Type checking of Statements (S)
- Note language constructs such as statements
typically do not have values. The special basic
type void is assigned to statements - S ? id E S.type if id.type E.type then
void else type_error - // checks that Lvalue and Rvaule of assignment
are the same - S ? If E then S1 S.type if E.type Boolean
then S1.type else type_error - //Checks that expressions in conditional
statement is Boolean - S ? While E do S1 S.type if E.type Boolean
then S1.type else type_error - //Checks that expressions in while statement is
Boolean - S ? S1 S2 S.type if S1.type void and
S2.type void then void else type_error - // sequences of statements must have type void
iff each sub-statment has type void any type
mismatch should generates type type-error
53Type-checking rules for coercion from integer to
real
PRODUCTION SEMANTIC RULE
E?num E.Type integer
E?num.num E.Type real
E?id E.Type lookup(id.entry)
E? E1 op E2 E.type if E1.type integer and E2.type integer then integer Else if E1.type integer and E. type real then real Else if E1.type real and E2.type integer then real Else if E1.type real and E2.type real then real Else type-error
54What is the type specifications for Boolean
expressions?
- Suppose you add
- T? boolean to the grammar,
- T? char integer array num of T Boolean
- You need to add productions and semantic rules to
permit comparison operator like lt and logical
connectives like AND or OR into the productions
for E - E? E1 lt E2
55Advanced Topics Harder problems in type inference
- The absence of declaration, makes type checking
harder - Some programming language either omit
declarations or treat them as optional
information - E.g., Scheme lacks declarations for variables
- Dynamic Changes in type
- E.g., APL
- Type-consistent uses and unknown function types
- If the type of a function varies with functions
arguments, then type inference becomes very
difficult
56Intermediate code generation
- Analysis-synthesis model of compiling
- Front end
- Parser
- Static checker
- Intermediate Code generation (IC)
- Back end
- Code generator
- Combining front (N languages) and back (M
machines) - NM compilers
- Advantages of using IC
- Allows combining front and back into NM
compilers (i.e. retargeting) - A machine-independent code optimizer can be used
to IC to produce more efficient code
57Translating a program
- The process of translation from source language
to target machine may include as sequence of
intermediate representation - High level
- Machine independent
- E.g., C language
- Low level
- Machine-dependent
- E.g., register allocation and instruction
selection - Target code/machine code
58Compiler and intermediate representations
Low level(three addressing code)
High level (Syntax Tree)
target (ASM)
Intermediate Representation
Source (Pascal)
Intermediate Representation
Intermediate Representation
- In the process of translating a program from
source language to target language, a compiler
may built a sequence of intermediate
representations - Examples of intermediate representations
(languages) are - --syntax tree,
- -postfix,
- -three-code addressing
- C
59Intermediate representation
- Intermediate representation
- Three-address code
- E.g., x y op z
- Where
- y and z are operands
- x is the result
- Used for low level machine dependent tasks
(register allocation, etc.) - Syntax tree
- Represents hierarchical structure of the source
program - Used for static type checking
60Variants of Syntax trees
- Syntax tree
- Nodes represents constructs
- Leaves represents concrete/meaningful constructs
- Directed Acyclic Graph (DAG)
- A directed graph having no directed cycles
- Used to represent expressions by identifying the
common sub expressions (i.e., sub-expressions
that may occur more than once) - Can be generated using the same technique as
syntax trees - Read algorithm 6.3 on page 361
61Syntax tree
a b -c b - c
62Dag
a b -c b - c
63Three address code for EXPRESSION
a b -c b -c
64Two representations of the syntax tree and Dag
65DAG (directed Acyclic Graph) for expressions
- DAG consists of
- Leaves representing the atomic operands
- Interior codes representing operators
- A node N may have more than one parent iff
- N represents common sub-expressions
- DAG
- Used for generation of more efficient code when
evaluating the expressions
66Example of DAG
a a (b c) (b - c) d
d
a
-
Represents b-c twice
b
c
67Sdd to generate syntax trees of dags
PRODUCTIONS
SEMANTIC RULES
- E?E1 T
- E?E1 - T
- E? T
- T?(E)
- T?id
- T?num
- Where
- Leaf and Node are functions creating a new node s
- E.node new Node(, E1.node, T.node)
- E.node new Node(-, E1.node, T.node)
- E.node T.node
- T.node E.node
- T.node new Leaf(id, id.entry)
- T.node new leaf( num, num.val)
68STEPS FOR CONSTRUCTING THE DAG FOR EXAMPLE
aa(b-c) d (b-c)
- P1 Leaf (id, entry-a) //creates
leaf node a - P2 Leaf (id, entry-a) P1 // reuse P1
- P3 Leaf (id, entry-b) // creates b
- P4 Leaf (id, entry-c) // creates c
- P5 Node (-, p3, p4) //creates (b-c)
- P6 Node (, p1, p5) // creates a(b-c)
- P7 Node (, p1, p6) // creates aa(b-c)
- P8 Leaf (id, entry-b) P3 //reuse existing node
- P9 Leaf (id, entry-c) P4 //reuse existing
node - P10 Node (-, p3, p4) P5 //reuse existing
expression (b-c) - P11 Leaf (id, entry-d) //creates d
- P12 Node (, p5, p11) // creates d (b-c)
using P5 P11 - P13 Node (, p7, p12) // creates aa(b-c)
d (b-c)
69Nodes of A DAG
- The nodes of syntax tree (DAG) can be stored in
an array of records - Each row is a record
- In each record, the first field is an operation
code - Operator s having two additional fields (left and
right children) - Leaves have one additional field which holds the
lexical value - Integer indexing is used to refer to the nodes
70(No Transcript)
71Three-address code
- In 3-address code
- Linearized representation of syntax tree or DAG
- At most one operator on the R.H.S of an
instruction - No build-up arithmetic expression are permitted
- Expression like xyz represented as
- t1 y z
- t2 x t1
- Where t1, and t2 are compiler-generated temporary
labels/names - 3-address code suitable
- Target-code generation (code can be reorder
easily) - Optimization
72Example 6.1 DAG
a a (b c) (b-c) d
t1 b c t2 a t1 t3 a t2 t4 t1 d t5
t3 t4
d
Represents a twice
a
-
Represents b-c twice
b
c
733-address mode addresses instructions
- Three-address code is built from two concepts
- Addresses
- Instructions
- Can be implemented by Records with fields
- e.g. quadruples
- op, arg1, arg2, results
- An address can be
- A name
- Use source program names can as addresses
- In an implementation, a name is replaced by a
pointer to its symbol-table entry - A constant
- Different types of constants (e,g., integers and
real numbers) - A compiler-generated temporary
- Creates unique name each time a temporary is
needed
743-address mode Instructions SET ( 1)
- Here is a list of the common 3-address
instructions forms - Assignment x y op z // where op is binary
operator - Assignment x op y // where op is unary
operator - Copy instructions x y // where x gets the value
of y - An unconditional jump goto L.
- the 3-address statement with Label L is the next
instruction - Conditional jumps such as if x relop y goto L,
which uses a relational operators
753-address mode Instructions SET (2)
- Procedure calls param x1, param x2, , param xn,
call p, n - y call p, n for function call
- Indexed copy instructions
- x yi // sets x to the value in the location
i-memory units beyond location y - xi y // sets the contents of the location i-
memory units beyond x to the value of y - Address and pointer assignments
- x y // sets value of x to location y) ,
- x y // copy the content of address pointed by
y to x , - x y // sets the rvalue of object pointed to by
x to the rvalue of y
76Translation scheme for 3-AC generation of
assignment
Checks to see if there is an entry for id.name
exists in symbol table
Emits three-address code to output file
77SDD to produce 3-AC for assignment
78Semantic rules generating code for a while
statement
79SDD for While Statement using numerical value for
Boolean
E.Code is a synthesized attribute represents a
3-addree code for E
Encoding true or false using numerical value 0
or 1
S.Code is a synthesized attribute represents a
3-addree code for S
E.place is the temp-name holding the value of E
80translating High level statement into 3-address
code (3-AC)
- Given the statement do J J1 while (aJ lt
v) - Can be translated using
- symbolic label L or
- position number
- Symbolic Label version
- L t1 j 1
- j t1
- t2 j 8 //if each array element takes
8-bytes - t3 a t2
- if t3 lt v goto L
-
81Using position numbers
- 100 t1 j 1
- 101 j t1
- 102 t2 j 8 //if each Array element takes
8-byte - 103 t3 a t2
- 104 if t3 lt v goto 100
-
82Implementation of 3-AC Quadruples
- The three-address code
- specifies the components of each type of the
instruction but not their representation - These instructions can be implemented as objects
or as records with field for - Operators
- Operands
- The representations can be
- Quadruples (op, arg1, arg2, result)
- E.g.. x y z ( , y, z, x)
- Triples (op, arg1, arg2)
- Refers to the results by position
- E.g. refers to the result of x op y by its
position instead of temporary
83Two implementation of three-code instructions
Instead of t1 it uses the position (0)
e.g., a b -c b -c
Note about quad 1- unary instruction like x
minus y or x y do not use arg2 2-operators like
param use neither arg2 nor result 3-Conditional
and unconditional jumps put the target label in
result
84Boolean expressions
- In programming languages, Boolean expressions
serve two important purposes - used to compute logical values (assignments)
- used as conditional expressions in statements
that change the flow of control - Examples of statements include
- IF/THEN,
- IF/THEN/ELSE,
- WHILE/DO
- Repeat/Until
- Etc
- Boolean expressions are combined using Boolean
operators (AND/OR, NOT)
85Boolean expression 2
- Boolean operators applied to Boolean operands
such as Boolean values (True/False) or relational
expressions - Relational expressions are of the form
- E1 relop E
- Where
- E1 and E2 are arithmetic expressions
- relop ? lt, lt, ltgt, gt, , gt
- E.g. x2gty-3
86Boolean expression 3
- E? E OR E E AND E NOT E (E) E rel E true
falseId - Given E1 OR E2
- if E1 is true, then E is true
- Given E1 AND E2
- If E1 false, then E is false
87Methods of Translating Boolean Expressions
- Two methods of representing the value of Boolean
expression - Encode true and false numerically and treat them
like numerical expressions - E.g.,
- let any nonzero true
- zero (negative) false
- Use flow of control and their positions in the
code to implicitly represent the Boolean value - Natural fit for control- oriented statements such
as if-then and while-do - E.g.,
- given E1 OR E2, if E1 is true, then using the
properties of OR we can conclude that the entire
expression is true no need to evaluate E2
88Numerical Representation
- Consider implementation of Boolean expression
using - 1 true
- 0 false
- Evaluating from left-to-right and using
precedence rules fro NOT, AND, OR - E.g.,
- a OR b AND NOT c ? (a OR (b AND (NOT c)))
- Three address code
- t1 NOT c
- t2 b AND t1
- t3 a OR t2
89Numerical Representation Relational operators
- For relational expression such as altb, the
method treats them like conditional statement - If a lt b then 1 else 0
- Three address code (3-AC)
- 100 if a lt b goto 103
- 101 t 0
- 102 goto 104
- 103 t1
- 104
90Translation schema to implement a numerical
representation of Boolean
91Example of evaluating 3-AC using Schema
Transalation
92Jumping code (short circuit code)
- Short Circuit code
- Refers to translation of Boolean expression into
three-address code without generating code for
any of the Boolean operators or having the code
needed to evaluate the entire expression - For example, it is possible to evaluate Boolean
expression without explicitly generation code for
the Boolean operators AND, OR, NOT if we use
their position as the values in the code sequence - No operators will appear in the code
- Only Boolean values (i.e., True/False) is used in
specific positions
93Control Flow Statements and 3-address code
- Control flow statements can be
- S? if (E) S1
- S? if (E) S1 else S2
- S? while (E) S1
- Where
- E is a boolean expression (predicate)
- S is a statement (e.g. assignment, etc.)
94Code for if, if/else, and while loop
Go to E.true
Go to E.true
E.code
E.code
Go to E.false
Go to E.false
S1.code
E.true
S1.code
E.true
goto S.next
E.false
S.code
E.false
(a) if
S.next
(b) if-THEN-else
Go to E.true
E.code
S. begin
Go to E.false
S1.code
E.true
goto s.begin
E.false
S2.nxt
(c) while
95SDD for flow control statements1
PRODUCTION SEMANTIC RULES
P? S S.next newlable() P.code S.code label(S.next)
S? assign S.code assign.code
S?if (E) S1 E.True newlable() E.False S1.next S.next S.Code E.code lable (E.True ) S1.code
S? if (E) S1 else S2 E.True newlable() E.False newlable S1.next S.next S2.next S.next S.Code E.code lablel (E.True ) S1.code gen(goto, S.next) label (E.false ) S2.Code
96SDD for flow control statements2
PRODUCTION SEMANTIC RULES
S? While (E) S1 S.begin newlabel() E.True newlabel() E.False S.next S1.next begin S.code lable(S.begin ) E.code lablel (E.true ) S1.codegen(goto, S.begin)
97Generating 3-address code for Boolean expressions
using AND, OR, NOT, Relop
production Semantic rules
E? E1 E2 E1.true E. True E1. false newlabel() E2.true E. true E2.false E.false E. code E1.code label(E1.false)E2.code
E? E1 E2 E1.true Newlabel() E1. falseE.false E2.true E. true E2.false E.false E. code E1.code label(E1.true)E2.code
98Generating 3-address code for Boolean expressions
(NOT)
production Semantic rules
E? !E1 E1.true E. false E1. false E.true E. Code E1.code
E? E1 relop E2 E. Code E1.code E2.code gen (if E1.addr rel.op E2.addr goto E.true) gen(goto E.false)
E?true E.Code gen (goto E.true)
E? false E.Code gen (goto E.false)
e.g. E is of the form altb can be translated
into If a lt b goto B.True goto B.false
99Example 6.22CONTROL-FLOW TRANSLATION OF
If-statement using SDD
- Consider again
- If ( xlt100 xgt200 x ! y ) x0
- Using SDD
- If x lt 100 goto L2
- goto L3
- L3 if x gt 200 goto L4
- goto L1
- L4 if x ! y goto L2
- goto L1
- L2 x 0
- L1
Represents B1.true
Represents B1.false
Note -The semantic rule for P? S generate L1
(i.e., next statement) -statement S is if (B) S1
(x 0)
100Avoiding Redundant GOTOS
- L3 if x gt 200 goto L4
- goto L1
- L4
- More efficient code using falls through
- IFFALSE xgt 200 goto L1 // go to next S
- L4
-
- L1
101Intermediate code for procedure
- Given
- a is an array of integer
- f integers? integers
- Translate n f (ai) into Intermediate code
- 1. t1 i 4 // assuming each array element
takes 4 bytes - 2. t2 a t1
- 3. param t2 // ai
- 4. t3 call f, 1 // call f with one parameter
t2 - 5. n t3
102Design Considerations for Intermediate code
generation
- The choice of allowable operators is an important
design consideration - Operators set should be rich to implement the
operations in the source langue - Having operators close to machine instructions
makes it easier to implement the intermediate
form on a target machine - Otherwise,
- the optimizer and code generator may need to do
additional work to interpret the structure in
order to generate efficient code
103Note about Exam 3
- Exam 3 is take-home exam and is due on December
11 .
104In class Quiz
- Translate the arithmetic expression a (bc) d
(b-c) - Into a syntax tree
- Into DAG