Title: CSCI 3136 Principles of Programming Languages Part III: Semantic Analysis Names, Scopes, and Binding
1CSCI 3136Principles of Programming
LanguagesPart IIISemantic AnalysisNames,
Scopes, and Bindings(PPL Chapter 4, Chapter
3)Faculty of Computer ScienceDalhousie
University
2Syntax and Semantics
- Syntax
- describes form of a valid program
- can be described with a CFG
- Semantics (Chapter 4 in textbook)
- describes meaning of a program
- cannot be described with a CFG
- some constraints that may appear syntactic are
actually enforced by semantic analysis
3Semantic Analysis
- Role of semantic analysis
- enforce semantic rules
- build intermediate representation
- (e.g., abstract syntax tree)
- fill symbol table
- pass results to intermediate code generator
- Two approaches to semantic analysis
- interleaved with syntactic processing, or
- a separate phase
- Formal mechanism Attribute grammars
4Enforcing Semantic Rules
- Static semantic rules
- enforced by compiler at compile-time
- Dynamic semantic rules
- compiler generates code for run-time enforcement
- Examples division by zero, out-of-bounds array
index - some compilers allow disable option for dynamic
checking
5Attribute Grammars
- an augmented CFG
- attributes added to each symbol
- CF productions are augmented with semantic rules
for - copying attribute values
- evaluating attribute values using semantic
functions, and - enforcing constraints on attribute values
6Example of Attribute Grammar
E1 ? E2 T E1.val sum(E2.val, T.val) E1 ? E2
? T E1.valdifference(E2.val,T.val) E ?
T E.valT.val T ? T F T1.valproduct(T2.val,
F.val) T1 ? T2 / F T1.valquotient(T2.val,F.val)
T ? F T.valF.val F1 ? ? F2 F1.valadd_inv(F2
.val)) F ? ( E ) F.valE.val F ? const F.val
const.val
E ? E T E ? E ? T E ? T T ? T F T ? T / F T ?
F F ? ? F F ? ( E ) F ? const
7An Example
- Let us consider the following language
- anbncn n 1
- i.e., a language that includes the following
words - abc, aabbcc, aaabbbccc, aaaabbbbcccc,
- This is not a CFL, but can be recognized by an
Attribute Grammar
8Synthesized attributes(bottom-up aproach)
S ? A B C Condition A.size B.size
C.size A ? a A.size 1 A2
a A.size A2.size 1 B ? b B.size
1 B2 b B.size B2.size 1 C
? c C.size 1 C2 c C.size
C2.size 1
9Example of a Parse Tree Decoration
S ? A B C Condition A.size B.size
C.size A ? a A.size 1 A2
a A.size A2.size 1 B ? b B.size
1 B2 b B.size B2.size 1 C
? c C.size 1 C2 c C.size
C2.size 1 Draw and decorate a parse tree
for aaabbbccc
10Inherited attributes(top-down approach)
S ? A B C B.inhSize A.size C.inhSize
A.size A ? a A.size 1 A ? A2 a A.size
A2.size 1 B ? b condition B.inhSize 1 B ?
B2 b B2.inhSize B.inhSize - 1 C ?
c condition C.inhSize 1 C ? C2 c C2.inhSize
C.inhSize - 1
11Attribute Flow
- Annotation or decoration of parse treeprocess
of evaluating attributes - Two kinds of attributes.
- Synthesized Attributes
- attributes evaluated using RHS of a production,
and values stored in LHS - bottom-up flow
- Inherited Attributes
- attributes calculated for a symbol in RHS
- top-down flow
12S-attributed and L-attributed Grammars
- S-attributed grammar
- all attributes are synthesized
- bottom-up attribute flow
- L-attributed grammar
- X ? Y1 Y2 Yn
- X.syn depend on X.inh and Y.all
- Yi.inh depend on X.inh or Yj.all (jlti)
- S-attributed grammars are a subset of
L-attributed grammars
13Example of an L-attributed Grammar
- attributes v (value) and st (sub-total)
- add semantic rules
- annotate a parse tree for 5 - 4 (3 - 2)
S ? T Tt Tt ? T Tt Tt ? - T Tt Tt ? e T ? n T
? ( S )
14Action Routines
- Ad-hoc translation interleaved with parsing.
- Parser generators allow programmer to specify
action routines (e.g. yacc). - Action routines can appear anywhere in a rule
e.g., rule Tt ? T Tt becomes - Tt1 ?
- T Tt2.st Tt1.st T.v Tt2 Tt1.v
Tt2.v - used in yacc and bison
15XML and DTDs
- Example of application of concept of CFGs
- Semantic annotation of text
- example
- DTD definition of elements
- regular expression-like syntax
16Names, Scopes, and Bindings
- Reading Chapter 3 of PPL
- a name is a mnemonic character string
representing something else - e.g., 1, 2, 3, test are not names
- x, sin, f, Prog_1, null? are names
- , lt, may be names, if they are not built-in
operators
17Binding
- A binding is an association between two entities,
e.g. name and a memory location, name and a
function, - Typically binding is between name and object
- A referencing environment is a complete set of
bindings active at a certain point in program
18Scope
- Scope of a binding
- the region of a program, or time interval (s) in
program execution, in which a binding is active - Scope
- a program region of maximal size where no
bindings are destroyed
19Binding Times
- Compile time
- Mapping of high-level language constructs to
machine code - Layout of static data in memory
- Link time
- Resolve references between separately compiled
modules - Load time
- Machine addresses assigned to static data
- Run time
- Bindings of values to variables, locations of
dynamic data
20Importance of Binding Time
- Early binding times
- Efficiency
- Compiled languages
- Later binding times
- Flexibility
- Interpreted languages
21Object and Binding Lifetime
- Object lifetime
- period between creation and destruction of an
object - e.g. pushing and popping a stack frame
- Binding lifetime
- time between creating and destroying a
name-to-object association - two common mistakes
- dangling reference (no object for a binding)
- memory leak (no binding for an object)
22Storage Allocation
- An objects lifetime corresponds to the mechanism
used to manage the space where the object
resides - Static
- object at a fixed absolute address
- Stack
- object allocated on the stack in connection with
a subroutine call - Heap
- object allocated/deallocated at arbitrary times
- explicitly by the programmer
- implicitly by the garbage collector
23Case ExampleObject creation and destruction in
C
- automatic objects (local to functions/blocks)
- free-store objects (new, delete)
- non-static member objects (depend on parent)
- array elements (depend on array)
- local static objects (thread control to the end)
- global, namespace, class static objects
- temporary objects (in expr, to full expr. end)
- user supplied allocation function (new)
- union members (no members with cons/des)
24Static Objects
- Global variables
- Variables local to a subroutine, but retain value
between invocations - Constant literals
- Tables for run-time support e.g. debugging, type
checking, etc. - Space for subroutines, including local variables
in a language with no recursion
25Stack-based Allocation
- Space for subroutines in a language that permits
recursion - stack frame (activation record) contains
- arguments, local variables
- return values
- return address, etc.
- Subroutine calling sequence maintains stack
- Caller code, before and after call
- Subroutine (callee) code Prologue and Epilogue
26Stack Frame (Activation Record)
- Compiler determines
- Frame pointer - a register pointing to a known
location in the current stack frame - Offsets from the frame pointer of objects in the
frame - Absolute size of stack frame may not be known
- Stack pointer
- register pointing to the first unused location on
the stack - Specified at run time
- The absolute location of stack frame in memory
27Stack Frame before subroutine call
smaller addresses
larger addresses
28Stack Frame after subroutine call
smaller addresses
larger addresses
29ExampleNew features in C in Standard 99
- introduction of automatic variables at arbitrary
positions - automatic variable-length arrays e.g.
- scanf(d, n)
- int an
-
30Heap-based Allocation
- Heap
- region of storage in which blocks can be
allocated and deallocated at arbitrary times, in
arbitrary order - Storage management
- Free list linked list of free blocks
- In each allocation, search for a block of
adequate size
search direction
31First-fit and Best-fit Allocation
- First fit grab first block that is large enough
- Best fit grab smallest block that is large
enough
search direction
reserve location for a new block
32Heap Fragmentation Problem
- Internal fragmentation
- part of a block is unused
- External fragmentation
- unused space consists of many small blocks
- although total free space may exceed allocation
request, individual free blocks may be too small - Is there less external fragmentation with best
fit or first fit? - depends on the size distribution of allocation
requests
33Cost of Allocation in a Heap
- Single free list
- linear cost in the number of free blocks.
- Separate free lists for blocks of different sizes
- Buddy system
- Block sizes are powers of 2
- Addresses of buddies of size 2k differ only at
the kth bit - For each k, maintain a list of free blocks of
size 2k - If block of size 2k is unavailable, split a block
of size 2k1 - If block of size 2k is deallocated, it may be
merged with the buddy block
34Fibonacci Heap
- Similar to Buddy system, but uses Fibonacci
numbers as standard block sizes - 1 1 2 3 5 8 13 21 33
35Deallocation in a Heap
- Explicit deallocation by the programmer
- Efficient
- e.g. Pascal, C
- May lead to bugs that are very difficult to find
- Dangling pointers/references from deallocating
too soon - Memory leaks from not deallocating
- Automatic deallocation by a garbage collector
- Can add significantly to runtime overhead
- e.g. Java, functional and logic programming
languages
36Scopes
- Scope of a binding
- the region of a program in which a binding is
active - Scope
- a program region of maximal size where no
bindings are destroyed - Scoping can be
- Lexical (static)
- bindings known at compile time
- Dynamic
- bindings depend on flow of execution at run time
37Static (Lexical) Scope
- Current binding for a name is the one
encountered most recently in top-to-bottom scan
of the program text. - more common Scheme, C, Java, Prolog
- Program text units
- packages, modules, source files
- classes
- nested subroutines
- blocks
- records, structures
38Dynamic Scope
- The current binding for a given name is
- the one encountered most recently during
execution, - not hidden by other binding for the same name,
and - not yet destroyed by exiting its scope.
39Example Lexical vs. Dynamic Scoping
1 a integer -- global declaration 2
procedure first 3 a 1 4 procedure
second 5 a integer -- local
declaration 6 first() 7 a 2 8 if
read_integer() gt 0 9 second() 10
else 11 first() 12 write_integer(a)
What does the program print? Under lexical
scoping? Under dynamic scoping? Dynamic
scoping is usually a bad idea
40Example Lexical vs Dynamic scope in Perl
static scoping sub f my a 1 print
fa\n printa() sub printa print
pa\n a 2 f()
dynamic scoping sub g local a 1 print
ga\n printa() sub printa print
pa\n a 2 g()
Output?
41Static Scoping in Scheme
- rules for special forms
- Examples in each, scope of variable in red is
shown in green
( (define x ) F1Fn )
(lambda (x1 ) F1Fn )
(let ((x1 E1 (x2 E2) (x3 E3) ) F1 Fn )
(let ((x1 E1) (x2 E2) (x3 E3)) F1 Fn )
(letrec ((x1 E1) (x2 E2) (x3 E3)) F1 Fn )
42Static Scoping in Pascal Example
procedure P1 (A1 T1)
P1
var X real
procedure P2 (A2 T2)
P2
P4
X
A1
procedure P3 (A3 T3)
declarations of P3
P3
F1
A2
A4
body of P3
body of P2
X
A5
A3
procedure P4 (A4 T4)
function F1 (A5 T5) T6
Whats visible inside P1? Whats visible inside
P2? In F1, the new X will hide the previous X.
var X integer
body of F1
body of P4
body of P1
43Static Chains
- How is static scoping implemented?
C
- Stack frame has a static link pointer to the
frame of the most recent invocation of the
lexically surrounding subroutine. - To reference
a variable in some outer scope, dereference
static chain pointer a number of times and add
offset, e.g. - To reference x declared in A in
code for C deref(deref(fp)) offset
fp
D
B
E
A
x
offset
44Shallow and Deep binding
- If a subroutine is passed as a parameter, when
are the free variables bound? - Shallow binding when the routine is called
- Deep binding when the routine is first passed as
a parameter. - Important in both dynamic and static scoping.
- Known as the fun-arg problem.
45Example of the fun-arg problem
- (define x 1)
- (define increase_x
- (lambda () (set! x ( x 1))))
- (define execute
- (lambda (f) (let ((x 20))
- (display (list inner x before x))
- (f)
- (display (list inner x after x))
- )))
- (display (list outer x before x))
- (execute increase_x)
- (display (list outer x after x))
46Example
type person record age integer threshold
integer ( age threshold ) people
database function older_than(p person)
boolean return p.age threshold procedure
print_person(p person) ( use line_length to
format data ) procedure print_selected_records(
db database predicate, print_routine
procedure) line_length integer if
device_type(stdout) terminal line_length
80 else line_length 132 for each record r
in db if predicate(r) print_routine(r) thresh
old 35 print_selected_records (people,
older_than, print_person)
Appropriate for older_than deep binding (to
get global threshold) print_person shallow
binding (to get locally set line_length) (dyna
mic scoping assumed)
47Example Shallow and Deep binding
- What is the output of the following code?
- int x 10
- function f(int a)
- x a 1
-
- function g(function h)
- int x 30
- h(100) print(x)
-
- function main()
- g(f) print(x)
- In case of deep binding?
- In case of shallow binding?
48Subroutine closures
- In deep binding, a closure is a bundle of
- A referencing environment
- Reference to the subroutine
- Deep binding is
- an option in dynamically scoped languages
- the default in statically scoped languages
49Modules
- Motivation dividing work between programmers on
a team. - Make objects and algorithms invisible to portions
of the software system that do not need them. - Information hiding
- Static variables in C - provide for
single-subroutine abstractions - Module (effectively a single instance of class)
- Module type (effectively a class with no
inheritance) - Class (module type inheritance)
50Static variables in C
- Subroutine to generate a series of distinct names.
/ Place into s a new name beginning with the
letter 1 and continuing with the ascii
representation of an integer guaranteed to be
distinct in each separate call. s is assumed to
point to space large enough to hold any such
name for the short ints used here, seven
characters suffice. 1 is assumed to be an upper
or lower-case letter. sprintf 'prints' formatted
output to a string. / void gen_new_name (char
s, char l) static short int name_nums52
/ C guarantees that static local variables
are initialized to zeros / int index (l gt
'a' l lt 'z') ? l-'a' 26 l
-'A' name_nuns index sprintf (s,
"cd\O", l, name_nuns index)
51Static variables in C Example 2
- Save time on compiling a regular expression
- (e.g., using the regex library by Henry Spencer)
void a_function(char s) int i static
regexp date NULL if (date NULL)
date regcomp(0-90-9? (JanFeb)
2004-9) if (regexec(date, s0))
char p for (pdate-gtstartp0 p lt
date-gtendp0 p) and so on
52Modules in Modula-2
PROCEDURE pop() element BEGIN END BEGIN top
1 END stack CONST stack_size TYPE
element VAR x, y element push(x) y
pop
MODULE stack IMPORT element, stack_size EXPORT
push, pop TYPE stack_index 1..stack_size
VAR s ARRAY stack_index OF element top
stack_index PROCEDURE error PROCEDURE
push(elem element) BEGIN END
53Modules in Modula-2 (2)
- visibility specified by explicit IMPORT and
EXPORT - example of closed scopes (vs. open scopes, where
bindings from outside are freely passed into
the scope) - forces programmer to clearly document the
interface - selectively open scopes Java, C, Perl, Python,
Ada
54Constructs similar to Modules
- equivalent constructs to module in other
languages - C separate compilation units
- C namespaces
- Java, Perl, Ada, Turing packages
- Clu clusters
55Perl Packages
- lexical (my) and package variables in Perl
- main the default package
- package as a symbol table
- run-time manipulation of names
- special names _, /, etc.
- typeglobs
56A Module Deficiency
- Drawback
- stack module cant be used in an application that
requires more than one stack - Solutions
- Use several copies of the module code with
different names - serious anti-reuse!
- a module that manages several stacks
- inelegant, ad hoc
- Module types - e.g. in Simula, Euclid
57Classes
public class stack private int
stack_size private element s private int
top 0 public void push(element x)
public element pop()
stack A, B element x, y A.push(x) y
B.pop()
- Every instance A of a module type or class has a
separate copy of the module types or classs
variables. - class module-as-a-type inheritance
58Lexical Scoping in Scheme
- rules for special forms
- Examples in each, scope of variable in red is
shown in green
( (define x ) F1Fn )
(lambda (x1 ) F1Fn )
(let ((x1 E1) (x2 E2) (x3 E3) ) F1 Fn )
(let ((x1 E1) (x2 E2) (x3 E3)) F1 Fn )
(letrec ((x1 E1) (x2 E2) (x3 E3)) F1 Fn )
59Environments in Scheme
frames
I
- simple referencing environment structure
- in A
- value of x?
- value of y?
- in B?
x3 y5
II
III
m1 y2
z6 x7
A
B
60Closures in Scheme
- Closure pointer to referencing environment
function code - e.g. (define (square x) ( x x))
square
parameters x body (x x)
(Section 3.2 of SICP)
61Example
- (define (square x) ( x x))
- (define (sum-of-squares x y) ( (square x) (squa
re y))) - (define (f a) (sum-of-squares ( a 1) ( a 2)))
62Environment Procedures
square
global environment
sum-of-squares
f
parameters a body (sum-of-squares
( a 1) ( a 2))
parameters x body ( x x)
parameters x, y body ( (square x)
(square y))
63Environment Evaluation
Evaluating (f 5)
global env
(f 5)
x6 y10
E1
a5
E2
E3
x6
E4
x10
(sum-of-squares ( a 1) ( a 2))
( (square x) (square y))
( x x)
( x x)
64Example with Counters
- (define new-counter (lambda ()
- (define c 0)
- (define counter (lambda ()
- (set! c ( c 1))
- c))
- counter))
- (define cnt (new-counter))
- (cnt) produces 1
- (cnt) produces 2 etc.
65Data abstraction in Scheme (1)
Representing a data structure as a list. (define
(element-of-set? x set) (cond ((null? set)
false) ((equal? x (car set)) true)
(else (element-of-set? x (cdr
set))))) (define (adjoin-set x set) (if
(element-of-set? x set) set
(cons x set))) (define (intersection-set set1
set2) (cond ((or (null? set1) (null? set2))
()) ((element-of-set? (car set1)
set2) (cons (car set1)
(intersection-set (cdr set1) set2)))
(else (intersection-set (cdr set1) set2))))
66Data abstraction in Scheme (2)
- Not a true data abstraction because
- cannot change a set
- could be fixed by defining a mutation operation
- cannot hide information
- a solution using closure
67Data abstraction in Scheme (3)
(define (make-account balance) (define
(withdraw amount) (if (gt balance amount)
(begin (set! balance (- balance amount))
balance) "Insufficient
funds")) (define (deposit amount) (set!
balance ( balance amount)) balance)
(define (dispatch m) (cond ((eq? m 'withdraw)
withdraw) ((eq? m 'deposit) deposit)
(else (error "Unknown request -
MAKE-ACCOUNT" m))))
dispatch) (define acc (make-account 50))((acc
deposit) 40) gt ???((acc withdraw) 60) gt
???
68Aliasing and Overloading
- Aliasing
- more names bound to one object
- e.g. pointers, references
- makes compiler optimization more difficult
- Overloading
- one name bound to more objects
69Aliasing examples
- common block in FORTRAN
- variant records in Pascal, union in C
- pointer-based data structures, e.g.
- int a, b, p, q
- a p q 3 b p
- use of references, e.g., C
-
- double sum, sum_of_squares
- void accumulate(double x)
- sum x sum_of_squares x x
-
- accumulate(sum) // !?
70Issues with Aliasing
- aliases make code more confusing
- optimization is more difficult or impossible
- keyword restrict in C99 is introduced to
address this issue - with a pointer declaration
- no aliases
- programmers responsibility
71C Aliasing Example
- non-parameter reference example
- int val 1024
- int a val
- int b // error
- int c 10 //error
- const int d 10 // OK
- int e some.complexstr.i
72Perl Aliasing
- References in Perl are used as pointers e.g.
- a 10 print aa\n
- b \a b 5 print aa\n
- we can also make explicit aliases
- c a c 100
- print aa\n
- sub a print proc a\n
- a c
- d \a d print dd\n
73Overloading
- Overloaded name
- name refers to more than one object in a given
scope (e.g. overloaded arithmetic operators) - most languages have some form of overloading
(e.g. , -, and other operators) - elaborate overloading C, Java, C, Ada
- operator overloading
- C, C A.operator(B)
- Ada (A, B)
- Fortran90 interface construct to associate
with some binary function
74Similar Mechanisms to Overloading
- Coercion (similar mechanism)
- compiler automatically converts an object into
another type when required - Java example o forces o.toString()
- Subroutine with polymorphic parameters
(unconverted). - Single body of code
- Behaviour is customized
- Generic subroutines (templates).
- separate, similar, not identical, copies of code