PL/0 and the 655 Project - PowerPoint PPT Presentation

1 / 190
About This Presentation
Title:

PL/0 and the 655 Project

Description:

A rule has a left-hand side (LHS) and a right-hand side (RHS), and consists of ... Input program read top-to-bottom, left-to-right, with no backtracking ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 191
Provided by: bobma5
Category:
Tags: lefthand | project

less

Transcript and Presenter's Notes

Title: PL/0 and the 655 Project


1
  • PL/0 and the 655 Project

2
Specification of Syntax
  • PL/0
  • How the nesting of expression, term and factor in
    PL/0 work together and generate code
  • How the nesting of recognition routines has the
    effect of static scoping
  • Project questions and answers
  • UML Use Case Modeling
  • General Problem of Describing Syntax
  • Recursive Descent Parsing
  • Attribute Grammars
  • Describing the Meanings of Programs
    Dynamic Semantics

3
Simple Syntax Processing Prerequisites
  • 321, 560, 625 cover basics of processing simple
    programs
  • 655 to advance and unify that understanding
  • Wirth approach to describing syntax graphs as
    flow charts for programming a parserrecursive
    descent
  • Textbook Chapters
  • Following slides should be a review

4
Syntax, semantics, language
  • Syntax - the form or structure of the
    expressions, statements, and program units
  • Semantics - the meaning of the expressions,
    statements, and program units
  • Sentence - string of characters over some
    alphabet (maybe what are usually words)
  • Language - set of sentences
  • Lexeme - lowest level syntactic unit of a
    language (e.g., , sum, begin)
  • Token - category of lexemes (e.g., identifier)

5
Language (following Wirth)
  • L L ( T, N, P, S )
  • Vocabulary T of terminal symbols
  • Set N of non-terminal symbols(grammatical
    categories)
  • Set P of productions (syntactical rules)
  • Symbol S (from N) called the start symbol
  • Language is set of sequences of terminal symbols
    that can be generated (directly or indirectly
    (thats his points 3 and 4)

6
Backus Normal Form (1959)
  • Invented by John Backus to describe Algol 58
  • BNF is equivalent to context-free grammars
  • A metalanguage is a language used to describe
    another language.
  • In BNF, abstractions are used to represent
    classes of syntactic structures--they act like
    syntactic variables (also called nonterminal
    symbols)
  • e.g. ltwhile_stmtgt -gt while ltlogic_exprgt do
    ltstmtgt
  • This is a rule it describes the structure of a
    while statement

7
Syntax rules
  • A rule has a left-hand side (LHS) and a
    right-hand side (RHS), and consists of terminal
    and non-terminal symbols
  • A grammar is a finite nonempty set of rules
  • An abstraction (or non-terminal symbol) can have
    more than one RHS
  • ltstmtgt -gt ltsingle_stmtgt begin ltstmt_listgt
    end
  • Syntactic lists are described in BNF using
    recursion
  • ltident_listgt -gt ident ident, ltident_listgt
  • A derivation is a repeated application of rules,
    starting with the start symbol and ending with a
    sentence (all terminal symbols)

8
An example grammar
  • ltprogramgt -gt ltstmtsgt
  • ltstmtsgt -gt ltstmtgt ltstmtgt ltstmtsgt
  • ltstmtgt -gt ltvargt ltexprgt
  • ltvargt -gt a b c d
  • ltexprgt -gt lttermgt lttermgt lttermgt - lttermgt
  • lttermgt -gt ltvargt const

9
An example derivation
  • ltprogramgt gt ltstmtsgt
  • gt ltstmtgt
  • gt ltvargt ltexprgt
  • gt a ltexprgt
  • gt a lttermgt lttermgt
  • gt a ltvargt lttermgt
  • gt a b lttermgt
  • gt a b const

10
Derivation explanation
  • Every string of symbols in the derivation is a
    sentential form
  • A sentence is a sentential form that has only
    terminal symbols
  • A leftmost derivation is one in which the
    leftmost non-terminal in each sentential form is
    the one that is expanded
  • A derivation may be neither leftmost nor
    rightmost
  • Parse tree is a hierarchical representation of a
    derivation

11
Parsing another view

12
Static Semantics
  • Other information about the language not
    specified with the BNF
  • Identifier length
  • Maximum integer value
  • Other restrictions on your compiler
  • Symbol table size
  • Code array size
  • Specify these in your description of your
    language processor
  • Recognize the restrictions youve implied

13
Unstated Assumptions
  • Input program read top-to-bottom, left-to-right,
    with no backtracking
  • Things declared before they are used
  • No redefining at same level
  • Inner declarations hidden by nesting
  • Inner can locally hide outer declarations

14
Ambiguity Right Recursive
  • A grammar is ambiguous iff if and only if it
    generates a sentential form that has two or more
    distinct parse trees
  • If we use the parse tree to indicate precedence
    levels of the operators, we cannot have ambiguity
  • Operator associativity can also be indicated by a
    grammar
  • ltexprgt -gt ltexprgt ltexprgt const (ambiguous)
  • ltexprgt -gt ltexprgt const const (unambiguous)
  • Left recursive (left associative)(recursive
    descent will require right recursive)

15
Extended BNF (abbreviations)
  • Optional parts are placed in brackets ()
  • ltproc_callgt -gt ident ( ltexpr_listgt)
  • Put alternative parts of RHSs in parentheses and
    separate them with vertical bars
  • lttermgt -gt lttermgt ( -) const
  • Put repetitions (0 or more) in braces ()
  • ltidentgt -gt letter letter digit

16
BNF / EBNF
  • BNF
  • ltexprgt -gt ltexprgt lttermgt
  • ltexprgt - lttermgt
  • lttermgt
  • lttermgt -gt lttermgt ltfactorgt
  • lttermgt / ltfactorgt
  • ltfactorgt
  • EBNF
  • ltexprgt -gt lttermgt ( -) lttermgt
  • lttermgt -gt ltfactorgt ( /) ltfactorgt

17
Syntax Graphs
  • Put the terminals in circles or ellipses and put
    the non-terminals in rectangles
  • Connect with lines with arrowheads
  • e.g., Pascal type declarations

18
Wirths Rules
  • B1 Reduce system of syntax graphs to a few of
    reasonable size
  • B2 Translate each graph to a procedure according
    to subsequent rules
  • B3 Sequence of elements translates to
  • begin T(S1) T(S2) T(Sn) endor T(S1)
    T(S2) T(Sn)
  • procedure TSx()begin TS1()
    getsym() TS2() getsym() end

19
lttermgt -gt ltfactorgt ( /) ltfactorgt
  • Pascal commentbegin factor while sym in
    times, slash do begin mulop
    sym getsym factor gen_proper_op end
    end

20
lttermgt -gt ltfactorgt ( /) ltfactorgt
  • void term()
  • factor() / parse the first factor/
  • while (next_token aster_code
  • next_token slash_code)
  • lexical() / get next token /
  • factor() / parse the next factor /

21
Recursive Descent Parsing
  • Parsing - constructing a parse / derivation tree
    for a given input string
  • Lexical analyzer is called by the parser
  • A recursive descent parser traces out a parse
    tree in top-down order it is a top-down parser
  • Each non-terminal in the grammar has a subprogram
    associated with it the subprogram parses all
    sentential forms that the nonterminal can
    generate
  • The recursive descent parsing subprograms are
    built directly from the grammar rules
  • Recursive descent parsers cannot be built from
    left-recursive grammars

22
PL/0 Program Structure
  • Initialize keyword arrays, operator symbols,
    mnemonics, and so forth
  • Initialize variables controlling scanning
    (getting the individual characters), lexical
    analysis (forming tokens), and parsing
  • Call the ltblockgt recognizing routine
  • Note that block ends with a call to listcode
  • Call the virtual machine interpreter
  • Machine code kept in an array between phases
  • Need to add to the output capabilities of PL/0

23
Blocks and Static Scoping
  • Blocks are different than sequences of statements
    or compound statements
  • Blocks can include declarations
  • Sort of like a single use subprogram used and
    defined here
  • Where can blocks appear?
  • Ada almost anywhere a statement could be
  • Pascal only as bodies of procedures
  • Java inner classes

24
Data Specific to a Procedure
  • To be able to return from call
  • Program address of its call (return address)
  • Address of data segment of caller
  • Keep in data segment of procedure as
  • RA (return address) DL (dynamic link)
  • Location of variables
  • Relative address only (since memory dynamic)
  • Displacement off base address of appropriate data
    segment (locally B register or by descending
    chain of static links)
  • What does static scoping mean here?

25
Example of Static Scoping
  • void a local variable one void b
    local variable two void c
    local variable three // beginning
    of code for c reference one,
    two call b // end of c //
    beginning of code for b reference one,
    two call c // end of b // beginning of
    code for a call b // end of a
  • a ? b ? c ? b

26
Example of Static Scoping
  • In a, one is local
  • In b, two is local
  • In b, one is a single static level out
  • In c, three is local
  • In c, two is a single static level out
  • In c, one is double static levels out
  • Then c calls b
  • In b, one is still a single static level out

27
Block Recognition Processing
  • Block(level, symbolTableStartingIndex)
  • Page 13, left
  • ltblockgt ltconst_declgt ltvar_declgt
    ltproc_declgt ltstatement_bodygt
  • ltproc_declgt procedure ltnamegt ltblockgt
  • Recognize inner block
  • Block(currentLevel1, currentSymbolTableIndex)
  • Jump around decalrations
  • tx0 tx tabletx0.adrcx gen(jmp,0,0)
    ... codetabletx0.adr.acx
    tabletx0.adrcx statement() gen(opr,0,0)
    return

28
Symbol Table and Static Scope
  • Variable declaration storage allocated by
    incrementing DX (data index) by 1
  • Initially DX is 3 to allocate space for the
    block mark (RA, DL, and SL)
  • Symbol table (table)
  • enter enter object into table
  • Nested in block which determines static scoping
  • Recursive calls make table act like a stack
  • position - find identifier id in table
  • Linear search backward

29
Blocks and Scoping
  • Nesting blocks does scope
  • Restoring symbol table pointers makes symbol
    table work like stack
  • Inner definitions lost to outer contexts
  • Idea make symbol table work like a tree(one
    branch along a tree looks like a stack)

30
PL/0 Virtual Machine
  • Section 5.10 (page 6 of handout)
  • Stack machine primary data store is stack
  • push, pop, insert or retrieve from within
  • Operations on top of stack (add, test, etc.)
  • Program store array named code
  • Unchanged during interpretation
  • I instruction register
  • P program address register
  • Data store array named S stack

31
Example of Static Scoping (Repeat)
  • void a local variable one void b
    local variable two void c
    local variable three // beginning
    of code for c reference one,
    two call b // end of c //
    beginning of code for b reference one,
    two call c // end of b // beginning of
    code for a call b // end of a
  • a ? b ? c ? b

32
Example of Static Scoping (Repeat)
  • In a, one is local
  • In b, two is local
  • In b, one is a single static level out
  • In c, three is local
  • In c, two is a single static level out
  • In c, one is double static levels out
  • Then c calls b
  • In b, one is still a single static level out

33
Stack of PL/0 Machine (Fig. 5.7)
DL RA SL
DynamicLink
A local vars
B local vars
C local vars
B
StaticLink
B local vars
T
34
Data Specific to a Procedure (again)
  • To be able to return from call
  • Program address of its call (return address)
  • Address of data segment of caller
  • Keep in data segment of procedure as
  • RA (return address) DL (dynamic link)
  • Location of variables
  • Relative address only (since memory dynamic)
  • Displacement off base address of appropriate data
    segment (locally B register or by descending
    chain of static links)
  • What does static scoping mean here?

35
PL/0 Code Generation
  • (page 7)
  • Addresses are generated as pairs of numbers
    indicating the static level difference and the
    relative displacement within a data segment.
  • But how does the compiler figure this out?
  • PL/0 code
  • Other questionhow does PL/0 handle forward
    references?

36
PL/0 Machine Commands
  • LIT load numbers (literals) onto the stack
  • LOD fetch variable values to top of stack
  • STO store values at variable locations
  • CAL call a subprogram
  • INT allocate storage by incrementing stack
    pointer (T)
  • JMP - transfer of control
  • (new program address - P)
  • JPC - conditional transfer of control
  • OPR - arithmetic and relational operators

37
More on PL/0 Code Generation
  • fct (lit, opr, lod, sto, cal, int, jmp, jpc)
  • instruction packed record f fct function
    code l 0 .. levmax level a 0 ..
    amax displacement address end
  • procedure gen (x fct y, z integer)
  • begincodecx.f x codecx.l y
    codecx.a zcx cx 1
  • end
  • procedure listcodevar i integer
  • begin list code generated for this bockfor i
    cx0 to cx-1 do writeln(i, mnemoniccodei.f
    5, codei.l 3, codei.a 5)
  • end

38
PL/0 Interpreter
  • t0 b1 p0 initialize
    registersS10 s20 s30 (initialize
    memoryrepeat instruction fetch
    loop icodep pp1 With i do case f of
    decode instruction lit begin tt1
    sta end opr case a of 1 st
    -st 2 begin tt-1 st st
    st1 end end jmp pa sto
    begin sbase(l)ast writeln(st) tt-
    1 end cal begin generate new block
    mark st1base(l) st2b st3
    p bt1 pa end enduntil p0
    not a good way to end

39
Project Virtual Machine
  • Can use the design of the PL/0 one
  • Operations in PL/0 are integer orientedyou
    probably want to add to this
  • Can also use other machine designs
  • Hybrid approach compiles to intermediate form,
    then interprets that
  • Direct interpretation possible if clearly
    proposed
  • Idea add output whenever computation done
  • Idea build some messages in that could be output
    with new opr instructions

40
Programs Writing Programs
  • Compilers basically take programs in one language
    and write programs in another
  • Compiler-compilers take grammars and write
    compilers (second level program writers)
  • Translation of ENBF straightforward
  • terminal ? specific recognizer
  • non-terminal ? call to recognizing routine
  • alternatives ? explicit initial characters
  • loops ? graph following algorithms
  • LEXX, YACC, other compiler tools
  • Add in other stuff to the language processor

41
Adding Predefined (Variable) Names
  • procedure block has 2 parameters
  • lev (the nesting level for the block)
  • tx (starting index for the symbol table)
  • The nested procedure enter is what puts symbols
    (variable names) into the symbol table
  • Right-side page 14 of handout
  • Initialize symbol table
  • Make initial call of block non-zero table index
  • Can initialize or do other things not normal in
    the user visible input language

42
Adding New Operator
  • Add to getsym to recognize new symbol
  • Look in condition, expression, term, factor
  • Is new operator parallel to one of those
    operators?
  • Basically another option in code generation
  • If not like existing operators,add new syntactic
    construct.
  • New action add to PL/0 machine
  • Generate new instruction gen(opr,0,14)
    (square)
  • Implement new instruction functionality(page 14,
    left-side)14 begin st stst end
  • Add it into list of mnemonics

43
Adding Built-in Function
  • Design new indicator for symbol table
  • Put function name in symbol table
  • Parser will recognize as defined name(there will
    be no way for user to put in)
  • In termif symident then
    iposition(id) case kind of
    constant variable procedure
    built-in begin getsym left paren
    expression
    getsym right paren
    gen (opr, 0, new-thing) end

44
Adding Pre-defined Function
  • Another approach
  • Put entry into symbol table
  • Make it a regular procedure
  • Initialize the code array to represent the code
    that might have been generated
  • Adding - New statement type
  • Add new syntax into body of statement(page 12)
  • Look at call as an example
  • Syntactic sugar

45
How To Start on the Project
  • Get your tokenizer working
  • This is the getsym procedure of the Pascal
    version of PL/0 distributed in class
  • Can also be done with classes in C and Java
  • Read in sample programs in the language youre
    trying to compile and output the tokens (with
    some other information)
  • Benefits
  • Written some programs in your language
  • Can leave the output statements for debugging

46
UML Use Case Modeling
  • program actions from the user viewpointe.g.,
    directions for the grader of how to execute your
    program
  • begin developing different aspects of the program
    and planning its eventual actions as soon as
    possible

user
command lineinterpreter
compilation
execution
47
(No Transcript)
48
(No Transcript)
49
655 Project
  • Unified Software Development Process (Rational)
  • Unified Modeling Language (UML)
  • Only to begin understanding, not required to use
  • Unified Process
  • Inception Phasebegin understanding the problem
    and what you might do
  • Spiral Approachtry to have some partial version
    at each stage
  • Project Report - proposal - introductory part
  • Risk analysis small steps rather than being
    overwhelmed, some small test programs

50
Project Activities
  • Language and program processing
  • Input (lexical) scanning
  • Grammars and recursive descent parsing
  • Compiling to a virtual machine
  • Virtual machine interpreter
  • Some other approaches to compiling

51
Project Options
  • Proposals required as a first step you may want
    an alternative language or alternative
    techniques.
  • Individual proposal or work as a group of two.
  • Many resources on the Internet if you want to
    use them, propose to do something that goes
    beyond them in a significant way.

52
Project Stages
  • Proposal (preliminary write-up) for project
  • By e-mail to grader
  • Simple parser for simple imperative language
  • To exercise submit process
  • Simple interpreter
  • (step that doesn't have to be turned in)
  • Final complete project
  • significant write-up electronic submission
  • No Other program in Lisp/Scheme

53
Test Input
  • Your language, your implementation, you know the
    features and restrictions, therefore
  • You supply the test input
  • Tell the grader what she should expect when
    running the tests and why you chose what you did
    (show off this or that feature, exercise an error
    message, clever program in your language)

54
655 Project Options
  • Encourage you to make this into something youll
    enjoy and be proud of
  • Flexibility probably unusual
  • Available resources (books, Internet, etc.)
  • Acknowledge their use
  • Do significant work of your own
  • Many different backgrounds and interests
  • My experience has been at detail levels after
    others have started and thought they were close
    to finished (the very large Last 10)

55
PL/0 Virtual Machine
  • Section 5.10 (page 6 of handout)
  • Stack machine primary data store is stack
  • push, pop, insert or retrieve from within
  • Operations on top of stack (add, test, etc.)
  • Program store array named code
  • Unchanged during interpretation
  • I instruction register
  • P program address register
  • Data store array named S stack

56
Data specific to a procedure
  • To be able to return from call
  • Program address of its call (return address)
  • Address of data segment of caller
  • Keep in data segment of procedure as
  • RA (return address) DL (dynamic link)
  • Location of variables
  • Relative address only (since memory dynamic)
  • Displacement off base address of appropriate data
    segment (locally B register or by descending
    chain of static links)
  • What does static scoping mean here?

57
Example of Static Scoping
  • void a local variable one void b
    local variable two void c
    local variable three // beginning
    of code for c reference one,
    two call b // end of c //
    beginning of code for b reference one,
    two call c // end of b // beginning of
    code for a call b // end of a
  • a ? b ? c ? b

58
Example of static scoping
  • In a, one is local
  • In b, two is local
  • In b, one is a single static level out
  • In c, three is local
  • In c, two is a single static level out
  • In c, one is double static levels out
  • Then c calls b
  • In b, one is still a single static level out

59
Stack of PL/0 machine (Fig. 5.7)
DL RA SL
DynamicLink
A local vars
B local vars
C local vars
B
StaticLink
B local vars
T
60
Data specific to a procedure (again)
  • To be able to return from call
  • Program address of its call (return address)
  • Address of data segment of caller
  • Keep in data segment of procedure as
  • RA (return address) DL (dynamic link)
  • Location of variables
  • Relative address only (since memory dynamic)
  • Displacement off base address of appropriate data
    segment (locally B register or by descending
    chain of static links)
  • What does static scoping mean here?

61
PL/0 code generation
  • (page 7)
  • Addresses are generated as pairs of numbers
    indicating the static level difference and the
    relative displacement within a data segment.
  • But how does the compiler figure this out?
  • PL/0 code

62
PL/0 machine commands
  • LIT load numbers (literals) onto the stack
  • LOD fetch variable values to top of stack
  • STO store values at variable locations
  • CAL call a subprogram
  • INT allocate storage by incrementing stack
    pointer (T)
  • JMP - transfer of control
  • (new program address - P)
  • JPC - conditional transfer of control
  • OPR - arithmetic and relational operators

63
Symbol table and static scope
  • Variable declaration storage allocated by
    incrementing DX (data index) by 1
  • Initially DX is 3 to allocate space for the
    block mark (RA, DL, and SL)
  • Symbol table (table)
  • enter enter object into table
  • Nested in block which determines static scoping
  • Recursive calls make table act like a stack
  • position - find identifier id in table
  • Linear search backward

64
More on PL/0 code generation
  • fct (lit, opr, lod, sto, cal, int, jmp, jpc)
  • instruction packed record f fct function
    code l 0 .. levmax level a 0 ..
    amax displacement address end
  • procedure gen (x fct y, z integer)
  • begincodecx.f x codecx.l y
    codecx.a zcx cx 1
  • end
  • procedure listcodevar i integer
  • begin list code generated for this bockfor i
    cx0 to cx-1 do writeln(i, mnemoniccodei.f
    5, codei.l 3, codei.a 5)
  • end

65
Programs writing programs
  • Compilers basically take programs in one language
    and write programs in another
  • Compiler-compilers take grammars and write
    compilers (second level program writers)
  • Translation of ENBF straightforward
  • terminal ? specific recognizer
  • non-terminal ? call to recognizing routine
  • alternatives ? explicit initial characters
  • loops ? graph following algorithms
  • LEXX, YACC, other compiler tools

66
655 Project Options
  • Encourage you to make this into something youll
    enjoy and be proud of
  • Flexibility probably unusual
  • Available resources (books, Internet, etc.)
  • Acknowledge their use
  • Do significant work of your own
  • Many different backgrounds and interests
  • My experience has been at detail levels after
    others have started and thought they were close
    to finished (the very large Last 10)

67
CIS 655 Project PL/0
  • Niklaus Wirth, Algorithms Data Structures
    Programs, 1976, Prentice-Hall (ISBN
    0-13-022418-9)
  • PL/0 subset of Pascal
  • Illustrated the way the Pascal P-code compiler
    built
  • http//www.cs.rochester.edu/u/www/courses/

    254/PLzero/guide/guide.html
  • 655 Project (100 pts, 40 of total grade)
  • Project proposal (10 pts for turning in,
    revisable)
  • Parser
  • Intermediate step (non-graded) (but 10 pts for
    turning in)
  • Input in syntax of programming language youre
    building a compiler/interpreter for
  • Some kind of output, maybe with XML markup
  • Develop your own test cases
  • Freedom to make the project into something youll
    enjoy and be proud of

68
655 Project
  • Unified Software Development Process (Rational)
  • Unified Modeling Language (UML)
  • Only to begin understanding, not required to use
  • Unified Process
  • Inception Phasebegin understanding the problem
    and what you might do
  • Spiral Approachtry to have some partial version
    at each stage
  • Project Report - proposal - introductory part
  • Risk analysis small steps rather than being
    overwhelmed, some small test programs

69
Project Activities
  • Language and program processing
  • Input (lexical) scanning
  • Grammars and recursive descent parsing
  • Compiling to a virtual machine
  • Virtual machine interpreter
  • Some other approaches to compiling

70
Project Options
  • Proposals required as a first step you may want
    an alternative language or alternative
    techniques.
  • Individual proposal or work as a group of two.
  • Many resources on the Internet if you want to
    use them, propose to do something that goes
    beyond them in a significant way.

71
Project Stages
  • Proposal (preliminary write-up) for project
  • By e-mail to grader
  • Simple parser for simple imperative language
  • To exercise submit process
  • Simple interpreter
  • (step that doesn't have to be turned in)
  • Final complete project
  • significant write-up electronic submission
  • No Other program in Lisp/Scheme

72
655 Project Options
  • Encourage you to make this into something youll
    enjoy and be proud of
  • Flexibility probably unusual
  • Available resources (books, Internet, etc.)
  • Acknowledge their use
  • Do significant work of your own
  • Many different backgrounds and interests
  • My experience has been at detail levels after
    others have started and thought they were close
    to finished (the very large Last 10)

73
Other Project Questions
  • An answer is probably in the PL/0 handout if you
    only knew where to look and what you were looking
    for
  • OK to use LEX and YACC if you build a significant
    project on top of their use
  • JLex
  • http//www.cs.princeton.edu/appel
    /modern/java/JLex/
  • More Class Questions?

74
How to start on the project
  • Get your tokenizer working
  • This is the getsym procedure of the Pascal
    version of PL/0 distributed in class
  • Can also be done with classes in C and Java
  • Read in sample programs in the language youre
    trying to compile and output the tokens (with
    some other information)
  • Benefits
  • Written some programs in your language
  • Can leave the output statements for debugging

75
655 Su03 Project Grading
  • Pick between possible project options
  • Write proposals
  • Revisable
  • First part of final report
  • Work singly or in pairs
  • Communicate with grader
  • Written reports required
  • Demonstration (after grader has time to read
    report)
  • Submit code to be run directly by grader
  • Project approximately 40 of overall grade

76
PL/0 program structure
  • Initialize keyword arrays, operator symbols,
    mnemonics, and so forth
  • Initialize variables controlling scanning
    (getting the individual characters), lexical
    analysis (forming tokens), and parsing
  • Call the ltblockgt recognizing routine
  • Note that block ends with a call to listcode
  • Call the virtual machine interpreter
  • Machine code kept in an array between phases
  • Need to add to the output capabilities of PL/0

77
Specification of Syntax
  • PL/0
  • How the nesting of expression, term and factor in
    PL/0 work together and generate code
  • How the nesting of recognition routines has the
    effect of static scoping
  • Project questions and answers
  • UML Use Case Modeling
  • General Problem of Describing Syntax
  • Recursive Descent Parsing
  • Attribute Grammars
  • Describing the Meanings of Programs
    Dynamic Semantics

78
Syntax, semantics, language
  • Syntax - the form or structure of the
    expressions, statements, and program units
  • Semantics - the meaning of the expressions,
    statements, and program units
  • Sentence - string of characters over some
    alphabet (maybe what are usually words)
  • Language - set of sentences
  • Lexeme - lowest level syntactic unit of a
    language (e.g., , sum, begin)
  • Token - category of lexemes (e.g., identifier)

79
Language (following Wirth)
  • L L ( T, N, P, S )
  • Vocabulary T of terminal symbols
  • Set N of non-terminal symbols(grammatical
    categories)
  • Set P of productions (syntactical rules)
  • Symbol S (from N) called the start symbol
  • Language is set of sequences of terminal symbols
    that can be generated (directly or indirectly
    (thats his points 3 and 4)

80
Language definitions
  • Who must use language definitions?
  • Other language designers
  • Implementors
  • Programmers (the users of the language)
  • Formal approaches to describing syntax
  • Recognizers - used in compilers
  • Generators - approach we'll study

81
Backus Normal Form (1959)
  • Invented by John Backus to describe Algol 58
  • BNF is equivalent to context-free grammars
  • A metalanguage is a language used to describe
    another language.
  • In BNF, abstractions are used to represent
    classes of syntactic structures--they act like
    syntactic variables (also called nonterminal
    symbols)
  • e.g. ltwhile_stmtgt -gt while ltlogic_exprgt do
    ltstmtgt
  • This is a rule it describes the structure of a
    while statement

82
Syntax rules
  • A rule has a left-hand side (LHS) and a
    right-hand side (RHS), and consists of terminal
    and non-terminal symbols
  • A grammar is a finite nonempty set of rules
  • An abstraction (or non-terminal symbol) can have
    more than one RHS
  • ltstmtgt -gt ltsingle_stmtgt begin ltstmt_listgt
    end
  • Syntactic lists are described in BNF using
    recursion
  • ltident_listgt -gt ident ident, ltident_listgt
  • A derivation is a repeated application of rules,
    starting with the start symbol and ending with a
    sentence (all terminal symbols)

83
An example grammar
  • ltprogramgt -gt ltstmtsgt
  • ltstmtsgt -gt ltstmtgt ltstmtgt ltstmtsgt
  • ltstmtgt -gt ltvargt ltexprgt
  • ltvargt -gt a b c d
  • ltexprgt -gt lttermgt lttermgt lttermgt - lttermgt
  • lttermgt -gt ltvargt const

84
An example derivation
  • ltprogramgt gt ltstmtsgt
  • gt ltstmtgt
  • gt ltvargt ltexprgt
  • gt a ltexprgt
  • gt a lttermgt lttermgt
  • gt a ltvargt lttermgt
  • gt a b lttermgt
  • gt a b const

85
Derivation explanation
  • Every string of symbols in the derivation is a
    sentential form
  • A sentence is a sentential form that has only
    terminal symbols
  • A leftmost derivation is one in which the
    leftmost non-terminal in each sentential form is
    the one that is expanded
  • A derivation may be neither leftmost nor
    rightmost
  • Parse tree is a hierarchical representation of a
    derivation

86
Ambiguity Right Recursive
  • A grammar is ambiguous iff if and only if it
    generates a sentential form that has two or more
    distinct parse trees
  • If we use the parse tree to indicate precedence
    levels of the operators, we cannot have ambiguity
  • Operator associativity can also be indicated by a
    grammar
  • ltexprgt -gt ltexprgt ltexprgt const (ambiguous)
  • ltexprgt -gt ltexprgt const const (unambiguous)
  • Left recursive (left associative)(recursive
    descent will require right recursive)

87
Extended BNF (abbreviations)
  • Optional parts are placed in brackets ()
  • ltproc_callgt -gt ident ( ltexpr_listgt)
  • Put alternative parts of RHSs in parentheses and
    separate them with vertical bars
  • lttermgt -gt lttermgt ( -) const
  • Put repetitions (0 or more) in braces ()
  • ltidentgt -gt letter letter digit

88
BNF / EBNF
  • BNF
  • ltexprgt -gt ltexprgt lttermgt
  • ltexprgt - lttermgt
  • lttermgt
  • lttermgt -gt lttermgt ltfactorgt
  • lttermgt / ltfactorgt
  • ltfactorgt
  • EBNF
  • ltexprgt -gt lttermgt ( -) lttermgt
  • lttermgt -gt ltfactorgt ( /) ltfactorgt

89
Recursive Descent Parsing
  • Parsing - constructing a parse / derivation tree
    for a given input string
  • Lexical analyzer is called by the parser
  • A recursive descent parser traces out a parse
    tree in top-down order it is a top-down parser
  • Each non-terminal in the grammar has a subprogram
    associated with it the subprogram parses all
    sentential forms that the nonterminal can
    generate
  • The recursive descent parsing subprograms are
    built directly from the grammar rules
  • Recursive descent parsers cannot be built from
    left-recursive grammars

90
lttermgt -gt ltfactorgt ( /) ltfactorgt
  • void term()
  • factor() / parse the first factor/
  • while (next_token ast_code
  • next_token slash_code)
  • lexical() / get next token /
  • factor() / parse the next factor /

91
Wirths Rules
  • B1 Reduce system of syntax graphs to a few of
    reasonable size
  • B2 Translate each graph to a procedure according
    to subsequent rules
  • B3 Sequence of elements translates to
  • begin T(S1) T(S2) T(Sn) endor T(S1)
    T(S2) T(Sn)
  • procedure TSx()begin TS1()
    getsym() TS2() getsym() end

92
lttermgt -gt ltfactorgt ( /) ltfactorgt
  • Pascal commentbegin factor while sym in
    times, slash do begin mulop
    sym getsym factor gen_proper_op end
    end

93
Blocks and static scoping
  • Blocks are different than sequences of statements
    or compound statements
  • Blocks can include declarations
  • Sort of like a single use subprogram used and
    defined here
  • Where can blocks appear?
  • Ada almost anywhere a statement could be
  • Pascal only as bodies of procedures
  • Java inner classes

94
Data specific to a procedure
  • To be able to return from call
  • Program address of its call (return address)
  • Address of data segment of caller
  • Keep in data segment of procedure as
  • RA (return address) DL (dynamic link)
  • Location of variables
  • Relative address only (since memory dynamic)
  • Displacement off base address of appropriate data
    segment (locally B register or by descending
    chain of static links)
  • What does static scoping mean here?

95
Example of Static Scoping
  • void a local variable one void b
    local variable two void c
    local variable three // beginning
    of code for c reference one,
    two call b // end of c //
    beginning of code for b reference one,
    two call c // end of b // beginning of
    code for a call b // end of a
  • a ? b ? c ? b

96
Example of static scoping
  • In a, one is local
  • In b, two is local
  • In b, one is a single static level out
  • In c, three is local
  • In c, two is a single static level out
  • In c, one is double static levels out
  • Then c calls b
  • In b, one is still a single static level out

97
Block recognition processing
  • Block(level, symbolTableStartingIndex)
  • Page 13, left
  • ltblockgt ltconst_declgt ltvar_declgt
    ltproc_declgt ltstatement_bodygt
  • ltproc_declgt procedure ltnamegt ltblockgt
  • Recognize inner block
  • Block(currentLevel1, currentSymbolTableIndex)
  • Jump around decalrations
  • tx0 tx tabletx0.adrcx gen(jmp,0,0)
    ... codetabletx0.adr.acx
    tabletx0.adrcx statement() gen(opr,0,0)
    return

98
Symbol table and static scope
  • Variable declaration storage allocated by
    incrementing DX (data index) by 1
  • Initially DX is 3 to allocate space for the
    block mark (RA, DL, and SL)
  • Symbol table (table)
  • enter enter object into table
  • Nested in block which determines static scoping
  • Recursive calls make table act like a stack
  • position - find identifier id in table
  • Linear search backward

99
Blocks and Scoping
  • Nesting blocks does scope
  • Restoring symbol table pointers makes symbol
    table work like stack
  • Inner definitions lost to outer contexts
  • Idea make symbol table work like a tree(one
    branch along a tree looks like a stack)

100
PL/0 Virtual Machine
  • Section 5.10 (page 6 of handout)
  • Stack machine primary data store is stack
  • push, pop, insert or retrieve from within
  • Operations on top of stack (add, test, etc.)
  • Program store array named code
  • Unchanged during interpretation
  • I instruction register
  • P program address register
  • Data store array named S stack

101
Stack of PL/0 machine (Fig. 5.7)
DL RA SL
DynamicLink
A local vars
B local vars
C local vars
B
StaticLink
B local vars
T
102
Data specific to a procedure (again)
  • To be able to return from call
  • Program address of its call (return address)
  • Address of data segment of caller
  • Keep in data segment of procedure as
  • RA (return address) DL (dynamic link)
  • Location of variables
  • Relative address only (since memory dynamic)
  • Displacement off base address of appropriate data
    segment (locally B register or by descending
    chain of static links)
  • What does static scoping mean here?

103
PL/0 code generation
  • (page 7)
  • Addresses are generated as pairs of numbers
    indicating the static level difference and the
    relative displacement within a data segment.
  • But how does the compiler figure this out?
  • PL/0 code

104
PL/0 machine commands
  • LIT load numbers (literals) onto the stack
  • LOD fetch variable values to top of stack
  • STO store values at variable locations
  • CAL call a subprogram
  • INT allocate storage by incrementing stack
    pointer (T)
  • JMP - transfer of control
  • (new program address - P)
  • JPC - conditional transfer of control
  • OPR - arithmetic and relational operators

105
More on PL/0 code generation
  • fct (lit, opr, lod, sto, cal, int, jmp, jpc)
  • instruction packed record f fct function
    code l 0 .. levmax level a 0 ..
    amax displacement address end
  • procedure gen (x fct y, z integer)
  • begincodecx.f x codecx.l y
    codecx.a zcx cx 1
  • end
  • procedure listcodevar i integer
  • begin list code generated for this bockfor i
    cx0 to cx-1 do writeln(i, mnemoniccodei.f
    5, codei.l 3, codei.a 5)
  • end

106
PL/0 Interpreter
  • t0 b1 p0 initialize
    registersS10 s20 s30 (initialize
    memoryrepeat instruction fetch
    loop icodep pp1 With i do case f of
    decode instruction lit begin tt1
    sta end opr case a of 1 st
    -st 2 begin tt-1 st st
    st1 end end jmp pa sto
    begin sbase(l)ast writeln(st) tt-
    1 end cal begin generate new block
    mark st1base(l) st2b st3
    p bt1 pa end enduntil p0
    not a good way to end

107
Project Virtual Machine
  • Can use the design of the PL/0 one
  • Operations in PL/0 are integer orientedyou
    probably want to add to this
  • Can also use other machine designs
  • Hybrid approach compiles to intermediate form,
    then interprets that
  • Direct interpretation possible if clearly
    proposed
  • Idea add output whenever computation done
  • Idea build some messages in that could be output
    with new opr instructions

108
Programs writing programs
  • Compilers basically take programs in one language
    and write programs in another
  • Compiler-compilers take grammars and write
    compilers (second level program writers)
  • Translation of ENBF straightforward
  • terminal ? specific recognizer
  • non-terminal ? call to recognizing routine
  • alternatives ? explicit initial characters
  • loops ? graph following algorithms
  • LEXX, YACC, other compiler tools

109
Static Semantics
  • Other information about the language not
    specified with the BNF
  • Identifier length
  • Maximum integer value
  • Other restrictions on your compiler
  • Symbol table size
  • Code array size
  • Specify these in your description of your
    language processor
  • Recognize the restrictions youve implied

110
Test input
  • Your language, your implementation, you know the
    features and restrictions, therefore
  • You supply the test input
  • Tell the grader what she should expect when
    running the tests and why you chose what you did
    (show off this or that feature, exercise an error
    message, clever program in your language)

111
Submit assignment goals
  • keeping you on track
  • checking out the submit process
  • This is not a separately graded assignment, but
    do do it within the next week or so.
  • The second part of the assignment is about
    checking out the submit process. I have set
    things up like I've done before, so I expect the
    following will work for most of you very easily
    but there will be a few problems, so get them out
    of the way as soon as possible.

112
Adding to PL/0
  • Predefined variable names
  • New operator
  • Built-in function
  • Pre-defined function
  • New statement type

113
Unstated Assumptions
  • Input program read top-to-bottom, left-to-right,
    with no backtracking
  • Things declared before they are used
  • No redefining at same level
  • Inner declarations hidden by nesting
  • Inner can locally hide outer declarations

114
Predefined (Variable) Names
  • procedure block has 2 parameters
  • lev (the nesting level for the block)
  • tx (starting index for the symbol table)
  • The nested procedure enter is what puts symbols
    (variable names) into the symbol table
  • Right-side page 14 of handout
  • Initialize symbol table
  • Make initial call of block non-zero table index
  • Can initialize or do other things not normal in
    the user visible input language

115
New Operator
  • Add to getsym to recognize new symbol
  • Look in condition, expression, term, factor
  • Is new operator parallel to one of those
    operators?
  • Basically another option in code generation
  • If not like existing operators,add new syntactic
    construct.
  • New action add to PL/0 machine
  • Generate new instruction gen(opr,0,14)
    (square)
  • Implement new instruction functionality(page 14,
    left-side)14 begin st stst end
  • Add it into list of mnemonics

116
Built-in Function
  • Design new indicator for symbol table
  • Put function name in symbol table
  • Parser will recognize as defined name(there will
    be no way for user to put in)
  • In termif symident then
    iposition(id) case kind of
    constant variable procedure
    built-in begin getsym left paren
    expression
    getsym right paren
    gen (opr, 0, new-thing) end

117
Pre-defined function
  • Another approach
  • Put entry into symbol table
  • Make it a regular procedure
  • Initialize the code array to represent the code
    that might have been generated

118
New statement type
  • Add new syntax into body of statement(page 12)
  • Look at call as an example

119
(No Transcript)
120
(No Transcript)
121
What is Computer Science?
  • Lots of parts and specialties
  • Core of computer science
  • How programs developed
  • How programs execute
  • Programming software engineering
  • 655 (programming languages) is central
  • Programming
  • How would you illustrate basic programming?
  • Really, how would you illustrate basic
    programming?
  • HTML for formatting text
  • JavaScript for beginning programming

122
What is 655 Prog. Languages About?
  • Compare programming languages, but how?
  • Vertical Fortran, Cobol, Lisp, C, etc.
  • Horizontal loops, selection, subprograms
  • Topics object-oriented, event handling,
    concurrency
  • Processing syntax, semantics, compiler basics
  • Note textbook and course have some of all of
    these
  • Fundamental programming language concepts (my
    idea)
  • Divide a program into pieces (e.g., subprograms,
    types, threads, tasks, classes, packages,
    modules, components)
  • Controlled modification of the pieces (e.g.,
    compilers, templates)
  • Have pieces communicate (during development,
    building, and execution e.g., names, linkages,
    parameters)
  • Combine pieces into program (e.g., loaders,
    building tools)
  • Know the pieces will work together (e.g., design,
    reviews, proofs)
  • Decisions, decisions, decisions, . . .

123
Goal of CIS 655
  • New insight into the programming process and the
    languages and concepts we use.
  • Knowledge of where we are and why.
  • Anticipate and prepare for future programming.

124
CIS 655 Warning
  • Adult Content WarningF-word of programming
    languages FORTRANA-word ALGOL
  • LISP Lots of Irritating Silly Parentheses
  • Sesame Streets Cookie MonsterC is for
    programming, thats good enough for me.
  • COBOL Committee Originated Boloney
  • America Demanded Ada
  • Java Just Another Version of Ada
  • C (C)

125
Traditional Content of 655
  • Languages you didnt see elsewhere (1 hr courses)
  • Compiler basics (used to be separate course)
  • History of computing (still some of that)
  • Classic languages
  • Lisp (Scheme), Algol (Pascal, Ada), Snobol, etc.
  • Classic computing approaches
  • Recursion
  • Concurrency
  • New topics
  • Programming with interfaces
  • Distributed computing, web applications, Web
    Services

126
Now 2005
  • Programming languages have changed mainly because
    of capabilities in execution environment
  • Graphical User Interfaces
  • Networking inside the execution environment
  • Operating system concurrency in prog. language
  • Classes (objects) as an extension of types
  • Exception handling
  • Security (probably coming soon)
  • Some changes for program structuring
  • Packages
  • Generics and templates
  • Proof conditions
  • Components ?

127
Programming Languages
  • Teaching at Old Dominion University
  • Usual course
  • Lisp
  • Snobol
  • APL
  • Algol / Pascal
  • Summer 1979
  • LISP
  • MUMPS
  • JOVIAL
  • CMS-2
  • Ada
  • How I got started with Ada (then for 16 years)

128
655 Summary Su05
  • PL/0 compiler for traditional languages
  • Lisp and its interpreter similarities
  • XML/XSLT string matching (declarative)
  • Patterns in programming
  • High-level constructor, factory, façade,
    singleton
  • Low-level loop, selection, exit, exception
  • Parameterized types generics, templates
  • Event-oriented programming, concurrency,
    distributed
  • Environments Eclipse and its extensions
  • Next generation programs (Oracle HTML DB)

129
Plan and Expectations for Su05
  • Classical languages and topics
  • History, comparisons, Algol (and its influence)
  • Basic control structures, data structures, types
  • Binding, parameter passing
  • Lisp and functional programming
  • Compilation basics
  • Instead of a separate course
  • Object-oriented programming, UML
  • Java, C, J2EE, .NET
  • Concurrency, event driven programming
  • Distributed Computing
  • RMI, CORBA, RPC, SOAP (XML Protocol)
  • XML, XSLT, XML Schemas, etc.

130
Summer Quarter 2005 Plan
  • Trying new things in a different order
  • Implementation of programming languages
  • Syntax of programming languages prerequisites
  • Possible Project - PL/0 handout (understanding
    large program)
  • New style project assignment (presentations)
    start early
  • Traditional topics middle part of quarter
  • New stuff, OO, concurrency, event driven,
    distributed computing platforms, Java, C
  • Course Expectations
  • Broad perspective on computing
  • How programs are developed executed
  • Knowledge of alternatives in programming
  • Information about traditional language processing
    concepts
  • Some specifics about programming languages
    considered in the common knowledge domain of
    computer scientists
  • Prepare for 50 more years of programming

131
Traditional 655 Programming Project(s)
  • Hybrid compiler / interpreter for small language
    with C/Java-style syntax
  • Transform a high-level language to low-level form
  • Reasonable use of tools and software encouraged
  • Primarily individual, some 2-person groups
  • Build on what you already know (e.g., 560)
  • Project done in stages
  • Proposals in class, demonstrations, documentation
  • Develop your own appropriate tests
  • Software to CIS computers using submit command
  • Possible alternatives may be proposed
  • No Perl implementations (use C or Java)

132
How to Use the Textbook
  • The textbook and web resources are very
    informative and cost effective
  • Read, question, summarize
  • Learn from reading
  • Work on it in this class
  • Bring book to class on days suggested
  • Relate the authors comments to class
  • Formulate your own conclusions
  • Test against discussions, experience, and
    experiments
  • Learning from reading is essential to being a
    successful computer scientist and leader

133
How 655 with Mathis Class Works
  • Mathis usually talks a lot
  • This term I want more student interaction
  • Student should learn a lot
  • Ive tried to structure topics of interest
  • Students investigate and share with class
  • Goalenable students to learn without the
    teacher
  • I want to help you in this direction,not force
    it on you by default
  • There will be a lot of change in the next 40
    years
  • Information on web site is frequently
    updated(primary method of class communication)

134
Do I have to come to class?
  • I dont usually take attendance, but . . .I
    notice who is here.
  • Class attendance more important than students
    seem to realize.
  • I think the topics I talk about are
    important.Lots of new material not covered in
    text.
  • Class attendance is the most efficient way to
    learn the information in the text course your
    opportunity to ask questions.
  • The projects and tests are discussed in class.
  • Office hours and e-mail important, but not a
    substitute for class attendance.
  • Ill miss you.

135
Relationship of 655 to Other Courses
  • 560 (prerequisite)
  • Classical system software and group projects
  • 625 (prerequisite)
  • 660 Operating Systems (when I have tau
Write a Comment
User Comments (0)
About PowerShow.com