Title: Pattern Matching and Recognition 689 - Special Topics on Advanced Compiler Technologies
1Pattern Matching and Recognition689 - Special
Topics on Advanced Compiler Technologies
- Alin Jula
- March,19th 2003
2Motivation - Why do we care ?
- Abstract concepts are simpler to deal with and
closer to the human language - Enables implementation substitution
- e.g. recurrences can be replaced
- Eases maintenance process
- e.g. Polaris - 600,000 lines of code
3General Issues
- Pattern matching and recognition Semantic
compiler
Highest Level
Human languages English, Romanian, Spanish, etc
Higher - Level codes
STAPL
CASE Instruments
C, Java
Low - Level codes
Fortran,C
assembly
4Example
- Given a set of uranium isotopes with random
energies and velocities, calculate their
coordinates (energy, velocities, space
coordinates,etc) after 3 seconds of interaction - Equivalent with thousands or maybe hundred of
thousands of lines of code
5Pattern Matching and Recognition
Program code
6Pattern Matching and Recognition
- Program Understanding
- (Low to High Level)
- Code Generation and Replacement
- (High to Low Level)
7 Quilici - Overview
- Translates C into C
- Input C code in Abstract Syntax Tree form
- Output Programming plans ( then C )
- Analysis
- Bottom-Up on the Code
- Top-Down on the Programming Plans
8Quilici - Example
Equivalent C code
C code
9 Quilici - Plan Library
- A Library of patterns (programming plans)
- Extends an existing Library (from Andersen
Consulting for understanding Cobol).
10 Quilici - Programming Plan
- Programming Plan
- definition list of attributes
- recognition rule(s)
- components of the plan
- constraints on the components
11 Quilici -Programming Plan
Read Values within a while loop
12Quilici - Library Organization
- Plan indices
- Connects plans hierarchically
- Objective narrows the search
- Specialization Constraints
- Stores only the ? constraints
- Objective eliminates commonalties
- Implied Plans
- Objects recognized form a particular plan
- Objective avoids matching them with the code,
their existence is implied from the recognized
plans
13 Quilici - The Algorithm
14 Quilici Example
15Parallel Code
16 BH - Overview
- Parallelizes FORTRAN code using pattern matching
- Input (FORTRAN code - abstract syntax tree)
- Output ( parallel FORTRAN code )
- Analysis
- Patterns are found in the code
- User replaces these patterns with an already
parallelized implementation
17 BH - Overview
18 BH- Pattern Language
- Describes
- programming concepts
- domain concepts
- The patterns are written in the pattern language
before pattern recognition process
19BH - Pattern Language Syntax
20 BH - Patterns Examples
21 BH - Pattern Classification
- Hierarchical approach
- Classification of patterns
- base-concept (atomic) - e.g. swap
- intermediate-concept - e.g. pivoting
- domain-concept - e.g. Gaussian elimination
- Dependencies between the patterns are stored in
the library
22 BH - Pattern Classification
23 BH - Pattern Recognition
- The user is asked to specify the domain concepts
as narrow as possible - The AST is then compared against the Pattern
Library
24 BH - Domain Patterns
25 BH - Algorithm
- Prune the space of patterns to search
- Base concepts are identified
- Intermediate patterns are built up from the base
ones - Domain concepts are built up from intermediate
ones
- One Pass Search !
- Fast
- not Effective
26 BH - Algorithm
27 BH - Algorithm (cont.)
28(No Transcript)
29 K - Overview
- Parallelizes FORTRAN 77 and C code using pattern
matching - Input (FORTRAN 77 and C code - abstract syntax
tree) - Output ( parallel C code )
- Analysis
- Graph based
- Hierarchical approach
30 K-Library
- Collected around of 150 patterns for scientific
numeric codes - Hierarchical approach
- e.g. matrix-multiplication loop over a dot
product - Patterns characterized based on the depth of the
loop nest - e.g. scalar 0. Dot product1,matrix
multiplication2, etc
31 K - library
32 K- Library Organization
- Organized as a Pattern Hierarchy Graph(PHG)
- G(V,E) - V is a set of patterns, E set of
dependencies between patterns - m dependent of the trigger patterns mi, there is
an edge between (m,mi) - Pattern recognition becomes a path finding
program in PHG
33K - Library Organization
34 K - Algorithm
35 K - Paramat
- ParamatPARallelize Automatically by pattern
MATching - Tool for pattern recognition and algorithm
substitution - Automatic Parallelization Tool
36K - Paramat
37 K -Paramat
- Algorithm replacement (recurrences)
- Data distribution
- Run time prediction
38(No Transcript)
39 W Y - Overview
- Program Understanding as Constraint Satisfaction
Problem - a set of variables Xi
- a set of values for each Xi, Dom (Xi)
- set of constraints - permissible subsets to
variable - Constraints implemented in Prolog
- Searching algorithm - backtracking
40 W Y - Overview
41 W Y - Algorithm
42 W Y - Results
43(No Transcript)
44 PAP -Overview
- Parallelizable Algorithmic Patterns
- Graphical tool - permits vizualization of the
recognized concepts, together with the
implementation within the program - Integrated into Vienna Fortran Compilation System
45 PAP - Recognizer
- Performs a hierarchical parsing driven by concept
recognition rules (implemented in Prolog) - Concept
- its compositional hierarchy (set of composing
subcomponents) - Relationships and constraints among composing
components
46 PAP - Recognizer
- PAP builds an Abstract Program Representation(APR)
from a Basic Program Representation (BPR)
(initial code ) - Base level of representation - Program Dependence
Graph (PDG) - nodes - statements
- edges
- control flow ( labeled with T/F)
- data dependence (labeled with variable
identifier) - The APR is a hierarchical PDG
47 PAP - APR
48 PAP - Traits
- PAP deals with
- Program variation - various implementations for
the same algorithm - Delocalization - implementation of a concept
throughout the code - Overlapping implementations - implementations of
two or more concepts are merged
49 PAP - Output example
50(No Transcript)
51BFV - Overview
- Code Generator
- Graphical Tool
- Input
- Design Patterns
- Output
- C code.
52BFV
53BFV
54BFV
55Parallel Code
56M Y - Algorithmic equivalence
- Let P and P be two subprograms. Define P and P
to be algorithmically equivalent iff one can be
obtained from the other one through following
transformations - Rename variables (bijective function-mapping)
- Reorder the statements without violating the
semantics - Permute the operands of any commutative operator
57References
- Quilici "An Opportunistic, Memory-Based
Approach to Recognizing Programmings Plans" ,
Alex Quilici - BH "A Pattern Matching Approach for reusing
Software Libraries in Parallel Systems" ,S.
Bhansali, J.R. Hagemeister - K "Pattern-Driven Automatic Parallelization" ,
Christoph Kessler - WY "Program Understanding as Constraint
Satisfaction Representation and Reasoning
Techniques" , S. Woods, Q. Yang - PAP"PAP Recognizer a Tool for Automatic
Recognition of Parallelizable Patterns" , B. Di
Martino, G. Iannello - BFV "Automatic Code Generation from Design
Patterns, IBM Systems Journal, Vol. 35, No. 2,
Frank Budinsky, Marilyn Finnie, Patsy Yu,John
Vlissides - MY "Automatic Algorithm Recognition and
Replacement , Robert Metzger and Zhaofang Wen