Title: Synthesis of Embedded Software for Reactive Systems
 1Synthesis of Embedded Software for Reactive 
Systems
- Jordi Cortadella 
 - Universitat Politècnica de Catalunya, Barcelona 
 - Joint work with 
 -  Robert Clarisó, Alex Kondratyev, Luciano 
Lavagno,  Claudio Passerone and Yosinori 
Watanabe (UPC, Cadence Berkeley Labs, 
Politecnico di Torino) 
  2System Design
Platform provider (e.g. Semiconductor company)
MethodologyforPlatform-basedSystem Design
External IP provider(e.g. software modem)
Internal IP provider (e.g. MPEG2 engine)
 Requirements specification, Testbench
 Functional and performance models (with 
agreed interfaces and abstraction levels) 
 3Metropolis Project
etropolis
- Goal develop a formal design environment 
 - Design methodologies abstraction levels, design 
problem formulations  - EDA formal methods for automatic synthesis and 
verification,  -  a modeling mechanism heterogeneous 
semantics, concurrency  - Participants 
 - UC Berkeley (USA) methodologies, modeling, 
formal methods  - CMU (USA) formal methods 
 - Politecnico di Torino (Italy) modeling, formal 
methods  - Universitat Politècnica de Catalunya (Spain) 
modeling, formal methods  - Cadence Berkeley Labs (USA) methodologies, 
modeling, formal methods  - Philips (Netherlands) methodologies 
(multi-media)  - Nokia (USA, Finland) methodologies (wireless 
communication)  - BWRC (USA) methodologies (wireless 
communication)  - BMW (USA) methodologies (fault-tolerant 
automotive controls)  - Intel (USA) methodologies (microprocessors)
 
  4Metropolis Framework
Architecture Specification
Design Constraints
- Metropolis Infrastructure 
 -  
 -  Design methodology 
 -  Meta model of computation 
 -  Base tools 
 -  - Design imports 
 -  - Meta model compiler 
 -  - Simulation
 
  5Outline
- The problem 
 - Synthesis of concurrent specificationsfor 
sequential processors  - Compiler optimizations across processes 
 - Previous work Dataflow networks 
 - Static scheduling of SDF networks 
 - Code and data size optimization 
 - Quasi-Static Scheduling of process networks 
 - Petri net representation of process networks 
 - Scheduling and code generation 
 - Open problems
 
  6Embedded Software Synthesis
- Specification concurrent functional netlist 
(Kahn processes, dataflow actors, SDL processes, 
)  - Software implementation (smaller) set of 
concurrent software tasks  - Two sub-problems 
 - Generate code for each task 
 - Schedule tasks dynamically 
 - Goals 
 - minimize real-time scheduling overhead 
 - maximize effectiveness of compilation
 
  7Environmental controller
Temperature
Humidity
ENVIRONMENTAL CONTROLLER
AC
Dehumidifier
Alarm 
 8Environmental controller
TEMP-FILTER float sample, last last  
0 forever  sample  READ(TSENSOR) if 
(sample - last gt DIF)  last  sample 
 WRITE(TDATA, sample)  
TSENSOR
HSENSOR
TEMP FILTER
HUMIDITY FILTER
HDATA
TDATA
CONTROLLER
AC-on
DRYER-on
ALARM-on 
 9Environmental controller
TEMP-FILTER float sample, last last  
0 forever  sample  READ(TSENSOR) if 
(sample - last gt DIF)  last  sample 
 WRITE(TDATA, sample)  
TSENSOR
HSENSOR
TEMP FILTER
HUMIDITY FILTER
HDATA
TDATA
HUMIDITY-FILTER float h, max forever  h  
READ(HSENSOR) if (h gt MAX) WRITE(HDATA, h) 
CONTROLLER
AC-on
DRYER-on
ALARM-on 
 10Environmental controller
CONTROLLER float tdata, hdata forever  
select(TDATA,HDATA)  case TDATA tdata  
READ(TDATA) if (tdata gt TFIRE) 
WRITE(ALARM-on,10) else if (tdata gt 
TMAX) WRITE(AC-on, tdata-TMAX) case HDATA 
 hdata  READ(HDATA) if (hdata gt HMAX) 
 WRITE(DRYER-on, 5)  
TSENSOR
HSENSOR
TEMP FILTER
HUMIDITY FILTER
HDATA
TDATA
CONTROLLER
AC-on
DRYER-on
ALARM-on 
 11Environ.
Processes
OS
Tsensor
T-FILTERwakes up
Operating system
T-FILTERexecutes
T-FILTERsleeps
Hsensor
H-FILTERwakes up
H-FILTERexecutes sends datato HDATA
H-FILTERsleeps
CONTROLLERwakes up
CONTROLLERexecutes reads datafrom HDATA
. . . 
 12Compiler optimizations
- Instruction level 
 - Basic blocks 
 - Intra-procedural(across basic blocks) 
 - Inter-procedural 
 - Inter-process ?
 
- a  b16 ? a  b gtgt 4 
 - common subexpr.,copy propagation 
 - loop invariants,induction variables 
 - inline expansion,parameter propagation 
 - channel optimizations,OS overhead reduction
 
Each optimization enables further optimizations 
at lower levels 
 13Partial evaluation (example)
Specification subsets (n,k)  n! / (k!  
(n-k)!) __________________________________________
______ int subsets (int n, int k)  return 
fact(n) / (fact(k)  fact(n-k))  int pairs 
(int n)  return subsets (n,2) ... print 
(pairs(x1)) ... ... print (pairs(5)) ...
Partial evaluation (compiler optimizations) 
 14Partial evaluation (example)
Specification subsets (n,k)  n! / (k!  
(n-k)!) __________________________________________
______ int subsets (int n, int k)  return 
fact(n) / (fact(k)  fact(n-k))  int pairs 
(int n)  return subsets (n,2) ... print 
((x1)x / 2) ... ... print (pairs(5)) ...
Partial evaluation (compiler optimizations) 
 15Partial evaluation (example)
Specification subsets (n,k)  n! / (k!  
(n-k)!) __________________________________________
______ int subsets (int n, int k)  return 
fact(n) / (fact(k)  fact(n-k))  int pairs 
(int n)  return subsets (n,2) ... print 
((x1)x / 2) ... ... print (10) ... 
 16Inter-process partial evaluation
forever  n  read (A) write (B,n) 
write (C, n-2) write (D, 2) 
A
forever  x  read (E) y  read (F) z 
 read (G) write (H, x/(yz)) 
n
x!
x!
x!
H
pairs (n) 
 17Inter-process partial evaluation
forever  n  read (A) write (B,n) 
write (C, n-2) write (D, 2) 
A
forever  x  read (E) y  read (F) z 
 read (G) write (H, x/(yz)) 
x!
x!
x!
H
No chances for optimization 
 18Inter-process partial evaluation
forever  n  read (A) write (B,n) 
write (C, n-2) write (D, 2) 
A
forever  x  read (E) y  read (F) z 
 read (G) write (H, x/(yz)) 
x!
x!
x!
H
2...2
2...2 
 19Inter-process partial evaluation
forever  n  read (A) write (B,n) 
write (C, n-2) write (G, 2) 
A
forever  x  read (E) y  read (F) z 
 read (G) write (H, x/(yz)) 
x!
x!
H
2...2
2...2 
 20Inter-process partial evaluation
forever  n  read (A) write (B,n) 
write (C, n-2) write (G, 2) 
A
forever  x  read (E) y  read (F) z 
 read (G) write (H, x/(yz)) 
x!
x!
H
2...2
2...2 
 21Inter-process partial evaluation
forever  n  read (A) write (B,n) 
write (C, n-2) write (G, ) 
A
forever  x  read (E) y  read (F) 
read (G) write (H, x/(y2)) 
x!
x!
H
-  Copy propagation across processes 
 -  Channel G only synchronizes (token available)
 
  22Inter-process partial evaluation
forever  n  read (A) write (B,n) 
write (C, n-2) write (G, ) 
A
forever  x  read (E) y  read (F) 
read (G) write (H, x/(y2)) 
x!
x!
H
By scheduling operations properly, FIFOs may 
become variables (one element per FIFO, at most) 
 23Inter-process partial evaluation
forever  n  read (A) v1  n v3  
n-2 
A
 x  v2 y  v4 write (H, 
x/(y2)) 
x!
v1
v2
x!
v3
v4
H 
 24Inter-process partial evaluation
A
forever  n  read (A) v1  n 
v2  fact (v1) x  
v2 v3  n-2 v4  fact (v3) 
 y  v4 
 write (H, x/(y2)) 
H
And now we can apply conventional compiler 
optimizations 
 25Inter-process partial evaluation
A
forever  n  read (A) x  fact (n) 
y  fact (n-2) write (H, x/(y2)) 
H
If some clever theorem prover could realize 
that fact(n)  n(n-1)fact(n-2) the following 
code could be derived ... 
 26Inter-process partial evaluation
forever  n  read (A) write 
(H,n(n-1)/2) 
A
H 
 27Inter-process partial evaluation
forever  n  read (A) write (B,n) 
write (C, n-2) write (D, 2) 
A
forever  x  read (E) y  read (F) z 
 read (G) write (H, x/(yz)) 
x!
x!
x!
H
This was the original specification of the system 
! 
 28Inter-process partial evaluation
forever  n  read (A) write 
(H,n(n-1)/2) 
A
H
- This is the final implementation after 
inter-process optimization  -  Only one process (no context switching overhead) 
 -  Channels substituted by variables (no 
communication overhead) 
  29- Goal improve performance, code size 
power consumption, ...  -  Reduce operating system overhead 
 -  Reduce communication overhead 
 - How? Do as much as possible statically 
 and automatically  -  Scheduling 
 -  Compiler optimizations
 
Operating system
TSENSOR
HSENSOR
TEMP FILTER
HUMIDITY FILTER
HDATA
TDATA
CONTROLLER
AC-on
DRYER-on
ALARM-on 
 30Outline
- The problem 
 - Synthesis of concurrent specifications 
 - Compiler optimizations across processes 
 - Previous work Dataflow networks 
 - Static scheduling of SDF networks 
 - Code and data size optimization 
 - Quasi-Static Scheduling of process networks 
 - Petri net representation of process networks 
 - Scheduling and code generation 
 - Open problems
 
  31Dataflow networks
- Powerful mechanism for data-dominated systems 
 - (Often stateless) actors perform computation 
 - Unbounded FIFOs perform communication via 
sequences of tokens carrying values  - (matrix of) integer, float, fixed point 
 - image of pixels, .. 
 - Determinacy 
 - unique output sequences given unique input 
sequences  - Sufficient condition blocking read 
 -  (process cannot test input queues for emptiness)
 
  32A bit of history
- Kahn process networks (58) formal model 
 - Karp computation graphs (66) seminal work 
 - Dennis Dataflow networks (75) programming 
language for MIT DF machine  - Lees Static Data Flow networks (86) efficient 
static scheduling  - Several recent implementations(Ptolemy, Khoros, 
Grape, SPW, COSSAP, SystemStudio, DSPStation, 
Simulink, ) 
  33Intuitive semantics
- Example FIR filter 
 - single input sequence i(n) 
 - single output sequence o(n) 
 - o(n)  c1  i(n)  c2  i(n-1) 
 
i(-1)
i
? c1
? c2
o 
 34Examples of Dataflow actors
- SDF Static Dataflow fixed number of input and 
output tokens  - BDF Boolean Dataflow control token determines 
number of consumed and produced tokens 
1
1
1
T
F
select
merge
F
T 
 35Static scheduling of DF
- Key property of DF networks output sequences do 
not depend on firing sequence of actors (marked 
graphs)  - SDF networks can be statically scheduled at 
compile-time  - execute an actor when it is known to be fireable 
 - no overhead due to sequencing of concurrency 
 - static buffer sizing 
 - Different schedules yield different 
 - code size 
 - buffer size 
 - pipeline utilization
 
  36Balance equations
- Number of produced tokens must equal number of 
consumed tokens on every edge (channel)  - Repetitions (or firing) vector v of schedule S 
number of firings of each actor in S  -  v(A) np  v(B) nc 
 -  must be satisfied for each edge
 
np
nc
A
B 
 37Balance equations
A
2
3
2
1
1
1
B
C
1
1
- Balance for each edge 
 -  3 v(A) - v(B)  0 
 -  v(B) - v(C)  0 
 -  2 v(A) - v(C)  0 
 -  2 v(A) - v(C)  0
 
  38Balance equations
- M v  0 
 -  iff S is periodic 
 - Full rank (as in this case) 
 - no non-zero solution 
 - no periodic schedule 
 - (too many tokens accumulate on A?B or B?C)
 
  39Balance equations
- Non-full rank 
 - infinite solutions exist (linear space of 
dimension 1)  - Any multiple of v  1 2 2T satisfies 
the balance equations  - ABCBC and ABBCC are minimal valid schedules 
 - ABABBCBCCC is non-minimal valid schedule
 
  40Static SDF scheduling
- Main SDF scheduling theorem (Lee 86) 
 - A connected SDF graph with n actors has a 
periodic schedule iff its topology matrix M has 
rank n-1  - If M has rank n-1 then there exists a unique 
smallest integer solution v to  -  M v  0
 
  41Deadlock
- If no actor is firable in a state before reaching 
the initial state, no valid schedule exists 
(Lee86)  
A
1
1
2
2
B
C
1
1
Schedule (2A) B C 
 42Deadlock
- If no actor is firable in a state before reaching 
the initial state, no valid schedule exists 
(Lee86)  
A
1
1
2
2
B
C
1
1
Schedule (2A) B C 
 43Deadlock
- If no actor is firable in a state before reaching 
the initial state, no valid schedule exists 
(Lee86)  
A
1
1
2
2
B
C
1
1
Schedule (2A) B C 
 44Deadlock
- If no actor is firable in a state before reaching 
the initial state, no valid schedule exists 
(Lee86)  
A
1
1
2
2
B
C
1
1
Schedule (2A) B C 
 45Deadlock
- If no actor is firable in a state before reaching 
the initial state, no valid schedule exists 
(Lee86)  
A
1
1
2
2
B
C
1
1
Schedule (2A) B C 
 46Deadlock
- If no actor is firable in a state before reaching 
the initial state, no valid schedule exists 
(Lee86)  
A
1
1
2
2
B
C
1
1
Schedule (2A) B C 
 47Compilation optimization
- Assumption code stitching 
 - (chaining custom code for each actor) 
 - More efficient than C compiler for DSP 
 - Comparable to hand-coding in some cases 
 - Explicit parallelism, no artificial control 
dependencies  - Main problem memory and processor/FU allocation 
depends on scheduling, and vice-versa 
  48Code size minimization
- Assumptions (based on DSP architecture) 
 - subroutine calls expensive 
 - fixed iteration loops are cheap 
 -  (zero-overhead loops) 
 - Global optimum single appearance schedule 
 - e.g. ABCBC ? A (2BC), ABBCC ? A (2B) (2C) 
 - may or may not exist for an SDF graph 
 - buffer minimization relative to single appearance 
schedules  -  (Bhattacharyya 94, Lauwereins 96, Murthy 97)
 
  49Buffer size minimization
- Assumption no buffer sharing 
 - Example 
 -  
 -  v   100 100 10 1T 
 - Valid SAS (100 A) (100 B) (10 C) D 
 - requires 210 units of buffer area 
 - Better (factored) SAS (10 (10 A) (10 B) C) D 
 - requires 30 units of buffer area, but 
 - requires 21 loop initiations per period (instead 
of 3) 
  50Scheduling more powerful DF
- SDF is limited in modeling power 
 - More general DF is too powerful 
 - non-Static DF is Turing-complete (Buck 93) 
 - bounded-memory scheduling is not always possible 
 - Boolean Data Flow Quasi-Static Scheduling of 
special patterns  - if-then-else, repeat-until, do-while 
 - Dynamic Data Flow run-time scheduling 
 - may run out of memory or deadlock at run time 
 - Kahn Process Networks quasi-static scheduling 
using Petri nets  - conservative schedulable network may be declared 
unschedulable 
  51Outline
- The problem 
 - Synthesis of concurrent specifications 
 - Compiler optimizations across processes 
 - Previous work Dataflow networks 
 - Static scheduling of SDF networks 
 - Code and data size optimization 
 - Quasi-Static Scheduling of process networks 
 - Petri net representation of process networks 
 - Scheduling and code generation 
 - Open problems
 
  52Quasi-Static Scheduling
- Sequentialize concurrent operations as much as 
possible  - less communication overhead (run-time task 
generation)  - better starting point for compilation 
(straight-line code from function blocks)  -  Must handle 
 - data-dependent control 
 - multi-rate communication
 
  53The problem
- Given a network of Kahn processes 
 - Kahn process sequential function  ports 
 - communication port-based, point-to-point, 
uni-directional, multi-rate  - Find a single sequential task 
 - functionally equivalent to the originalnetwork 
(modulo concurrency)  - threads driven by input stimuli(no OS 
intervention) 
TSENSOR
HSENSOR
TEMP FILTER
HUMIDITY FILTER
HDATA
TDATA
CONTROLLER
AC-on
DRYER-on
ALARM-on 
 54Event-driven threads
Init() last  0
Reset
Tsensor() sample  READ(TSENSOR) if 
(sample - last gt DIF)  last  sample 
if (sample gt TFIRE) WRITE(ALARM-on,10) 
 else if (sample gt TMAX) 
WRITE(AC-on,sample-TMAX) 
Hsensor() h  READ(HSENSOR) if (h gt MAX) 
 WRITE(DRYER-on,5) 
 55The scheduling procedure
- 1. Specify a network of processes 
 - process C  communication operations 
 - netlist connection between ports 
 - 2. Translate to the computational model Petri 
nets  - 3. Find a schedule on the Petri net 
 - 4. Translate the schedule to a task
 
  56TSENSOR
TSENSOR
TEMP FILTER
last  0
TDATA
sample  READ(TSENSOR)
TEMP-FILTER float sample, last last  0 while 
(1)  sample  READ(TSENSOR) if (sample - 
lastgt DIF)  last  sample 
WRITE(TDATA, sample)  
F
T
last  sample WRITE(TDATA,sample)
TDATA 
 57HSENSOR
HSENSOR
HUMIDITY FILTER
HDATA
h  READ(HSENSOR)
HUMIDITY-FILTER float h, max last  0 while 
(1)  h  READ(HSENSOR) if (h gt MAX) 
WRITE(HDATA, h) 
F
h gt MAX ?
T
WRITE(HDATA,h)
HDATA 
 58CONTROLLER while(1)  select(TDATA,HDATA)  
case TDATA tdata  READ(TDATA) if 
(tdata gt TFIRE) WRITE(ALARM-on, 10) 
 else if (tdata gt TMAX) WRITE(AC-on, 
tdata-TMAX) case HDATA hdata  READ(HDATA, 
hdata) if (hdata gt HMAX) 
WRITE(DRYER-on, 5) 
TDATA
HDATA
hdata  READ(HDATA)
tdata  READ(TDATA)
tdata gt TFIRE?
hdata gt HMAX?
F
F
T
T
WRITE(ALARM-on,10)
WRITE(DRYER-on,5)
h gt MAX ?
F
tdata gt TMAX?
T
WRITE(AC-on,tdata-TMAX) 
 59TSENSOR
HSENSOR
last  0
sample  READ(TSENSOR)
h  READ(HSENSOR)
F
sample-last gt dif ?
F
h gt MAX ?
T
T
last  sample WRITE(TDATA,sample)
WRITE(HDATA,h)
TDATA
HDATA
hdata  READ(HDATA)
tdata  READ(TDATA)
tdata gt TFIRE?
hdata gt HMAX?
F
F
T
T
WRITE(ALARM-on,10)
WRITE(DRYER-on,5)
h gt MAX ?
F
tdata gt TMAX?
T
WRITE(AC-on,tdata-TMAX) 
 60Petri nets for Kahn process networks
Sequential processes (1 token per process)
Input/Output ports (communication with the 
environment)
Channels (point-to-point communication between 
processes) 
 61Petri nets for Kahn process networks
True
True
False
False
- Data-dependent choices 
 -  Conservative assumption (any outcome is possible)
 
  62Schedule
-  Infinite state space 
 -  Schedule properties 
 -  Finite (no infinite resources) 
 -  Inputs served infinitely often 
 -  All choice outcomes covered
 
  63Schedule
-  Finding the optimal schedule is 
computationally expensive  -  Heuristics are required 
 -  token count minimization 
 -  guidance by T-invariants (cycles)
 
  64Code generation
Initialization
I1
system
Await state
I1
I2
I2
- Generated code 
 -  ISRs driven by input stimuli (I1 and I2) 
 -  Each tasks contains threads from one await 
state to another await state 
Choice
I1
I2
T
F
F
T
I1
I2 
 65Code generation
I1
system
I1
I2
I2
- Generated code 
 -  ISRs driven by input stimuli (I1 and I2) 
 -  Each tasks contains threads from one await 
state to another await state 
I1
I2
T
F
F
T
I1
I2 
 66Code generation
C0
I1
system
I1
I2
I2
C9
C1
C4
- Generated code 
 -  ISRs driven by input stimuli (I1 and I2) 
 -  Each tasks contains threads from one await 
state to another await state 
C5
C2
C3
C11
F
I2
I1
I1
I2
C8
C6
C10
C7
T 
 67Code generation
enum state S1, S2, S3 S 
 68Code generation
enum state S1, S2, S3 S Init ()  C0() S  
S1 return 
C0
I1
I2
C9
C1
C4
C5
C2
C3
C11
F
I2
I1
I1
I2
C8
C6
C10
C7
T 
 69Code generation
enum state S1, S2, S3 S ISR1 ()  
switch(S)  case S1 C1() C2() SS2 
return case S2 C3() C2() return case 
S3 C6() C7() C11() C5() return   
C0
I1
I2
C9
C1
C4
C5
C2
C3
C11
F
I2
I1
I1
I2
C8
C6
C10
C7
T 
 70Code generation
enum state S1, S2, S3 S 
 ISR2 ()  switch(S)  case S1 C4() 
C5() SS3 break case S2 C10() C11() 
C5() SS3 return case S3 if (C8())  
 C7() C11() C5() return 
  else  C9() S  
S1 return   
C0
I1
I2
C9
C9
C1
C4
C4
C5
C5
C2
C3
C11
C11
F
I2
I2
I1
I1
I2
I2
C8
C8
C6
C10
C10
C7
T
C7 
 71Code generation
enum state S1, S2, S3 S Init ()  C0() S  
S1 return  ISR1 ()  switch(S)  case 
S1 C1() C2() SS2 return case S2 C3() 
C2() return case S3 C6() C7() C11() 
C5() return   ISR2 ()  switch(S)  
 case S1 C4() C5() SS3 break case S2 
C10() C11() C5() SS3 return case S3 if 
(C8())  C7() C11() C5() 
return  else  
 C9() S  S1 return   
C0
I1
I2
C9
C1
C4
C5
C2
C3
C11
F
I2
I1
I1
I2
C8
C6
C10
C7
T 
 72Code generation
enum state S1, S2, S3 S Init ()  C0() S  
S1 return  ISR1 ()  switch(S)  case 
S1 C1() C2() SS2 return case S2 C3() 
C2() return case S3 C6() C7() C11() 
C5() return   ISR2 ()  switch(S)  
 case S1 C4() C5() SS3 break case S2 
C10() C11() C5() SS3 return case S3 if 
(C8())  C7() C11() C5() 
return  else  
 C9() S  S1 return   
Reset
Init ()
S
I1
ISR1 ()
I2
ISR2 () 
 73Environmental controller
Temperature
Humidity
ENVIRONMENTAL CONTROLLER
AC
Dehumidifier
Alarm 
 74TSENSOR
HSENSOR
last  0
sample  READ(TSENSOR)
h  READ(HSENSOR)
F
sample-last gt dif ?
F
h gt MAX ?
T
T
last  sample WRITE(TDATA,sample)
WRITE(HDATA,h)
TDATA
HDATA
hdata  READ(HDATA)
tdata  READ(TDATA)
tdata gt TFIRE?
hdata gt HMAX?
F
F
T
T
WRITE(ALARM-on,10)
WRITE(DRYER-on,5)
h gt MAX ?
F
tdata gt TMAX?
T
WRITE(AC-on,tdata-TMAX) 
 75TSENSOR
p0
HSENSOR
p3
 A
p6
p8
p1
 B
 G
p2
p7
Cf
 Ct 
 Hf
p9
TDATA
HDATA
 I
 D
p4
p10
Ef
Jf
 Et
 Jt
h gt MAX ?
p5
 F 
 76(p0 p8 p9)
await state
A
(p1 p8 p9)
TSENSOR
HSENSOR
(p1 p3 p8 p9)
(p1 p6 p8 p9)
B
G
Cf
Hf
(p2 p8 p9)
(p2 p7 p9)
Ct
Ht
(p1 p8 p9 TDATA)
(p1 p8 p9 HDATA)
D
I
Jf
Et
Jt
(p1 p4 p8)
(p1 p8 p10)
Ef
F
(p1 p5 p8) 
 77(p0 p8 p9)
A
(p1 p8 p9)
TSENSOR
HSENSOR
(p1 p3 p8 p9)
(p1 p6 p8 p9)
B
G
Cf
Hf
(p2 p8 p9)
(p2 p7 p9)
Ct
Ht
(p1 p8 p9 TDATA)
(p1 p8 p9 HDATA)
D
I
Jf
Et
Jt
(p1 p4 p8)
(p1 p8 p10)
Ef
F
(p1 p5 p8) 
 78(p0 p8 p9)
TEMP-FILTER
HUMIDITY-FILTER
A
(p1 p8 p9)
TSENSOR
HSENSOR
(p1 p3 p8 p9)
(p1 p6 p8 p9)
B
G
Cf
Hf
(p2 p8 p9)
(p2 p7 p9)
Ct
Ht
(p1 p8 p9 TDATA)
(p1 p8 p9 HDATA)
D
I
Jf
Et
Jt
(p1 p4 p8)
(p1 p8 p10)
Ef
F
(p1 p5 p8)
CONTROLLER 
 79Code generation and optimization
Tsensor()  sample  READ(TSENSOR) if 
(sample - last gt DIF)  last  sample 
WRITE (TDATA,sample) tdata  READ (TDATA) 
 if (tdata gt TFIRE) WRITE(ALARM-on,10) 
 else if (tdata gt TMAX) 
WRITE(AC-on,tdata-TMAX)  
Channel elimination 
 80Code generation and optimization
Tsensor()  READ(TSENSOR,sample,1) if 
(sample - last gt DIF)  last  sample 
WRITE (TDATA,sample,1) READ 
(TDATA,tdata,1) if (tdata gt TFIRE) 
WRITE(ALARM-on,10) else if (tdata gt TMAX) 
 WRITE(AC-on,tdata-TMAX)  
 tdata  sample 
Copy propagation 
 81Code generation and optimization
Tsensor()  READ(TSENSOR,sample,1) if 
(sample - last gt DIF)  last  sample 
WRITE (TDATA,sample) tdata  READ (TDATA) 
 if (sample gt TFIRE) WRITE(ALARM-on,10) 
 else if (sample gt TMAX) 
WRITE(AC-on,sample-TMAX)   
 82Event-driven threads
Init() last  0
Reset
Tsensor() sample  READ(TSENSOR) if 
(sample - last gt DIF)  last  sample 
if (sample gt TFIRE) WRITE(ALARM-on,10) 
 else if (sample gt TMAX) 
WRITE(AC-on,sample-TMAX) 
Hsensor() h  READ(HSENSOR) if (h gt MAX) 
 WRITE(DRYER-on,5) 
 83Application example ATM Switch
- No static schedule due to 
 - Inputs with independent rates (need Real-Time 
dynamic scheduling)  - Data-dependent control (can use Quasi-Static 
Scheduling)  
  84Functional Decomposition
Accept/discard cell
Output time selector
4 Tasks ( 1 arbiter)
Clock divider
Output cell enabler 
 85 Minimal (QSS) Decomposition
Input cell processing
2 Tasks
Output cell processing 
 86Real-time scheduling of tasks
Task 1
 RTOS
Task 2
Shared Processor 
 87ATM experimental results
Functional partitioning
QSS
41 Tasks 
 88Producer-Filter-Consumer Example
init
controller
Ack
Coeff
Req
Pixels
Pixels
pixels
producer
consumer
filter 
 89Experimental Results
 of clock cycles
4-task implementation
1-task implementation
size of channels 
 90Open problems
- Is a system schedulable ? (decidability) 
 - False paths in concurrent systems(data 
dependencies)  - Synthesis for multi-processors 
 - Abstraction / partitioning 
 - and many others ...
 
  91Schedulability
- A finite complete cycle is a finite sequence of 
transition firings that returns the net to its 
initial state  - infinite execution 
 - bounded memory 
 - To find a finite complete cycle we must solve the 
balance (or characteristic) equation of the Petri 
net 
t1
t2
t2
t3
2
2
t1
t3
2
1 0
-2 1
D 
f  D  0
 0 -2
f  D  0 has no solution
f  (4,2,1)
  92Schedulability 
- Can the adversary ever force token overflow?
 
t6
t3
t5
t2
t1
t7
t4
t8
t6 
 93Schedulability 
- Can the adversary ever force token overflow?
 
t6
t3
t5
t2
t1
t7
t4
t8 
 94Schedulability
- Can the adversary ever force token overflow?
 
t6
t3
t5
t2
t1
t7
t4
t8
t8 
 95Schedulability
- Can the adversary ever force token overflow?
 
t3
t5
t2
t7
t1
t4
t6 
 96Schedulability
- Can the adversary ever force token overflow?
 
t3
t5
t2
t7
t1
t4
t6 
 97Schedulability
- Can the adversary ever force token overflow?
 
t3
t5
t2
t7
t1
t4
t6 
 98Schedulability
- Schedulability of Free-choice PNs is decidable 
 - Algorithm is exponential 
 - What if the resulting PN is non-free 
choice?(synchronization-dependent control)  - What if the PN is not schedulable for all choice 
resolutions? (correlation between choices) 
  99(Quasi) Static Scheduling approaches
- Lee et al. 86 Static Data Flow cannot specify 
data-dependent control  - Buck et al. 94 Boolean Data Flow undecidable 
schedulability check, heuristic pattern-based 
algorithm  - Thoen et al. 99 Event graph no schedulability 
check, no task minimization  - Lin 97 Safe Petri Net no schedulability check, 
single-rate, reachability-based algorithm  - Thiele et al. 99 Bounded Petri Net partial 
schedulability check, reachability-based 
algorithm  - Cortadella et al. 00 General Petri Net maybe 
undecidable schedulability check, balance 
equation-based algorithm 
  100False paths
Choices are correlated
WRITES  READS ? i  j 
 101Multi-processor allocation
enum state S1, S2, S3 S Init ()  C0() S  
S1 return  ISR1 ()  switch(S)  case 
S1 C1() C2() SS2 return case S2 C3() 
C2() return case S3 C6() C7() C11() 
C5() return   ISR2 ()  switch(S)  
 case S1 C4() C5() SS3 break case S2 
C10() C11() C5() SS3 return case S3 if 
(C8())  C7() C11() C5() 
return  else  
 C9() S  S1 return   
Reset
Init ()
S
I1
ISR1 ()
I2
ISR2 ()
- State and data are shared 
 -  Mutual exclusion required
 
  102Conclusions
- Reactive systems 
 - OS required to control concurrency 
 - Processes are often reused in different 
environments  - Static and Quasi-Static Scheduling minimize 
run-time overhead by automatic partitioning the 
system functions into input-driven threads  - No context switch required (OS overhead is 
reduced)  - Compiler optimizations across processes 
 - Much more research is needed 
 - strategies to find schedules (decidability ?) 
 - false paths in concurrent systems 
 - what about multiple processors? 
 - ...