Title: What simplifications could a compiler, or you, do without sacrifice fast execution
1(No Transcript)
2What simplifications could a compiler, or you, do
without sacrifice fast execution?
35-7 Code optimization
Two functions f and g
define MAX 10int aMAX, bMAX, cMAX,
xMAX, yMAXint i, j, r, s. . .int f(int
a, int b) int z z 2 a b return
zint g(int a, int b, int c) int z z
a c c b return z
What code optimization can the compiler do? -O,
-O0, -O1, -O2, -O3, -Os ?
With the O or O0 you have to do all
optimi-zations yourself
4Optimization flags
-O, -O0 No optimization-O1 Optimize for
size-O2 Optimize for speed and enable some
optimization-O3 Enable all optimizations as
O2, and intensive loop optimizations-Os Optimi
ze for speed
Default setting!
5Two for loops
. . .for(i 0 i lt MAX -1 i) xi
f(ai, bi) s 2 rfor(j 0 j lt MAX
- 1 j) yj s g(aj, bj, cj)
What can be done?
We want shorter execution time without increasing
the code!
6Loop integration
The two loops have the same range (0, MAX-1), and
no data dependency (x only in loop1, y only in
loop2). Loops can be integrated saves loop
overhead ( only i )!
s 2 rfor(i 0 i lt MAX - 1 i)
xi f(ai, bi) yj s g(aj, bj,
cj)
7Precalculation at compile time
The defined constant MAX is used as MAX - 1 in
the loop. MAX - 1 could be precalculated as 10
1 9 at compile time!
s 2 rfor(i 0 i lt 9 i) xi
f(ai, bi) yj s g(aj, bj,
cj)
8Algebraic simplification
Rewriting function g can save one multiplication
operation
mul sub mul mul sub
int g(int a, int b, int c) int z z c
(a b) return z
9Inlining of functions
Both functions f and g are short and their code
could be inserted directly in the loop.
int a10, b10, c10, x10, y10int i, r,
ss 2 rfor(i 0 i lt 9 i) xi
2 ai bi yj s ((ai bi)
ci)
loop unrolling would give shorter execution time,
but it would also increase the code size, so it
cant be used in this case.
10(No Transcript)
115-2 Register lifetime
A processor has this instruction type op R1, R2,
R3 all three registers must be different. Code
to run
u c d (1) v a b (2)w a u (3)x
v e (4)
How many registers are needed?
12Register Life Time Graph
u c d (1) v a b (2)w a u (3)x
v e (4)
Four registers are needed!
13Data Flow Graph
A Data Flow Graph can detect data dependencies.
u c d (1) v a b (2)w a u (3)x
v e (4)
- Must be before (3)
- Must be before (4)
(2) and (3) can change execution order!
14New Register Life Time Graph
New instruction order
u c d (1) w a u (2)v a b
(3)x v e (4)
Now only 3 registers needed. Saving 25.
15(No Transcript)
165-8 CDFG
- Control and Data Flow Graph (CDFG)
- Multiplication takes 3 cycles, all other
instructions take 1 cycle. Best/Worst execution
time?
mode 0 TBest 11 2
y 0if(mode 1) for(i 0 i lt 5
i) y ai bi
mode 1 TWorst 11 1(51) 54 5 34
T 31 4
17Multiply Accumulate operation
c) MAC-unit! R1 R1 R2 R3 in one cycle!
y ai bi / one cycle /
TWorst 11 1(51) 51 5 19
19/34 0.56. With MAC 56 of ordinary processor
execution time.
T 1
18(No Transcript)
19Processes on a CPU
20Scheduling states of process
21Priority Driven Scheduling
- Each process has fixed priority
- The ready process with the highest priority
executes - Process executes until completion or preemtion
by higher priority process
22Examples of sampling frequencies and execution
period.
Actuator servo2000 Hz
RTOS
GPS sensor20 Hz
Process periodsGPS1/20 50 ms Speed 1/1000
1 ms Joystick 1/500 2 ms Servo 1/2000
0.5 ms
Speed sensor1 kHz
Joystick500 Hz
Tasks will often run periodicaly with different
process periods.
23Task Triplet
P( max execution time, period, deadline
) deadline lt period RMS deadline period
(simplification)
246-2 Processor utilization and feasible scheduling
Task TripletP(execution time, period, deadline)
deadline period P1(3, 9, 9) P2(1, 2, 2)
P3(1, 6, 6)
Timeline least-common multiple of process
periods 9, 2, 6 3?3, 2, 2?3 3?3?2 18
CPU utilization
100 ?
25Rate Monotonic Scheduling
RMS shortest period is assigned the highest
priority and so on.
RMS guarantee, feasible schedule exists if
In this case U 1 so there is no guarantee!
n 3 U lt 0.78
( Limit n ? U lt 69 )
26RMS figure
Priorities P2 gt P3 gt P1 (2 lt 6 lt 9)
P1 misses the deadline! No feasible schedule with
RMS!
27Earliest Deadline First Scheduling
EDF guarantee, feasible schedule exists if U
? 1This case U 1, EDF shall produce a feasible
schedule.
28(No Transcript)
296.3 Scheduling and semaphores
P(execution time, period, deadline) P1(1, 3, 3)
P2(1, 4, 4) P3(2, 6, 6) 3, 2?2, 2?3 3?2?2
12
RMS P1 gt P2 gt P3 (3 lt 4 lt 6)
Sem1 is a binary semaphore. accessSem1() and
releaseSem1() takes 0 time.
30RMS with no critical sections
31RMS with critical sections