Title: LowComplexity Algorithms for Static Cache Locking in Multitasking Hard RealTime Systems
1Low-Complexity Algorithms for Static Cache
Locking in Multitasking Hard Real-Time Systems
- Isabelle Puaut, David Decotigny
- IRISA, Campus universitaire de Beaulieu
- IEEE RTSS02
2Introduction
- Caches in hard real-time systems source of
unpredictability - Intra-task interferences
- Inter-task interferences
- How to cope with caches in hard real-time
systems? - Cache analysis methods
- Task partitioning and cache locking
3Approaches To Deal With Caches
- Cache analysis methods
- use caches without any restriction
- resort to static analysis techniques to predict
WCET - should not be overly pessimistic
- Task partitioning
- reserve some portions to certain tasks
- not a solution for intra-task interferences
- Cache locking
- load the cache contents and lock it
- static / dynamic
- make the memory access time predictable
4Contributions
- Static cache locking of I-caches in multitasking
real-time systems - Improves the system performance
- Gives more predictability
- Alleviates the need for using complex static
analysis techniques - Addresses both intra and inter task interferences
- Propose of two algorithms (pseudo-poly) to select
the contents of I-cache - Minimizing the worst-case CPU utilization
- Minimizing the inferences between tasks
5Assumptions and Notations
- Architecture model
- W-ways set-associative lockable I-Cache with a
LRU replacement policy (with a prefetch buffer of
size Sb) - Task model
- periods Pi (also deadline) and
worst-case execution time Ci - Li,j program lines of size Sb of task Ti
- WSeqi the sequence of Li,j along the worst
execution path - si with discarding consecutively repeated
element - number of accesses to memory
hierarchy (I-Cache or main memory) - (w/o a
prefetch buffer), - (with a prefetch
buffer),
6Schedulability Analysis
- CUA (Cache-aware Utilization-based Analysis)
- Utilization ratio (1)
- an upper bound on the cache-related delay
on the tasks preempted by Ti - For dynamic priority systems U 1 (necessary
and sufficient feasibility condition) - For static priority systems (RM)
(sufficient condition only too pessimistic)
7Schedulability Analysis (cont.)
- CRTA (Cache-aware Response Time Analysis)
- Necessary and sufficient condition for static
priority systems - Consideration of the interferences of higher
priority tasks on Ti on time window win - (3)
- Converges to Ri when
8Algorithms
- Algorithms based on two metrics
- Minimizing the worst-case CPU utilization
- Minimizing the interferences between tasks
- Greedy no reconsideration of the assignment of
a cache block once decided - Complexity pseudo-polynomial
- Assumption worst-case execution paths
(sequences si) are known
9Algorithm Minimize Utilization (Lock-MU)
- Description
- Ls set of program lines that can be mapped into
the W blocks of set s. - Locks into the cache the W program lines Li,j of
Ls having the highest ratio. - Complexity ( )
- Optimality
- Optimal with respect to the minimization of the
CPU consumption when si is known
10Algorithm Minimize Interferences (Lock-MI)
/ Cache Initialization /
/ Compute WCET and determine Ci /
/ Select program lines to be locked / /
Decreases of response times for all Tt
indueced by locking Li,j /
11Identification of the Worst-Case Execution
Sequences
- Worst execution path si depends on the contents
of the statically locked cache. - To be optimal, need to solve a mutually recursive
optimization problem not possible in practice - Consider only a single execution path
- makes the algorithms non optimal
- lowers the complexity to pseudo-polynomial
12Experimental Setup
- Target architecture and simulator
- CPU simplified MIPS processor (emulated)
- OS Nachos
- Sb 16 bytes, prefetch buffer (16 bytes)
- thit 1, tmiss 10 clock cycles
- si is obtained with I-Cache disabled.
- Static WCET analyzer
- Heptane (Hades Embedded Processor Timing
ANalyzEr) static WCET analysis tool
13Experimental Setup (cont.)
- Task sets
- Small / Medium
- Periods selected to make the CPU utilization
1.3 (not feasible) w/o I-Cache
14Worst-Case Performance Analysis of Locked Cache
and Dynamic Caches
- Cache-related preemption delay
- Static cache locking
- delay to refill the prefetch buffer (constant)
-
- Dynamic cache and static cache analysis
- considered all cache blocks of a task have to be
reloaded after a context switch -
- can linearly increase with cache size
15Compared Worst-Case Performance of Static Cache
Locking and Cache Analysis
16Results Analysis (1)
- Performance for small cache sizes and low degrees
of associativity - Static cache analysis outperforms static cache
locking - Small cache size
- High degree of intra/inter task conflicts
- Cache-preemption delay is negligible
- Static cache analysis take advantage of the
spatial locality lower WCET than static cache
locking
17Results Analysis (2)
- Impact of cache size
- The performance increase of static cache locking
is higher than the one of static cache analysis - Due to cache-related preemption delay
- Static cache locking constant
- Static cache analysis increases linearly with
the size of cache
18Results Analysis (3)
- Impact of the degree of associativity
- Static cache locking works better with increasing
the degree of associativity - Static cache locking takes benefit of the
increasing degree of associativity to eliminate
intra and inter task interferences - Static cache analysis due to the pessimistic
way the instructions are classified (hit or miss)
19Results Analysis (4)
- Compared performance of Lock-MU and Lock-MI
- CPU utilization with Lock-MI is generally worse
(larger) than with Lock-MU. - However, Lock-MI accepts as many task sets as
Lock-MU and in some situations accept task sets
not feasible under Lock-MU. (ex task set
Medium for a 4-way set-associative cache of
1Kbytes)
20Conclusions
- Key benefits of static cache locking
- Makes memory access time predictable
- Considers both intra and inter task interference
in a unified way - Alleviates the need for using complex static
analysis techniques - Proposed two pseudo-poly algorithms
- Outperforms the static cache analysis for larger
and higher degree of cache - Future Works
- Analysis of the sensitivity of the algorithms
with paths - Extension to data caches, unified caches and
multi-level caches - Dynamic cache locking strategies
21Sketch of Proof
- (ti,j thit
or tmiss) - Minimizing is equivalent to
minimizing . ( constant) - Minimizing the utilization is then equivalent to
minimizing, for every set s - Minimizing the utilization is equivalent to
locking in the cache (i.e. set ti,j thit ) the
W program lines with the highest ratio .