Folie 1 - PowerPoint PPT Presentation

1 / 63
About This Presentation
Title:

Folie 1

Description:

Hardware/Software Codesign of Embedded Systems Power/Voltage Management Voicu Groza School of Information Technology and Engineering Groza_at_SITE.uOttawa.ca – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 64
Provided by: PeterM217
Category:

less

Transcript and Presenter's Notes

Title: Folie 1


1
Hardware/Software Codesign of Embedded Systems
Power/Voltage Management
Voicu Groza School of Information Technology and
Engineering Groza_at_SITE.uOttawa.ca
2
Embedded Systems
  • Power/Energy Aware Embedded Systems
  • Dynamic Voltage Scheduling
  • Dynamic Power Management

http//www.phys.ncku.edu.tw/htsu/humor/fry_egg.ht
ml
Surpassed hot (kitchen) plate ? Why not use it?
3
Processing units
  • Need for efficiency (power energy)

Power is considered as the most important
constraint in embedded systemsin L. Eggermont
(ed) Embedded Systems Roadmap 2002, STW
Current smart phones can hardly be operated for
more than an hour, if data is being
transmitted.from a report of the Financial
Times, Germany, on an analysis by Credit Suisse
First Boston http//www.ftd.de/tm/tk/9580232.html
?nvse
4
The energy/flexibility conflict- Intrinsic Power
Efficiency -
Operations/WattMOPS/mW
Ambient Intelligence
10
DSP-ASIPs
hardwired muxed ASIC
1
Processors
µPs
Reconfigurable Computing
0.1
0.01
Technology
0.13µ
0.07µ
0.25µ
0.5µ
1.0µ
Necessary to optimize HW/SW otherwise the prize
for software flexibility cannot be paid!
H. de Man, Keynote, DATE02T. Claasen, ISSCC99
5
Power and energy are related to each other
P
E
t
In many cases, faster execution also means less
energy, but the opposite may be true if power has
to be increased to allow faster execution.
6
Low Power vs. Low Energy Consumption
  • Minimizing the power consumption is important for
  • the design of the power supply
  • the design of voltage regulators
  • the dimensioning of interconnect
  • short term cooling
  • Minimizing the energy consumption is important
    due to
  • restricted availability of energy (mobile
    systems)
  • limited battery capacities (only slowly
    improving)
  • very high costs of energy (solar panels, in
    space)
  • cooling
  • high costs
  • limited space
  • dependability
  • long lifetimes, low temperatures

7
Application Specific Circuits (ASICS)or Full
Custom Circuits
  • Custom-designed circuits necessary
  • if ultimate speed or
  • energy efficiency is the goal and
  • large numbers can be sold.
  • Approach suffers from
  • long design times,
  • lack of flexibility(changing standards) and
  • high costs(e.g. Mill. mask costs).

8
Mask cost for specialized HWbecomes very
expensive
?Trend towards implementation in Software
http//www.molecularimprints.com/Technology/tech
_articles/MII_COO_NIST_2001.PDF9
9
Power Consumption of a Gate
10
Fundamentals of dynamic voltage scaling (DVS)
Power consumption of CMOScircuits (ignoring
leakage)
Delay for CMOS circuits
? Decreasing Vdd reduces P quadratically,while
the run-time of algorithms is only linearly
increased(ignoring the effects of the memory
system).
11
Potential for Energy Optimization
  • Saving Energy under given Time Constraints
  • Reduce the supply voltage Vdd
  • Reduce switching activity a
  • Reduce the load capacitance CL
  • Reduce the number of cycles Cycles

12
Processors
At the chip level, embedded chips include
micro-controllers and microprocessors.
Micro-controllers are the true workhorses of the
embedded family. They are the original embedded
chips and include those first employed as
controllers in elevators and thermostats Ryan,
1995.
13
Voltage Scaling and Power ManagementDynamic
Voltage Scaling
Energy / Cycle nJ
Vdd
14
Power density continues to get worse
Nuclear reactor
15
Need to consider CPU System Power
Courtesy N. Dutt Source V. Tiwari
16
New ideas can actually reduceenergy consumption
Pentium
Crusoe
Running the same multimedia application.
As published by Transmeta www.transmeta.com
17
Dynamic power management (DPM)
Example STRONGARM SA1100
400mW
  • RUN operational
  • IDLE a sw routine may stop the CPU when not in
    use, while monitoring interrupts
  • SLEEP Shutdown of on-chip activity

RUN
90µs
Power fault signal
10µs
160ms
10µs
90µs
SLEEP
IDLE
Power fault signal
50mW
160µW
18
Variable-voltage/frequency example INTEL Xscale
OS should schedule distribution of the energy
budget.
From Intels Web Site
19
Key requirement 2 Code-size efficiency
  • CISC machines RISC machines designed for
    run-time-,not for code-size-efficiency
  • Compression techniques key idea

20
Code-size efficiency
  • Compression techniques (continued)
  • 2nd instruction set, e.g. ARM Thumb instruction
    set

Dynamically decoded at run-time
Same approach for LSI TinyRisc, Requires
support by compiler, assembler etc.
21
Dictionary approach, two level control
store(indirect addressing of instructions)
Dictionary-based coding schemes cover a wide
range of various coders and compressors.Their
common feature is that the methods use some kind
of a dictionary that contains parts of the input
sequence which frequently appear.The encoded
sequence in turn contains references to the
dictionary elements rather than containing these
over and over. Á. Beszédes et al. Survey of
Code size Reduction Methods, Survey of Code-Size
Reduction Methods, ACM Computing Surveys, Vol.
35, Sept. 2003, pp 223-267
22
Key idea (for d bit instructions)
Uncompressed storage of a d-bit-wide instructions
requires axd bits. In compressed code, each
instruction pattern is stored only
once. Hopefully, axbcxd lt axd. Called
nanoprogramming in the Motorola 68000.
For each instruction address, S contains table
address of instruction.
b
instructionaddress
a
S
b d bit
table of used instructions (dictionary)
c ? 2b
small
d bit
CPU
23
Key requirement 3 Run-time efficiency -
Domain-oriented architectures -
n-1
Application yj ?i0 xj-iai
?i 0?i ? n-1 yij yi-1j xj-iai
Architecture Example Data path ADSP210x
Application maps nicely onto architecture
MR0 A11 A2n-2 MXxn-1 MYa0for
( j1 to n) MRMRMXMY MYaA1 MXxA2
A1 A2--
24
Modulo addressing
sliding window
Modulo addressingAm ? Am(Am1) mod
n(implements ring or circular buffer in memory)
x
t
t1
..xt1-1xt1xt1-n1xt1-n2..
..xt1-1xt1xt11xt1-n2..
n most recent values
Memory, tt1
Memory, t2t11
25
Saturating arithmetic
  • Returns largest/smallest number in case of
    over/underflows
  • Examplea 0111b 1001standard
    wrap around arithmetic (1)0000saturating
    arithmetic 1111(ab)/2 correct 1000 wra
    p around arithmetic 0000 saturating arithmetic
    shifted 0111
  • Appropriate for DSP/multimedia applications
  • No timeliness of results if interrupts are
    generated for overflows
  • Precise values less important
  • Wrap around arithmetic would be worse.

almost correct
26
Fixed-point arithmetic
Shifting required after multiplications and
divisions in order to maintain binary point.
27
Properties of fixed-point arithmetic
  • Automatic scaling a key advantage for
    multiplications.
  • Examplex 0.5 x 0.125 0.25 x 0.125 0.0625
    0.03125 0.09375For iwl1 and fwl3 decimal
    digits, the less significant digits are
    automatically chopped off x 0.093Like a
    floating point system with numbers ? 0..1),with
    no stored exponent (bits used to increase
    precision).
  • Appropriate for DSP/multimedia applications(well-
    known value ranges).

28
Spatial vs. Dynamic Supply Voltage Management
  • Analogy of biological blood systems
  • Different supply to different regions
  • High pressure High pulse count and High
    activity
  • Low pressure Low pulse count and Low activity

Not all components require same performance.
Required performance may change over time
29
(No Transcript)
30
Example Processor with 3 voltagesCase a)
Complete task ASAP
Task that needs to execute 109 cycles within 25
seconds.
Ea 109 x 40 x 10-9 40 J
31
Case b) Two voltages
Eb 750 106 x 40 x 10-9 250 106 x 10 x
10-9 32.5 J
32
Case c) Optimal voltage
Ec 109 x 25 x 10-9 25 J
33
Observations
? A minimum energy consumption is achieved for
the ideal supply voltage of 4 Volts. In the
following variable voltage processor processor
that allows any supply voltage up to a certain
maximum. It is expensive to support truly
variable voltages, and therefore, actual
processors support only a few fixed voltages.
Ishihara, Yasuura Voltage scheduling problem
for dynamically variable voltage processors,
Proc. of the 1998 International Symposium on Low
Power Electronics and Design (ISLPED98)
34
Generalization
  • Lemma Ishihara, Yasuura
  • If a variable voltage processor completes a task
    before the deadline, then the energy consumption
    can be reduced.
  • If a processor uses a single supply voltage V
    and completes a task T just at its deadline, then
    V is the unique supply voltage which minimizes
    the energy consumption of T.
  • If a processor can only use a number of discrete
    voltage levels, then a voltage schedule with at
    most two voltages minimizes the energy
    consumption under any time constraint.
  • If a processor can only use a number of discrete
    voltage levels, then the two voltages which
    minimize the energy consumption are the two
    immediate neighbors of the ideal voltage Videal
    possible for a variable voltage processor.

35
The case of multiple tasksAssigning optimum
voltages to a set of tasks
N the number of tasks ECj the number of
execution cycles of task j L the number of
voltages of the target processor Vi the ith
voltage, with 1 ? i ? L Fi the clock frequency
for supply voltage Vi T the global deadline at
which all tasks must have been completed SCj
the average switching capacitance during the
execution of task j (SCi comprises the actual
capacitance CL and the switching activity ?) Xi,
j the number of clock cycles task j is executed
at voltage Vi
36
Designing an IP model
  • Simplifying assumptions of the IP-model include
    the following
  • There is one target processor that can be
    operated at a limited number of discrete
    voltages.
  • The time for voltage and frequency switches is
    negligible.
  • The worst case number of cycles for each task
    are known.

37
Experimental Results
38
Voltage Scheduling Techniques
  • Static Voltage Scheduling
  • Extension Deadline for each task
  • Formulation as IP problem (SS)
  • Decisions taken at compile time
  • Dynamic Voltage Scheduling
  • Decisions taken at run time
  • 2 Variants
  • arrival times of tasks is known (SD)
  • arrival times of tasks is unknown (DD)

39
Dynamic Voltage Controlby Operating Systems
Voltage Control and Task Scheduling by Operating
System to minimize energy consumption Okuma,
Ishihara, and Yasuura Real-Time Task Scheduling
for a Variable Voltage Processor, Proc. of the
1999 International Symposium on System Synthesis
(ISSS'99)
  • Target
  • single processor system
  • Only OS can issue voltage control instructions
  • Voltage can be changed anytime
  • only one supply voltage is used at any time
  • overhead for switching is negligible
  • static determination of worst case execution
    cycles

40
Problem for Operating Systems
deadline
2.5V
arrival time
Task1
5.0V
Task2
4.0V
Task3
What is the optimum supply voltage assignment for
each task in order to obtain minimum energy
consumption?
41
The proposed Policy
Consider a time slot the task can use without
violating real-time constraints of other tasks
executed in the future
  • Once time slot is determined
  • The task is executed at a frequency of WCEC / T
    Hz
  • The scheduler assigns start and end times of
    time slot

42
Two Algorithms
  • Two possible situations
  • The arrival time of tasks is known
  • SD Algorithm
  • Static ordering and Dynamic voltage assignment
  • The arrival time of tasks is unknown
  • DD Algorithm
  • Dynamic ordering and Dynamic voltage assignment

43
SD Algorithm (CPU Time Allocation)
  • Arrival time of all tasks is known
  • Deadline of all tasks is known
  • WCEC of all tasks is known
  • CPU time can be allocated statically
  • CPU time is assigned to each task
  • assuming maximum supply voltage
  • assuming WCEC

44
SD Algorithm (Start Time Assignment)
  • In SD, it is possible to assign lower supply
    voltage toTask2 using the free time
  • In SS, the scheduler cant use the free time
    because it has statically assigned voltage

45
DD Algorithm
When the tasks arrival time is unknown, its end
time cant be predicted statically using the SD
algorithm ? No predetermined CPU time, start or
end times
  • Start Time Assignment
  • New task arrives it either
  • Preempts currently executing task
  • Starts right after currently executing task
  • Starting time is determined

46
DD Algorithm (cont.)
End Time Prediction Based on the currently
executing tasks end time prediction, add the new
tasks WCEC time at maximum voltage
47
DD Algorithm (cont.)
? If the currently executing task finishes
earlier, then new task can start sooner and run
slower at lower voltage
48
Comparison SD vs. DD
  • SD Algorithm

Task
End Time
Start Time
  • DD Algorithm

Task
End Time
Start Time
49
Experimental Results Energy
Normal Processor runs at maximum supply
voltage SS Static Scheduling SD
Scheduling done by SD Algorithm DD Scheduling
done by DD Algorithm
50
Dynamic power management (DPM)
Dynamic Power management tries to assign optimal
power saving states Requires Hardware
Support Example StrongARM SA1100
400mW
RUN
RUN operational IDLE a sw routine may stop the
CPU when not in use, while monitoring
interrupts SLEEP Shutdown of on-chip activity
SLEEP
IDLE
160uW
50mW
51
The opportunityReduce power according to
workload
Desired Shutdown only during long idle times ?
Tradeoff between savings and overhead
52
The challenge
  • Questions
  • When to go to a power-saving state?
  • Is an idle period long enough for shutdown?
  • Predicting the future

53
Adaptive Stochastic Models
Sliding Window (SW) Chung DATE 99
  • Interpolating pre-computed optimization tables to
    determine power states
  • Using sliding windows to adapt to non-stationarity

54
Comparison of different approaches
P average power Nsd number of shutdowns Nwd
wrong shutdowns (actually waste energy)
55
What about multitasking?
? Coordinate multiple workload sources
user
requesters
program
program
program
operating system
power manager
device
56
Requesters
  • Concurrent processes
  • Created, executed, and terminated
  • Have different device utilization
  • Generate requests only when running(occupy CPU)
  • Power manager is notified when processes change
    state

We use processes to represent requesters requester
process
57
Task Scheduling
Rearrange task execution to cluster similar
utilization and idle periods
t1 t2 t3
2
3
1
1
2
3
1
2
time
idle
idle
T
T time quantum
58
Power-aware OS implementations
  • Windows APM and ACPI
  • Device-centric, shutdown based
  • Power-aware Linux
  • Good research platform (several partial
    implementations, es. U. Delft, Compaq, etc.)
  • Quite high-overhead for low-end embedded systems
  • Power-aware ECOS
  • Good research platform (HP-Unibo implementation)
  • Lower overhead than Linux, modular
  • Micro OSes

59
Application Aware DPM Example Communication
Power
NICs powered by portables reduce battery life
8 hours
  • In general
  • Higher bit rates lead to higher power consumption
  • 90 of power for listening to a radio channel
  • ? Proper use of PHY layer services by MAC is
    critical!

60
Off mode power savings
Server
Access Point
Buffering
Refill
Beacons
Request
Request
Client
Power
Doze mode
time
Off mode
Energy saving
time
Playback
Playback
Low water mark
Buffer full
Playing
LWM reached
61
LWM / Buffer characteristics
  • Higher error probability
  • Exploits NIC off-state
  • Min. value to allow data acquisition

Where to put the LWM?
  • lower error probability
  • Incurs NIC off-state overhead
  • Max. value Buffer_length1 block

How long should the buffer be?
  • Depends on memory availability
  • The longer the buffer, the higher the NIC
    off-state benefits

Buffering Strategies should be Power Aware!
62
Comparison
  • Low length buffers incur off mode power overhead
  • Good power saving for high length buffers

63
Exploiting application knowledge
Approximate processing Chandrakasan98-01 Tradeof
f quality for energy (es. lossy
compression) Design algorithms for graceful
degradation Enforce power-efficiency in
programming Avoid repetitive polling
Intel98 Use event-based activation
(interrupts) Localize computation whenever
possible Helps shutdown of peripherals Helps
shutdown of memories
Write a Comment
User Comments (0)
About PowerShow.com