Title: S2DB: A Novel SimulationBased Debugger for Sensor Network Applications
1S2DB A Novel Simulation-Based Debugger for
Sensor Network Applications
- Ye Wen, Rich Wolski, Selim Gürün
- Department of Computer Science
- UC Santa Barbara
- EMSOFT2006
2Sensor Networks
- Sensor networks
- An ad-hoc community of thousands of
heterogeneous, resource
constrained, tiny devices
Deployed in remote locations under extreme
conditions
Src Matt Welsh
Src Culler01
How can we debug an application on hundreds of
sensor nodes concurrently and realistically
without installing them in the field?
3Outline
- Overview
- Other debugging methods
- Our approach S2DB
- Building blocks
- Debugging points
- Virtual instrumentation
- Coordinated break
- Time traveling
- Evaluation
- Summary and conclusion
4Other Debugging Approaches
- JTAG
- Set breakpoints, step-execute program and query
hardware - Not possible to synchronize I/O and program
execution - Visualization tools
- Sympathy, SpyGlass, Surge Network Viewer,
MoteView - Display network topology and analyze the data
flow - Require a data collection agent on sensor node
- Simulation-based debuggers
- TOSSIM debugs the emulated code event-based
- ATEMU, Avrora, Emstar have similar concepts that
we built on and extend in various ways
5Simulation-based Sensor network debugger (S2DB)
- Base a scalable distributed sensor network
simulator - Fidelity cycle-accurate full-system simulation
of sensor network applications - Performance real-time speed for hundreds of
sensor nodes - Scalability simulate thousands of sensor nodes
using cluster computer - Novel debugging facilities for sensor network
- For single device debugging points for device
state inspection virtual debugging hardware for
software-controlled debugging - For multiple devices coordinated break condition
for parallel debugging - For network time traveling for trace analysis
6Internal Distributed Simulator Design
- Correct, faithful execution of AVR binaries
- Rich, complete hardware simulation
- Simple, effective radio synchronization protocol
- Automatic node partitioning
7Debugging Points
- Conventional debuggers expose register, PC and
memory - S2DB operates on debugging points The access
point to one of the internal states of the
simulated machine - The conventional debug points in S2DB
8New Debugging Points
- Further debugging points for exposing the full
system states - Software defined events based on virtual
debugging hardware - Synthetic high-level system events, derived from
the combination of simple events/states
9Debug Points Setting and Executing
- Print a variable X
- gtprint mem( X )
- Break execution on erasing first page of flash
-
10Debug Points Setting and Executing
- Print a variable X
- gtprint mem( X )
- Break execution on erasing first page of flash
- gtbreak when flash access( erase, 0x1 )
11Debug Points Setting and Executing
- Print a variable X
- gtprint mem( X )
- Break execution on erasing first page of flash
- gtbreak when flash access( erase, 0x1 )
- Break execution when pc matches foo and a program
variable Y is larger than 1 -
12Debug Points Setting and Executing
- Print a variable X
- gtprint mem( X )
- Break execution on erasing first page of flash
- gtbreak when flash access( erase, 0x1 )
- Break execution when pc matches foo and a program
variable Y is larger than 1 - gtbreak when pc() foo mem(Y)gt1
13Debug Points Setting and Executing
- Print a variable X
- gtprint mem( X )
- Break execution on erasing first page of flash
- gtbreak when flash access( erase, 0x1 )
- Break execution when pc matches foo and a program
variable Y is larger than 1 - gtbreak when pc() foo mem(Y)gt1
High Overhead
Lower Overhead
In a complex expression, we evaluate lower
overhead debug functions first to optimize
condition evaluation
14Virtual Hardware Based Code Instrumentation
- Three virtual registers in reserved AVR address
space Command, input, output - Useful for injecting a print command into
source code without going through serial port - Allows custom debugging points e.g. To monitor
nth execution of a function - User sets debugger to monitor ID,VALUE
- User command writes ltid, valuegt to output
register - The debugger interrupts execution if id ID and
value VALUE - 3 register accesses total
15Parallel Debugging
- Debugging single nodes is useful but not enough
for distributed applications - Many bugs emerge from the interactions between
nodes - E.g. packet delivery failures in network
protocols and race conditions in distributed
applications - Goals
- Display the status of multiple devices in
parallel - Break the execution on multiple nodes
simultaneously - Step execute multiple devices at same pace
- These require clock synchronization
- Should be efficient and scalable
16Partially Ordered Synchronization
- Partially ordered synchronization for the
evaluation of coordinated break condition in
distributed simulation - One node acts as master
- Other nodes always follow the master, i.e. clocki
lt clockmaster
17Coordinated Break Condition
- Coordinated Break
- Simple breaks execution when all nodes clocks
reach time T - gt break when clock() T
- Conjunction of atomic conditions
- gt break when node1.cond1 nodeX.condY
18Coordinated Break Condition
- Coordinated Break
- Simple breaks execution when all nodes clocks
reach time T - gt break when clock() T
- Conjunction of atomic conditions
- gt break when node1.cond1 nodeX.condY
- Limitation Arbitrary conditions (e.g.
disjunction of atomic conditions) - Constraints from distributed simulation structure
- Sacrifice generality for scalability and
performance
19Time Traveling
- Analyze anomaly using trace logs
- Replay events before and at the time of anomaly
- Periodic check-pointing saves simulated network
state - Snapshots of CPU, memory and hardware components
- Also Radio packet queue, receive/send queue,
power status - Small footprint 5KB size
- Flash is too large log-based snapshot
- Other mechanisms can be used to trigger a
checkpoint - Debugging points
- Break points
20Debug Point Cost
Comparison of Debugging Point Cost
21Coordinated Break Points
Coordinated break point condition with multiple
devices
Y-axis is the ratio to execution speed on real
device without condition monitoring
22Checkpointing Overhead
Each curve represents a configuration. i.e, 4x1
running 1 host and 4 nodes per host
23Conclusion and Summary
- S2DB contributions
- Debugging points
- Virtual debugging hardware
- Coordinated break condition for parallel
debugging - Time traveling for sensor network debugging
- The debugger overhead to simulation is less than
10 - Ongoing work
- User interface
- Plug-in for Eclipse IDE
- We expect feedback from sensor network community
to add new debugging features
24Questions?
25Stargate Simulation
- Xscale processor
- Arm v5TE instruction set with Xscale DSP
extensions - No thumb instruction set support yet
- MMU, GPIO, interrupt controller, real-time clock
- Flash memory
- Memory-mapped I/O
- State machine based on Intel Verilog model
- Estimate Flash latency using empirical data
- 802.11 Wireless card
- Serial interface
- Boots and runs Familiar Linux
26Related Work
- Sensor Network Simulation
- ATEMU, Avrora
- Full system, multi-simulation, lock-step
synchronization - No sensor network gateway support
- EmTos
- A wrapper library for TOSSIM and EmStar
- All applications must be recompiled to host
machine code and linked to EmTos - Other Simulation
- Skyeye
- Full system ARM emulator including LCD and
debugger - Not intended for sensor networks and
multi-simulation
27Partially Ordered Synchronization
Peer Synchronization
Partially Ordered Synchronization
Y
Y
B
D
B
D
UPDATE
WAIT
UPDATE
master
A
X
C
A
X
E
C
WAIT
UPDATE
WAIT
- Peer synchronization used in the base simulator
- Synchronize only before a radio read, no order is
enforced - Partially ordered synchronization for the
evaluation of coordinated break condition in
distributed simulation - One node acts as master
- Other nodes always follow the master, i.e. clocki
lt clockmaster
28Ensemble Synchronization
- Clock synchronization
- Execution rates of simulators should be
proportional to real devices - Lock-step method synchronize clocks on each
serial byte transfer period - Serial transfer rate 57.6 Kbits/seconds (128
Mote cycles) - Ensemble simulation requires clock
synchronization to slowest simulation thread - Stargate simulator is the bottleneck (most
complex) - Communication
- Packets assembled using receivers local clock
- Packet rate 19.2 Kbits/seconds
29Debugging over Simulation
- Sensor network research requires substantial
engineering, investment, and learning curve - Configuring/installing network devices a hassle
- Many bugs not detected until run-time
- HW lacks user-interface, debugging requires HW
modification - Analyzing erroneous behavior not easy
- Hard to distinguish hardware faults from software
bugs - Simulation has significant advantages
- Provides a controlled environment
- Cost-effective solution
- - Not the same as real-life execution
30S2DB Building Blocks
- S2DB spreads simulation into a computation
cluster for scalability
Separate operating system threads on possibly
different machines
Clock synchronization only on network read from
another node