Title: A Robust Protocol for Concurrent OnLine Test COLT of NoCbased SystemsonaChip
1A Robust Protocol for Concurrent On-Line Test
(COLT) of NoC-based Systems-on-a-Chip
- Design Automation Conference (DAC)
- June 7, 2007
Praveen S. Bhojwani and Rabi N. Mahapatra Texas
AM University, College Station
2Agenda
- Introduction
- Concurrent On-Line Test (COLT) basics
- Multi TI-IP Configuration
- Robust TI-IP Operation
- Experimental Setup
- Results
- Summary
- Future Work
3Introduction
- Increasing complexity in SoC designs
- Reliability challenges due to lowering lifetimes
- Due to electro-migration, stress migration,
time-dependant dielectric breakdown and thermal
cycling - Post SoC deployment Run-time confidence in
correct operation of the SoC - Precursor to recovery invocation!
4Possible Options
- Option 1 Use a parallel stream of execution for
comparison - Option 2 Turn-off executing applications, put
SoC into test mode - Manage test costs
- Test power (up to 1.5x normal mode power)
- Test time (increases with complexity)
5Possible Options
- Option 3 Run tests while still executing
applications ? concurrent testing - Impact on application needs to be managed
- Along with other test costs
6Networks-on-Chip (NoC)
- Emergence of Networks-on-chip (NoC) to address
scalability issues with busses for complex
multi-core designs - Power efficient
- Reduced contention
- Benefits of interconnection networks!
- Possibility of reuse as a Test Access Mechanism
(TAM) - Prevents the need for a design specific TAM
7On-Line Testing in Research
- Manufacturers insert Infrastructure-IPs (I-IPs)
into SoC to improve yield and to provide test
support within designs Zorian,DT02 - Reuse I-IPs to perform the on-line SoC test
- Detect idle periods of execution and test all SoC
components using I-IPs Marzone,IOLTS05 - non-concurrent on-line testing ? turn off
applications - But detecting the idle periods is achallenge
itself!
8COLT with a TI-IP
- Concurrent On-Line Testing (COLT) using a Test
Infrastructure-IP (TI-IP) - Test in the presence of executing applications
- No need to turn off applications
- Deployed a TI-IP in a NoC-based design to manage
COLT - Identified challenges to COLT
P. Bhojwani and R. Mahapatra, An
Infrastructure-IP for on-line testing of
network-on-chip based SoCs, Proc ISQED 2007.
9Conceptual NoC-enable SoC TI-IP
10Simplified Example
TR
SSResp
SSResp
SSResp
IP
C
C
C
C
System Snapshot Determination
Test Acceptance Stage
Test Delivery Application Stage
Test Request Stage
SSResp
SSResp
SSResp
SSResp
C
C
C
C
SSReq
SSReq
SSReq
SSReq
SSReq
SSReq
SSReq
SSReq
SSReq
SSReq
SSReq
SSReq
SSReq
SSReq
SSResp
SSResp
SSResp
TV
TV
TV
TV
TI-IP
TI-IP
C
C
C
C
SSResp
SSResp
SSResp
SSResp
C
C
C
C
C Core Network Interface
11Motivation
- Explore multi TI-IP operation
- Test cases exposed some possible hazards
- Need for a robust operation specification
- Formalize operation
- Specify Protocol
- Identify Hazards
- Provide mitigation support
12Multi TI-IP Configuration
- For scalability, deploy multiple TI-IPsinto NoC
- Test Vector Delivery cost management
13Multi TI-IP Operation
- Reduce energy consumption for TV delivery
- Test of TI-IPs
- Challenge
- Co-operative operation to manage COLT
- Manage costs
- Token ring of TI-IP
- TI-IP with the token is the active TI-IPin the
system
14Multi TI-IP Operation
Only TI-IP with Test Management Token can accept/
schedule/ execute test requests
15Protocol Specification
- Identify communicating partners
- TI-IP
- CNI
- IP core under test
- Identify messages between these partners
- Commands for TI-IP operation
- Test Request, Test Response, System Snapshot
Determination, System Snapshot Response, Test
Vector Data, Test Throttle,CNI Alarm,
16Test Request Protocol Step
CNI attached to IP Core to be tested
17System Snapshot Collection Step
System Snapshot Requests sent to all CNIs in
the NoC
18Test Vector Delivery
CNIs attached to cores-under-test
Single TI-IP setup
19Test Throttle Step
Test Throttle Request only sent to necessary CNIs
of IP cores under test
20Test Management Token Step
3 TI-IP setup
21Protocol Hazards
- Identify design aspects that may affectCOLT
operation - Done for each protocol step
- Identified 5 types of hazards
- Starvation due to Test/Application
- Test Input Queue Buffer Overflow
- TI-IP Failure
- Test Wrapper Buffer Overflow
- CNI Failure
22Protocol Hazards
- Starvation
- Application and Test traffic, prevents theother
from communication - Test Input Queue Buffer Overflow
- Excessive test traffic can lead to buffer
overflow at the TI-IP input - TI-IP Failure
- CNI Failure
23Protocol Hazards
- Test Wrapper Buffer Overflow
- Can lead to test data loss
24Hazard Summary
25Hazard Mitigation
- Mitigate the identified hazard for each protocol
step - Mitigation techniques used
- Timeouts
- Priority inversions
- Communication interleaving
- Automatic Retransmission
- Some failures cannot be recovered from!
- Details in the paper
26Experimental Setup
- Simulation platform
- NoCSim NoC interconnection network simulator
- 4x4 2D folded torus
- Application Benchmarks
- Embedded System Synthesis Benchmark Suite (E3S)
- Task graphs from 5 application domains
- Test Benchmarks
- ITC02 SoC Test Benchmark
- g1063.soc and d695.soc test cases used
27Experimental Setup
- Only considering SCAN for now due to test data
availability - Only testing IP cores (for now)
- Assign 1 TI-IP to an SoC and let it occupy a
whole NoC tile - For a 22mm x 22mm chip laid out as a 4x4 2D
torus, each tile could be 5mm x 5mm Towles,
DAC01 - 5.2 area overhead
28Test Configurations
Each network tile configured for an IBM 405GP
(area constraints)
Task graph assignment done by hand.
29Test Configurations
- Multi TI-IP test configurations tested and
operation verified - Protocol hazard scenarios developed
30Results
Normal on line test-mode vs Starvation Hazard
test mode energy profile
6
.
00
E
-
09
5
.
50
E
-
09
5
.
00
E
-
09
)
J
(
4
.
50
E
-
09
y
g
r
4
.
00
E
-
09
e
n
E
3
.
50
E
-
09
3
.
00
E
-
09
2
.
50
E
-
09
0
9
8
7
6
5
4
3
2
1
0
9
8
3
5
7
9
1
3
5
1
7
9
0
2
2
3
4
5
7
8
9
1
0
1
3
4
1
1
1
1
Time
Normal on
-
line test mode
Mitigating starvation impact on test
31Results
32Summary
- Lowering design lifetimes
- Concurrent On-Line Testing techniques needed
- Use of TI-IPs to provide COLT support
- Multi TI-IP configuration to manage test vector
delivery cost - Robust TI-IP operation essential
- Protocol Specification
- Hazard Identification
- Hazard Mitigation
33Summary
- Verified TI-IP and multi TI-IP operation using
combination of academic benchmarks - Developed test cases to generateHazard scenarios
- Verified Robust TI-IP operation underthese
scenarios
34Future Work
- Explore test triggering options
- Event based triggering
- On-line testing of NoC components
- Routers
- Links
- CNI
35