Software Defined Acoustic Modems - PowerPoint PPT Presentation

1 / 64
About This Presentation
Title:

Software Defined Acoustic Modems

Description:

Optimizations data representation and distribution ... Understanding processing in coral reef, lagoons and forereef in French Polynesia ... – PowerPoint PPT presentation

Number of Views:354
Avg rating:3.0/5.0
Slides: 65
Provided by: eceU1
Category:

less

Transcript and Presenter's Notes

Title: Software Defined Acoustic Modems


1
Software Defined Acoustic Modems
  • Ryan Kastner
  • Department of Electrical and Computer
    Engineering
  • University of California, Santa Barbara
  • CENG Seminar
  • USC
  • March 22, 2005

2
Outline
  • Underwater Wireless Communication
  • AquaNode
  • Software Defined Acoustic Modems
  • Application Mapping Design Flow
  • Application Specification
  • Reconfigurable Device Architecture - BLOBs
  • Matching Pursuits Core
  • Optimizations data representation and
    distribution
  • Parameterizable number of symbols, samples,
    paths
  • Design Tradeoffs area, latency, energy,

3
Ecological Research Programs
  • Santa Barbara Channel Long Term Ecological
    Research (SBC LTER)
  • Goals
  • Focuses on understanding the nearshore ecosystems
    of the west coast
  • Time/space variation of individual organisms,
    populations, and ecological communities
  • Moorea LTER
  • Goals
  • Understanding processing in coral reef, lagoons
    and forereef in French Polynesia
  • Nature of animal and plant community structure
    and diversity
  • Responses to environmental change induced either
    by human activities or natural cycles

4
Monitoring in Moorea
  • Establish monitoring sites in lagoons and on fore
    reefs surrounding Moorea
  • Response variables measured
  • Weather
  • Tides, Currents and Flows
  • Ocean Temperature Color
  • Salinity, Turbidity pH
  • Nutrients
  • Recruitment Settlement
  • Size Age Structure
  • Species Abundance
  • Community Diversity

5
Monitoring in the Santa Barbara Channel (SBC)
Stearns Wharf
6
Typical Instrumentation SBC mooring
7
Existing SBC Moorings
8
Monitoring Realities
Repetitive data collection is expensive and
requires massive numbers of person hours
  • 1/3 Santa Barbara Coastal LTER budget allocated
    to collection and management of monitoring data
  • Estimated 10,000 Scientist hours to conduct
    large scale monitoring program of tropical
    forest
  • Need for autonomous, adaptive wireless acoustic
    underwater network for eco-surveillance and
    adaptive sampling.

9
Scenario for WetNet for Eco-Surveillance
  • Deploy Ad hoc wireless (acoustic) network in
    lagoon
  • Network consists of Aquanodes with Conductivity,
    Temperature, Depth (CTD) sensors
  • Ad hoc network allows Aquanodes to relay data to
    a dockside collector
  • Aquanode requirements
  • Low cost, low power wireless modems
  • Integral router
  • Integral CTD sensor suite
  • Additional nitrate, oxygen chemical sensors
  • Real-time data from Moorea available on Web

10
Aquanode
Float
Cabled transducer
Software Defined Acoustic Modem
Conductivity/Temperature/ Depth Sensor
Battery
11
WetNet using Aquanodes
CTD, currents, nutrient data to Internet.
Adaptive sampling commands to AquaNodes.
Wi-Fi or Wi-Max link
Dockside acoustic/RF comms and signal processing.
Cabled hydrophone array
Dock
AquaNodes with acoustic modems/routers, sensors
12
Underwater Acoustic Channel
  • Severe multipath - 1 to 10 msec for shallow water
    at up to 1 km range
  • Doppler Shifts
  • Long latencies speed of sound underwater approx
    1500 m/sec

Dock
AquaNodes with acoustic modems/routers, sensors.
13
Hardware Platform
  • Hardware is wirelessly updatable no need to
    retrieve equipment to update hardware for
    changing communication protocols, sampling,
    sensing strategies
  • Software Defined Acoustic Modems reconfigurable
    hardware known to provide, flexible, high
    performance implementations for DSP applications

Sensor
Software Defined Acoustic Modem
Transducer
14
Software Defined Acoustic Modem
  • Ideal One piece of hardware for all sensor
    nodes
  • Sensor Interface
  • Must develop common interface with different
    sensors (CTD, chemical, optical, etc.) and
    communication elements (transducer)
  • Wide (constantly changing) variety of sensors,
    sampling strategies
  • Communication Interface
  • Amplifiers, Transducers
  • Signal modulation

Transducer
CTD Sensor
Reconfigurable Hardware Platform
15
Acoustic Modem Requirements
  • Complex, computationally intensive communication
    protocol
  • Limited energy
  • Fast design tools to aid mapping of the
    communication protocols into hardware

Transducer
Communication Protocol
CTD Sensor
Reconfigurable Hardware Platform
Mapping
16
Design Considerations for SDAM
  • Multipath Spread Range of 1 to 10 milliseconds
    for shallow water at up to 1 km range
  • Larger bandwidths reduce frequency dependent
    multipaths
  • Transducers
  • Size/weight/cost proportional to wavelength
  • Acceptable propagation losses at 100 meter
    ranges
  • Waveform
  • M-FSK signaling
  • Datasonics/Benthos modems (used in Seaweb,
    FRONT)
  • Narrowband thus sensitive to frequency-selective
    fading.
  • Use more tones increasing sensitivity to
    Doppler spread.
  • Proposed Walsh/m-sequence signaling
    (Direct-sequence)
  • Provides frequency diversity due to wide
    bandwidth
  • Can be detected noncoherently

17
Walsh/m-Sequence Waveforms
Chip rate 5 kcps, approx. 5 kHz bandwidth.
Uses 25 kHz carrier. Use 7 chip m-sequence c per
Walsh symbol, 8 bits per Walsh symbol bi.
Composite symbol duration is thus T 11.2 msec.
(Longer than maximum multipath spread.)
Symbol rate is 266 bps, or 133 bps using 11.2
msec. time guard band for channel clearing.
11 msec.
18
Transmitted Signal
1
1
-1
1
-1
-1
-1
1
1
-1
1
-1
-1
-1
-1
-1
1
-1
1
1
1
19
Walsh/m-sequence Signal Parameters
1
1
-1
1
-1
-1
-1
1
1
-1
1
-1
-1
-1
-1
-1
1
-1
1
1
1
20
UWA Walsh/m-sequence GMHT-MP Modem
Generalized multiple hypothesis test (GMHT)
21
Acoustic Modem Performance
  • True multipath intensity profile (MIP)
  • Nf paths assumed by MP estimation
  • N? Number of paths present

MP identifies major paths using one symbol of
information
22
Acoustic Modem Performance
  • Comparison of rake receiver and matching pursuits
  • Symbol Error Rate (SER)
  • Nf paths assumed by MP estimation
  • N? Number of paths present

23
UWA Walsh/m-sequence GMHT-MP Modem
?
how do we implement it?
Modem is accurate
24
System Design Tools
  • Goal Map application specification to system
    architecture
  • Subject to always increasing design constraints
    lower energy, smaller, faster, etc.

System Design
Communication Protocol Walsh/m-sequence GHMT-
MP Acoustic Modem
Reconfigurable System
25
System Design and Architecture
  • Problem take application code and map it to
    some system platform (e.g. reconfigurable
    device)
  • System platforms are extremely (and increasingly)
    complicated, multiprocessing computing systems
  • Mix of hardware and software components
  • Microprocessors RISC, DSP, network,
  • Logic level (FPGA) Reconfigurable logic
  • Specs for Xilinx Virtex II
  • 3K to 125K logic cells,
  • Four PowerPC processor cores
  • Complex memory hierarchy - 1,738 KB block RAM,
    external memory, local memory in CLBs
  • Possibility of soft core processors DSP
  • Custom hardware - embedded multipliers, fast
    carry chain logic, etc.
  • Large amount of performance improvement possible,
    IF there is a good mapping

How do we best represent the application for
mapping?
26
Obligatory Design Flow Slide
Syntactic/Semantic Analysis
Device Architecture Description
AST
Specification Language
SUIF
Application Specific Optimizations
Application Behaviors
AST
Reconfigurable System Compiler
Machine SUIF
Function Level SSA CFG Generation
Profiling
PDGSSA Generation
SSA CFG
SSA CFG
?Proc Backend
Task Level Optimizations
SSA CFG
?Proc Binary
Instruction Level Optimizations
logic bitstream
Logic and Physical Synthesis
Platform Programming Software
Synthesizable HDL
Functional Reconfigurable System
Hardware Description Language (HDL) backend
Commercial Tools
27
Reconfigurable Device Architecture
  • Modeling a Reconfigurable Device as an array of
    BRAM-Level operation blocks (BLOBs)
  • BLOB
  • A multiplier
  • A BRAM
  • Adjacent CLBs
  • Adjacent interconnects

28
Application Specification
  • Can be written in C, SystemC, SystemVerilog,
    linear systems, signal flow graph, CDFGs
  • Must have front end to task graphs
  • Focusing first on a C to task graph

Signal Flow Graph
C code
Linear Systems
if(x
x y
29
Application Specific Optimizations
  • Data Representation
  • Number of bits
  • Representation Fixed, Floating,
  • Linear System Optimizations
  • Convert constant multiplications to shifts and
    adds
  • Minimize number of operations, latency, area,
    etc.

Transposed form of FIR filters
Replacing constant multiplications
by a multiplier block
30
Intermediate Representation
  • Must exploit fine AND coarse-grain parallelism
  • Ideally want automatic mapping
  • Need a form that can do synthesis to both
    hardware/software

val pred for(i 0 i if(val 32767) val 32767 else
if(val val -32768
?
31
PDGSSA Representation
CDFG Form
Input Application
val pred for(i 0 i if(val 32767) val 32767 else
if(val val -32768
32
PDGSSA Representation
CDFG Form
PDGSSA Form
33
Advantages of PDGSSA
  • Exploits parallelism
  • Explicitly shows control and data dependences
  • Control structures do not limit data parallelism
  • Regions are hyperblocks allows aggressive
    optimizations
  • Synthesis to hardware and software

34
Comparing CDFG, PDG
  • Benchmarks bunch of MediaBench functions
  • PDG, CDFG 2-3 times faster than sequential
    execution
  • PDG about 7 faster than CDFG
  • PDG, CDFG approx. same area

35
Comparing Different Predicated Forms
  • Comparison with PSSA, sequential execution
  • PSSA - predicated static single assignment
  • Used by several other projects CASH, Sea
    Cucumber
  • PDGSSA on average 8 faster than PSSA

36
Map Application to HW/SW Cores
  • Dependence analysis to exploit fine/coarse grain
    parallelism
  • Interprocedural dependencies selective
    inlining
  • Control dependencies loop optimizations,
    hoisting, if conversion
  • Data dependencies arrays, aliasing, liveness
  • System partitioning
  • Cluster into coarser grained tasks
  • Decide how to divide application onto platform

37
System Partitioning
  • How do you decide where to map different parts of
    the application?
  • Hardware or software which processor, which
    memory, exact location, etc.
  • Extremely hard set of problems (NP-Hard)
  • Must be flexible - different applications/systems
    have wide variety of models
  • Fundamental problem - many different heuristic
    methods have been developed
  • Simulated annealing
  • Genetic Algorithms
  • Tabu Search
  • Kernighan/Lin
  • Ant Colony Optimization

38
Code Generation
  • Once task graph is partitioned
  • Generate code for each task
  • Create communication protocols
  • Data transfer between BRAMs in BLOBs
  • Memory hierarchy local registers (in CLBs),
    local BRAM, remote BRAM, off chip memory
  • Need code generation from every input
    specification to every computational core
  • Software use conventional compiler flow
  • Reconfigurable Hardware need flow from task to
    RTL HDL

39
Data Partitioning and Storage Assignment
  • Mapping high-level programs into FPGA-based
    reconfigurable computing architectures with
    distributed block RAM modules
  • Objective Improve utilizations of available
    storage resources, optimize system performance,
    and meeting design goals, including area,
    latencies, and throughputs

. . .
40
Data Partitioning Challenges
  • Reconfigurable systems are different from
    multiple processor parallel systems
  • Different architecture multiple processors vs.
    CLBs
  • Different program execution sequential programs
    ILP vs. fully parallelized and concurrent
    manner
  • Challenges
  • Indistinct boundaries between local and remote
    memory accesses
  • Data partitioning and storage assignment has more
    compound effects on system performance
  • Flexible memory port configurations

41
Additional Optimizations Port Configurations
  • Different port configurations support different
    memory bandwidths, but require address generators
    with different complexities
  • Example (below) single-port 8-bit block RAM
    module vs. dual-port 32-bit block RAM modules

42
Additional Optimizations Buffer Insertion
  • Similar to software prefetching in
    microprocessors, but implemented by inserting
    buffers
  • Reduce the delay of critical paths, and improve
    clock frequencies

43
Matching Pursuits Core
  • Goal Map matching pursuits to reconfigurable
    device
  • Parameterizable number of samples, data
    representation
  • Tradeoffs - Provides designs with various area,
    latency, energy,

System Design Tools
Matching Pursuits Algorithm
Reconfigurable System
44
Matching Pursuit Algorithm
Additional area cost
Check for Nf times
45
Matching Pursuit Algorithm
  • Accurate and low complexity approximation to the
    Maximum Likelihood (ML) solution
  • The redesigned MP is Nf times faster than the
    original MP. Both MPs are faster than ML.
  • Insignificant increase in memory requirement

ASE0.003
46
Data Representation
  • Floating point representation
  • Large dynamic range and high precision
  • Costly
  • Fixed point representation
  • Requires fewer number of resources
  • Comparable performance
  • Bitwidth analysis for trading off estimation
    accuracy and the number of fixed-point bits
  • 8 bits is sufficient

47
MP Core Parameterization
  • Parameterization of the MP core
  • Data representation
  • Data distribution
  • Tradeoff system parameters
  • Parameters in MP
  • M the number of training symbols (M1)
  • Nf the number of sparse channels (Nf15)
  • Ns the number of samples per symbol (Ns88)
  • The amount of hardware resources
  • How much parallelism can be supported?
  • Xilinx Virtex-II XC2V3000
  • 1728 KBits (96 BRAMs, 18 KBits/BRAM)
  • 96 embedded multipliers (18x18 Bits)
  • 3584 CLBs

48
Data Distribution
  • Calculating matched filter outputs through
    matrix-vector multiplications
  • A bank of correlators multiplies each sample of
    the receivedvector r with the corresponding
    sample of a column in an S matrix

Global scheme Local scheme
Distribute matrix by row
Distribute matrix by column
49
Experimental Results Correlator Bank
  • A bank of correlators multiplies each sample of
    the receivedvector r with the corresponding
    sample of a column in an S matrix
  • It is possible to achieve communication-free
    partition
  • A good time/resource tradeoff

50
Experimental Results Correlator Bank
  • Compared communication-free schemes with row-wise
    partitioning with the same granularity
  • Row-based partitioning results show a very
    similar time/resource tradeoff
  • 30-50 slower due to the large amount of global
    communications and global controls

51
Data Distribution Results
  • Similarities of two schemes
  • As the number of rows/columns distributed into
    the same BRAM increases
  • Execution time increases linearly
  • Area decreases exponentially
  • The local scheme exceeds the global scheme
  • Due to local communication
  • The MAC scheme runs faster than the pipelined
    adder-tree scheme before and after physical
    layout
  • The MAC scheme takes less time in terms of
    synthesis and placement and routing

52
Matching Pursuits Mapping
53
Putting It All Together
  • Parameterizable MP core
  • Running 216 times faster than a high performance
    desktop computer with a 2.17GHz AMD Athlon XP CPU

54
Conclusions
  • AquaNodes
  • Wireless communication devices for underwater
    channel
  • Software Defined Acoustic Modem (SDAM)
  • System Design Flow
  • Application Specification
  • BLOBs
  • Intermediate Representation PDGSSA
  • Reconfigurable Hardware Synthesis Techniques
  • Matching Pursuits Core
  • Core component in SDAM demodulator
  • Parameterized samples, symbols, paths
  • Design implementations tradeoff latency, area,
    power, energy

55
ExPRESS Lab
  • ExPRESS - Extensible, Programmable,
    Reconfigurable Embedded SystemS -
    http//express.ece.ucsb.edu/
  • Students
  • PhD Students Andrew Brown, Wenrui Gong, Anup
    Hosangadi, Yan Meng, Gang Wang
  • Undergrads - Brian DeRenzi, Talayeh Saderi,
    Willis Hoang
  • Colloborators Prof. Ronald Iltis, Prof. Hua
    Lee, Prof. Volkan Rodoplu, Prof. Timothy
    Sherwood
  • Sponsors

56
Extra Slides
57
Leveraging of Existing UCSB Oceanographic
Infrastructure
  • Partnership for Interdisciplinary Studies of
    Coastal Oceans (PISCO) http//www.piscoweb.org
  • - 19 moorings
  • - measurements currents, temperature
  • Santa Barbara Coastal Long Term Ecological
    Research Project
  • http//sbc.lternet.edu/
  • - 3 moorings Stearns Wharf instrument
    package
  • - measurements currents, temperature,
    conductivity, fluorescence, optical
  • backscatter, nitrate
  • Moorea Coral Reef LTER
  • - several moorings will be deployed beginning
    in 2005
  • - measurements currents, temperature,
    conductivity, fluorescence , optical
  • backscatter, others to be
    determined (TBD)
  • Southern California Coastal Ocean Observing
    System (SCCOOS)
  • http//www.sccoos.org/
  • - at least one coastal mooring will be in the
    Santa Barbara Channel
  • - measurements currents, temperature,
    conductivity,
  • spectral fluorescence, optical
    backscatter, nitrate, others TBD

58
Wireless Communication in Multipath Channels
  • Ubiquitous wireless applications
  • Multipath fading poses a strong negative effect
    on wireless communication
  • Multiple paths due to scattering
  • Received signal consists of multiple delayed and
    attenuated versions of the transmitted signal
  • Signal corruption due to destructive interference

59
Multipath Channel Estimation
  • Recovering corrupted signals due to multipath
    propagation
  • Improving data rate and reliability
  • Enabling accurate radiolocation, MUD
  • Key technique for supporting mobility and high
    data rate processing
  • Found in both acoustic modem and radiolocation
    applications

60
Problem Formulation of Multipath Channel
Estimation
  • r Sf n
  • r received signal
  • f f1, f2, , fNsT
  • fi channel coefficient of path i
  • S S1, S2, , SNs
  • Si received signal due to path i if fi 1
  • Sifi received signal due to path i
  • n additive white Gaussian noise
  • M the number of training symbols
  • Ns the number of samples per symbol duration
  • Sparse channel (Nf
  • Computing an estimate of f given S and r
    containing noise n

61
(No Transcript)
62
Proposed Approach
  • Based on our current efforts on synthesizing C
    programs into RTL designs
  • Integrate traditional program test and
    transformation techniques in parallelizing
    compilation into our system compiler framework
  • Overview
  • Code Analysis calculate the reference
    footprints, analyze the iteration space, and
    determine directions of partitioning
  • Architectural synthesis obtain performance
    characteristics of the iteration body
  • Data partitioning and granularity adjustment

63
Problem Formulation
64
Experimental Results Edge Detection
  • Sobel edge detection applies horizontal and
    vertical detection masks to an input image.
    Optimization techniques are utilized
  • In the smaller design, we achieve the 150 MHz
    design goal with a 46x speedup compared to the
    original design
  • Failed to achieve the 150 MHz goal, which
    indicates it is extremely important to consider
    physical attributes of the problem at higher
    levels of the design
Write a Comment
User Comments (0)
About PowerShow.com