Title: MAPLD 98 - Poster PCD
1Making a Case for Distributed Adaptive Computing
in Remote Sensing Science Data Processing
Clay Gloster Department of Electrical Computer
Engineering NC State University, Raleigh,
NC email gloster_at_eos.ncsu.edu Marco
Figueiredo marco_at_fpga.gsfc.nasa.gov SGT Inc./NASA
Goddard Space Flight Center Greenbelt, MD, USA
2Outline
- Problem Statement
- Potential Solutions
- Application Image Classification
- An Overview of Adaptive Computing
- Research Goals/ Vision
- Summer 97/98 Accomplishments
- Experimental Results
- Conclusions Future Work
3Generic Challenge Scientific Data Processing
- Problem Statement
- Given an application that requires an excessive
number of scientific computations - Design a system that can perform these
computations with the following constraints. - The system must provide improved performance over
current state of the art. - System cost cannot be excessive!!
- The system must be flexible and easy to adapt to
new applications. - The system development time must be small.
- Technology developed during system development
must be easily transferred to many potential users
4NASA Challenge Remote Sensing Science Data
Processing
- Large data input/output requirements (Data
Intensive) - The MODIS instrument, to be launched on the
EOS-AM1 satellite, has average daily data volumes
of 530MB. - Large data processing requirements (Compute
Intensive) - The MODIS instrument, to be launched on the
EOS-AM1 satellite, has data processing
requirement of 5.7 GFLOPs . - Algorithms can change even after the instrument
is in orbit - Enhancements made to algorithms
- Errors found in algorithms
- Correction for errors introduced by instrument
fatigue
5Potential Solutions
- General-purpose processors (Software)
- Personal Computers (400MHZ), computer
workstations, Supercomputers, etc. - Application-specific processors (Hardware)
- ICs designed to solve a particular problem
- Special-purpose processors (Hybrid)
- Digital Signal Processors, Math Coprocessors,
etc. - Adaptive Computers (A New Paradigm)
- Consists of a low cost, high performance,
software programmable general purpose processor
with one or more low cost, high performance
hardware programmable coprocessors.
6Cost Analysis
Cost(X)
S (
W
)(X
)
i
i
i1, ..., 4
W
0.25
i
7Application Image Classification
8Previous Results
Adaptive Computing has been shown to provide
several orders of magnitude speedup for select
applications.
9Adaptive Computing
- With the advent of programmable hardware devices
called Field Programmable Gate Arrays (FPGAs), we
can now reprogram hardware. - With todays technology, we can download a new
function into an FPGA in time on the order of 200
microseconds. - Current devices also allow partial configuration
of the device while other portions of the device
continue to function.
Time B
Time A
10Adaptive Computing for Space Applications
Satellite with Adaptive Computer
11Relevant Features of Java Release 1.1
- Object Oriented Programming Language
- The notion of hardware objects is easily
implemented - Threads
- Applets
- Native Methods (interfaces Java to existing
code.) - Remote Method Invocation (RMI)
- The notion of relocatable hardware objects is
easily implemented. - Java Security (Digital Signatures, secure RMI)
JAVA is a good programming language for this
project!!!!
12Proof of Concept (Summer 1997)
- Develop an understanding of the ASDP project and
identify where NC State could make a contribution - Use the Java language to implement a simple
addition program that can be executed remotely. - Accomplishments
- Implementation of a simple addition program
controlled from the local host in Java. - Implementation of a simple addition program
controlled from a remote host in Java.
13Proof of Concept (Summer 1998)
- Use the Java language to implement a new version
of the PNN algorithm that can be executed
remotely. - Accomplishments
- Implementation of the Pnn Algorithm controlled
from the local host in Java. - Implementation of a Pnn Algorithm controlled from
a remote host in Java. - New implementation of Pnn Algorithm that
processes a block of data rather than a single
pixel or a row.
14Current Research Projects
- A generic design methodology for the
implementation of high performance remote sensing
scientific data processing applications that can
drastically improve development time. - A partial implementation of the PNN algorithm for
image classification using floating-point units
to evaluate the feasibility of using floating
point as opposed to fixed point units in
state-of-the-art reconfigurable computing
environments. - A distributed library of generic floating point
arithmetic design modules that are fast, modular,
and can easily scale. - A distributed reconfigurable computing
implementation of the PNN algorithm for image
classification that improves performance by an
order of magnitude over the software
implementation. - A prototype implementation of the hardware
resource allocation system that manages
reconfigurable computing hardware.
15A Generic Design Methodology
- The Von-Neumann Paradigm
- Given a typical microprocessor/CPU with a fixed
architecture - Given an application
- Todays scientists are trained to use an existing
design methodology to map the application onto
the given processor. - The New Paradigm
- Given a typical reconfigurable computer with an
adaptive architecture - Given an application
- Develop a new design methodology for future
scientists to be trained in.
16Pnn A Case Study
- Use Pnn and other applications to develop a
generic design methodology - Use Pnn and other applications to reveal the
limitations of current reconfigurable computing
architectures/systems - The result of this study will be an environment
- Scientist can enter the concept in enough detail
for an engineer to implement. - Tasks developed as a part of the methodology will
be evaluated for potential automation. - Parallelism/distributed processing should be
exploited in this environment whenever possible.
17A Feasibility Study Floating Point Units and
Reconfigurable Computing
- Develop Parameterized VHDL Models for addition,
subtraction, multiplication, and division - Investigate the feasibility of implementing
modular floating point units - Assess the current state of the art to identify
the capacity of current FPGAs for various
floating point units. - Make recommendations toward migrating from fixed
point FPGA implementations to floating point. - Estimate the time frame when FPGAs will be able
to support application development using floating
point units.
18Fixed Point Versus Floating Point
Area
Floating Point
Fixed Point
Pipeline Depth/Number of Units
19Hardware Resource Allocation System
- Single Adaptive Computing Resource Allocation
System - Multiple Adaptive Computing Resource Allocation
System - Preemptive Hardware Resource Allocation System
20Single RC Resource Allocation System
I
n
t
e
r
n
e
t
S
i
t
e
1
C
P
C
P
C
P
S
i
t
e
2
1
1
1
1
2
M
1
- N sites distributed over the Internet
- 1 adaptive computer resource
- M1 hardware programmable modules
S
i
t
e
N
21Multiple RC Resource Allocation System
Internet
Site 1
µp
2
CP
CP
CP
2
2
2
1
2
M
Site 2
2
µp
m
CP
CP
CP
m
Site N
m
m
1
2
M
m
- N sites distributed over the Internet
- m adaptive computer resources
- SMi , i 1,m, hardware programmable modules
22Experimental Results Pnn
- PNN Image Classification run on a 512x512 image,
4 bands, 5 classes. - Experiments run on
- A Pentium 166MHz with 64MB of memory, running
Windows NT - Pixel and block based versions of the algorithm
were evaluated. - One Block 6 rows x 512 pixels/row x 4
bytes/pixel 12,228 bytes - Local and remote versions of the algorithm were
evaluated. - In the remote experiments, client and server were
the same machine.
23Local/Remote Image Classification
- Both the client and the server contain adaptive
computing resources. - Both the client and the server contain software
(Java C) versions of the classification
algorithm. - Local Classification is executed on the client.
- Remote Classification is executed on the server
via a request from the client.
NC State Univ.
gloster.cacc..ncsu.edu
NASA GSFC
classic.gsfc.nasa.gov
24Remote Image Classification (Pixel Based)
Description Avg Row Total Software
(Java) 14.83 7597.79 Software
(C) 15.79 8087.24 Hardware (Single)
4.16 2128.38 Hardware (Multiple)
4.25 2156.02
Times reported in CPU seconds
25Remote Image Classification (Block Based)
Description Avg Row Total Software (Java)
2.65 1358.91 Software (C)
3.61 1847.33 Hardware (Single) 0.35
178.217 Hardware (Multiple) 0.35 180.15
Times reported in CPU seconds
26Local Image Classification (Pixel Based)
Description Avg Row Total Software (Java)
2.57 1317.31 Software (C)
3.65 1871.36 Hardware (Single) 0.27
141.10 Hardware (Multiple) 0.14 77.45
Times reported in CPU seconds
27Local Image Classification (Block Based)
Description Avg Row Total Software (Java)
2.56 1309.65 Software (C)
3.69 1889.63 Hardware (Single) 0.28
143.05 Hardware (Multiple) 0.28 142.57
Times reported in CPU seconds
28An Interesting Result
- Using REMOTE HARDWARE is faster than LOCAL
SOFTWARE - Remote Single Module Classification 178.22s
- Local Java Classification 1309.65s
- What does this infer?
- One site should develop many applications using
the proposed resource allocation system. - Applications should be served from centers of
excellence, i.e. Distributed Active Archive
Centers (DAACs) - Alternatively, applications should be
given/licensed to users that have adaptive
computing resources in-house.
29Conclusions/Future Work
- The Java programming language is a good language
for this project. - Distributed Adaptive Computing Implementations
can provide better performance over local
software implementations - Floating point implementations may be feasible
for remote sensing science data processing
applications. - An adaptive computing hardware resource
allocation system can be beneficial. - Portions of a generic design methodology can be
automated reducing development time for
reconfigurable computing implementations.