Title: Enabling Technologies for Reconfigurable Computing and Software / Configware Co-Design Part 3: Resources for RC and Data-Stream-based Computing -
1Enabling Technologies for Reconfigurable
Computing and Software / Configware Co-Design
Part 3Resources for RC and
Data-Stream-based Computing -
July 8, 2002, ENST, Paris, France
- Reiner Hartenstein
- University of
- Kaiserslautern
2Schedule
time slot
10.00 11.00 Reconfigurable Computing (RC)
11.00 11.30 coffee break
11.30 12.30 Data-Stream-based Computing
12.30 14.00 lunch break
14.00 15.00 Resources for RC and Data-Stream-based Computing
15.00 15.30 Recent developments
15.30 16.00 Discussion
3gtgt Configware Industry
- Configware Industry
- Terminology
- MoPL data-procedural language
- Anti architecture and circuitry
- Stream-based Memory Architecture
http//www.uni-kl.de
4Configware heading for mainstream
- Configware market taking off for mainstream
- FPGA-based designs more complex, even SoC
- No design productivity and quality without good
configware libraries (soft IP cores) from
various application areas. - FPGA vendors and a growing no.
of independent configware houses
(soft IP core vendors)
and design services .
5OS for PLDs
- separate EDA software market, comparable to the
compiler / OS market in computers, - Cadence, Mentor, Synopsys just jumped in.
- lt 5 Xilinx / Altera income
from EDA software - Alliances with hundreds of partners providing
hundreds of IP cores, synthesizable (hopefully)
(WWW sites difficult to
navigate)
6gtgt Terminology
- Configware Industry
- Terminology
- MoPL data-procedural language
- Anti architecture and circuitry
- Stream-based Memory Architecture
http//www.uni-kl.de
7Terminology
8Terminology Acronyms
- DPU datapath unit
- DPA datapath array
- rDPU reconfigurable DPU
- rDPA reconfigurable DPA
- RC reconfigurable computing
- RL reconfigurable logic
- RA reconfigurable array
- Software (SW) procedural sources
- Configware (CW) structural sources
- Hardware (HW) hardwired platforms
- ASIC customizable hardwired platforms
- Flexware (FW) reconfigurable platforms
- FPGA field-programmable gate array
- FPL field-programmable logic
) note firmware is SW !
9Babylonial Confusion
- Communication between areas, and between
abstraction levels mainly because of
non-intuitive, misleading or ambiguos terminology
10gtgt MoPL data-procedural language
- Configware Industry
- Terminology
- MoPL data-procedural language
- Anti architecture and circuitry
- Stream-based Memory Architecture
http//www.uni-kl.de
11Fundamental Ideas available (1)
- Data Sequencer Methodology
- Data-procedural Languages (Duality with v N)
- ... supporting memory bandwidth optimization
- Soft Data Path Synthesis Algorithms
- Parallelizing Loop Transformation Methods
- Compilers supporting Soft Machines
- SW / CW Partitioning Co-Compilers
12Fundamental Ideas available (2)
- Programming Xputers
- Similarities to programming computers
- How not to get confused by similarities
- What benefits vs. Computers ?
13Programming Language Paradigms
easy to learn
14Similar Programming Language Paradigms
very easy to learn
15JPEG zigzag scan pattern
published in 1993
16gtgt Anti architecture and circuitry
- Configware Industry
- Terminology
- MoPL data-procedural language
- Anti architecture and circuitry
- Stream-based Memory Architecture
http//www.uni-kl.de
17GAU generic address unit Scheme
GAU
18BSU Basic Stepper Unit
19GAG Complex Sequencer Implementation
Generic Address Generator
20Generic Sequence Examples
atomic scan
linear scan
video scan
-90º rotated video scan
-45º rotated (mirx (v scan))
sheared video scan
non-rectangular video scan
zigzag video scan
spiral scan
feed-back-driven scans
perfect shuffle
21Slider Demo
address
22XMDS Scan Pattern Editor GUI
23gtgt Stream-based Memory Architecture
- Configware Industry
- Terminology
- MoPL data-procedural language
- Anti architecture and circuitry
- Stream-based Memory Architecture
http//www.uni-kl.de
24MoM Xputer Architecture
published in 1990
25Antimachine MoM architecture
26Linear Filter Application 11 x 22 initial
Dissertation Michael Herz
9 x 20 180
1620
27Linear Filter scanline unrolling
3 x 20 60
900
2890o Rotation of Scan Pattern
3 x 10 30
600
29Linear Filter Application final
Speed-up factor 11,2
30MoM Application Examples
- Image Processing
- Grid-based design rule check 1983
- 4 by 4 word scan cache
- Pattern-matching based
- Our own nMOS DPLA design
- design rule violation pixel map automatically
generated from textual design rules - 256 MC nMOS, 800 single metal CMOS
- Speed-up gt 10000 vs. Motorola 68000
) machine not yet discovered
31MoM Architecture Features
- Scan Cache Size adjustable at run time
- Any other shape than square supported
- 2-dimensional memory space
- Supports generic scan patterns
- Subject of parallel access transformations
- compare Francky Cathoor et al .
- Supports visualization
32Hot Research Topic Memory Architectures
- High Performance Embedded Memory Architectures
Cathoor et al. - High Performance Memory Communication
Architectures Herz - Custom Memory Management Methodology Cathoor et
al - Data Reuse Transformations Kougia et al.
- Data Reuse Exploration Soudris, Wuytak
- Rapidly greowing market IP cores, module
generators ets.
33Processor Memory Performance Gap
von Neumann bottleneck
34rDPAs classical cache does not help
- super pipe networks, no parallel computers !
- Stream-based arrays are a memory bandwidth problem
- the memory bandwidth problem is often more
dramatic then for microprocessors
- classical interleaving is not practicable, since
based on sequential instruction streams
- classical caches do not help, since instruction
sequencing is not used
- the problem throughput of parallel data streams,
not instruction streams
35Cache does not help ....
however, the anti machine has no v.N. bottleneck!
36Data-Stream-based Soft Anti Machine
37The Disk Farm? or a System On a Card?
Gordon Bell, Jim Gray, ISCA2000
- The 500GB disc card
- LOTS of bandwidth
- A few disks replaced by
- gt10s Gbytes RAM
- and a processor
MicroDrive1.7 x 1.4 x 0.2 2006 ? 1999 340
MB, 5400 RPM, 5 MB/s, 15 ms seek 2006 9 GB, 50
MB/s ? (1.6X/yr capacity, 1.4X/yr
BW) Integrated IRAM processor 2x height Connected
via crossbar switch growing like Moores law 16
Mbytes 1.6 Gflops 6.4 Gops 10,000 nodes in
one rack! 100/board 1 TB 0.16 Tflops
38gtgtgt Coarse Grain
- END -
39Appendix
- APPENDIX -
40Alliances
41Xilinx Alliances
- The Software AllianceEDA Program
- ... Xilinx Inc.'s Foundation...
- free WebPACK downloadable tool palette
- The Xilinx XtremeDSP Initiative (with Mentor
Graphics) - MathWorks / Xilinx Alliance.
- The Wind River / Xilinx alliance
42The Software Alliance EDA Program
IKOS Systems, Innoveda, Mentor Graphics,
MiroTech, Model Technoloy, Protel
International, Simucad, SynaptiCAD,
Synopsys, Synplicity, Translogic, Virtual
Computer Corporation.
- Acugen Software,
- Agilent
- EEsof EDA,
- Aldec,
- Aptix,
- Auspy Development,
- Cadence,
- Celoxica,
- Dolphin Integration,
- Elanix,
- Exemplar,
- Flynn Systems,
- Hyperlynx,
- provides a wide selection of EDA tools
helps leading EDA vendors to integrate Xilinx
Alliance software tightly into their tools
43The Xilinx AllianceCORE program
- a cooperation between Xilinx and third-party core
developers, to produce a broad selection of
industry-standard solutions for use in Xilinx
platforms. - Partners are
Amphion Semiconductor, Ltd. ARC Cores CAST,
Inc. DELTATEC Derivation Systems, Inc. Dolphin
Integration (Grenoble) Eureka Technology Inc.
Frontier Design Inc. GV Associates, Inc.
inSilicon Corporation iCODING Technology Inc.
Loarant Corporation Mindspeed Technologies - A
Conexant Business (formerly Applied Telecom)
MemecCore Mentor Graphics Inventra NewLogic
Technologies, Inc. (Europe) NMI Electronics
Paxonet Communications, Inc. Perigee, LLC
Rapid Prototypes Inc. sci-worx GmbH (Hannover,
Germany) SysOnChip TILAB (Telecom Italia Lab)
VAutomation Virtual IP Group, Inc. XYLON.
44The Xilinx Reference Design Alliance Program
- The Xilinx Reference Design Alliance Program
helps the development of multi-component
reference designs that incorporate Xilinx devices
and other semiconductors. - The designs are fully functional, but no
warranties, no liability. Partners are.
JK microsystems, Inc. LYR Technologies NetLogic
Microsystems
ADI Engineering Innovative Integration
45The Xilinx University Program
- The Xilinx University Program provides
- Xilinx Student Edition Software,
- Professor Workshops,
- a Xilinx University User Group,
- Presentation Materials and Lab Files,
- Course Examples,
- Research,
- Books, etc.
46Altera offers over a hundred IP cores (1)
Altera offers over a hundred IP cores like, for
example
- modulator,
- synchronizer,
- DDR SDRAM controller,
- Hadamar transform,
- interrupt controller,
- Real86 16 bit microprocessor,
- floating point,
- FIR filter,
- discrete cosine,
- ATM cell processor,
- and many others.
- controller,
- UART,
- microprocessor,
- decoder,
- bus control,
- USB controller,
- PCI bus interface,
- viterbi controller,
- fast Ethernet
- MAC receiver or transmitter,
47Altera offers over a hundred IP cores (2)
- from Altera
- AMIRIX Systems, Inc.
- Amphion Semiconductor, Ltd.
- Arasan Chip Systems, Inc.
- CAST, Inc.
- Digital Core Design
- Eureka Technology Inc.
- HammerCores
- Innocor
- Ktech Telecommunications, Inc.
- Lexra Computing Engines
- Mentor Graphics - Inventra
Modelware Ncomm, Inc. NewLogic Technologies
Northwest Logic Nova Engineering, Inc.
Palmchip Corporation Paxonet Communications
PLD Applications Sciworx Simple Silicon
Tensilica TurboConcept.
48Altera IP core design services
- Altera IP core design services are available from
49Altera Certified Design Center (CDC) Program
- Certified Design Center (CDC) Program
- Barco Silex
- El Camino GmbH
- Excel Consultants
- Plextek
- Reflex Consulting
- Sci-worx
- Tality
- Zaiq Technologies.
50The Altera Consultants Alliance Program (ACAP)
- The Altera Consultants Alliance Program (ACAP)
lists - 41 offices in North America and
- 29 in the rest of the world.
51Devlopment boards
- Devlopment boards are offered from
- Altera
- El Camino GmbH
- Gid'el Limited
- Nova Engineering, Inc.
- PLD Applications
- Princeton Technology Group
- RPA Electronics Design, LLC
- Tensilica.
52Consultants and services not listed by Xilinx
nor Altera (index)
Flexibilis, Tampere, Finland, Geoff Bostock
Designs, Wiltshire, England, Great River
Technology, Alberquerque, NM, New Horizons GB
Ltd, United Kingdom, North West Logic Silicon
System Solutions, Canterbury, Australia,
Smartech, Tampere, Finland, Tekmosv, Austin,
Texas, The Rockland Group, Garden Valley,
CA Nick Tredennick, Los Gatos, California,
Vitesse,
- Algotronix, Edinburgh,
- Andraka Consulting Group
- Arkham Technology, Pasadena, CA
- Barco Silex, Louvain-la-Neuve, Belgium,
- Bottom Line Technologies, Milford, NJ
- Codelogic, Helderberg, South Africa,
- Coelacanth Engineering, Norwell, MASS
- Comit Systems, Inc., Santa Clara, CA
- EDTN Programmable Logic Design Center
53Consultants and services not listed by Xilinx
nor Altera (1)
- Algotronix, Edinburgh, Reconfigurable Computing
and FPL in software radio, communications and
computer security - Andraka Consulting Group high performance FPGA
designs for DSP applications - Arkham Technology, Pasadena, low cost IP cores
for Xilinx and Atmel, embedded processor, DSP,
wireless communication, COM / CORBA / DirectX,
client-server database programming, software
internationalization, PCB design - Barco Silex, Louvain-la-Neuve, Belgium, IP
integration boards for ASIC and FPGA,
consultancy, design, sub-contracting
54Consultants and services not listed by Xilinx
nor Altera (2)
- Bottom Line Technologies, Milford, New Jersey,
FPGA design, training, designing Xilinx parts
since 1985 - Codelogic, Helderberg, South Africa, consulting,
FPGA design services - Coelacanth Engineering, Norwell, Massachusetts,
design services, test development services, in
wireless communication, DSP-based
instrumentation, mixed-signal ATE - Comit Systems, Inc., Santa Clara, California,
DSP, ASIC, networking, embedded control in
avionics -- FPGA / ASIC design and system
software - EDTN Programmable Logic Design Center
55Consultants and services not listed by Xilinx
nor Altera (3)
- FirstPass, Castle Rock, Colorado
- Vitesse, ASIC design
- Flexibilis, Tampere, Finland, VHDL IP cores for
Xilinx products - Geoff Bostock Designs, Wiltshire, England, FPGA
design services - Great River Technology, Alberquerque, New Mexico,
FPGA design services in digital video and
point-to-point data transmission for aerospace,
military, and commercial broadcasters - New Horizons GB Ltd, United Kingdom, FPGA design
and training, Xilinx specialist - North West Logic FPGA and embedded processor
design in digital communications, digital video
56Consultants and services not listed by Xilinx
nor Altera (4)
- Silicon System Solutions, Canterbury, Australia,
VHDL IP cores for the ASIC and FPGA/CPLD/EPLD
markets - Smartech, Tampere, Finland, ASIC and FPGA design
- Tekmosv, Austin, Texas, Multiple Designs on a
Single Gate Array, HDL synthesis, design
conversions, chip debug, test generation - The Rockland Group, Garden Valley, California, a
TeleConsulting organization about logic design
for FPGAs - Nick Tredennick, Los Gatos, California, investor
and consultant
57Terms
58Confusing Terminology
- Computer Science and EE as well as ist RD and
applicatgion areas suffer from a babylonial
confusion. - Communication not only between Computer Science
and EE, but also between ist special areas, even
between ist different abstraction levels is made
difficult mainly because of immature
terminology in relation to reconfigurable
circuits and their applications. - Terms are rarely standardized and often used with
drastically different meanings even within then
same special area. - Often terms have been so badly coined, that they
are not self-explanatory, but misleading. A
demonstratory example is the comparizon of terms
used used in VHDL and Verilog. - Ideal are "intuitive" terms. But often Intuition
yields the wrong idea. Whenever a new term
appears in teaching, I often have to tell the
students, that the term does not mean, what he
believes.
59Terms (1)
.
à la Ingo Kreuz
Term Meaning Example
Hardware hardwired Processor, ASIC
Flexware Reconfigurable (structurally programmable) FPLA, FPGA, KressArray
Firmware Microprogramme (rarely used after introduction of RISC proc.) IBM 360 Computer Family
Software procedural programs (sequentially executable by a CPU) Word, C, OS, Compiler, etc.
Configware structural programs, soft IP cores, personalizing CPLD, FPGA, or other Flexware for rDPA FPGA configuration, e. g. as a logic circuit, state machine, datapath, function
60Terms (2)
.
à la Ingo Kreuz
Term Meaning Example
data objects of computing data property depends on the moment of watching Bits, numbers, operands, results, any text (also compiler input) lists, graphs, tables, images, ...
data stream ordered, also parallel data word lists, obtained by scheduling I/O data streams for systolic or other arrays
programming personalisation by loading programm code procedural code or structural code for (re)configuration
program source text or object code for programming procedural oder structural
61Terms (3)
.
à la Ingo Kreuz
Term Meaning Example
boot program simple program to enable programming - usually saved in non-volatile memory comparable to the starter of the motor of a car
booting load and execute a boot program
62Hardware Terms (1)
à la Ingo Kreuz
Term Meaning Example
machine execution unit, driven by deterministic sequencer von Neumann machine
dataflow machine not a machine, since without a deterministic sequencer (exotic concept) (sleeping research area)
CPU Instruction Set Processor ("von Neumann) program counter (instruction sequencer) and DPU - mode of operation deterministically instruction-driven ARM, Pentium core,
63Hardware Terms (2)
à la Ingo Kreuz
Term Meaning Example
DPU data path unit, processes operands - no CPU since without sequencer - no maschine ALU with registers, multiplexers etc.
Computer CPU with RAM and interfaces
Parallel Computer ensemble of several Computers
Xputer deterministically data-driven Machine, (transport-triggered) - data counter(s) used instead of a program counterm MoM architectures (Kaiserslautern)
dataflow machine indeterministically data-driven (execution sequence unpredictable) (sleeping research area)
64Terms on Parallelism (1)
à la Ingo Kreuz
Term Meaning Example
parallelism several levels of parallelism distinguished parallel processes, parallelism at instruction set level, pipelines,
concurrent parallel processes run on different CPUs of a parallel computer - may occasionally exchange signals or data weather prognisis, complex simulations, etc.
ISP (instruction set parallelism) several CPUs run in parallel by clocked synchronization VLIW (very long instruction word) computer
65Terms on Parallelism (2)
à la Ingo Kreuz
Term Meaning Example
pipelining several uniform or different DPUs running simultaneously - connected to a pipeline by buffer registers. pipelined CPUs, pipe networks, systolic, etc.
chaining several uniform or different DPUs running simultaneously - connected to a pipeline without buffer registers Schaltnetze, komplexe arithmetische Operatoren
Pipe network Ensemble of DPUs, also multiple pipelines, also with irregular or wild structures systolisc arrays, stream-based computing arrays
66Terms on Parallelism (3)
à la Ingo Kreuz
Term Meaning Example
Systolic Array Pipe network with only linear (straight-on, no branching), uniform pipelines (all DPUs hardwired and with same functionality) pipelines Matrix computation, DSP, DNA sequencing, etc.
stream-based computing arrays (super-systolic arrays) pipe network, configured before fabrication image processing, DSP, complex functions and algorithms
(coarse grain) reconf. stream-based arrays stream-based arrays, configurable after fabrication KressArray
67Counterparts
à la Ingo Kreuz
category property counterpart
programing mode procedural (classical) structural (synthesis, design) - field-programmable, PLA programming, etc.
machine principle of operation controlflow-driven (instruction-driven) v. Neumann Data-driven Xputer machine
system principle of operation instruction-flow-driven (parallel computer etc.) Data-stream-based (systolisc array, DPU array, KressArray)
Set-up time (datapaths switched thru) during run time (instruction-driven) before run time FPGA (at compile time) Gate Array (at fabrication)
68-
69Synthesizable Memory Communication
http//kressarray.de
70Opportunities by new patent laws ?
- to clever guys being keen on patents
- dont file for patent following details !
- everything shown in this presentation has been
published years ago