Title: Future%20Directions%20in%20Computer%20and%20Systems%20Architecture%20for%20Scientific%20Computing
1Future Directions in Computer and Systems
Architecture for Scientific Computing
- Rick Stevens
- Director, Mathematics and Computer Science
Division - Argonne National Laboratory
- and
- Professor of Computer Science and
- Co-Director of The Computation Institute
- The University of Chicago
- stevens_at_mcs.anl.gov
2Outline of This Talk
- Supercomputing trends
- Possible Paths to PetaFLOPS
- Where are Clusters Going
- Whats after PC Clusters
- Whats this thing called the Grid
- Questions
3The National Academic Usage of Supercomputers
Continues To Grow Exponentially
Teraflops
1000 x 1985
Gigaflops
Source Quantum Research Lex Lane, NCSA
4Rational Drug Design
Nanotechnology
Tomographic Reconstruction
Phylogenetic Trees
Biomolecular Dynamics
Neural Networks
Crystallography
Fracture Mechanics
MRI Imaging
Reservoir Modelling
Molecular Modeling
Biosphere/Geosphere
Diffraction Inversion Problems
Distribution Networks
Chemical Dynamics
Atomic Scattering
Electrical Grids
Flow in Porous Media
Pipeline Flows
Data Assimilation
Signal Processing
Condensed Matter Electronic Structure
Plasma Processing
Chemical Reactors
Cloud Physics
Electronic Structure
Boilers
Combustion
Actinide Chemistry
Radiation
Fourier Methods
Graph Theoretic
CVD
Quantum Chemistry
Reaction-Diffusion
Chemical Reactors
Cosmology
Transport
n-body
Astrophysics
Multiphase Flow
Manufacturing Systems
CFD
Basic Algorithms Numerical Methods
Discrete Events
PDE
Weather and Climate
Air Traffic Control
Military Logistics
Structural Mechanics
Seismic Processing
Population Genetics
Monte Carlo
ODE
Multibody Dynamics
Geophysical Fluids
VLSI Design
Transportation Systems
Aerodynamics
Raster Graphics
Economics
Fields
Orbital Mechanics
Nuclear Structure
Ecosystems
QCD
Pattern Matching
Symbolic Processing
Neutron Transport
Economics Models
Genome Processing
Virtual Reality
Astrophysics
Cryptography
Electromagnetics
Computer Vision
Virtual Prototypes
Intelligent Search
Multimedia Collaboration Tools
Computer Algebra
Databases
Magnet Design
Computational Steering
Scientific Visualization
Data Mining
Automated Deduction
Number Theory
CAD
Intelligent Agents
5Can PetaFLOPS Transition us from Computing as a
means for Explanation to Computing as a mode for
Discovery?
- Two Modes of Discovery
- getting to a new place first!!
- staying in the new place with the time to look
around - Needed Environment
- reliable infrastructure (discovery happens at the
edges) - predictable and abundant availability of resource
(time) - freedom from pressure to avoid failure (dumb
ideas) - To make this transition we need to make
scientific computing resources more abundant,
more available and more usable
6Best Prepared Applications Communities
- Astrophysics
- Stars, Supernovae, Neutron Stars, Cosmology
- Weapons Simulations
- Burn Codes, 3D Hydrodynamics, 3D Structures
- Climate and Weather Modeling
- Computational Chemistry
- Molecular Modeling
- Cryptography
7Less Prepared Applications Communities
- Computational Linguistics
- Computational Economics
- Operations Research
- Bioinformatics
- Computational Logic
- Complex Systems Simulation
- Engineering/Design Simulations
8(No Transcript)
90.3 m
1.4 m
4oK 50 W
77oK
SIDE VIEW
1 m
Fiber/Wire Interconnects
1 m
3 m
0.5 m
10SIDE VIEW
Nitrogen
Helium
Tape Silo Array (400 Silos)
Hard Disk Array (40 cabinets)
4oK 50 W
77oK
Fiber/Wire Interconnects
Front End Computer Server
3 m
3 m
Console
Cable Tray Assembly
0.5 m
220Volts
220Volts
WDM Source
Generator
Generator
980 nm Pumps (20 cabinets)
Optical Amplifiers
11Simplified Application View of an HTMT Slice
512K CPUlogical units, 128 per slice 500 MHz
clock 32 MB/CPUlogical
DPIM
bw 8 GB/s/PIM CPUlogical latency 85 ns
software
VORTEX
128 DPIM CPUs 4 GB RAM
256K CPUlogical, 64 per slice 1 GHz clock 4
MB/CPUlogical
SPIM
bw 512GB/s/CPUlogical latency 20 ns
10 GB/s ? DPIM ? DV
RTI
40 GB/s ? DV ? SPIM
SPELL CRAM
4096 units, 1 per slice 256 GHz clock 1 MB
CRAM/CPUlogical
64 SPIM CPUs 256 MB RAM
bw 64 GB/s/CPUlogical latency 20 ns
CNET
512 GB/s ? SPIM ? SPELL
SPELL CRAM
1 SPELL 1 MB CRAM
12IBMs Blue Gene Machine
- 1 Million CPUs
- 1 Gflops/CPU, 64 bits, add/mult/cycle _at_ 500 MHz
- 32 CPUs per chip, 16 MB memory on Chip (32
banks), 5 cycle access - 57 RISC instructions, 8 register banks, 8 way
threaded (8 M threads) - 6 I/O Channels per die, 8 bits wide, 2 GB/s
bi-directional - DMA engine per channel, active messaging hardware
support - external memory interface per chip (not used on
the internal nodes) - 3D Mesh topology with dynamic routing
- self healing capability (routes around back CPUs
and Chips) - edge chips will have external memory and PC
interfaces - 64 chips per board ? 2K CPUs per board ? 32K CPUs
rack - 32 Racks ? Air cooled, about 2,000 sq. ft floor
space - Chips in design phase today.. Chip Production
starts in 01
13IBM Blue Gene .. Software and Applications
- Compiler (C and C) based on PowerPC Compilers
- Communications Library (simple message passing)
- Dynamic Source Routing, Active Message Like HW
support - Simple Node Exec (each node only has 500K memory)
- Application Target ? Protein Folding.. Brute
Force - One Chip per Atom (32K atoms)
- Bonded Force Terms Computed Mostly Locally
- Non-bonded Terms computing via messaging
- Other possible applications targets
- N-body problems in cosmology?
- Emergent Phenomena (network sim, Alife, Automata
Theory)
14Chiba City
The Argonne Scalable Cluster
8 Computing Towns 256 Pentium III systems
1 Visualization Town 32 Pentium III systems
with Matrox G400 cards
1 Storage Town 8 Xeon systems with 300G disk
each
Cluster Management 12 PIII Mayor Systems 4 PIII
Front End Systems 2 Xeon File Servers 3.4 TB disk
Management Net Gigabit and Fast Ethernet Gigabit
External Link
High Performance Net 64-bit Myrinet
15Chiba City System Details
- Communications
- 64-bit Myrinet computing net
- Switched fast/gigabit ethernet management net
- Serial control network
- Software Environment
- Linux (based on RH 6.0), plus install your own
OS support - Compilers GNU g, etc
- Libraries and Tools PETSc, MPICH, Globus, ROMIO,
SUMMA3d, Jumpshot, Visualization, PVFS, HPSS,
ADSM, PBS Maui Scheduler
- Purpose
- Scalable CS research
- Prototype application support
- System - 314 computers
- 256 computing nodes, PIII 550MHz, 512M, 9G local
disk - 32 visualization nodes, PIII 550MHz, 512M, Matrox
G200 - 8 storage nodes, 500 MHz Xeon, 512M, 300GB disk
2.4TB total - 10 town mayors, 1 city mayor, other management
systems PIII 550 MHz, 512M, 3TB disk
16(No Transcript)
17Example 320-host Clos topology of 16-port
switches
64 hosts
64 hosts
64 hosts
64 hosts
64 hosts
(From Myricom)
18Evolution of Cluster Services
Usage Management Model
The cluster grows...
Login
Login
Login
Login File Service Scheduling Management
File Service
File Service
File Service Scheduling Management
...
...
Scheduling
Scheduling Management
Management
Improve computing performance.
Improve system reliability and manageability.
Basic Goal
1 or more computers per service.
1 computer that is distinct from the computing
nodes.
Server Strategy
19Alarming Trends of Systems Software
Availabilityfrom Computing Hardware Vendors
2000
2010
1990
Necessary computing power
GigaFLOPs
TeraFLOPs
PetaFLOPs
Applications
Applications
Solution provided by
High Perf Tools
High Perf Tools
High Perf Env
High Perf Env
Customer
System Software
System Software
Vendor
Operating System
Operating System
Hardware
Hardware
Focusing on Scientific Computing
Leveraging Business Solutions
Some Government Research
Solutions Created by...
20HPC Community is Leveraging Commodity Hardware
- But has not yet been able to significantly
leverage commodity software or even commodity
software ABI/APIs beyond the basic kernel and OS
services - commodity software components dont yet exist to
build scalable systems software - high performance is not the primary goal of the
few commercial efforts building systems software
type infrastructure for commodity environments - High-end community has a perennial labor shortage
for HPC systems software developers - How to establish a critical mass of systems
software developers focused on the high-end?
21Targeted Hardware Systems
- End User Integrated Clusters (10K-100K nodes)
- low-cost high-volume CPUs and motherboards (SMP
8-16) - low-cost high-volume communications and SANs
- relatively simple memory subsystems and I/O
support - Vendor Integrated Systems (10K nodes)
- high-performance CPUs and packaging (SMP gt 16)
- proprietary SANs and communications and
interfaces - sophisticated memory and I/O subsystems
- New Generation Architectures (1K-1M nodes)
- unique CPUs, ISAs and packaging
- unique SANs and communications interfaces
- ultra-sophisticated memory and I/O subsystems
22Possible Software Stack for HPC
23Node Issues
- Floating Point and Integer Performance
- Memory Bandwidth
- Cache Performance and Cache Sizes
- Motherboard Chip Sets
24Some Cluster Node Options Today
- Intel Pentium III
- AMD Athlon
- Compaq Alpha
- Intel IA-64
- IBM PowerPC Power4
25Intel Pentium III
- Now available at 1.0GHz (1000MHz) for the desktop
- 866, 850, 800, 750, 733, 700, 667, 650, 600, 550,
533, 500 and 450 MHz Clock Speeds - 70 New Instructions (since first Pentium)
- P6 Architecture
- 133- or 100-MHz System Memory Bus
- 512K Level Two Cache or 256K Advanced Transfer
Cache - Intel's 1GHz Pentium III is shipping now at a
list price of 990. - For more information, check out Intel's Web site
at www.intel.com/PentiumIII/.
26IA-32 Willamette Demod _at_ 1.5 GHz
- Willamette bus is a source-synchronous 64-bit
100MHz bus that is quad-pumped delivering a total
of 3.2GB/s of bandwidth--three times the
bandwidth of the fastest Pentium III bus. - A unique and unexpected aspect of Willamette's
microarchitecture is its "double-pumped" ALUs.
Claiming the effective performance of four ALUs,
the two physical ALUs are each capable of
executing an operation in every half-clock cycle. - 20 Stage Pipeline compared to 10 on P6
- SSE 2 includes support for (dual)
double-precision SIMD floating-point operations
and uses the MMX register set.
27AMD Athlon v Pentium III
- Pentium III's smaller, but faster, on-chip L2
cache is superior to Athlon's larger, but slower,
external L2 cache for most benchmarks. - Pentium III's cache scales with core CPU
frequency, while Athlon's does not. Athlon's
deficiency in this regard will not be remedied
until AMD delivers Thunderbird, which will have a
256K on-chip L2.
(Source AMD, Intel, MDR)
28AMD K7/Athlon (www.amd.com)
- AMD is shipping 1GHz Athlon processors. List
price for the 1GHz version is 1,000 in
quantities of 1,000 units.
Up to 72 instructions can be in execution in K7's
out-of-order integer pipe (light purple),
floating-point pipe (dark purple), and load/store
unit. (Source AMD and MDR)
K7's 22 million transistors occupy 184 mm2 in a
slightly enhanced version of AMD's 0.25-micron
6-layer-metal CS44E process. (Source AMD)
29Compaq Alpha 21264/21364
- The 733-MHz four-issue out-of-order 21264 is
todays fastest microprocessor - The 21364, code-named EV7, will use multiple
Rambus channels to greatly reducing memory
latency. - The 21364 will use the 21264 core but increase
frequency to 1 GHz with a 0.18-micron process,
add a 1.5M on-chip L2, a 6-GByte/s memory port,
and 13 GBytes/s of chip-to-chip bandwidth.
(Source Compaq, MDR)
(Source Compaq, MDR)
30Intel IA-64
- NEW Intel/HP 64-bit microprocessor
- New Architecture Influenced by PA-RISC
- RISC, VLIW, SuperPipelined, Superscalar
- Executes both IA-32 and IA-64 code
- 3 Instructions (41 bit) per bundle
- moderate VLIW, 128 bits/bundle
- Multiple Instruction bundle issues/clock (20
instructions in flight) - Pipeline depth of 10
- Advanced and Speculative Mechanisms
- Memory Hierarchy Hints
- Deep Branch Prediction
31Intel IA-64 Building Blocks
32Future IBM PowerPC (Power4)
- 170-million-transistors
- Two 64-bit 1-GHz five-issue superscalar cores
- Triple-level cache hierarchy
- A 10-GByte/s main-memory interface
- 45-GByte/s multiprocessor interface.
- IBM will see first silicon on Power4 in 1Q00, and
systems will begin shipping in 2H01.
33IBMs Power4
- Power4 includes two gt1-GHz superscalar cores
- gt 100 GBytes/s of bandwidth to a large shared-L2
cache - gt 55 GBytes/s of bandwidth to memory and other
Power4 chips - Four Power4 chips packaged in a MCM as an
eight-processor SMP with total bandwidths of 40
GBytes/s to memory and 40 GBytes/s to other
modules.
(Source IBM, MDR)
(Source IBM, MDR)
34Conclusions
- What is the best Node for your cluster?
- Well it depends on your application of course
- For Integer codes both Pentium III and Athlon _at_ 1
GHz - For 32 bit floating point IA-32 probably best
price performance today - For 64 bit floating point Compaq Alpha is the
clear lead - How long will these choices be clear ?
- Probably Not very long
- Intel will introduce IA-64 (Itanium) very shortly
(mid-summer) - 3 GFLOPS _at_ 800 MHz for 64 Bit, 6 GFLOPS for 32
bit - Limited by Front Side Bus bandwidth of 2.1 GB/s
- AMDs Thunderbird will address external cache
issues with Athlon - AMD is targeting Sledgehammer at IA-64
- Compaq is revising Alpha (EV67-gt EV7) to compete
with IA-64 - IBM is putting considerable resources into
Power4, and CMP and MCMs will enable IBM to
produce very powerful servers/cluster nodes
35Possible Extreme Linux Systems in 2010
(conservative)
- 4.8 GHz Clocks
- 16 way SMP node
- 300 Gflops node
- gt 4 GB RAM
- gt160 GB disk
- 8 x 400 MB/s interfaces/node
- lt 1 kW power/ node
- 2000/node 2000 other
- 8000 nodes
- 2.5 PetaFLOPS peak
- Total system Price 16M
- Annual power bill 8M
36Some Cluster Node Options Tomorrow
37Compaq Itsy Pocket Computer
- Hardware
- 200 MHz StrongARM SA-1100 Processor
- 16MB DRAM, 4MB Flash Memory
- Audio CODEC, microphone, speaker
- LCD Display and Touch screen
- Serial, IrDA and USB IO, pushbuttons
- Software
- Runs Linux
- MIDI
- MPEG Video Playback
- Text to Speech, Speech to Text
- Wireless Web Server
http//www.research.digital.com/wrl/projects/Itsy/
index.html
38IPic Match Head Size Web Server
- PIC 12C509A running _at_ 4MHz
- IPic Tiny TCP/IP stack
- HTTP 1.0 compliant web server
- Simple telnet server
- 24LC256 i2c EEPROM
http//www-ccs.cs.umass.edu/shri/iPic.html
39Are We Destined to Compute on Toys?
- U128-bit RISC chip running at 300 MHz
- RAM 32MB DRAM
- Graphics Processor 150 MHz, 4MB integrated VRAM
- Sound SPU2, 2MB RAM (AC3, Dolby Digital, DTS
support) - Drive 4X DVD, 24X CD read speeds (PlayStation,
DVD, audio CD support) - Dimensions 178mm by 301mm by 78mm (12 inches by
7 inches by 3 inches) - Weight 2.1 kg (4 lbs. 10 oz.)
40Each vector unit has enough parallelism to
complete a vertex operation (19 mul-adds 1
divide) every seven cycles.
- The PSX2's Emotion Engine provides ten
floating-point multiplier-accumulators, four
floating-point dividers, and an MPEG-2 decoder to
deliver killer multimedia performance.
41Playstation2 Specs
42With 4M of multiported DRAM and 16 pixel
processors, the 42.7-million-transistor graphics
synthesizer is 16.7 16.7 mm in a 0.25-micron
process with 0.25-micron gates. (Source SCE)
The Emotion Engine, heart of Sony's
second-generation PlayStation, implements 10.5
million transistors and measures 17 14.1 mm in
a 0.25-micron four-layer-metal 1.8-V process with
0.18-micron gates. (Source SCE and Toshiba)
43IRAM Vision Intelligent PDA
Intelligent RAM project_at_ UCB (Patterson, Yelick
et al.) One of multiple Processor in Memory
Projects aimed at addressing the memory
bottleneck by putting processors IN the memory.
(UCB V-IRAM/ISTORE)
44ISTORE Hardware Vision
- System-on-a-chip enables computer, memory,
without significantly increasing size of disk - 5-7 year target
- MicroDrive1.7 x 1.4 x 0.2 2006 ?
- 1999 340 MB, 5400 RPM, 5 MB/s, 15 ms seek
- 2006 9 GB, 50 MB/s ? (1.6X/yr capacity,1.4X/yr
BW) - Integrated IRAM processor
- 2x height
- Connected via crossbar switch
- growing like Moores law
- 10,000 nodes in one rack!
(UCB V-IRAM/ISTORE)
45Berkeleys Vector IRAM
- Vector processing
- high-performance for media processing
- low power/energy for processor control
- modularity, low complexity
- scalability
- well understood software development (Cray
Compilers) - Embedded DRAM
- high bandwidth for vector processing
- low power/energy for memory accesses (lt 2 WATTS)
- modularity, scalability
- small system size
(UCB V-IRAM/ISTORE)
46Block Diagram
(UCB V-IRAM/ISTORE)
47V-IRAM Design Overview
- Memory system
- 8 2MByte eDRAM banks
- single sub-bank per bank
- 256-bit synchronous interface, separate I/O
signals - 20ns cycle time, 6.6ns column access
- crossbar interconnect for 12.8 GB/sec per
direction - no caches
- Network interface
- user-level message passing
- dedicated DMA engines
- 4 100MByte/s links
- 64b MIPS scalar core
- coprocessor interface
- 16KB I/D caches
- Vector unit
- 8KByte vector register file
- support for 64b, 32b, and 16b data-types
- 2 arithmetic (1 FP), 2 flag processing, 1
load-store units - 4 64-bit datapaths per unit
- DRAM latency included in vector pipeline
- 4 addresses/cycle for strided/indexed accesses
- 2-level TLB
(UCB V-IRAM/ISTORE)
48VIRAM-1 Floorplan
DRAM Bank 0
DRAM Bank 6
DRAM Bank 4
DRAM Bank 2
N I
M I P S
C T L
Vector Lane 3
Vector Lane 0
Vector Lane 1
Vector Lane 2
I O
DRAM Bank 1
DRAM Bank 7
DRAM Bank 5
DRAM Bank 3
(UCB V-IRAM/ISTORE)
49Vector Lanes
64b
Control
256b
(UCB V-IRAM/ISTORE)
50V-IRAM Prototype Summary
- Technology
- 0.18um eDRAM CMOS process (IBM)
- 6 layers of copper interconnect
- 1.2V and 1.8V power supply
- Memory 16 MBytes
- Clock frequency 200MHz
- Power 2 W for vector unit and memory
- Transistor count 140 millions
- Peak performance
- GOPS 3.2 (64b), 6.4 (32b), 12.8 (16b)
- GFLOPS 1.6 (32b)
(UCB V-IRAM/ISTORE)
51Petaflops from V-IRAM?
- In 2005 .. V-IRAM 6 Gflops/64 MBs
- 50 TF per 19 Rack (10K CPUs/Disk assemblies)
- 100 drawers of 100 processors (like library
cards) - cross bar interconnected at Nx 100 MB/s
- 20 optically interconnected Racks
- 1015 FLOPS 20M
- Power 20K Watts x 20 400K Watts
- Would use mostly the same parts as Cell phones
and PDAs - Probably could have only four types of components
52The Computational Grid ? Moving From Local to
Global Access to Computing and Information
Resources
53The Emerging Concept of a National Scale
Information Power Grid
54The Grid Links People with Distributed Resources
on a National Scale
http//science.nas.nasa.gov/Groups/Tools/IPG
55Examples of Grids Testbeds
- Applications oriented Gigabit Testbeds
- CASA, Aurora, BLANCA, etc.
- I-WAY (at SC95 in San Diego)
- Gusto 97 and 98,99, (Globus project testbed)
- UW and Microsoft HDTV over NTON Demo at SC99
- STARTAP iGrid demonstration effort at SC98
- NCSA Alliance National Technology Grid
- SDSC Integrated Metasystems (Legion Testbed)
- NASA Information Power Grid project
- DOE NGI Testbeds (e.g. EMERGE)
National Computational Science Alliance, National
Technology Grid Charlie Catlett et. al., NCSA
56The Vision for the Grid
- Persistent, Universal and Ubiquitous Access to
Networked Resources - Common Tools and Infrastructure for Building 21st
Century Applications - Integrating HPC, Data Intensive
- Computing, Remote Visualization
- and Advanced Collaborations
- Technologies
57The Grid from a Services View
Cosmology
Chemistry
Environment
Applications
Nanotechnology
Biology
Distributed
Data-
Remote
Problem
Remote
Collaborative
Computing
Intensive
Visualization
Solving
Instrumentation
Application
Applications
Applications
Applications
Applications
Applications
Applications
Toolkits
Toolkit
Toolkit
Toolkit
Toolkit
Toolkit
Toolkit
Grid Services
Resource-independent and application-independent
services
(Middleware)
E.g.,
authentication, authorization, resource
location, resource allocation, events, accounting,
remote data access, information, policy, fault
detection
Resource-specific implementations of basic
services
Grid Fabric
E.g., Transport protocols, name servers,
differentiated services, CPU schedulers, public
key
(Resources)
infrastructure, site accounting, directory
service, OS bypass
58Midwest Networked CAVE and ImmersaDesk Sites
Argonne UIC-Chicago UIUC-Urbana U Wisconsin U
Michigan Indiana U U Iowa Iowa State U
Minnesota U of Chicago
59Access Grid Nodes
- Access Grid Nodes Under Development
- Library, Workshop
- ActiveMural Room
- Office
- Auditorium
60CorridorOne ? Distance Visualization
- Argonne
- Berkeley Lab
- Los Alamos
- Princeton
- University of Illinois
- University of Utah
- DOE NGI Project
61Acknowledgements
62Many Thanks to Our Research Collaborators
- Los Alamos (Ahrens, Painter, Reynders)
- Utah (Johnson, Hansen)
- Princeton (Li, Finkelstein, Funkhouser)
- UIUC (Reed, Brady)
- EVL/UIC (DeFanti, Leigh, Sandin, Brown)
- LCSE/Minnesota (Woodward)
- LBNL (Lucas, Lau, Johnston, Bethel)
- NCSA (Smarr, Catlett, Baker, Cox)
- Kitware (Schroeder, Lorensen)
- UCB (Demmel, Culler)
63More Thanks Owed For
- AccessGrid and Metro
- NCSA, Utah, EVL/UIC, UIUC, Princeton, Maui, UNM,
GATech, UKY, LANL, LBNL, Boston (20 by end of
FY00) - ActiveMural and MicroMural
- Princeton, UIUC, LLNL, Minn, EVL
- Immersive and Large-Format Scientific
Visualization - Chicago, NCSA, EVL/UIC, LANL, LSU, Kitware, Utah,
UIUC - ManyWorlds/CAVERNsoft
- NCSA Alliance, UIUC, LSU, APS GCA, Nalco, AVTC
64The FL Group at Argonne/Chicago is
Rick Stevens, Mike Papka, Terry Disz, Bob Olson,
Ivan Judson, Randy Hudson, Lisa Childers, Mark
Hereld, Joe Paris, Tom Brown, Tushdar Udeshi,
Justin Binns, Mary Fritsch, Chenhui Yang Thanks
to Argonne, UChicago, DOE and NSF for
support!! Thanks to the NCSA, LANL, LBNL,
Princeton, Utah and EVL for pics and movies
65(No Transcript)
66Questions?
67Questions?
68Questions?
69Questions?
70Questions?
71Questions?
72Questions?
73Questions?
74Questions?
75Partnerships for Advanced Computational
Infrastructure Partner and Leading Edge Sites
76Universities With Projects Using NSF
Supercomputers
850 Projects in 280 Universities
77University High Speed NetworksNSF vBNS and
Internet2 Abilene
78Prototyping Americas 21st Century Information
Infrastructure
The National Technology Grid
79The Next Step is the Integration of
International Research Networks
- NSF Funded Interconnection Point to US Networks
- Managed by Alliance Sites EVL (UIC) and ANL
- Operated by Ameritech Advanced Data Services
- Applications
- Remote Instrumentation
- Shared Virtual Realities
- Tele-Immersion
- Distributed Computing
- Real-Time Client/Server
- Interactive Multimedia
- Tele-Instruction
- Tele-Medicine
- Digital Video
www.startap.net
80A Telecommunications Trend
- Data traffic will exceed voice traffic in 3-5
years - Voice traffic will become an small fraction of
total traffic within 10 years - Unlimited POTS voice service will be tossed in as
a perk - Bandwidth prices will eventually begin to follow
Moores law - 2X capability every 18 months
- Eventually we will think of bandwidth as free
like the air
81(No Transcript)
82The Vision for the Grid
- Persistent, Universal and Ubiquitous Access to
Networked Resources - Common Tools and Infrastructure for Building 21st
Century Applications - Integrating HPC, Data Intensive
- Computing, Remote Visualization
- and Advanced Collaborations
- Technologies
83The Grid from a Services View
Cosmology
Chemistry
Environment
Applications
Nanotechnology
Biology
Distributed
Data-
Remote
Problem
Remote
Collaborative
Computing
Intensive
Visualization
Solving
Instrumentation
Application
Applications
Applications
Applications
Applications
Applications
Applications
Toolkits
Toolkit
Toolkit
Toolkit
Toolkit
Toolkit
Toolkit
Grid Services
Resource-independent and application-independent
services
(Middleware)
E.g.,
authentication, authorization, resource
location, resource allocation, events, accounting,
remote data access, information, policy, fault
detection
Resource-specific implementations of basic
services
Grid Fabric
E.g., Transport protocols, name servers,
differentiated services, CPU schedulers, public
key
(Resources)
infrastructure, site accounting, directory
service, OS bypass
84Teleimmersion Networking Requirements
- Immersive environment
- Sharing of objects and virtual space
- Coordinated navigation and discovery
- Interactive control and synchronization
- Interactive modification of environment
- Scalable distribution of environment
85Globus-Enabled Support forPriority Traffic on
the Internet
Experiments performed on GARnet, testbed
constructed with Cisco support
86Active Spaces ? Exploring Future Workspace
Environments
Related Work Xerox PARC Ubicomp (Wieser et. al.)
MIT Oxygen Project UIUCs SmartSpaces (Reed et.
al.) UNCs Office of the Future (Fuchs et. al.)
87Organizational and Social Trends
- Distribution and Virtualization of Organizations
- Merging of Work/play/home/learning
- Work As Conversation/communication/collaboration
- Emerging Dominance of Multidisciplinary Problems
- Emerging Dominance of Consumer Products/tech
Drivers - What are possible infrastructures and
environments to support future work, learning,
living and play?
88Active Spaces ? Workspace Environments
- Ensembles of Display and Computer Systems and
Software Environments Cooperatively Working to
Support the User - New Application Environment Metaphor
- Above, Beyond, Behind and Below the Desktop
- Includes the Desktop and Personal Devices, but
Also Includes Large-format, Immersive and Room
Scale Systems - Challenges
- integrate these different modalities in positive
ways to enable more rapid problem solving - provide bridges to existing tools and
environments - combine with high-end visualization and
collaboration systems
89StationOne
Early Concept sketch for a single user visual
SuperStation. Designed to be both a
high-resolution visualization environment and a
high-end node in a collaboratory for advanced
simulation.
Argonne Futures Lab Group (Stevens et al.)
90(No Transcript)
91Proposed Totally Active Work Space (TAWS)
Electronic Visualization Lab (DeFanti et al.)
92Active Spaces ? As Spaces
- Blending Workspace Architecture and Information
and Technology Rich Environments - How groups of people organize for doing complex
tasks - How space affects work/thinking
- How distance and time affects working
- Inspired by the concept of the Medieval Craft
Workshop
93Active Spaces ? As Grid Nodes
- They support several key notions
- Blending of virtual and physical interfaces
- Uses variety of display types from immersion to
small 2D displays - Uses variety of scales of interfaces
- Couple data resources to physical interfaces
provides tangible handles or references to
virtual resources - Support scaling of streams and bandwidths to
dozens of persistent real-time streams - Support the integration of many types of tools
and components allows for ad hoc integration of
new and exiting tools
94Midwest Networked CAVE and ImmersaDesk Sites
Argonne UIC-Chicago UIUC-Urbana U Wisconsin U
Michigan Indiana U U Iowa Iowa State U
Minnesota U of Chicago
95Welcome to Chiba City
96(No Transcript)
97Advanced Display Environments
- Advanced display systems provide more resolution
and capability (e.g. Immersion, Stereo) than the
desktop - ActiveMural
- MicroMural
- CAVE
- ImmersaDesk
- The Futures Lab Group is evaluating high-end
scientific visualization applications in all of
these environments -
- We are beginning to study which types of devices
are appropriate for which types of datasets,
analysis goals and visualization modalities
98(No Transcript)
99(No Transcript)
100Directions for Active Space Development
- We Are Constructing Experimental Distributed
Intentional Workspaces Based on the Concept of
the Artists Workshop or Studio As Imagined in
Medieval Times but Updated for the 21st Century - We are Focusing on Three general Problems
- Exploration of Advanced Display Environments
- Group to Group Collaboration Technologies
- Grid based Interaction and Visualization
Technologies, Architecture Systems and
Infrastructures - Computational Science, High-end Scientific
Visualization , and Parallel Systems Design and
Development and are our Drivers
101Active Mural ? Beyond the MegaPixel
Related Work Stanford's Info Mural (Hanrahan,
et. al.) Princetons Giant Display Wall (Li, et.
al.) MIT AI Labs Big Inexpensive Display (Knight
et. al.) Minnesotas Great Wall of Power
(Woodward et. al.) EVLs Infinity Wall (DeFanti,
et. al.) UIUCs SmartSpaces Wall (Reed et.
al.) UNCs Office of the Future (Fuchs et. al.)
102ActiveMural, a Tiled Display Wall
- Argonne, Princeton and UIUC Collaboration
- 8 x 16 display wall
- Jenmar Visual Systems BlackScreen technology, gt
10000 lumens - 8 LCD ? 15 DLP ? 24 DLP
- 8-20 MegaPixels
- SGI and Linux drivers
- VR and ActiveSpace UI
103(No Transcript)
104 The Active Mural Lab with Access Grid Tech
105Corridor One ? Attacking the Distance
Visualization and Resource Problem
Related Work Argonne/EVL I-WAY Project
(DeFanti/Stevens, et. al.) Argonne/ISI Globus
Project (Foster/Kesselman et. al.) Darpa Gigabit
Testbeds (Kahn et. al.) NCSA Metacomputing
Project (Catlett/Smarr et. al.) NASA IPG Project
(Johnston et. al.)
106CorridorOne ? Distance Visualization
- Argonne
- Berkeley Lab
- Los Alamos
- Princeton
- University of Illinois
- University of Utah
- DOE NGI Project
107C1 Distance Corridor Systems Architecture
- Data Servers
- Analysis and Manipulation Engines
- Visualization Servers
- Visualization Clients
- Display Device Interfaces
- Advanced Networking Services
108Our Belief and Conclusion
- The Development and Success of Future
Integrated Networked Digital Group Workspaces
Will Be Strongly Influenced by a Number of
Technologies, Concepts and Issues - The Emergence of the Grid and The Widespread Use
of Commodity Systems and Components - Group Oriented Collaboration Environments and
Tools - Increasing Integration of Display Systems and
Internetworked Digital Devices Into the
Environment, Especially Wireless Devices and
Large-Format (non-Desktop) Displays - New Visual and Interactive Modalities (e.g.
Teleimmersion) Will Emerge to Address Issues of
Working over Distance and Time - These New Types of Emerging Environments Needs to
Support Both Formal and Informal Interactions - Informal Social Use of These Environments Is
Critical to Their Use, Adoption and Eventual
Success - How People Work in These New Spaces Must Be
Studied and the Impacts on Productivity, Learning
and Development Must Be Learned
109ActiveMural, ?Mural Price Performance
- Leverages COTS Technology
- Commodity Projectors Cost 0.006/pixel
- High End Projectors Cost 0.015/pixel
- Platform Independent
- SGI, NT, Linux Multiple Configurations
- Complete Linux Based System lt 100K
- Screen and Frame 10K
- Projectors 36K
- Computers W/ Graphics 40K
- Collaboration With Princeton and UIUC
- Coordinating Software Efforts
- Projector Mount Fabrication
- Virtual Framebuffer Software
- Diagnostics and Analysis Software
Sub mm pixels
Image blending
110The ?Mural
Portable Six Projector Tiled Display
for High-Resolution Visualization
111(No Transcript)
112Projector Alignment/ Image Blending
113(No Transcript)
114(No Transcript)
115(No Transcript)
116(No Transcript)
117Research Directions for ActiveMural, ?Mural
- Continued development and refinement of
ActiveMural - Characterize and improve image quality (blending,
filters, etc.) - Improve calibration, reconfigurability and tuning
- Software environment (e.g. virtual frame buffers)
- Investigate use of ActiveMural as multiresolution
testbed - Human factors studies and comparisons with CAVE
and Idesks - Development of smaller versions of AM (?Mural)
that can can be more widely deployed - Six panel version under construction
- will be demonstrated today and at SC99
- StationOne prototype planning
- Office friendly version of ?Mural
118Access Grid ? Integrating Group to Group
Collaboration and Visualization
Related Work Berkeley and LBNLs Mbone Tools
(Jacobson et al.) Xerox PARC MOO and Jupiter
Projects (Curtis, et. al.) Argonne/NEUs Labspace
Project (Evard/Stevens et. al.) EVLs CAVERsoft
(DeFanti, et. al.) UNCs Office of the Future
(Fuchs et. al.) DOEs Collaboratory Pilots
(Zalusec et. al.)
119Group-to-Group Interaction is Different
- Large-scale Scientific and Technical
Collaborations Often Involve Multiple Teams
Working Together - Group-to-group Interactions Are More Complex Than
Than Individual-to-individual Interactions - The Access Grid Project Is Aimed at Exploring and
Supporting This More Complex Set of Requirements
and Functions - The Access Grid Will Integrate and Leverage
Desktop Tools As Needed
120Access Grid Nodes
- Access Grid Nodes Under Development
- Library, Workshop
- ActiveMural Room
- Office
- Auditorium
121Components of an Access Grid Node
122(No Transcript)
123(No Transcript)
124(No Transcript)
125The Access Grid is More Than Teleconferencing
- Physical Spaces Designed to Support Distributed
Groupwork - Multiple Virtual Collaborative Venues
- Agenda Driven Scenarios and Sessions
- Integration With GRID Services
126The Five Stages of Joint Work
- Awareness
- Interaction
- Cooperation
- Collaboration
- Organization
127Group-to-Group Requirements
- Need to Enable Both Subgroup and Group
Communication - Support Multi-layered and Simultaneous
Interactions - Multiple Private/group-private/back Channels
- Integration of Groupware Tools/concepts With
Collaboration
128The Docking Concept for Access Grid
Private Workspaces - Docked into the Group
Workspace
129The Illinois True Grid Dark Fiber
Infrastructure
- State Funded Network Testbed (4.0M/FY00)
- Application Oriented
- QoS and Resource Scheduling
- Research and Development
- Network Research
- Advanced Systems Interfaces
- Technology Application
- Experimental Protocols
- Abilene/SuperNet Access for Chicago
- Possibility of a POP at ANL
- Access to MREN/AADS
- Technology Partners
- CISCO
- Fore Systems
130Proposed Terabit/s Midwest Network Testbed
- Includes NCSA, Argonne, U of Okla
- 10 Gbit/s Application Testbed Planned
- Industry partners include Williams
Communications, Nortel, Corvis, Cisco - Integration of GRID middleware with network QoS
and network provisioning systems
- Corvis All-Optical Technologies
- 400 Gbit/s Transport per fiber strand
- Transmission distance to 3200 km
- 2.4 Tb/s optical switching