Storage: Alternate Futures - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Storage: Alternate Futures

Description:

Kaps, Maps, SCAN? Kaps: How many kilobyte objects served per second ... SCAN: How long to scan all the data. the data mining and utility metric. And ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 54
Provided by: jimg178
Category:

less

Transcript and Presenter's Notes

Title: Storage: Alternate Futures


1
Storage Alternate Futures
Yotta Zetta Exa Peta Tera Giga Mega Kilo
  • Jim Gray
  • Microsoft Research
  • http//Research.Microsoft.com/Gray/talks
  • IBM Almaden, 1 December 1999

2
Acknowledgments Thank You!!
  • Dave Patterson
  • Convinced me that processors are moving to the
    devices.
  • Kim Keeton and Erik Riedell
  • Showed that many useful subtasks can be done by
    disk-processors, and quantified execution
    interval
  • Remzi Dusseau
  • Re-validated Amdahl's laws

3
Outline
  • The Surprise-Free Future (5 years)
  • 500 mips cpus for 10
  • 1 Gb RAM chips
  • MAD at 50 Gbpsi
  • 10 GBps SANs are ubiquitous
  • 1 GBps WANs are ubiquitous
  • Some consequences
  • Absurd (?) consequences.
  • Auto-manage storage
  • Raid10 replaces Raid5
  • Disc-packs
  • Disk is the archive media of choice
  • A surprising future?
  • Disks (and other useful things) become
    supercomputers.
  • Apps run in the disk

4
The Surprise-free Storage Future
  • 1 Gb RAM chips
  • MAD at 50 Gbpsi
  • Drives shrink one quantum
  • Standard IO
  • 10 GBps SANs are ubiquitous
  • 1 Gbps WANs are ubiquitous
  • 5 bips cpus for 1K and 500 mips cpus for 10

5
1 Gb RAM Chips
  • Moving to 256 Mb chips now
  • 1Gb will be standard in 5 years, 4 Gb will
    be premium product.
  • Note
  • 256Mb 32MB the smallest memory
  • 1 Gb 128 MB the smallest memory

6
System On A Chip
  • Integrate Processing with memory on one chip
  • chip is 75 memory now
  • 1MB cache gtgt 1960 supercomputers
  • 256 Mb memory chip is 32 MB!
  • IRAM, CRAM, PIM, projects abound
  • Integrate Networking with processing on one chip
  • system bus is a kind of network
  • ATM, FiberChannel, Ethernet,.. Logic on chip.
  • Direct IO (no intermediate bus)
  • Functionally specialized cards shrink to a chip.

7
500 mips System On A Chip for 10
  • 486 now 7 233 MHz ARM for 10 system on a
    chiphttp//www.cirrus.com/news/products99/news-pr
    oduct14.html AMD/Celeron 266 30
  • In 5 years, todays leading edge will be
  • System on chip (cpu, cache, mem ctlr, multiple
    IO)
  • Low cost
  • Low-power
  • Have integrated IO
  • High end is 5 BIPS cpus

8
Standard IO in 5 Years
  • Probably
  • Replace PCI with something better will still
    need a mezzanine bus standard
  • Multiple serial links directly from processor
  • Fast (10 GBps/link) for a few meters
  • System Area Networks (SANS) ubiquitous (VIA
    morphs to SIO?)

9
Ubiquitous 10 GBps SANs in 5 years
  • 1Gbps Ethernet are reality now.
  • Also FiberChannel ,MyriNet, GigaNet, ServerNet,,
    ATM,
  • 10 Gbps x4 WDM deployed now (OC192)
  • 3 Tbps WDM working in lab
  • In 5 years, expect 10x, progress is astonishing
  • Gilders law Bandwidth grows 3x/year
    http//www.forbes.com/asap/97/0407/090.htm

1 GBps
120 MBps (1Gbps)
80 MBps
5 MBps
40 MBps
20 Mbsp
10
Thin Clients mean HUGE servers
  • AOL hosting customer pictures
  • Hotmail allows 5 MB/user, 50 M users
  • Web sites offer electronic vaulting for SOHO.
  • IntelliMirror replicate client state on server
  • Terminal server timesharing returns
  • . Many more.

11
Remember Your Roots?
12
MAD at 50 Gbpsi
  • MAD Magnetic Aerial Density
  • 3-10 Mbpsi in products
  • 28 Mbpsi in lab
  • 50 Mbpsi paramagnetic limit
    but. People have ideas.
  • Capacity rise 10x in 5 years (conservative)
  • Bandwidth rise 4x in 5 years (densityrpm)
  • Disk 50GB to 500 GB,
  • 60-80MBps
  • 1k/TB
  • 15 minute to 3 hour scan time.

13
The Absurd Disk
  • 2.5 hr scan time (poor sequential access)
  • 1 aps / 5 GB (VERY cold data)
  • Its a tape!

1 TB
100 MB/s
200 Kaps
14
Disk vs Tape
  • Disk
  • 47 GB
  • 15 MBps
  • 5 ms seek time
  • 3 ms rotate latency
  • 9/GB for drive 3/GB for ctlrs/cabinet
  • 4 TB/rack
  • Tape
  • 40 GB
  • 5 MBps
  • 30 sec pick time
  • Many minute seek time
  • 5/GB for media10/GB for drivelibrary
  • 10 TB/rack

Guestimates Cern 200 TB 3480 tapes 2 col
50GB Rack 1 TB 20 drives
The price advantage of tape is narrowing, and
the performance advantage of disk is growing
15
Standard Storage Metrics
  • Capacity
  • RAM MB and /MB today at 512MB and 3/MB
  • Disk GB and /GB today at 50GB and 10/GB
  • Tape TB and /TB today at 50GB and
    12k/TB (nearline)
  • Access time (latency)
  • RAM 100 ns
  • Disk 10 ms
  • Tape 30 second pick, 30 second position
  • Transfer rate
  • RAM 1 GB/s
  • Disk 15 MB/s - - - Arrays can go to 1GB/s
  • Tape 5 MB/s - - - striping is
    problematic, but works

16
New Storage Metrics Kaps, Maps, SCAN?
  • Kaps How many kilobyte objects served per second
  • The file server, transaction processing metric
  • This is the OLD metric.
  • Maps How many megabyte objects served per second
  • The Multi-Media metric
  • SCAN How long to scan all the data
  • the data mining and utility metric
  • And
  • Kaps/, Maps/, TBscan/

17
For the Record (good 1999 devices packaged in
systemhttp//www.tpc.org/results/individual_resul
ts/Compaq/compaq.5500.99050701.es.pdf)
X 100
Tape is 1Tb with 4 DLT readers at 5MBps each.
18
For the Record (good 1999 devices packaged in
systemhttp//www.tpc.org/results/individual_resul
ts/Compaq/compaq.5500.99050701.es.pdf)
Tape is 1Tb with 4 DLT readers at 5MBps each.
19
The Access Time Myth
  • The Myth seek or pick time dominates
  • The reality (1) Queuing dominates
  • (2) Transfer dominates BLOBs
  • (3) Disk seeks often short
  • Implication many cheap servers better than
    one fast expensive server
  • shorter queues
  • parallel transfer
  • lower cost/access and cost/byte
  • This is obvious for disk arrays
  • This even more obvious for tape arrays

Wait
Transfer
Transfer
Rotate
Rotate
Seek
Seek
20
Storage Ratios Changed
  • DRAM/disk media price ratio changed
  • 1970-1990 1001
  • 1990-1995 101
  • 1995-1997 501
  • today 0.1pMB disk 301
    3pMB dram
  • 10x better access time
  • 10x more bandwidth
  • 4,000x lower media price

21
Data on Disk Can Move to RAM in 8 years
301
6 years
22
Outline
  • The Surprise-Free Future (5 years)
  • 500 mips cpus for 10
  • 1 Gb RAM chips
  • MAD at 50 Gbpsi
  • 10 GBps SANs are ubiquitous
  • 1 GBps WANs are ubiquitous
  • Some consequences
  • Absurd (?) consequences.
  • Auto-manage storage
  • Raid10 replaces Raid5
  • Disc-packs
  • Disk is the archive media of choice
  • A surprising future?
  • Disks (and other useful things) become
    supercomputers.
  • Apps run in the disk.

23
The (absurd?) consequences
  • 256 way nUMA?
  • Huge main memories now 500MB - 64GB memories
    then 10GB - 1TB memories
  • Huge disksnow 5-50 GB 3.5 disks then 50-500
    GB disks
  • Petabyte storage farms
  • (that you cant back up or restore).
  • Disks gtgt tapes
  • Small disksOne platter one inch 10GB
  • SAN convergence 1 GBps point to point is easy
  • 1 GB RAM chips
  • MAD at 50 Gbpsi
  • Drives shrink one quantum
  • 10 GBps SANs are ubiquitous
  • 500 mips cpus for 10
  • 5 bips cpus at high end

24
The Absurd? Consequences
  • Further segregate processing from storage
  • Poor locality
  • Much useless data movement
  • Amdahls laws bus 10 B/ips io 1 b/ips

Disks
Processors
100 GBps
10 TBps
1 Tips
100TB
25
Storage Latency How Far Away is the Data?
Andromeda
9
Tape /Optical
10
2,000 Years
Robot
6
Pluto
Disk
2 Years
10
1.5 hr
Olympia
Memory
100
This Hotel
10
10 min
On Board Cache
On Chip Cache
2
This Room
Registers
1
My Head
1 min
26
Consequences
  • AutoManage Storage
  • Sixpacks (for arm-limited apps)
  • Raid5-gt Raid10
  • Disk-to-disk backup
  • Smart disks

27
Auto Manage Storage
  • 1980 rule of thumb
  • A DataAdmin per 10GB, SysAdmin per mips
  • 2000 rule of thumb
  • A DataAdmin per 5TB
  • SysAdmin per 100 clones (varies with app).
  • Problem
  • 5TB is 60k today, 10k in a few years.
  • Admin cost gtgt storage cost???
  • Challenge
  • Automate ALL storage admin tasks

28
The Absurd Disk
  • 2.5 hr scan time (poor sequential access)
  • 1 aps / 5 GB (VERY cold data)
  • Its a tape!

1 TB
100 MB/s
200 Kaps
29
Extreme case 1TB disk Alternatives
  • Use all the heads in parallel
  • Scan in 30 minutes
  • Still one Kaps/5GB
  • Use one platter per arm
  • Share power/sheetmetal
  • Scan in 30 minutes
  • One KAPS per GB

500 MB/s
1 TB
200 Kaps
500 MB/s
200GB each
1,000 Kaps
30
Drives shrink (1.8, 1)
  • 150 kaps for 500 GB is VERY cold data
  • 3 GB/platter today, 30 GB/platter in 5years.
  • Most disks are ½ full
  • TPC benchmarks use 9GB drives (need arms or
    bandwidth).
  • One solution smaller form factor
  • More arms per GB
  • More arms per rack
  • More arms per Watt

31
Prediction 6-packs
  • One way or another, when disks get huge
  • Will be packaged as multiple arms
  • Parallel heads gives bandwidth
  • Independent arms gives bandwidth aps
  • Package shares power, package, interfaces

32
Stripes, Mirrors, Parity (RAID 0,1, 5)
  • RAID 0 Stripes
  • bandwidth
  • RAID 1 Mirrors, Shadows,
  • Fault tolerance
  • Reads faster, writes 2x slower
  • RAID 5 Parity
  • Fault tolerance
  • Reads faster
  • Writes 4x or 6x slower.

0,3,6,..
1,4,7,..
2,5,8,..
0,1,2,..
0,1,2,..
0,2,P2,..
1,P1,4,..
P0,3,5,..
33
RAID 10 (strips of mirrors) Winswastes space,
saves arms
  • RAID 5
  • Performance
  • 225 reads/sec
  • 70 writes/sec
  • Write
  • 4 logical IO,
  • 2 seek 1.7 rotate
  • SAVES SPACE
  • Performance degrades on failure
  • RAID1
  • Performance
  • 250 reads/sec
  • 100 writes/sec
  • Write
  • 2 logical IO
  • 2 seek 0.7 rotate
  • SAVES ARMS
  • Performance improves on failure

34
The Storage RackToday
  • 140 arms
  • 4TB
  • 24 racks24 storage processors61 in rack
  • Disks 2.5 GBps IO
  • Controllers 1.2 GBps IO
  • Ports 500 MBps IO

35
Storage Rack in 5 years?
  • 140 arms
  • 50TB
  • 24 racks24 storage processors61 in rack
  • Disks 14 GBps IO
  • Controllers 5 GBps IO
  • Ports 1 GBps IO
  • My suggestion move the processors into the
    storage racks.

36
Its hard to archive a PetaByteIt takes a LONG
time to restore it.
  • Store it in two (or more) places online (on
    disk?).
  • Scrub it continuously (look for errors)
  • On failure, refresh lost copy from safe copy.
  • Can organize the two copies differently
    (e.g. one by time, one by space)

37
Crazy Disk Ideas
  • Disk Farm on a card surface mount disks
  • Disk (magnetic store) on a chip (micro machines
    in Silicon)
  • Full Apps (e.g. SAP, Exchange/Notes,..) in the
    disk controller (a processor with 128 MB dram)

ASIC
The Innovator's Dilemma When New Technologies
Cause Great Firms to FailClayton M.
Christensen.ISBN 0875845851
38
The Disk Farm On a Card
  • The 500GB disc card
  • An array of discs
  • Can be used as
  • 100 discs
  • 1 striped disc
  • 50 Fault Tolerant discs
  • ....etc
  • LOTS of accesses/second
  • bandwidth

14"
39
Functionally Specialized Cards
P mips processor
  • Storage
  • Network
  • Display

Today P50 mips M 2 MB
M MB DRAM
In a few years P 200 mips M 64 MB
ASIC
ASIC
40
Data Gravity Processing Moves to Transducers
  • Move Processing to data sources
  • Move to where the power (and sheet metal) is
  • Processor in
  • Modem
  • Display
  • Microphones (speech recognition) cameras
    (vision)
  • Storage Data storage and analysis

41
Its Already True of PrintersPeripheral
CyberBrick
  • You buy a printer
  • You get a
  • several network interfaces
  • A Postscript engine
  • cpu,
  • memory,
  • software,
  • a spooler (soon)
  • and a print engine.

42
Disks Become Supercomputers
Kilo Mega Giga Tera Peta Exa Zetta Yotta
  • 100x in 10 years 2 TB 3.5 drive
  • Shrink to 1 is 200GB
  • Disk replaces tape?
  • Disk is super computer!

43
All Device Controllers will be Cray 1s
  • TODAY
  • Disk controller is 10 mips risc engine with 2MB
    DRAM
  • NIC is similar power
  • SOON
  • Will become 100 mips systems with 100 MB DRAM.
  • They are nodes in a federation (can run Oracle
    on NT in disk controller).
  • Advantages
  • Uniform programming model
  • Great tools
  • Security
  • Economics (cyberbricks)
  • Move computation to data (minimize traffic)

Central Processor Memory
Tera Byte Backplane
44
With Tera Byte Interconnectand Super Computer
Adapters
  • Processing is incidental to
  • Networking
  • Storage
  • UI
  • Disk Controller/NIC is
  • faster than device
  • close to device
  • Can borrow device package power
  • So use idle capacity for computation.
  • Run app in device.
  • Both Kim Keeton (UCB) and Erik Riedel (CMU)
    thesis investigate thisshow benefits of this
    approach.

45
Implications
Conventional
Radical
  • Move app to NIC/device controller
  • higher-higher level protocols CORBA / COM.
  • Cluster parallelism is VERY important.
  • Offload device handling to NIC/HBA
  • higher level protocols I2O, NASD, VIA, IP, TCP
  • SMP and Cluster parallelism is important.

46
How Do They Talk to Each Other?
  • Each node has an OS
  • Each node has local resources A federation.
  • Each node does not completely trust the others.
  • Nodes use RPC to talk to each other
  • CORBA? COM? RMI?
  • One or all of the above.
  • Huge leverage in high-level interfaces.
  • Same old distributed system story.

Applications
Applications
datagrams
datagrams
streams
RPC
?
streams
RPC
?
SIO
SIO
SAN
47
Basic Argument for x-Disks
  • Future disk controller is a super-computer.
  • 1 bips processor
  • 128 MB dram
  • 100 GB disk plus one arm
  • Connects to SAN via high-level protocols
  • RPC, HTTP, DCOM, Kerberos, Directory
    Services,.
  • Commands are RPCs
  • management, security,.
  • Services file/web/db/ requests
  • Managed by general-purpose OS with good dev
    environment
  • Move apps to disk to save data movement
  • need programming environment in controller

48
The Slippery Slope
Nothing Sector Server
  • If you add function to server
  • Then you add more function to server
  • Function gravitates to data.

Something Fixed App Server
Everything App Server
49
Why Not a Sector Server?(lets get physical!)
  • Good idea, thats what we have today.
  • But
  • cache added for performance
  • Sector remap added for fault tolerance
  • error reporting and diagnostics added
  • SCSI commends (reserve,.. are growing)
  • Sharing problematic (space mgmt, security,)
  • Slipping down the slope to a 2-D block server

50
Why Not a 1-D Block Server?Put A LITTLE on the
Disk Server
  • Tried and true design
  • HSC - VAX cluster
  • EMC
  • IBM Sysplex (3980?)
  • But look inside
  • Has a cache
  • Has space management
  • Has error reporting management
  • Has RAID 0, 1, 2, 3, 4, 5, 10, 50,
  • Has locking
  • Has remote replication
  • Has an OS
  • Security is problematic
  • Low-level interface moves too many bytes

51
Why Not a 2-D Block Server?Put A LITTLE on the
Disk Server
  • Tried and true design
  • Cedar -gt NFS
  • file server, cache, space,..
  • Open file is many fewer msgs
  • Grows to have
  • Directories Naming
  • Authentication access control
  • RAID 0, 1, 2, 3, 4, 5, 10, 50,
  • Locking
  • Backup/restore/admin
  • Cooperative caching with client
  • File Servers are a BIG hit NetWare
  • SNAP! is my favorite today

52
Why Not a File Server?Put a Little on the Disk
Server
  • Tried and true design
  • Auspex, NetApp, ...
  • Netware
  • Yes, but look at NetWare
  • File interface gives you app invocation interface
  • Became an app server
  • Mail, DB, Web,.
  • Netware had a primitive OS
  • Hard to program, so optimized wrong thing

53
Why Not Everything?Allow Everything on Disk
Server(thin clients)
  • Tried and true design
  • Mainframes, Minis, ...
  • Web servers,
  • Encapsulates data
  • Minimizes data moves
  • Scaleable
  • It is where everyone ends up.
  • All the arguments against are short-term.

54
The Slippery Slope
Nothing Sector Server
  • If you add function to server
  • Then you add more function to server
  • Function gravitates to data.

Something Fixed App Server
Everything App Server
55
Outline
  • The Surprise-Free Future (5 years)
  • Astonishing hardware progress.
  • Some consequences
  • Absurd (?) consequences.
  • Auto-manage storage
  • Raid10 replaces Raid5
  • Disc-packs
  • Disk is the archive media of choice
  • A surprising future?
  • Disks (and other useful things) become
    supercomputers.
  • Apps run in the disk
Write a Comment
User Comments (0)
About PowerShow.com