Title: Unleashing the Power of Parallel NFS The Top 5 Things You Should Know Today
1Unleashing the Power of Parallel NFSThe Top 5
Things You Should Know Today
- A Panasas Webinar
- Brent Welch
- Director of Software Architecture
- Panasas, Inc.
- August 9, 2007
2Getting Started
- What is the pNFS protocol standard?
- It is an extension to the NFSv4 file system
protocol standard, nearing final ratification by
the IETF. - It allows direct, parallel I/O between clients
and storage devices and eliminates the scaling
bottleneck found in todays NAS systems - It supports multiple types of back-end storage
systems, including traditional block storage,
other file servers, and object storage systems - Today we will address
- How pNFS meets the performance challenges
inherent with NAS and SAN - Examine the performance and scalability
advantages of pNFS - How Panasas and other leading storage companies
are contributing to pNFS - Why pNFS is important and how your organization
can take full advantage of the protocol as it
becomes available.
3Cluster storage problem statement
- Compute clusters are growing larger in size (8,
128, 1024, 4096) - Drivers scientific codes, seismic data, digital
animation, biotech, EDA - Each host in the cluster needs uniform access to
any stored data - Demand for storage capacity and bandwidth is
growing (GBs/sec) - Apply clustering techniques to the storage system
itself - Maintain simplicity of storage management even at
large scale
?
Clients
Storage
4Parallel computing requires parallel storage
- Economics of commodity x86/Linux systems
- Drives down cost via standard building blocks
- Enables users to build larger clusters larger
models - Accommodates the huge growth in number sizes of
files - The use of parallelism in compute environments is
rapidly accelerating - Clusters are now de-facto deployment architecture
for HPC - Embarrassingly parallel low-latency MPI
applications - Multi-core processors multi-threading
Storage systems must be optimized for parallelism
in a standard, economically efficient way
5Network Attached Storage in the 80s, 90s
NFS
NFS
NFS
NFS
Filer Heads
Filer Heads
Filer Heads
Filer Heads
Islands of Storage Filer Heads create I/O
performance bottlenecks Multiple instances create
management challenges
6Clustered NAS emerged to solve manageability
issues in early 2000
NFS
NFS
NFS
NFS
Clustered Filer Heads
Bridged Islands of Storage In-band Filer Head
synchronization creates I/O performance
bottlenecks Load balancing becomes management
performance issue
7Parallel Clustered Storage solves the performance
scalability issues
Metadata
direct, parallel data paths
Management
Pool of Parallel Clustered Storage I/O
Performance Bottlenecks and Management Challenges
Solved as Filers Removed from Data Path
8The advantage of parallel storage over NFS
FLUENT CFD analysis
Serial I/O Increased I/O activity outweighs
solver performance improvement
Parallel I/O Performance scalingmaintained
Source Fluent / ANSYS, November 2006
9Advantage of parallel storage over clustered
NFS Paradigm GeoDepth seismic benchmark
7 hours 17 mins Av. ReadBW300MB/s
2.5X faster (less time)
Time
3 hours 35 mins Av. ReadBW500MB/s
2 hours 51 mins Av. ReadBW650MB/s
4 Shelves
2 Shelves
4 Shelves
Source Paradigm Panasas, February 2007
10The 5 top things you need to know
- Parallel I/O solves the bandwidth bottleneck
11Impetus for a standard for parallel I/O
- Key storage vendors have existing, incompatible
parallel FS products - IBM GPFS
- EMC MPFSi (High Road)
- Panasas ActiveScale
- IBRIX
- HP Polyserve
- What about open source? Same interoperability
concerns. - Red Hat GFS
- PVFS
- Lustre
- Need parallel standard within NFS
12NFSv4 and pNFS
- NFS created in 80s to share data among
engineering workstations - NFSv3 widely deployed
- NFSv4 eight years in the making, lots of new
stuff - Integrated Kerberos (or PKI) user authentication
- Integrated file locking
- ACLs (hybrid of Windows and POSIX models)
- NFSv4.1 adds even more
- Details learned from early NFSv4.0 experience
- pNFS for parallel I/O
- Directory delegations for efficiency
- Sessions for better at-most-once semantics
13pNFS The standard for parallel NAS
- pNFS is an extension to the Network File System
v4 protocol standard - Allows for parallel and direct access
- From Parallel Network File System clients
- To Storage Devices over multiple storage
protocols - Moves the Network File System server out of the
data path
pNFS Clients
Metadata
NFSv4.1 Server(s)
direct, parallel data paths
Storage Block (FC) / Object (OSD) / File (NFS)
Management
14pNFS Layouts
- Client gets a layout from the NFS Server
- The layout maps the file onto storage devices and
addresses - The client uses the layout to perform direct I/O
to storage - At any time the server can recall the layout
- Client commits changes and returns the layout
when its done - pNFS is optional, the client can always use
regular NFSv4 I/O
layout
Storage
Clients
NFSv4.1 Server
15pNFS Client
- Common client for different storage back ends
- Wider availability across operating systems
- Fewer support issues for storage vendors
Client Apps
pNFS Client
1. SBC (blocks) 2. OSD (objects) 3. NFS
(files) 4. PVFS (user level) 5. Something new
Layout Driver
NFSv4.1
pNFS Server
Layout metadatagrant revoke
Cluster Filesystem
16pNFS Protocol Operations
- LAYOUTGET
- (filehandle, type, byte range) -gt type-specific
layout - LAYOUTRETURN
- (filehandle, byte range) -gt server can release
state about the client - LAYOUTCOMMIT
- (filehandle, byte range, updated attributes,
layout-specific info) -gt server ensures that data
is visible to other clients - Timestamps and end-of-file attributes are updated
- GETDEVICEINFO
- Map deviceID in layout to type-specific
addressing information
17pNFS Protocol Callbacks
- NFS Version 4 servers are stateful, and they
generate callbacks to clients to reclaim state
about delegated locks and delegated layouts - pNFSv4.1 adds these callback operations
- CB_LAYOUTRECALL
- Server tells the client to stop using a layout,
or all layouts - CB_RECALL_ANY
- Server tells the client to release delegations of
its own choosing, allowing the server to reduce
the amount of state is is maintaining
18pNFS READ
- LOOKUPOPEN client to NFS server, returns file
handle and state ids - LAYOUTGET client to NFS server, returns layout
- READ client to storage devices, many reads in
parallel - LAYOUTRETURN client to NFS server
- Clients can cache layouts for use with multiple
READ and multiple LOOKUPOPEN instances - Server uses CB_LAYOUTRECALL when the layout is no
longer valid
Linux Compute Cluster
Control Path
Parallel Data Paths
Metadata Manager
READ
Storage Devices
19pNFS WRITE
- LOOKUPOPEN client to NFS server, returns file
handle and state ids - LAYOUTGET client to NFS server returns layout
- WRITE client to storage devices
- LAYOUTCOMMIT client to NFS server publishes
write - LAYOUTRETURN client to NFS server
- Server may restrict byte range of write layout to
reduce allocation overheads, avoid quota limits,
etc.
Linux Compute Cluster
Control Path
Parallel Data Paths
WRITE
Metadata Manager
Storage Devices
20Example pNFS over Blocks
- Layout describes an array of block or extents
- NFS server is responsible for block allocation
- Client uses SCSI/SBC commands to read and write
data blocks - iSCSI or FC SAN access
21Example pNFS over Files
- Layout describes the set of file servers that
store (parts of) a file - Layout parameters describe how data is striped
over the component files - Simple striping is supported
- NFS server is responsible for creating and
deleting component files, and establishing
security and access control state on data servers - Client uses NFS commands to read and write data
(bytes) - Data File Servers are responsible for block
management - Metadata File Server is responsible for
attributes and access control
22Example pNFS over Objects
- Layout describes the set of component objects
that store a file - Layout parameters describe how data is striped
over these objects - RAID-0, RAID-1 (Mirroring), RAID-5, RAID-6 are
all possibilities - Security credentials grant access to the client
for individual objects - NFS server is responsible for creating and
deleting objects, and granting access credentials - Client uses iSCSI/OSD commands to read and write
data (bytes) - Object Storage Device (OSD) is responsible for
block management
23Key pNFS Participants
- Panasas (Objects, based on Panasas Storage
Cluster OSDs) - Network Appliance (Files over NFSv4)
- IBM (Files, based on GPFS)
- EMC (Blocks, based on HighRoad MPFSi)
- Sun (Files over NFSv4)
- U of Michigan/CITI (Files over PVFS2, Files over
NFSv4)
24Current Status
- pNFS is part of the IETF NFSv4 minor version 1
standard draft - draft-ietf-nfsv4-minorversion1-13.txt
- Weekly editorial review meetings started this May
- Anticipate working group last call this October
- Anticipate RFC being published late Q1 2008
- Prototype interoperability testing began in 2006
- Connect-a-thon and Bake-a-thon multi-vendor
testing sessions 2-3 times/year. - March 2007 San Jose. June 2007 Austin. October
2007 Ann Arbor. - Expect Linux integration into kernel.org by late
2008 - Expect other vendor releases by late 2008
25The 5 Top Things You Need to Know
- Parallel I/O solves the bandwidth bottleneck
- The Industry is standardizing parallel I/O as
pNFS
26Taking full advantage of pNFS
- How do you effectively scale applications?
- Scalability involves several dimensions of
hardware and software - An effective solution is balanced
- CPU power
- Main memory and memory system throughput
- Interconnect bandwidth and latency
- Storage bandwidth and capacity
- Middleware (MPI)
- Application structure
- A scalable system requires scaling each system
component - Scalable I/O cannot be overlooked, especially
within an application - One metric is 1GB/sec I/O for every Teraflop of
computing
27The 5 Top Things You Need to Know
- Parallel I/O solves the bandwidth bottleneck
- The Industry is standardizing parallel I/O as
pNFS - Your internal codes may need modifying to take
full advantage of pNFS - For further information on modifying your
internal code, request a copy of Optimizing HPC
Applications with Parallel Storage a previous
Panasas webinar on this topic - Please email your request to info_at_panasas.com
- Now is the time to ask your vendors about their
plans to support pNFS
28Panasas The pNFS Company
- pNFS was originally proposed to the NFS community
by Panasas CTO Garth Gibson - Special thanks to Gary Grider, Los Alamos NL, and
Lee Ward, Sandia NL - pNFS leveraged from Panasas DirectFLOW client
architecture - Implementation experience guided pNFS standards
effort - The primary benefits of pNFS are available from
Panasas today - Superior bandwidth
- Unmatched scalability
- Simplified management
- Full investment protection in storage hardware
applications
29Panasas Announces Open Sourcing of DirectFLOW
Client Software
- Panasas to open-source core of DirectFLOW client
software - A reference implementation to show how we solve
parallel I/O problems - Key Panasas components
- Storage Access Mgr, OSD client, Object iSCSI and
other network layers, parts of the Panasas
libraries (common, rpc, sec) that are needed to
compile and link. - Available later this summer at www.pnfs.com, a
community resource site. - Panasas has a dedicated pNFS development center
- Leverage Panasas engineering expertise
DirectFLOW source code - Focus on pNFS Object layout driver, iSCSI drivers
other contributions to open source pNFS client
server teams. - Why is Panasas doing this?
- To accelerate industry migration to parallel file
systems and speed the integration of pNFS into
Linux Kernel and distros - To enable our customers reap the benefits of
standards-based parallel storage solutions as
soon as possible
30Panasas parallel storage leadership
- System architecture inherently parallel
- Simple software upgrade to pNFS
- Time to market advantage with pNFS
- Commercial production deployment expertise
- Shipping for 4 years
- Deployed at 100 sites
- Object-based pNFS Server Implementation
- Superior performance Both streaming and random
I/O - Easy Management 15-minute install,
Auto-provisioning, load-balancing, RAID mgmt. - High Availability Failover, predictive disk
management and parallel reconstruction - And all at PetaScale enabled by the Object
architecture
31The 5 top things you need to know
- Parallel I/O solves the bandwidth bottleneck
- The Industry is standardizing parallel I/O as
pNFS - Your internal codes may need modifying to take
full advantage of pNFS - Now is the time to ask your vendors about their
plans to support pNFS - Panasas is leading the charge towards pNFS
32References
- pNFS Problem StatementGarth Gibson (Panasas),
Peter Corbett (Netapp), Internet-draft, July
2004, http//www.pdl.cmu.edu/pNFS/archive/gibson-
pnfs-problem-statement.html - NFSv4 pNFS ExtensionsG. Goodson (Netapp), B.
Welch, B. Halevy (Panasas), D. Black (EMC), A.
Adamson (CITI), Internet-draft, October 2005,
http//www.ietf.org/internet-drafts/draft-ietf-nf
sv4-pnfs-00.txt - Linux pNFS Kernel DevelopmentCITI,
http//www.citi.umich.edu/projects/asci/pnfs/linu
x/ - NFSv4 Minor Version 1http//www.ietf.org/intern
et-drafts/draft-ietf-nfsv4-minorversion1-12.txt
33Thank You!
- For more information
-
- www.panasas.com
- www.pnfs.com
- info_at_panasas.com