Unleashing the Power of Parallel NFS The Top 5 Things You Should Know Today - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Unleashing the Power of Parallel NFS The Top 5 Things You Should Know Today

Description:

It is an extension to the NFSv4 file system protocol standard, ... http://www.pdl.cmu.edu/pNFS/archive/gibson-pnfs-problem-statement.html 'NFSv4 pNFS Extensions' ... – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 34
Provided by: bhal1
Category:

less

Transcript and Presenter's Notes

Title: Unleashing the Power of Parallel NFS The Top 5 Things You Should Know Today


1
Unleashing the Power of Parallel NFSThe Top 5
Things You Should Know Today
  • A Panasas Webinar
  • Brent Welch
  • Director of Software Architecture
  • Panasas, Inc.
  • August 9, 2007

2
Getting Started
  • What is the pNFS protocol standard?
  • It is an extension to the NFSv4 file system
    protocol standard, nearing final ratification by
    the IETF.
  • It allows direct, parallel I/O between clients
    and storage devices and eliminates the scaling
    bottleneck found in todays NAS systems
  • It supports multiple types of back-end storage
    systems, including traditional block storage,
    other file servers, and object storage systems
  • Today we will address
  • How pNFS meets the performance challenges
    inherent with NAS and SAN
  • Examine the performance and scalability
    advantages of pNFS
  • How Panasas and other leading storage companies
    are contributing to pNFS
  • Why pNFS is important and how your organization
    can take full advantage of the protocol as it
    becomes available.

3
Cluster storage problem statement
  • Compute clusters are growing larger in size (8,
    128, 1024, 4096)
  • Drivers scientific codes, seismic data, digital
    animation, biotech, EDA
  • Each host in the cluster needs uniform access to
    any stored data
  • Demand for storage capacity and bandwidth is
    growing (GBs/sec)
  • Apply clustering techniques to the storage system
    itself
  • Maintain simplicity of storage management even at
    large scale

?
Clients
Storage
4
Parallel computing requires parallel storage
  • Economics of commodity x86/Linux systems
  • Drives down cost via standard building blocks
  • Enables users to build larger clusters larger
    models
  • Accommodates the huge growth in number sizes of
    files
  • The use of parallelism in compute environments is
    rapidly accelerating
  • Clusters are now de-facto deployment architecture
    for HPC
  • Embarrassingly parallel low-latency MPI
    applications
  • Multi-core processors multi-threading

Storage systems must be optimized for parallelism
in a standard, economically efficient way
5
Network Attached Storage in the 80s, 90s
NFS
NFS
NFS
NFS
Filer Heads
Filer Heads
Filer Heads
Filer Heads
Islands of Storage Filer Heads create I/O
performance bottlenecks Multiple instances create
management challenges
6
Clustered NAS emerged to solve manageability
issues in early 2000
NFS
NFS
NFS
NFS
Clustered Filer Heads
Bridged Islands of Storage In-band Filer Head
synchronization creates I/O performance
bottlenecks Load balancing becomes management
performance issue
7
Parallel Clustered Storage solves the performance
scalability issues
Metadata
direct, parallel data paths
Management
Pool of Parallel Clustered Storage I/O
Performance Bottlenecks and Management Challenges
Solved as Filers Removed from Data Path
8
The advantage of parallel storage over NFS
FLUENT CFD analysis
Serial I/O Increased I/O activity outweighs
solver performance improvement
Parallel I/O Performance scalingmaintained
Source Fluent / ANSYS, November 2006
9
Advantage of parallel storage over clustered
NFS Paradigm GeoDepth seismic benchmark
7 hours 17 mins Av. ReadBW300MB/s
2.5X faster (less time)
Time
3 hours 35 mins Av. ReadBW500MB/s
2 hours 51 mins Av. ReadBW650MB/s
4 Shelves
2 Shelves
4 Shelves
Source Paradigm Panasas, February 2007
10
The 5 top things you need to know
  • Parallel I/O solves the bandwidth bottleneck

11
Impetus for a standard for parallel I/O
  • Key storage vendors have existing, incompatible
    parallel FS products
  • IBM GPFS
  • EMC MPFSi (High Road)
  • Panasas ActiveScale
  • IBRIX
  • HP Polyserve
  • What about open source? Same interoperability
    concerns.
  • Red Hat GFS
  • PVFS
  • Lustre
  • Need parallel standard within NFS

12
NFSv4 and pNFS
  • NFS created in 80s to share data among
    engineering workstations
  • NFSv3 widely deployed
  • NFSv4 eight years in the making, lots of new
    stuff
  • Integrated Kerberos (or PKI) user authentication
  • Integrated file locking
  • ACLs (hybrid of Windows and POSIX models)
  • NFSv4.1 adds even more
  • Details learned from early NFSv4.0 experience
  • pNFS for parallel I/O
  • Directory delegations for efficiency
  • Sessions for better at-most-once semantics

13
pNFS The standard for parallel NAS
  • pNFS is an extension to the Network File System
    v4 protocol standard
  • Allows for parallel and direct access
  • From Parallel Network File System clients
  • To Storage Devices over multiple storage
    protocols
  • Moves the Network File System server out of the
    data path

pNFS Clients
Metadata
NFSv4.1 Server(s)
direct, parallel data paths
Storage Block (FC) / Object (OSD) / File (NFS)
Management
14
pNFS Layouts
  • Client gets a layout from the NFS Server
  • The layout maps the file onto storage devices and
    addresses
  • The client uses the layout to perform direct I/O
    to storage
  • At any time the server can recall the layout
  • Client commits changes and returns the layout
    when its done
  • pNFS is optional, the client can always use
    regular NFSv4 I/O

layout
Storage
Clients
NFSv4.1 Server
15
pNFS Client
  • Common client for different storage back ends
  • Wider availability across operating systems
  • Fewer support issues for storage vendors

Client Apps
pNFS Client
1. SBC (blocks) 2. OSD (objects) 3. NFS
(files) 4. PVFS (user level) 5. Something new
Layout Driver
NFSv4.1
pNFS Server
Layout metadatagrant revoke
Cluster Filesystem
16
pNFS Protocol Operations
  • LAYOUTGET
  • (filehandle, type, byte range) -gt type-specific
    layout
  • LAYOUTRETURN
  • (filehandle, byte range) -gt server can release
    state about the client
  • LAYOUTCOMMIT
  • (filehandle, byte range, updated attributes,
    layout-specific info) -gt server ensures that data
    is visible to other clients
  • Timestamps and end-of-file attributes are updated
  • GETDEVICEINFO
  • Map deviceID in layout to type-specific
    addressing information

17
pNFS Protocol Callbacks
  • NFS Version 4 servers are stateful, and they
    generate callbacks to clients to reclaim state
    about delegated locks and delegated layouts
  • pNFSv4.1 adds these callback operations
  • CB_LAYOUTRECALL
  • Server tells the client to stop using a layout,
    or all layouts
  • CB_RECALL_ANY
  • Server tells the client to release delegations of
    its own choosing, allowing the server to reduce
    the amount of state is is maintaining

18
pNFS READ
  • LOOKUPOPEN client to NFS server, returns file
    handle and state ids
  • LAYOUTGET client to NFS server, returns layout
  • READ client to storage devices, many reads in
    parallel
  • LAYOUTRETURN client to NFS server
  • Clients can cache layouts for use with multiple
    READ and multiple LOOKUPOPEN instances
  • Server uses CB_LAYOUTRECALL when the layout is no
    longer valid

Linux Compute Cluster
Control Path
Parallel Data Paths
Metadata Manager
READ
Storage Devices
19
pNFS WRITE
  • LOOKUPOPEN client to NFS server, returns file
    handle and state ids
  • LAYOUTGET client to NFS server returns layout
  • WRITE client to storage devices
  • LAYOUTCOMMIT client to NFS server publishes
    write
  • LAYOUTRETURN client to NFS server
  • Server may restrict byte range of write layout to
    reduce allocation overheads, avoid quota limits,
    etc.

Linux Compute Cluster
Control Path
Parallel Data Paths
WRITE
Metadata Manager
Storage Devices
20
Example pNFS over Blocks
  • Layout describes an array of block or extents
  • NFS server is responsible for block allocation
  • Client uses SCSI/SBC commands to read and write
    data blocks
  • iSCSI or FC SAN access

21
Example pNFS over Files
  • Layout describes the set of file servers that
    store (parts of) a file
  • Layout parameters describe how data is striped
    over the component files
  • Simple striping is supported
  • NFS server is responsible for creating and
    deleting component files, and establishing
    security and access control state on data servers
  • Client uses NFS commands to read and write data
    (bytes)
  • Data File Servers are responsible for block
    management
  • Metadata File Server is responsible for
    attributes and access control

22
Example pNFS over Objects
  • Layout describes the set of component objects
    that store a file
  • Layout parameters describe how data is striped
    over these objects
  • RAID-0, RAID-1 (Mirroring), RAID-5, RAID-6 are
    all possibilities
  • Security credentials grant access to the client
    for individual objects
  • NFS server is responsible for creating and
    deleting objects, and granting access credentials
  • Client uses iSCSI/OSD commands to read and write
    data (bytes)
  • Object Storage Device (OSD) is responsible for
    block management

23
Key pNFS Participants
  • Panasas (Objects, based on Panasas Storage
    Cluster OSDs)
  • Network Appliance (Files over NFSv4)
  • IBM (Files, based on GPFS)
  • EMC (Blocks, based on HighRoad MPFSi)
  • Sun (Files over NFSv4)
  • U of Michigan/CITI (Files over PVFS2, Files over
    NFSv4)

24
Current Status
  • pNFS is part of the IETF NFSv4 minor version 1
    standard draft
  • draft-ietf-nfsv4-minorversion1-13.txt
  • Weekly editorial review meetings started this May
  • Anticipate working group last call this October
  • Anticipate RFC being published late Q1 2008
  • Prototype interoperability testing began in 2006
  • Connect-a-thon and Bake-a-thon multi-vendor
    testing sessions 2-3 times/year.
  • March 2007 San Jose. June 2007 Austin. October
    2007 Ann Arbor.
  • Expect Linux integration into kernel.org by late
    2008
  • Expect other vendor releases by late 2008

25
The 5 Top Things You Need to Know
  • Parallel I/O solves the bandwidth bottleneck
  • The Industry is standardizing parallel I/O as
    pNFS

26
Taking full advantage of pNFS
  • How do you effectively scale applications?
  • Scalability involves several dimensions of
    hardware and software
  • An effective solution is balanced
  • CPU power
  • Main memory and memory system throughput
  • Interconnect bandwidth and latency
  • Storage bandwidth and capacity
  • Middleware (MPI)
  • Application structure
  • A scalable system requires scaling each system
    component
  • Scalable I/O cannot be overlooked, especially
    within an application
  • One metric is 1GB/sec I/O for every Teraflop of
    computing

27
The 5 Top Things You Need to Know
  • Parallel I/O solves the bandwidth bottleneck
  • The Industry is standardizing parallel I/O as
    pNFS
  • Your internal codes may need modifying to take
    full advantage of pNFS
  • For further information on modifying your
    internal code, request a copy of Optimizing HPC
    Applications with Parallel Storage a previous
    Panasas webinar on this topic
  • Please email your request to info_at_panasas.com
  • Now is the time to ask your vendors about their
    plans to support pNFS

28
Panasas The pNFS Company
  • pNFS was originally proposed to the NFS community
    by Panasas CTO Garth Gibson
  • Special thanks to Gary Grider, Los Alamos NL, and
    Lee Ward, Sandia NL
  • pNFS leveraged from Panasas DirectFLOW client
    architecture
  • Implementation experience guided pNFS standards
    effort
  • The primary benefits of pNFS are available from
    Panasas today
  • Superior bandwidth
  • Unmatched scalability
  • Simplified management
  • Full investment protection in storage hardware
    applications

29
Panasas Announces Open Sourcing of DirectFLOW
Client Software
  • Panasas to open-source core of DirectFLOW client
    software
  • A reference implementation to show how we solve
    parallel I/O problems
  • Key Panasas components
  • Storage Access Mgr, OSD client, Object iSCSI and
    other network layers, parts of the Panasas
    libraries (common, rpc, sec) that are needed to
    compile and link.
  • Available later this summer at www.pnfs.com, a
    community resource site.
  • Panasas has a dedicated pNFS development center
  • Leverage Panasas engineering expertise
    DirectFLOW source code
  • Focus on pNFS Object layout driver, iSCSI drivers
    other contributions to open source pNFS client
    server teams.
  • Why is Panasas doing this?
  • To accelerate industry migration to parallel file
    systems and speed the integration of pNFS into
    Linux Kernel and distros
  • To enable our customers reap the benefits of
    standards-based parallel storage solutions as
    soon as possible

30
Panasas parallel storage leadership
  • System architecture inherently parallel
  • Simple software upgrade to pNFS
  • Time to market advantage with pNFS
  • Commercial production deployment expertise
  • Shipping for 4 years
  • Deployed at 100 sites
  • Object-based pNFS Server Implementation
  • Superior performance Both streaming and random
    I/O
  • Easy Management 15-minute install,
    Auto-provisioning, load-balancing, RAID mgmt.
  • High Availability Failover, predictive disk
    management and parallel reconstruction
  • And all at PetaScale enabled by the Object
    architecture

31
The 5 top things you need to know
  • Parallel I/O solves the bandwidth bottleneck
  • The Industry is standardizing parallel I/O as
    pNFS
  • Your internal codes may need modifying to take
    full advantage of pNFS
  • Now is the time to ask your vendors about their
    plans to support pNFS
  • Panasas is leading the charge towards pNFS

32
References
  • pNFS Problem StatementGarth Gibson (Panasas),
    Peter Corbett (Netapp), Internet-draft, July
    2004, http//www.pdl.cmu.edu/pNFS/archive/gibson-
    pnfs-problem-statement.html
  • NFSv4 pNFS ExtensionsG. Goodson (Netapp), B.
    Welch, B. Halevy (Panasas), D. Black (EMC), A.
    Adamson (CITI), Internet-draft, October 2005,
    http//www.ietf.org/internet-drafts/draft-ietf-nf
    sv4-pnfs-00.txt
  • Linux pNFS Kernel DevelopmentCITI,
    http//www.citi.umich.edu/projects/asci/pnfs/linu
    x/
  • NFSv4 Minor Version 1http//www.ietf.org/intern
    et-drafts/draft-ietf-nfsv4-minorversion1-12.txt

33
Thank You!
  • For more information
  • www.panasas.com
  • www.pnfs.com
  • info_at_panasas.com
Write a Comment
User Comments (0)
About PowerShow.com