Title: Everything You Wanted to Know About Storage, but Were Afraid to Ask
1Everything You Wanted to Know About Storage, but
Were Afraid to Ask
2- Do you have a Cell phone, PDA or Smartphone?
3- Do you have a DIGITAL CAMERA?
4 5- What do all of these devices have in common ?
6- How do you protect your data?
7Digital Footprint Calculator
http//www.emc.com/digital_universe/downloads/web/
personal-ticker.htm
8- Are you familiar with RAID ?
9RAID 0
- Data is striped across the HDDs in a RAID set
- The stripe size is specified at a host level for
software RAID and is vendor specific for hardware
RAID - When the number of drives in the array increases,
performance improves because more data can be
read or written simultaneously - Used in applications that need high I/O
throughput - Does not provide data protection and availability
in the event of drive failures
10(No Transcript)
11RAID 1
- Mirroring is a technique whereby data is stored
on two different HDDs, yielding two copies of
data. - In addition to providing complete data
redundancy, mirroring enables faster recovery
from disk failure. - Mirroring involves duplication of data the
amount of storage capacity needed is twice the
amount of data being stored. Therefore, mirroring
is considered expensive - It is preferred for mission-critical
applications that cannot afford data loss
12(No Transcript)
13Nested RAID
- Mirroring can be implemented with striped RAID by
mirroring entire stripes of disks to stripes on
other disks - RAID 01 and RAID 10 combine the performance
benefits of RAID 0 with the redundancy benefits
of RAID 1 - These types of RAID require an even number of
disks, the minimum being four. - RAID 01 is also called mirrored stripe.
- This means that the process of striping data
across HDDs is performed initially and then the
entire stripe is mirrored.
14(No Transcript)
15Nested RAID
- RAID 10 is also called striped mirror
- The basic element of RAID 10 is that data is
first mirrored and then both copies of data are
striped across multiple HDDs in a RAID set - Some applications that benefit from RAID 10
include the following - High transaction rate Online Transaction
Processing (OLTP),Database applications that
require high I/O rate, random access, and high
availability
16(No Transcript)
17(No Transcript)
18RAID 3
- RAID 3 stripes data for high performance and uses
parity for improved fault tolerance. - Parity information is stored on a dedicated drive
so that data can be reconstructed if a drive
fails - RAID 3 is used in applications that involve large
sequential data access, such as video streaming.
19(No Transcript)
20RAID 4
- Stripes data across all disks except the parity
disk at the block level - Parity information is stored on a dedicated disk
- Unlike RAID 3 , data disks can be accessed
independently so that specific data elements can
be read or written on a single disk without read
or write of an entire stripe
21RAID 5
- RAID 5 is a very versatile RAID implementation
- The difference between RAID 4 and RAID 5 is the
parity location. - RAID 4, parity is written to a dedicated drive,
while In RAID 5, parity is distributed across all
disks - The distribution of parity in RAID 5 overcomes
the write bottleneck. - RAID 5 is preferred for messaging,
medium-performance media serving, and relational
database management system (RDBMS)
implementations in which database administrators
(DBAs) optimize data access
22(No Transcript)
23RAID 6
- RAID 6 works the same way as RAID 5 except that
RAID 6 includes a second parity element - This enable survival in the event of the failure
of two disks in a RAID group. - RAID-6 protects against two disk failures by
maintaining two parities -
24Hot Spare
- A hot spare refers to a spare HDD in a RAID array
that temporarily replaces a failed HDD of a RAID
set. - When the failed HDD is replaced with a new HDD,
The hot spare replaces the new HDD permanently,
and a new hot spare must be configured on the
array, or data from the hot spare is copied to
it, and the hot spare returns to its idle state,
ready to replace the next failed drive. - A hot spare should be large enough to accommodate
data from a failed drive. - Some systems implement multiple hot spares to
improve data availability. - A hot spare can be configured as automatic or
user initiated, which specifies how it will be
used in the event of disk failure
25(No Transcript)
26What is an Intelligent Storage System
- Intelligent Storage Systems are RAID arrays that
are - Highly optimized for I/O processing
- Have large amounts of cache for improving I/O
- performance
- Have operating environments that provide
- Intelligence for managing cache
- Array resource allocation
- Connectivity for heterogeneous hosts
- Advanced array based local and remote
replication options
27Components of an Intelligent Storage System
- An intelligent storage system consists of four
key components front end, cache, back end, and
physical disks.
28Components of an Intelligent Storage System
- The front end provides the interface between the
storage system and the host. - It consists of two components front-end ports
and front-end controllers - The front-end ports enable hosts to connect to
the intelligent storage system, and has
processing logic that executes the appropriate
transport protocol, such as SCSI, Fibre Channel,
or iSCSI, for storage connections - Front-end controllers route data to and from
cache via the internal data bus. When cache
receives write data, the controller sends an
acknowledgment
29Components of an Intelligent Storage System
- Controllers optimize I/O processing by using
command queuing algorithms - Command queuing is a technique implemented on
front-end controllers - It determines the execution order of received
commands and can reduce unnecessary drive head
movements and improve disk performance
30Intelligent Storage System Cache
- Cache is an important component that enhances the
I/O performance in an intelligent storage system. - Cache improves storage system performance by
isolating hosts from the mechanical delays
associated with physical disks, which are the
slowest components of an intelligent storage
system. Accessing data from a physical disk
usually takes a few milliseconds - Accessing data from cache takes less than a
millisecond. Write data is placed in cache and
then written to disk
31Cache Data Protection
- Cache mirroring Each write to cache is held in
two different memory locations on two independent
memory cards - Cache vaulting Cache is exposed to the risk of
uncommitted data loss due to power failure - using battery power to write the cache content to
the disk storage vendors use a set of physical
disks to dump the contents of cache during power
failure
32Intelligent Storage System Back End
- It consists of two components back-end ports and
back-end controllers - Physical disks are connected to ports on the back
end. - The back end controller communicates with the
disks when performing reads and writes and also
provides additional, but limited, temporary data
storage. - The algorithms implemented on back-end
controllers provide error detection and
correction, along with RAID functionality.
Controller - Multiple controllers also facilitate load
balancing
33Intelligent Storage System Physical Disks
- Disks are connected to the back-end with either
SCSI or a Fibre Channel interface
34What is LUNs
- Physical drives or groups of RAID protected
drives can be logically split into volumes known
as logical volumes, commonly referred to as
Logical Unit Numbers (LUNs)
35High-end Storage Systems
- High-end storage systems, referred to as
active-active arrays, are generally aimed at
large enterprises for centralizing corporate data - These arrays are designed with a large number of
controllers and cache memory - An active-active array implies that the host can
perform I/Os to its LUNs across any of the
available Paths
36Midrange Storage Systems
- Also referred as Active-passive arrays
- Host can perform I/Os to LUNs only through active
paths - Other paths remain passive till active path fails
- Midrange array have two controllers, each with
cache, RAID controllers and disks drive
interfaces - Designed for small and medium enterprises
- Less scalable as compared to high-end array
37CLARiiON Whiteboard Video
38DAS
39DAS
Direct-Attached Storage (DAS)
- storage connects directly to servers
- applications access data from DAS using
block-level access protocols - Examples
- internal HDD of a host,
- tape libraries, and
- directly connected external HDD
40DAS
Direct-Attached Storage (DAS)
- DAS is classified as internal or external, based
on the location of the storage device with
respect to the host. - Internal DAS storage device internally
connected to the host by a serial or parallel bus - distance limitations for high-speed connectivity
- can support only a limited number of devices,
and - occupy a large amount of space inside the host
41DAS
Direct-Attached Storage (DAS)
- External DAS server connects directly to the
external storage device - usually communication via SCSI or FC protocol.
- overcomes the distance and device count
limitations of internal DAS, and - provides centralized management of storage
devices.
42DAS Benefits
- Ideal for local data provisioning
- Quick deployment for small environments
- Simple to deploy
- Reliability
- Low capital expense
- Low complexity
43DAS Connectivity Options
- host ?? storage device communication via
protocols - ATA/IDE and SATA Primarily for internal bus
- SCSI
- Parallel (primarily for internal bus)
- Serial (external bus)
- FC High speed network technology
44DAS Connectivity Options
- protocols are implemented on the HDD controller
- a storage device is also known by the name of
the protocol it supports
45DAS Management
- LUN creation, filesystem layout, and data
addressing - Internal Host (or 3rd party software) provides
- Disk partitioning (Volume management)
- File system layout
46DAS Management
- External
- Array based management
- Lower TCO for managing data and storage
Infrastructure
47DAS Challenges
- limited scalability
- Number of connectivity ports to hosts
- Number of addressable disks
- Distance limitations
- For internal DAS, maintenance requires downtime
- Limited ability to share resources (unused
resources cannot be easily re-allocated) - Array front-end port, storage space
- Resulting in islands of over and under
utilized storage pools
48Introduction to SCSI
- SCSI3 is the latest version of SCSI
49SCSI Architecture
Primary commands common to all devices
50SCSI Architecture
Standard rules for device communication and
information sharing
51SCSI Architecture
Interface details such as electrical signaling
methods and data transfer modes
52SCSI Device Model
- SCSI initiator device
- Issues commands to SCSI target devices
- Example SCSI host adaptor
53SCSI Device Model
- SCSI target device
- Executes commands issued by initiators
- Examples SCSI peripheral devices
54SCSI Device Model
- Device requests contain
- Command Descriptor Block (CDB)
55SCSI Device Model
- CDB structure
- 8 bit structure
- defines the command to be executed
- contains operation code, command specific
parameter and control parameter
56SCSI Addressing
a number from 0 to 15 with the most common value
being 7
57SCSI Addressing
a number from 0 to 15
58SCSI Addressing
a number that specifies a device addressable
through a target
59SCSI Addressing Example
controller
device
target
60Areas Where DAS Fails
- Just-in-time information to business users
- Integration of information infrastructure with
business processes - Flexible and resilient storage architecture
61The Solution?
- Storage Networking
- FC SAN
- NAS
- IP SAN
62What is a SAN ?
- Dedicated high speed network of servers and
shared storage devices - Provide block level data access
63What is a SAN ?
- Resource Consolidation Centralized storage
and management - Scalability
- Theoretical limit Appx. 15 million devices
- Secure Access
64Fibre Channel
Latest FC implementations support 8Gb/s
65Fibre Channel
a high-speed network technology that runs on
high-speed optical fiber cables (for front-end
SAN connectivity)
66Fibre Channel
and serial copper cables (for back-end disk
connectivity)
67FC SAN Evolution
68Components of SAN
- three basic components
- servers,
- network infrastructure, and
- storage,
- can be further broken down into the following
key elements - node ports,
- cabling,
- interconnecting devices (such as FC switches or
hubs), - storage arrays, and
- SAN management software
69Components of SAN Node ports
- Examples of nodes
- Hosts, storage and tape library
- Ports are available on
- HBA in host Front-end adapters in storage
- Each port has transmit (Tx) link and receive
(Rx) link
- HBAs perform low-level interface functions
automatically to minimize impact on host
performance
70Components of SAN Cabling
- Copper cables for short distance
- Optical fiber cables for long distance
- Single-mode
- Can carry single beams of light
- Distance up to 10 KM
- Multi-mode
- Can carry multiple beams of light simultaneously
- Distance up to 500 meters
71Components of SAN Cabling
72Components of SAN Cabling (connectors)
- Node Connectors
- SC Duplex Connectors
- LC Duplex Connectors
- Patch panel Connectors
- ST Simplex Connectors
73Components of SAN Interconnecting devices
Hubs Switches and Directors
74Components of SAN Storage array
- storage consolidation and centralization
- provides
- High Availability/Redundancy
- Performance
- Business Continuity Multiple host connect
75Components of SAN SAN management software
- A suite of tools used in a SAN to manage the
interface between host and storage arrays - Provides integrated management of SAN
environment - Web based GUI or CLI
76SAN Interconnectivity Options FC-AL
Fibre Channel Arbitrated Loop (FC-AL)
Devices must arbitrate to gain control
Devices are connected via hubs Supports up to
127 devices
77SAN Interconnectivity Options FC-SW
Fabric connect (FC-SW)
Dedicated bandwidth between devices Support
up to 15 million devices Higher availability
than hubs
78Network-Attached Storage
79Think "File Sharing"
80Sharing Files
81Sharing Files
822.2 GB
834 GB
84Sharing Files
85(No Transcript)
86Sharing Files
87Sharing Files
88What is NAS?
89What is NAS?
- IP-based file sharing device attached to LAN
- Server consolidation
- File-level data access and sharing
90Why NAS?
dedicated to file-serving
91Benefits of NAS
- Support comprehensive access to information
- Improves efficiency and flexibility
- Centralizes storage
- Simplifies management
- Scalability
- High availability through native clustering
- Provides security integration to environment
(user authentication and authorization)
92CPU and Memory
NICs
file sharing protocols
IP network
NAS OS
storage protocols (ATA, SCSI, or FC)
93(No Transcript)
94- Benefits
- Increases performance throughput (service level)
to end users - Minimizes investment in additional servers
- Provides storage pooling
- Provides heterogeneous file servings
- Uses existing infrastructure, tools, and processes
95(No Transcript)
96- Benefits
- Provides continuous availability to files
- Heterogeneous file sharing
- Reduces cost for additional OS dependent servers
- Adds storage capacity non-disruptively
- Consolidates storage management
- Lowers Total Cost of Ownership
97IP SAN
98Celerra Whiteboard Video
99Driver for IP SAN
- In FC SAN transfer of block level data takes
place over Fibre Channel - Emerging technologies provide for the transfer of
block-level data over an existing IP network
infrastructure
100Why IP?
- Easier management
- Existing network infrastructure can be leveraged
- Reduced cost compared to new SAN hardware and
software - Supports multi-vendor interoperability
- Many long-distance disaster recovery solutions
already leverage IP-based networks - Many robust and mature security options are
available for IP networks
101Block Storage over IP - iSCSI
- SCSI over IP
- IP encapsulation
- Ethernet NIC card
- iSCSI HBA
- Hardware-based gateway to Fibre Channel storage
- Used to connect servers
102Block Storage over IP - FCIP
- Fibre Channel-to-IP bridge / tunnel (point to
point) - Fibre Channel end points
- Used in DR implementations
103iSCSI ?
- IP based protocol used to connect host and
storage - Carries block-level data over IP-based network
- Encapsulate SCSI commands and transport as TCP/IP
packet
104Components of iSCSI
- iSCSI host initiators
- Host computer using a NIC or iSCSI HBA to
connect to storage - iSCSI initiator software may need to be
installed - iSCSI targets
- Storage array with embedded iSCSI capable
network port - FC-iSCSI bridge
- LAN for IP storage network
- Interconnected Ethernet switches and/or routers
105- No FC components
- Each iSCSI port on the array is configured with
an IP address and port number - iSCSI Initiators Connect directly to the Array
106- Bridge device translates iSCSI/IP to FCP
- Standalone device
- Integrated into FC switch (multi-protocol
router) - iSCSI initiator/host configured with bridge as
target - Bridge generates virtual FC initiator
107- Array provides FC and iSCSI connectivity natively
- No bridge devices needed
108FCIP (Fibre Channel over IP)?
- FCIP is an IP-based storage networking technology
- Combines advantages of Fibre Channel and IP
- Creates virtual FC links that connect devices in
a different fabric - FCIP is a distance extension solution
- Used for data sharing over geographically
dispersed SAN
109FCIP (Fibre Channel over IP)?
110FCoE Whiteboard Video
111Question 1
What was EMCs revenue in 2009?
Ask the Audience
112EMC Corporation2009 At a Glance
Revenues 14 billion
Net Income 1.9 billion
Employees 41,500
Countries where EMC does business gt80
RD Investment 1.5 billion
Operating Cash Flow 3.3 billion
Free Cash Flow 2.6 billion
Founded 1979
112
113IDC Digital Universe Study
114Question 2
How much digital information was created
worldwide in 2009?
Ask the Audience
115The Digital Universe 2009-2020
Growingby aFactor of 44
20090.8 ZB
One Zettabyte (ZB) 1 trillion gigabytes
Source IDC Digital Universe Study, sponsored by
EMC, May 2010
1161.2 ZB in 2010 is Equal to . . .
75 Billion Fully Loaded 16GB iPads
117What is Driving the Digital Explosion?
Web 2.0 Applications
Ubiquitous Content-Generating Devices
3G/4G
Secure Collaboration
Longer Data Retention Periods
SEC 17a-4
Freedom of Information Act
HIPAA
Sarbanes-Oxley
Regulation Landscape
118Question 3
What percentage of the .8 zettabytes of digital
information is created by individuals?
Ask the Audience
119The Digital Information World
Individuals create data companies manage it!
Create
Manage
Source IDC Digital Universe Study, sponsored by
EMC, May 2010
120Question 4
How much storage capacity was available on the
first Symmetrix 4200 that EMC shipped in 1990?
Ask the Audience
121EMCs Tiered Storage Platforms
Broadest Range of Function, Performance, and
Connectivity
iSCSI
IP
FICON
SAN
NAS
CAS
Fibre Channel
ADICScalarfamily
EMC Centera
Symmetrix
Invista
Connectrix
iSCSI
DMX-3
DMX-3 950
EMC Centera 4-Node
Low-costFibre Channel 500 GB 7,200 rpm
SATA 250 GB 7,200 rpm
Fibre Channel 73 GB 10k/15k rpm
Fibre Channel 300 GB 10k rpm
Fibre Channel 146 GB 10k/15k rpm
SATA 500 GB 7,200 rpm
1990
2009
Symmetrix 4200 Integrated Cached Disk Array
introduced with a capacity of 24 gigabytes.
Symmetrix V-Max Systems are available with up to
2 petabytes of usable storage in a single system.
122Managing Information StorageTrends, Challenges
and Options
123Question 6
What is the number 1 challenge identified by IT
and storage managers?
Ask the Audience
124Digital Information Storage Challenges
Most important activities/constraints identified
as challenges by IT/storage managers
- Managing Storage Growth
- Designing, deploying, and managing backup and
recovery - Designing, deploying, and managing storage in a
virtualized server environment - Designing, deploying, and managing disaster
recovery solutions - Storage consolidation
- Making informed strategic / big-picture decisions
- Integrating storage in application environments
(such as Oracle,
Exchange, etc.) - Designing and deploying multi-site environments
- Lack of skilled storage professionals
Managing Information Storage Trends, Challenges
and Options 2010-2011
Source Input from over 1,450 storage
professionals worldwide ? http//education.EMC.com
/ManagingStorage/
125Building an Effective Storage Mgmt Organization
Hire an additional 22 storage professionals . .
.
Based on EMC study Managing Information
Storage Trends, Challenges Options
(2010-2011) www.emc.com/managingstorage
126Where Managers Plan to Find Storage Expertise
Based on EMC study Managing Information
Storage Trends, Challenges Options
(2010-2011) www.emc.com/managingstorage
127Top IT Certifications by Salary
Source Certification Magazine, December 2009
128Storage Role Across IT Disciplines
- Leverage the functionalities of storage
technology products to.. - Systems Architects/Administrators
- Maximize performance, increase availability, and
avoid costly server upgrades. - Network Administrators
- Maximize performance of your network and to help
you plan in advance. - Database Administrators
- Maximize performance, increase availability, and
realize faster recoverability of your database. - Application Architect
- Increase the performance and availability of your
application - IT Project Managers
- Plan execute your IT Projects, which involve or
are impacted by Storage technology components
129EMC Academic Alliance
130Key Pillars of IT
Businesses IT perspective on the data center in
the last 20 years have focused on 4 pillars of
Information Technology operating systems,
databases, networking, and software application
development Based on todays IT infrastructure,
Information Storage is the 5th pillar of IT!
131Question 7
What is the name of the EMC authored booked that
was released in May 2009?
Ask the Audience
132Information Storage and Management (ISM)
http//education.EMC.com/ismbook
133Information Storage and Mgmt (ISM)
- Section 1. Storage System
KEY CONCEPT COVERAGE
Data and Information
Structured and Unstructured Data
Storage Technology Architectures
Core Elements of a Data Center
Information Management
Information Lifecycle Management
Host, Connectivity, and Storage
Block-Level and File Level Access
File System and Volume Manager
Storage Media and Devices
Disk Components
Zoned Bit Recording
Logical Block Addressing
Littles Law and the Utilization Law
Hardware and Software RAID
Striping, Mirroring, and Parity
RAID Write Penalty
Hot Spares
Intelligent Storage System
Front-End Command Queuing
Cache Mirroring and Vaulting
Logical Unit Number (LUN)
LUN Masking
High-end Storage System
Midrange Storage System
134Information Storage and Mgmt (ISM)
- Section 2. Storage Networking Technologies and
Virtualization
KEY CONCEPT COVERAGE
Storage Consolidation
Fibre Channel (FC) Architecture
Fibre Channel Protocol Stack
Fibre Channel Ports
Fibre Channel Addressing
World Wide Names (WWN)
Zoning
Fibre Channel Topologies
NAS Device
Remote File Sharing
NAS Connectivity and Protocols
NAS Performance and Availability
MTU and Jumbo Frames
Fixed Content and Archives
Single-Instance Storage
Object Storage and Retrieval
Content Authenticity
Internal and External DAS
SCSI Architecture
SCSI Addressing
iSCSI Protocol
Native and Bridged iSCSI
FCIP Protocol
Memory Virtualization Storage Virtualization
Network Virtualization In-Band and Out-of-Band Implementations
Server Virtualization Block-Level and File Level Virtualization
135Information Storage and Mgmt (ISM)
- Section 3. Business Continuity
KEY CONCEPT COVERAGE
Synchronous and Asynchronous Replication
LVM-Based Replication
Host-Based Log Shipping
Disk-Buffered Replication
Three-Site Replication
Data Consistency
Business Continuity
Information Availability
Disaster Recovery
BC Planning
Business Impact Analysis
Operational Backup
Archival
Retention Period
Bare-Metal Recovery
Backup Architecture
Backup Topologies
Virtual Tape Library
Data Consistency
Host-Based Local Replication
Array-Based Local Replication
Copy on First Access (CoFA)
Copy on First Write (CoFW)
Restore and Restart
136Information Storage and Mgmt (ISM)
- Section 4. Storage Security and Management
KEY CONCEPT COVERAGE
Alerts
Management Platform Standards
Internal Chargeback
Storage Security Framework
The Risk Triad
Security Domain
Infrastructure Right Management
Access Control
and in the Cloud
137EMC Academic Alliance
Developing tomorrows Information Storage
Professionalstoday!
- Partnering with leading Institutes of Higher
Education worldwide to bridge the storage
knowledge gap in Industry - Providing EMC, Customers and Partners with source
to hire storage educated graduates - Hundreds of institutions globally, educating
thousands of students - Offering unique open course on Information
Storage and Management - Focus on concepts and principles
- Opportunity for EMC to give back as the industry
leader - For the latest list of participating institutions
and to introduce us to your Alma Mater, visit - http//education.EMC.com/academicalliance
138Becoming an Academic PartnerRequired Steps . . .
- Institution enrolls via the EAA online
application. - http//info.emc.com/mk/get/EAA_APPL_form?srcHBX_
Account_Numberemc-emccom - Institution identifies faculty to teach course
and administer the program. - Institution identifies faculty to attend the 5
day ISM Faculty Readiness Seminar (FRS) and clear
ISM certification exam. - Institution accesses secure Faculty website to
download teaching aids such as chapter
PowerPoints, quizzes, simulators, etc. - Institution promotes ISM course to students.
- Institution schedules and begins teaching the ISM
course.
139Summary
- Information storage is one of the fastest growing
sectors within IT. - Information growth and complexity creates
challenges and career opportunities - Business and industry are looking for IT
professionals who know all 5 pillars. - Those who obtain the skills through formal
education and industry qualification have an
advantage.