Storage Network Designs for OLTP Business Continuity - PowerPoint PPT Presentation

About This Presentation

Title:

Storage Network Designs for OLTP Business Continuity

Description:

Title: Storage Management 2003 Last modified by: TSousa Created Date: 2/5/2002 6:20:53 PM Document presentation format: On-screen Show Company: TechTarget – PowerPoint PPT presentation

Number of Views:176

Avg rating:3.0/5.0

Slides: 65

Provided by: cdnTtgtme3

Category:

more less

Transcript and Presenter's Notes

Title: Storage Network Designs for OLTP Business Continuity

1
Storage Network Designs for OLTP Business
Continuity

Marc Farley
President, Building Storage Networks, Inc.

2
Agenda

The Vendor Neutral Approach
Overview of OLTP High Availability
I/O Redundancy Methods
Storage Network Technologies
Storage Networking for HA OLTP

3
Vendor Neutral Approach

Generic terms, not vendor terms
Assumed basic knowledge of SAN, NAS, RAID

4
And now, for something completely different..
5
OLTP Environments

Mission critical business applications
Business in real-time
Expensive equipment and software
Aggressive performance objectives
Highly skilled IT staff
Hands-on computing operations

6
OLTP Database Software

Oracle,
8i Oracle Parallel Server (OPS)
9i Real Application Cluster (RAC)
IBM
DB2 UDB
Informix
MS SQL Server
Sybase, My SQL, others

7
OLTP OS Platforms

IBM S/390 MVS
Unix Systems
Windows 2000
HA Linux

8
OLTP Requirements

99.999 uptime
Non-degrading response time
High transaction rates
Seamless scalability
Cost relief

9
Database Storage Approaches

Raw parititions
Bypass OS I/O buffering
File system
Facilitates data management
NFS mounted
Offload DB server, NTAP Oracle

10
ACID Properties of OLTP
Atomicity No partial transactions Consistency
All tables are in a consistent state before and
after a completed transaction Isolation One
transaction cannot contaminate other
transactions Durability Transactions are
complete only when the database updates are
written to disk storage
11
Challenges of OLTP

Major systems integration effort
Intricate tuning and monitoring
Little tolerance for errors
Complex data structures relationships
Time and sequence-sensitive processes
Must be adhered to for data integrity
Shifting workloads and bottlenecks

12
OLTP Database Files

Data files
Database data, tablespaces
Redo log files, archive log files
Reconstruct or rollback transactions
Control files
File layout information

13
OLTP Table Space Storage

Use many spindles to distribute hot spots
RAID 01 recommended
File system recommended over raw partitions
Easier data management

14
Striping for Performance
RAID Controller (Microsecond performance)
DiskDrive
DiskDrive
DiskDrive
DiskDrive
DiskDrive
DiskDrive
Disk Drives (Millesecond performance)From
rotational latency and seek time
15
My Personal Favorite, RAID 01
RAID Controller
DiskDrive
DiskDrive
DiskDrive
DiskDrive
DiskDrive
DiskDrive
DiskDrive
DiskDrive
DiskDrive
DiskDrive
1
2
3
5
4
Mirrored Pairs of Striped Members
16
OLTP Redo Log Storage

Raw partitions recommended
Sequential high speed writes
Separate mirror pairs per log file group
Capacity for 30 60 minutes of data
Goal is to limit disk contention for current and
active log files

17
OLTP Archive Log Storage

File system or NFS mounting is required
NFS mounting is recommended
Mirroring or RAID
Goal is to have easy access in case they are
needed for reconstruction

18
High Availability

The ability for a system or application to
immediately continue its mission after loss or
damage to system components, systems, facilities
and data

19
Availability Threats

Expected
Scaling limitations
Processor
Storage capacity
Network
Consolidations
Product life cycles

Unexpected
Failures
Bugs
Virus
Operator errors
Disasters

20
HA Engages All Elements

Systems
Application
Network connections
Network services
Storage and I/O subsystems

21
Scoping the Risks
System Network Storage
Component HBA Cable Disk drive
System Server Switch Subsystem
Pathological Virus attack on platform Service provider outage Environmental media loss
Site Server rooms gutted All external communications Total data loss
22
Managing the Risks

Local copies of data
Immediate availability
(Remote) Nearby
Immediate availability to several hours
Remote Far away
One to several days availability

23
Disaster/Availability Radii
Local
Remote Nearby
Remote Far Away
24
Nobody Expects..

Weird things to happen to them
Disintegration of media
Underground flooding through tunnels
Fires in Telco switching centers

25
High Availability for OLTP

Duplication of functions
Without degrading performance
Without risking data integrity
Brute force techniques
Automation and efficiency
Cost is always an issue
And high availability DOES cost

26
A Long Time Ago in a Job Not So Far Away.
You must learn the to be a master of redundancy
it if you are going to be a storage geek.
Remember Marc, there is only one concept
REDUNDANCY!
Redundancy. Again!
Got it Jim. Lets Eat!
Whatever
27
Eventually, I Learned to Appreciate His
Teachings

REDUNDANCYNSPoF (No Single Point of Failure)

Dont get the giant spicy Polish for lunch
its too much for the digestion
28
OLTP HA Requires Complete Redundancy Protection

Client network
Server systems and components
Application modules
I/O Channels and Networks
Storage subsystems and components
Data

29
A Quick Look At Clustered Storage
Shared Everything
Shared Nothing
Both servers share control of a common storage
address space
Each server controls its own storage address space
30
Examples of OLTP Clusters
Microsoft SQL Server
Oracle 9.1 RAC
Data is exchanged between servers
Data is accessed directly from storage
Failoverpaths only
31
One more time, with subsystems
Microsoft SQL Server
Oracle 9.1 RAC
All storage is shared by all cluster nodes
Same subsystem but different address spaces
32
I/O Redundancy

Host to subsystem
Mirroring Host to independent targets
Multi-pathing Host to a single target
Subsystem to subsystem
Store and forward
Local
Remote

33
Disk Mirroring Redundant storage targets
Independent, identically sized storage address
spaces
One controller
Two controllers
34
Disk Mirroring I/Os to 2 Targets

Brute force redundancy fast and simple
Both read and write I/Os
Overlapped reads for performance
Local connections
Limited capacity
I/O Bottlenecks for random I/O activity
if targets are disk drives

35
Disk Mirroring for Redo Log Files

Log files are a common bottleneck
Use raw partitions
Redundancy is required
Mirroring is adequate
Use highest RPM with lowest seek times
Put on a separate channel from database I/O
Use separate mirrored pairs per group

36
Mirroring to Storage Subsystems
StorageSubsystem
Independent, identically sized storage address
spaces
Two controllers
StorageSubsystem
37
Mirroring to Subsystems

Targets are subsystems, not disks
Separate address spaces
Capacity scales to subsystem max
Double level redundancy
Mirroring plus RAID
Multiple disk spindles reduces I/O bottlenecks

38
Disk Mirroring Datafiles from Host to Storage
Subsystems

Disk mirroring subsystem RAID
Excellent capacity scaling
Adjacent and across campus/town
One subsystem outside site radius
Requires longer distance cabling
Reads and writes both transmitted

39
Multi-Pathing Redundant Paths Between a Host
Subsystem
X
Application data volume
Pathing software determines that a transmission
error occurs switches to a redundant path
40
Multi-pathing vs Mirroring

Mirroring assumes independent, but similar
storage targets
Multi-pathing assumes multiple paths to the exact
same target
Mirroring can use a single HBA, multi-pathing
needs two HBAs

41
Path Failures
1
3
2
1. HBA problem
Application data volume
2. Link, switch or network problem
3. Subsystem controller problem
42
Transmission failures recognized after SCSI
timeouts are exceeded
The I/Os is retried and eventually an error is
passed back to the process that issued the I/O
43
Path Failover for OLTP I/O

Redundant path resources take over activities for
a failed path to sustain operations without
disrupting service or risking data integrity

44
Store and Forward
Independent, identically sized storage address
spaces
Host
B
A
45
Store Forward One Host I/O and Two Copies of
Data

Only real option for remote copies
Does not forward read I/Os
Proprietary protocols and methods
Standards are emerging ie. FC/IP
First step to storage snapshots

46
Store and Forward Acknowledgements
Asynchronous
Synchronous
B
B
A
A
47
Trade-offs withAcknowledgement Handling

Synchronous
Always preferred
Slowest performance
State of copy is precise
Asynchronous
Fastest performance
Least precise knowledge of copy status

48
Store Forward Local and Remote Copies

Local nearby copy techniques
Synchronous
Fiber optic cabling, optical/DWDM services
Remote-far away copy techniques
Asynchronous
ATM gateways, OC-12 or less, FC/IP

49
Mirroring vs Synchronous Store and Forward for
Local Nearby Copies

Mirroring
Async I/O
Reads and writes
No snapshot tie-in
Uses more host slots
Least costly

Store and Forward
Async or Sync I/O
Writes only
Snapshot ready
May conserve host I/O slots
Most costly

50
Combining Mirroring with Store and Forward
Store and Forward Radius
Local
Nearby
Remote Far Away
Mirroring Radius
51
Data Redundancy for OLTP

Backup
Snapshots
Delta (log files)

52
Backup for OLTP

A whole subject unto itself
Disaster recovery primarily
Cold? Who can afford to do that anymore?
Hot put DB in backup mode
Backup snapshot image of data

53
Subsystem Snapshots for OLTP
1. Flush host buffers (sync, sync)
2. Create Snapshot
Database Server
Disk Storage Subsystem A
Disk Storage Subsystem c
Disk Storage Subsystem B
54
Logical Snapshots for OLTP
1. The address space is mapped
2. First updates
v
Overwritten data locations are not returned to
the free space pool. (Undelete)
3. Secondupdates
55
Delta Redundancy with Log Files

Recording of all transaction activities
Roll forward, bring up to date
Roll Backward, go to known good state
Terrific tool for remote redundancy
Not HA
Process cannot have holes in it

56
Remote Redundancy w/ Log Files
-1
d(x) f(x) f(x-1)
f(x-1)
f(x)
Current to Log File Switch Checkpoint
Latest Redo Log File
Previous Instance
57
And now, some thoughts from our sponsor..
How come I always end up doing all the work?
He never does anything except eat and sleep
Redundancy is a way of life
ManagingRedundancy is Hard Work
58
SAN Considerations