Paretos Law and IT Business Continuity Solve 80% of Your Problems DisasterProof the Mission Critical - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

Paretos Law and IT Business Continuity Solve 80% of Your Problems DisasterProof the Mission Critical

Description:

Business continuance (sometimes referred to as business continuity) describes ... Business continuance planning seeks to prevent interruption of mission-critical ... – PowerPoint PPT presentation

Number of Views:85

Avg rating:3.0/5.0

Slides: 30

Provided by: elainep6

Category:

more less

Transcript and Presenter's Notes

Title: Paretos Law and IT Business Continuity Solve 80% of Your Problems DisasterProof the Mission Critical

1
Paretos Law and IT Business ContinuitySolve 80
of Your Problems Disaster-Proof the Mission
Critical Aspects of Your Infrastructure

AITP Meeting April 22, 2008 IT Disaster
Preparedness Start Planning Yesterday
Presented by Bob Lamendola

2
Bad Things Do Happen

Natural disasters
Man-made disasters
Pandemics
System failures
Scheduled maintenance
Human error
Personnel events
Terrorism

3
Recent Examples
Northeast Blackout - August, 2003
Hurricane Katrina August, 2005
MTA Strike, NYC - December, 2005
Avian Influenza - Ongoing
4
Recent Examples
Steam Pipe ExplosionNYC July 19, 2007

September 11, 2001

5
Revealing Research

According to a survey of US companies conducted
by Info-Tech Research Group, more than 60 percent
of IT departments did not have formal plans and
procedures in place to deal with the East Coast
blackout.
Although more than 76 percent of companies
surveyed said that the blackout had an impact on
their organization, most of them admitted that
they were not sufficiently prepared.

6
Regulatory Compliance is Not an Option

Sarbanes-Oxley
HIPAA
Gramm-Leach-Bliley
California Act
Basel II
CLERP-9
New Federal Rules of Civil Procedure
Email e-discovery rules

7
Definitions

High Availability
High availability refers to a system or component
that is continuously operational for a desirably
long length of time. Availability can be measured
relative to "100 operational" or "never
failing." A widely-held but difficult-to-achieve
standard of availability for a system or product
is known as "five 9s" (99.999 percent)
availability. Source TechTarget Data
Center Media
Disaster Recovery
Duplicating computer operations after a
catastrophe occurs, such as a fire or earthquake.
It includes routine off-site backup as well as a
procedure for activating vital information
systems in a new location. Source PC
Magazine
Business Continuity
Business continuance (sometimes referred to as
business continuity) describes the processes and
procedures an organization puts in place to
ensure that essential functions can continue
during and after a disaster. Business continuance
planning seeks to prevent interruption of
mission-critical services, and to reestablish
full functioning as swiftly and smoothly as
possible. Source Bitpipe.com

8
Business Impact Analysis

What is a Business Impact Analysis?
It is a technique for identifying both tangible
and intangible impacts on a business process,
function or department usually over time, based
on given criticalities.
A Business Impact Analysis
Provides senior management with the information
needed to devise a recovery strategy and recovery
prioritization
Provides supporting data to define an appropriate
DR program budget
Identifies who and what are vital to the
businesss survival
Internal suppliers, customers, shareholders, IT
systems, manufacturing processes
External government departments, regulators,
trade bodies, competitors, pressure groups
Evaluates recover priorities and time scales
Criticality of each function to business survival
Assesses the potential cost of disaster
Direct and indirect costs of loss of service
capability

9
Business Impact Analysis

Identifies the high risk areas of the existing
infrastructure
Single points of failure
Recovery time limitations
Identifies the business critical applications and
the systems they run on
Identifies the areas of vulnerability within the
environment
Focuses on the delivered service
Business applications like CRM, order processing,
dispatch, and billing
Internal applications like payroll and HR
Communications like email and Web sites
Answers how not having the capability affects the
business
Is the application critical to the business?
Is the function duplicated elsewhere?
What viable alternatives exist?

10
Contingency Plan Criteria

Factors to consider
The scale of the organization and its IT systems
The nature of the operation
An online system may need to be restored within
hours, whereas a customer billing operation may
not be harmed by a few days delay, if no data is
lost
The relative costs of different options
A company with several linked sites may be able
to move operations to an alternative site
The perceived likelihood of disaster occurring
Companies in earthquake zones are likely to
invest more in disaster recovery than average

11
Recovery Objectives

Recovery Time Objective (RTO)
The period of time within which technical
services and / or business functions must be
recovered and available after an outage (e.g. one
business day) measured from the time of disaster
to the resumption of production operations.
Recovery Point Objective (RPO)
The acceptable level of data loss exposure
following an unplanned event. This is the point
in time (prior to the disaster) to which lost
data can be restored typically the last backup
taken offsite.

12
Frequency of Downtime
Frequency
Type of Disaster Scenario
13
The Business Critical 80 Then and Now

10 years ago, financial applications were the
top priority. Today, an organizations mission
critical areas are
Communications
Email, handhelds, telecommunications
Revenue generating systems
Order entry systems, payment processing systems
Backend operations
Financial systems, ERP systems

14
Paretos Law

In 1906, Paretos Principle was born. An Italian
economist, Vilfredo Pareto, observed that 20
percent of his countrys people owned eighty
percent of the wealth. This principle was
broadened in the mid-20th century by Dr. Joseph
Juran, who penned a universal rule called the
vital few and trivial many the principle that
20 percent of input is always responsible for 80
percent of the output. Jurans work, although
expanding widely on Paretos, remained known as
Paretos Principle, or the 80/20 rule.
Paretos Principle applies to business continuity
management. 20 percent of the threats to an
organization will result in 80 percent of
invocations. The business continuity managers
main task is to identify the Pareto Principle
risks and mitigate these. These risks will not
be the headline grabbers, they will be the
mundane threats of fire, and flood of natural
disaster of loss of critical IT and telecoms
systems and of loss of human resources.
Source David Honour, Continuity Central

15
Disaster-Proofing Against the 20
16
Cold Site
Version 1 Pre-designated equipment resident at
alternate location, not typically used for any
other purpose but DR Version 2 Contract for
equipment/facility used on a temporary basis,
during declared emergency - several providers
offer these services
17
Warm Site
Pre-designated equipment resident at an alternate
location, not typically used for any other
purpose but DR, periodically refreshed with live
data Data refresh can be accomplished in a number
of ways, including leased line and tape
18
Hot Site
Pre-designated equipment resident at alternate
location May be used for purposes other than DR,
with real-time or near real-time replication of
data
19
DR Configurations Recap
20
Technical Server Contingency Planning Solutions
21
System Backups

Servers can be backed up through a distributed
system, in which each server has its own drive,
or through a centralized backup device. Four
types of system backup methods are available to
preserve servers data
Full
Captures all files on disk
Incremental
Captures files that were created or changed since
the last backup
Differential
Backup of stored files that were created or
modified since the last full backup
Block Level Backup
Works like a differential backup, but the files
are backed up at the block level, which reduces
the space requirement

22
RAID Redundant Array of Independent Disks

Provides disk redundancy and fault tolerance
for data storage and decreases mean time between
failures. Raid is used to mask disk drive and
disk controller failures. RAID technology uses
three data redundancy techniques and 5 RAID
levels to provide levels of redundancy.
Mirroring
Writes data simultaneously to separate hard
drives
Parity
A technique of determining whether data has been
lost or overwritten
Striping
Improves the performance of the hardware array
controller by distributing data across all drives

23
Standby Servers

Servers can be pre-built and staged in an
off-site location. At the point in time that a
disaster recovery plan is called in effect, the
servers will be put into operation and data
restoration to restore the services of the
effected resource.

24
Electronic Vaulting and Remote Journaling

These are similar technologies that provide
additional data backup capabilities, with backups
made to remote tape drives over communication
links. Remote journaling and electronic vaulting
enable shorter recovery times and reduced data
loss should the server be damaged between
backups.

25
Server Load Balancing

This technology increases the server
application availability. Through load balancing,
traffic can be distributed dynamically across
groups of servers running a common application so
that no single sever is overwhelmed. With this
technique, a group of servers appear as a single
server to the network. Using load balancing among
different sites can enable the application to
continue to operate as long as one or more sites
remain operational.

26
Synchronous Server Replication/Mirroring

This method uses a disk-to-disk copy and
maintains a replica of the database or file
system by applying changes to the replicating
server at the same time changes are applied to
the protected server. With synchronous
mirroring, the RTO can be minutes. Mirroring
should be used for critical applications that can
accept little or no downtime or no data loss.

27
Disaster Recovery Planning Cycle
28
Bottom Line