Business Continuity and Disaster Recovery Planning

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Continuity of Operations Plan (COOP)

Provide procedures and capabilities to sustain an organization essentials, strategic functions at alternative site for up to 30 days

Business Resumption / Recovery Plan ( BRP)

Provide procedures for recovery business operations immediately following a disaster

Continuity of Support Plan

Provide procedures and capabilities for recovering a major application or general support system

Cyber Incident Response Plan

Provide strategies to detect , respond to and Limit consequences of malicious cyber incident

Mean Time to Repair (MTTR)

The Mean Time to Repair (MTTR) describes how long it will take to recover a specific failed system.

Categorization - NIST - ComputerSecurity Incident Handling Guide - SP 800-61

DoS - Malicious Code - Unauthorized access - Inappropriate usage - Multiple components

Minimum Operating Requirements (MOR)

Minimum Operating Requirements ( MOR ) describe the minimum environmental and connectivity requirements in order to operate computer equipment.

Mobile Site

Mobile sites are "data centers on wheels": towable trailers that contain racks of computer equipment, as well as HVAC, fire suppression, and physical security.

other terms may be substituted for Maximum Tolerable Downtime.

These include Maximum Allowable Downtime (MAD), Maximum Tolerable Outage (MTO), and Maximum Acceptable Outage (MAO).

BCP/DRP frameworks

800-34 Rev. 1 "Contingency Planning Guide for Federal Information Systems" ISO/IEC 27031 focuses on BCP Separate ISO plan for disaster recovery is "ISO/IEC 24762:2008, Information technology—Security techniques—Guidelines for information and communications technology disaster recovery services."

Cold Site

A cold site is the least expensive recovery solution to implement. It does not include backup copies of data nor does it contain any immediately available hardware.

Develop the contingency planning policy statement

A formal department or agency policy provides the authority and guidance necessary to develop an effective contingency plan.

Hot Site

A hot site is a location that an organization may relocate to following a major disruption or disaster. It is a datacenter with a raised floor, power, utilities, computer peripherals, and fully configured computers.

Call Tree

A key tool leveraged for staff communication by the Crisis Communications Plan is the Call Tree, which is used to quickly communicate news throughout an organization without overburdening any specific person.

Failure and recovery metrics

A number of metrics are used to quantify how frequently systems fail, how long a system may exist in a failed state, and the maximum time to recover from failure. These metrics include the Recovery Point Objective ( RPO ), Recovery Time Objective (RTO), Work Recovery Time (WRT), Mean Time Between Failures (MTBF), Mean Time to Repair (MTTR), and Minimum Operating Requirements ( MOR ).

redundant site

A redundant site is an exact production duplicate of a system that has the capability to seamlessly operate all necessary IT operations without loss of services to the end user of the system.

simulation test , also called a walk through dril

A simulation test , also called a walk through drill (not to be confused with the discussion-based structured walk-through), goes beyond talking about the process and actually has teams to carry out the recovery process.

Activate Team

Activate team If a disaster is declared, then the recovery team needs to be activated. Depending on the scope of the disaster, this communication could prove extremely difficult. The use of calling trees,

Assess

Assess Though an initial assessment was carried out during the initial response portion of the disaster recovery process, a more detailed and thorough assessment will be performed by the disaster recovery team.

Assessing the critical state

Assessing the critical state can be difficult because determining which pieces of the IT infrastructure are critical depends solely on how it supports the users within the organization.

Check List - Consistency Testing

Checklist (also known as consistency ) testing lists all necessary components required for successful recovery and ensures that they are, or will be, readily available should a disaster occur.

Communicate

Communicate One of the most difficult aspects of disaster recovery is ensuring that consistent timely status updates are communicated back to the central team managing the response and recovery process.

Conduct the business impact analysis BIA

Conduct the business impact analysis ( BIA ): The BIA helps to identify and prioritize critical IT systems and components. A template for developing the BIA is also provided to assist the user.

Develop an IT contingency plan

Develop an IT contingency plan : The contingency plan should contain detailed guidance and procedures for restoring a damaged system.

Develop recovery strategy

Develop recovery strategies : Thorough recovery strategies ensure that the system may be recovered quickly and effectively following a disruption.

Environmental

Environmental—Threats focused on information systems or datacenter environments include items such as power issues (blackout, brownout, surge, spike), system component or other equipment failures, and application or software flaws.

Types of disruptive events

Errors and omissions:Natural disasters:Electrical or power problems:Temperature and humidityWarfare, terrorism, and sabotage:Financially motivated attackers:Personnel shortages:

Identify preventive controls

Identify preventive controls : Measures taken to reduce the effects of system disruptions can increase system availability and reduce contingency life cycle costs.

Natural

Natural—The most obvious type of threat that can result in a disaster is naturally occurring. This category includes threats such as earthquakes, hurricanes, tornadoes, floods, and some types of fires. Historically, natural disasters have provided some of the most devastating disasters that an organization can have to respond to.

Partial and complete business interruption

Partial and complete business interruption Arguably, the most high fidelity of all DRP tests involves business interruption testing . However, this type of test can actually be the cause of a disaster, so extreme caution should be exercised before attempting an actual interruption test.

Plan maintenance

Plan maintenance : The plan should be a living document that is updated regularly to remain current with system enhancements."

Plan testing , training , and exercis

Plan testing , training , and exercises : Testing the plan identifies planning gaps, whereas training prepares recovery personnel for plan activation; both activities improve plan effectiveness and overall agency preparedness.

Incident Response Plan - IRP -6 Phases

Preparation - Identification - Containment - Eradication - Recovery - Lessons Learned

Project Initiation

Project Initiation involves seven distinct milestones

Occupant Emergency Plan (OEP)

Provide coordinated procedures to minimize loss of life or injury protecting property damage in response to physical attack

Disaster Recovery Plan (DRP)

Provide detail procedures to facilitate recovery of capabilities as an alternate site

Crisis Management Plan ( CMP )

Provide procedure to disseminating status reports to personal and the public

Business Continuity Plan ( BCP)

Provide procedures for sustaining essential business operations while recovering from significant disruption

Reconstitution

Reconstitution The primary goal of the reconstitution phase is to successfully recover critical business operations at either primary or secondary site.

Respond

Respond In order to begin the disaster recovery process, there must be an initial response that begins the process of assessing the damage. Speed is essential during this initial assessment.

The Disaster Recovery Process

Respond, Activate Team, Communicate, Assess, Reconstitute

Starting emergency power

Starting emergency power Though it might seem simple, converting a datacenter to emergency power, such as backup generators that will begin taking the load as the UPS fail, is not to be taken lightly.

BCP/DRP-focused risk assessmen

The BCP/DRP-focused risk assessment determines what risks are inherent to which IT assets. A vulnerability analysis is also conducted for each IT system and major application.

Relationship between BCP and DRP

The Business Continuity Plan is an umbrella plan that includes multiple specific plans, most importantly the Disaster Recovery Plan. The Disaster Recovery Plan serves as a subset of the overall Business Continuity Plan

DRP Review

The DRP review is the most basic form of initial DRP testing and is focused on simply reading the DRP in its entirety to ensure completeness of coverage.

Disaster Recovery Planning

The Disaster Recovery Plan (DRP) provides a short-term plan for dealing with specific IT-oriented disruptions.

Recovery Point Objective RPO

The Recovery Point Objective (RPO) is the amount of data loss or system inaccessibility (measured in time) that an organization can withstand. The point prior to the outage to which data are to be restored that is the last point of know good data

Recovery Time Objective RTO

The Recovery Time Objective (RTO) describes the maximum time allowed to recover business or IT systems. RTO is also called the systems recovery time.

Service Delivery Objectives - SDO

The SDO is the level of acceptable service that me be achieved with the Recovery Time Objective - RTO

Identify critical assets

The critical asset list is a list of those IT assets that are deemed business essential by the organization.

Determine Maximum Tolerable Downtime

The primary goal of the BIA is to determine the Maximum Tolerable Downtime ( MTD ), which describes the total time a system can be inoperable before an organization is severely impacted.

NIST Special Publication 800 34

provides a visual means for understanding the interrelatedness of a BCP and a DRP, as well as Continuity of Operations Plan ( COOP ), Occupant Emergency Plan ( OEP ), and others.

Maximum Tolerable Downtime is composed of two metrics:

the Recovery Time Objective ( RTO ) and the Work Recovery Time ( WRT )

Warm Site

A warm site has some aspects of a hot site, for example, readily accessible hardware and connectivity, but it will have to rely upon backup data in order to reconstitute a system after a disruption. It is a datacenter with a raised floor, power, utilities, computer peripherals, and fully configured computers.

AIW

Acceptable Interruption Window

Business Impact Analysis - BIA

Business Impact Analysis ( BIA ) is the formal method for determining how a disruption to the IT system(s) of an organization will impact the organization's requirements, processes, and interdependencies with respect to the business mission.

Change Mangement

Change management includes tracking and documenting all planned changes, formal approval for substantial changes, and documentation of the results of the completed change. All changes must be auditable.

Human

Human—The human category of threats represents the most common source of disasters. Human threats can be further classified by whether they constitute an intentional or unintentional threat.

IPF

Information process facility

System Development Lifecycle - SDLC

Initiation - Development/ Acquisition - Implementation - Operate/ Maintenance - End of Life / Disposition

Maximum Tolerable Outage - MTO

MTO is the total time the operations can subs tainted at an alternate site

Mean Time Between Failures

Mean Time Between Failures (MTBF) quantifies how long a new or repaired system will run before failing.

Defining Incident Management Processes -CMU/SEI

Prepare - Protect - Detect - Triage - Respond

Business Continuity Planning

The overarching goal of a BCP is for ensuring that the business will continue to operate before, throughout, and after a disaster event is experienced.

Downtime consists of two elements

The systems recovery time and the work recovery time. Therefore, MTD = RTO + WRT.

Disasters or disruptive events

The three common ways of categorizing the causes for disasters are whether the threat agent is natural, human, or environmental in nature.

Work Recovery Time (WRT)

Work Recovery Time (WRT) describes the time required to configure a recovered system.

CSIRTs

computer security incident response teams - CSIRTs

Parrellel Processing

parallel processing . This type of test is common in environments where transactional data is a key component of the critical business processing. Typically, this test will involve recovery of critical processing components at an alternate computing facility

NIST 800-34, Contingency Planning Guide to achieving a sound, logical BCP/DRP.

• Project Initiation • Scope the Project • Business Impact Analysis • Identify Preventive Controls • Recovery Strategy • Plan Design and Development • Implementation, Training, and Testing • BCP/DRP Maintenance


Ensembles d'études connexes

TCI lesson 15: early societies in west africa

View Set

PrepU Chapter 35: Assessment of Musculoskeletal Function

View Set

Nurs 309 Exam (Ch 64 - Diabetes)

View Set

Microbiology 20 Advanced Chapters 1, 4, 5, and 6

View Set