Business Continuity and Disaster Recovery

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Risk Analysis Process

1. Identify company assets 2. Identify potential threats 3. Estimate loss 4. Identify countermeasures 5. Respond to risk

BCP Steps

1. Scope/Plan Initiation 2. Criticality Prioritization 3. Identification of resouce dependencies 4. Estimate Downtime 5. Outline response options

BCP: Continuity Planning

1. Strategy development 2. Provisions and processes 3. Plan approval 4. Plan implementation 5. Training and education

Warm Site

A leased or rented facility that is usually partially configured with some equipment, such as HVAC, and foundation infrastructure components, but not the actual computers. MTD 72 hours

Cold Site

A leased or rented facility that supplies the basic environment, electrical wiring, air conditioning, plumbing, and flooring, but none of the equipment or additional services. A cold site is essentially an empty data center. It may take weeks to get the site activated and ready for work.

Database: Electronic vaulting

A process that involves moving a database to a remote site using bulk transfers. Can be a hot site or an offsite location. However, there might be a significant delay between the time you declare a disaster and the time your database is ready for operation with current data.

Emergency-Response Guidelines

A process that outlines the company and individual responsibilities for response to emergency situations. This provides steps for the first employees to conduct in detecting, responding, notifying and activating the BCP plan.

Total Risk

A risk a company faces if it chooses not to implement any type of safeguard. A company may choose to conduct a cost/benefit analysis results indicate this is the best course of action. threats × vulnerability × asset value = total risk

Mean Time to Repair (MTTR)

An estimate of how long it will take to fix a piece of equipment and get it back into production.

Single Point of Failure

Any component that can cause an entire system to fail. If a computer has data on a single disk, failure of the disk can cause the computer to fail, so the disk is a single point of failure.

Senior Management's Role

Approval and buy-in is essential to the success of the overall BCP effort. Involvement should be throughout the development/implementation phases of the plan, this should be a relatively straightforward process. Also responsible for initiating the BCP/DRP in the event of an emergency.

BCP: Risk Acceptance/Mitigation

BCP docs should contain the outcome and strategy for identifying each risk analysis for why they're acceptable and whats deemed unacceptable. Business leaders to formally document their risk acceptance decisions.

Vital Records Program

BCP documentation should always outline this program. This states where the critical records will be stored and the backup process. Most important step is to identify them and then find them.

BCP: Risk Assessment

BCP step that recaps the decision-making process undertaken during the BIA. Should include all the risk considered, the quantitative/qualitative analyses perfomed to assess these risk. **Must be updated on a regular basis.**

BIA: Likelihood Assessment

BCP team should draw up a comprehensive list of events that can be a threat to the business. The team should identify the likelihood that each risk will occur and use the ARO to determine the number of times a business expects to experience a disaster each year.

BCP: Strategy Development

Bridges the gap between BIA and BCP planning. The BCP team must prioritize a list using the quantitative/qualitative results to determine which risk to address first. The MTD estimates should also be used to determine which risk are deemed accepatable and which should be mitigated.

Maximum Tolerable Downtime (MTD)

Calculation to determine how long an organization can remain in operation without assets, systems or processes functioning. Shortest time gets 1st priority

Reciprocal Agreement

Contract between 2 organizations that state in the event of a disaster they will aid each other by sharing their IT processing capabilities. They have no initial cost related to them, which makes this the cheapest option.

Checklist Test

Copies of the BCP are distributed to the different departments and functional areas for review. Method ensures that nothing has been left out.

Business Continuity Plan (steps)

Define Scope Prioritize criticality Identify Resource Dependencies Estimate Downtime Outline Response Options

Prudent Man Rule

Demonstrates that management has taken reasonable actions to ensure safety standards by practicing due care, and due diligence in accordance to accepted best practices.

Fail Safe

Electrical hardware lock will be unlocked when power is removed. e.g. emergency exit doors will be configured to so that personnel is not locked inside during a fire or other emergency.

Annualized Loss Expectancy (ALE)

Estimates the annual loss resulting from an incident. This identifies the maximum annual budgetary amount to spend in the protection of an asset. Cost to deploy countermeasures each year should not exceed the ALE. ALE = SLE X ARO

BIA Step 2. Identify resource requirements

Evaluation of resources that are required to resume mission/business processes and related interdependencies.

Function Recovery

Has the ability to automatically recover functions. Gives the system the ability to roll back changes to a secure state.

Annualized Rate of Occurrence (ARO)

Identifies how often a successful threat attack will occur.

Recovery Strategy

Insurance should be included to enable your organization to recover from a disaster. Try to include property insurance that supports actual cash value clauses so your property damages can be compensated based on fair mar.ket value.

Restoration

Involves bring a business facility and envrionment back to a workable state.

Recovery

Involves bringing business operations and processes back to a working state.

Quantitative

Involves the use of numbers and formulas to reach a decision. This type of data often expresses options in terms of the dollar value to the business.

High Availability (HA)

Is a combination of technologies and processes that work together to ensure that some specific thing is always up and running

Residual Risk (RR)

Monetary value that identifies the asset value that countermeasures do not protect. (threats × vulnerability × asset value) × controls gap = residual risk

Disk-to-Disk Backup

More commonly used as a backup solution for their disaster recovery strategy. Organizations that use this method must remember to maintain geographical diversity. Disks must be located offsite. Managed service providers are used to manage remote backup locations.

Disaster

Occurs when a facility becomes unusable for 24 hours or longer.

BCP: Identify Preventive Controls

Once threats are recognized, identify and implement controls and countermeasures to reduce the organization's risk level in an economical manner

DRP: Training, Awareness, and Documentation

Orientation training for all new employees. Initial training for employees taking on a new disaster recovery role for the first time.Detailed refresher training for disaster recovery team members. Brief awareness refreshers for all other employees.

(BCP): Testing

Plan should be tested at least once annually. Also, if any significant changes occur, changes to assets, systems or processes.

BCP Goal

Primary goal is to maintain business operations with reduced or restricted infrastructure capabilities or resources.

Security Planning Objectives

Protect the safety of employees. *top priority* Identify and implement corrective actions for organization survival. Return all processes to normal operations.

Mutual Assistance Agreements (MAAs)

Provides an inexpensive alternative to disaster recovery sites, but they are not commonly used because they are difficult to enforce. Organizations participating in this service may also be shut down by the same disaster, and they also raise confidentiality concerns.

BCP: Policy Statement

Provides the guidance necessary to develop a BCP and assigns roles to carry out task.

(BCP): Maintenance

Put in place steps to ensure the BCP is a living document that is updated regularly. The BCP team should still meet periodically to discuss the plan and review the results of plan test and make further adjustments if needed. Also, drastic changes may require new plan, old BCP copies should be destroyed or replaced to stay consistent.

Recovery Team Role

Responsible for implementing the recovery efforts to get the organization functioning at the secondary or alternate site when the primary site experiences a disaster.

Disaster Recovery Plan

Should include risk assessments to determine likelihood of a disaster, the criticality prioritization to determine the essential business operations and documented resource dependencies to determine what supporting resources are necessary to support essential business operations.

Hierarchical Storage Management (HSM)

System thats uses an automated robotic backup jukebox consisting of 32 or 64 optical or tape backup devices. All the drive elements within this system are configured as a single drive array (a bit like RAID)

Exposure Factor (EF)

The amount of damage that the risk poses to the asset, expressed as a percentage of the asset's value. "e.g. damage would cause 70% damage to the building."

Single Loss Expectancy (SLE)

The amount of loss expected for any single threat attack on given asset. This is a monetary value that describes how much the incident will cost in terms of lost asset value. SLE = AV X EF

Accept Risk

The company understands the level of risk it is faced with, as well as the potential cost of damage, and decides to just live with it and not implement the countermeasure.

BIA: Identify Priorities

The first task in BIA is to identify business priorities; the most essential to day-to-day operations. The list should be ranked in order of importance. Team members from each dept should create a list of important business functions and combine that list to create a master list. This will be followed by assigning AV to each asset and developing an MTD/RTO. **The goal of the BCP process is to ensure that your RTOs are less than your MTDs**

Due Diligence

The implementation and support of security policies and procedures.

Backups

The most important area to maintain for business continuity. They're the only means to which an organization can restore data in the event of a disaster of failed or damaged systems.

Control Gap (CG)

The percentage of the asset value that a countermeasure cannot protect. This identifies the effectiveness of the countermeasures or security controls implemented for the asset.

Synchronous replication

The primary and secondary repositories are always in sync, which provides true real-time duplication.

Automated Recovery

The system is able to perform trusted recovery activities to restore itself against at least one type of failure. e.g. hardware RAID provides automated recovery against the failure of a hard drive.

Recovery Time Objective (RTO)

This defines the maximum amount of time that a system resource can remain unavailable before there is an unacceptable impact on the other system resources, supported mission/business processes and the MTD. "e.g. how long would it take to rebuild a server"

Risk Avoidance

This occurs when a company decides to terminate the activity that is introducing the risk.

Software Escrow Arrangement

Unique tool used to protect a company from failures of software developers to provide adequate support for its products or against the possibility that the developer will go out of business.

Service Bureaus

Used for leasing computer time. This company is responsible for owning large server farms and fields of workstations. Organizations typically use them to consume portions of their process capacity. Used to provide support for IT needs in the event of a disaster. Can be accomplished on site or remotely.

BCP: Implementation

After approval from senior management, the BCP team should develop the implementation schedule as soon as possible. This will be followed by BCP maintenance to ensure that the plan is still meeting business needs.

BCP: Plan Approval

After the BCP has been designed, its time to gain top-level management endorsement. It's beneficial to have senior management involved throughout the development phases. Senior management approval and buy-in is essential to the success of the overall BCP effort.

BCP: Training and Education

All personnel involved in the plan should receive some type of training on the overall plan and individual responsibilities. People with more responsiblities should be trained and evaluated on their specific BCP task to ensure efficiency. Finally, one backup person should be trained on all the BCP tasks for redundancy.

RAID 1

Also called mirroring. The most expensive fault tolerance system. Uses 2 disks that hold the same data. If one disk fails the other can continue to operate. No performance boost, just redundancy.

RAID 5

Also called striping with parity Most commonly used, uses 3 or more disks with one disk holding the parity information. If a disk fails, its data can be recovered using the parity information stored on the remaining disks. Combines disk striping across multiple disks with even parity for data redundancy.

RAID 0

Also called striping. This process breaks data into units and stores the units across a series of disks. Increases read&write performances. But, doesn't provide fault tolerance.

RAID 10

Also known as RAID 1+0 or stripe of mirrors. Combines disk striping and disk mirroring. Multiple disks are striped creating a single volume. A second set of disks is then added to mirror the first set. Redundancy + Performance

Mobile Sites

Alternative to traditional recovery sites. Consist of self-contained trailers or other easily relocated units. includes all environmental control systems necessary to maintain a safe computing environment. Are usually configured as cold sites or warm sites.

BCP: Legal and Regulatory Requirements

Business leaders must exercise due diligence to ensure that shareholders' interests are protected. Federal, state, and local laws or regulations that require them to implement various degrees of BCP. Organizations might have contractual obligations to your clients that require you to implement sound BCP practices. So it's essential to include your organization's legal counsel in the BCP process so because they're familiar with legal, regulatory, and contractual obligations that apply to your organization.

Quality of Service (QoS)

Controls that protect the integrity of data networks under load. This assist in improving the end-user experience. Systems can be used to prioritize certain traffic types that have low tolerance for interference and are high business requirements. Bandwidth Latency Jitter Packet Loss Interfernce

DRP: Maintenance

DRP is a living document. When organizations make changes, the DRP must adapt to those changes as well. Change management processes (doc updates) should be followed when infrastructure changes occur. Training and exercises must be conducted to ensure that staff members skills are sharp.

Disk Shadowing

Data is written to (and read from) 2 or more independent disks. Process is transparent to the user.

RAID 6

Data written (striped) across 3 or more disk with 2 parity stripes. Can auto-recover two disks. Provides redundancy.

Simulation Test

Disaster recovery team members are presented with a scenario and asked to develop an appropriate response. Some of these response measures are then tested. This may involve the interruption of noncritical business activities and the use of some operational personnel.

DRP: Testing and Maintenance

Every disaster recovery plan must be tested on a periodic basis to ensure that the plan's provisions are viable and that it meets an organization's changing needs.

Hot Site

Facility that is leased or rented and is fully configured and ready to operate within a few hours. The only missing resources from a hot site are usually the data, which will be retrieved from a backup site, and the people who will be processing the data. MTD 24 hours.

Cyber Incident Response Plan

Focuses on malware, hackers, intrusions, attacks, and other security issues. Outlines procedures for incident response

Continuity of operations (COOP) plan

Focuses on restoring an organization's (usually a headquarters element) essential functions at an alternate site and performing those functions for up to 30 days before returning to normal operations. This term is commonly used by the U.S. government to denote BCP.

BCP: Develop Recovery Strategies

Formulate methods to ensure systems and critical functions can be brought online quickly.

Business Impact Assessment (BIA)

Identifies critical processes/assets and the effect of their loss on the company. Identifies threats that can affect those processes/assets. Establishes Max down time the organization can survive without. Determines actual cost of loss as accurately as possible.

BCP: Business Impact Assessment (BIA)

Identify critical functions and systems and allow the organization to prioritize them based on necessity. Identify vulnerabilities and threats, and calculate risks

Manual Recovery

If a system fails, it does not fail in a secure state. Instead, an administrator is required to manually perform the actions necessary to implement a secured or trusted recovery after a failure or system crash.

Fault tolerance

It's the ability of a system to suffer a fault but continue to operate. This is achieved by adding redundant components such as additional disks within a redundant array of inexpensive disks (RAID) array, or additional servers within a failover clustered configuration.

Cloud Computing

Many organization now use this for disaster recovery options because providers offer on-demand services at low cost. It's often quite cost-effective and allows the organization to avoid incurring most of the operating cost until the cloud site activates in a disaster.

Transfer Risk

Many types of insurance are available to companies to protect their assets. If a company decides the total risk is too high to gamble with, it can purchase insurance.

Countermeasures

Minimize the risk with the least amount of money spent. The annual cost of the countermeasure should never cost more than the ALE for the asset. Must reduce risk to an acceptable level. Should not introduce new vulnerabilities Budget must be considered

BIA Step 1. Determine missing/business process

Mission/business processes supported by the system are identified and the impact of a system disruption to those processes is determined along with outage impacts and estimated downtime. The downtime should reflect the maximum that an organization can tolerate while still maintaining the mission.

Asynchronous Replication

Occurs when the primary and secondary data volumes are out of sync. Synchronization may take place in seconds, hours, or days, depending upon the technology in place.

Structured walk-through

Often referred to as a table-top exercise, members of the disaster recovery team gather in a large conference room and role-play a disaster scenario. The team members then refer to their copies of the disaster recovery plan and discuss the appropriate responses to that particular type of disaster. This testing can be down at the same time as live production.

The Salvage Team

Once the site is deemed "safe" for people, this team steps in. Their job is to return the company to its full original capabilities to the original or new location. They must rebuild or repair the IT infrastructure. They typically have more time than recovery team. Must ensure the reliability of the IT infrastructure, by returning the least mission-critical processes first and then restore the more important processes.

Business Continuity Plan (BCP)

Plan that address how an organization will respond to disruption of critical systems. Identifies the long-term actions to return all operations back to normal.

DRP Goal

Primary goal is to minimize risk to the organizatoin from delays and interruptions in providing services.

Risk Analysis

Process of identifying threats and taking action to reduce risk.

Database: Remote Journaling

Process that involves transferring data in a more expeditious manner. Process uses bulk transfer mode but on a more frequent basis. "once every hour or more frequently" Copies of the database transaction logs are transferred to the remote site but not applied to the live database server but are maintained in a backup device.

System resilience

Refers to the ability of a system to maintain an acceptable level of service during an adverse event. It refers to the ability of a system to return to a previous state after an adverse event.

Salvage Team Role

Responsible for returning the organization back to normal operations after a disaster after its deemed safe to return to the primary site. Also responsible for re-building the primary site and returning the organization's business processes. Also are responsible for returning the least business critical processes first to test the reliability of site and second return Mission-critical functions last.

Mitigate Risk

Risks are reduced to a level considered acceptable enough to continue conducting business.

Least Business-critical services

Services should be restored last after an organization is stabilized after an incident.

Mission-Critical services

Should be restored first within their maximum tolerable downtime when the organization is stabilized.

Automated Recovery w/ Undue Loss

Similar to automated recovery, can restore itself to one type of failure but ensures that specific objects are protected to prevent loss. Attempts to restore data or other objects.

Read-through test

Simplest but most critical. In this test, you distribute copies of disaster recovery plans to the members of the disaster recovery team for review. Process ensures that key personnel are aware of their responsibilities, provides review opportunities, and key personnel availability.

Full Backups

Store complete copies of data. They duplicate every file on the system regardless of the archive bit. The archive bit on every file is reset, turned off, or set to 0. To restore, restore only the last backup. This is the fastest restore method.

Differential Backups

Stores all files that have been modified since the time of the most recent full backup. Only files that have the archive bit turned on, enabled, or set to 1 are duplicated. However, the archive bit does not change. To restore, restore the last full backup and the last differential backup. Next to a full backup, this is the fastest restore method.

Incremental Backups

Stores files that have been modified since the time of the most recent full or incremental backup.Only files that have the archive bit turned on, enabled, or set to 1 are duplicated. After completion, the archive bit on all duplicated files is reset, turned off, or set to 0. To restore, restore the full backup and every subsequent incremental backup.

BCP: Project Scope and Planning

Structured analysis of the business's organization from a crisis planning point of view. Creation of a BCP team with the approval of senior management An assessment of the resources available to participate in BCP An analysis of the legal and regulatory landscape that governs an organization's response to a catastrophic event

Fail-open

System will fail in an open state, granting all access.

Qualitative

Takes non-numerical factors, such as emotions, investor/customer confidence, workforce stability, and other concerns, into account. This type of data often results in categories of prioritization (such as high, medium, and low).

Parallel test

Test that ensures that systems can actually perform at the alternate locations in the event of an emergency. Some systems are moved to the alternate site and compared with the regular processing that's done a the primary site.

Full interruption

Testing involves shutting down operations at the primary site and processing everything at the alternate site to simulate a disaster. Can reveal holes in the plan, and areas of improvements. Senior management approval must be obtained beforehand.

BCP: Provision and Processes

The BCP team designs the specific procedures and mechanisms that will mitigate the risks deemed unacceptable during the strategy development stage. The assets that "must" be protected: People, building/facilities, and infrastructure.

BCP: Team Selection

The BCP team should consist Representatives from each of the organization's departments responsible for the core services performed by the business. Key support depts, IT, security, legal and senior management should all be included in constructing the team.

Asset Value (AV)

The cost of a resource to the organization including both quantitative and qualitative values.

Mean Time Between Failures (MTBF)

The estimated lifetime of a piece of equipment; it is calculated by the vendor of the equipment or a third party. The reason for using this value is to know approximately when a particular device will need to be replaced.

Database: Remote Mirroring

The most advanced and most expensive backup solution. A live database server is maintained at the backup site. The remote server receives copies of the database modifications at the same time as the production server at the primary site. Best used with hot sites.

Risk Analysis

The process of identifying threats and taking action to reduce risk. To perform this analysis, take the following general steps: 1. Identify company assets (asset identification) and assign a value to each asset (asset valuation). 2. Identify potential threats, vulnerabilities, and risks. 3. Estimate the potential loss per incident. 4. Identify potential countermeasures. 5. Respond to the risk.

Work Recovery Time (WRT)

The remainder of the overall MTD value after the RTO has passed. This deals with restoring data, testing processes and then making everything "live" for production purposes.

Fail-secure

The system will default to a secure state in the event of a failure, blocking all access. e.g. electical lock will be locked when power is removed.

Due Care

This practice is the development of security policy and procedures which eliminates an organizations burden of negligence in the case of a security breach.

Recovery Point Objective (RPO)

This represents the point in time, prior to a disruption or system outage, to which mission/business process data must be recovered (given the most recent backup copy of the data) after an outage

Maximum Tolerable Downtime (MTD)

This represents the total amount of time leaders/managers are willing to accept for a mission/business process outage or disruption and includes all impact considerations. Also known as maximum tolerable outage (MTO). "e.g. how long will systems be down"

Recovery Team

This team is responsible for putting the BCP/DRP into action and restoring IT capabilities as swiftly as possible. They have a shorter time frame to operate, failure to restore business processes within the MTD/RTO means the company fails.

BIA Step 3. Identify recovery priorities

Used the results from the "Identify Resources Step" to link system resources to critical mission/business process. Priority can then be established for sequencing recovery activities and resources.

Crisis Management

When disasters strike, panicking often occurs. The best way to combat this is to make sure individuals are training on the DRP recovery procedures and know the proper notification and immediate response processes. Overall, this will ensure that key employees will know how to handle emergency situations.

BCP: Develop Contingency Plan

Write procedures and guidelines for how the organization can still stay functional in a crippled state


Set pelajaran terkait

Section 12: Ethics and Risk Management in Delaware

View Set

Chapter 10: Environmental Health

View Set

ServeSafe - questions in back of book

View Set