D8-Business Continuity and Disaster Recovery Planning
Mean Time to Repair (2.3)
The Mean Time to Repair (MTTR) describes how long it will take to recover a specific failed system. It is the best estimate for reconstituting the IT system so that business continuity may occur.
Continuity of Operations Plan: COOP (2.6)
(Purpose) provide procedures and capabilities to sustain an organization's essential, strategic functions at an alternate site for up to 30 days NIST SP 800-34. (Scope) addresses the subset of an organization's missions that are deemed most critical; usually written at headquarters level; not IT focused
Business Recovery (or Resumption) Plan: BRP (2.6)
(Purpose) provide procedures for recovering business operations immediately following a disaster NIST SP 800-34. (Scope) addresses business processes; not IT focused; IT addressed based only on its support for business process
Business Continuity Plan: BCP (2.6)
(Purpose) provide procedures for sustaining essential business operations while recovering from a significant distruption NIST SP 800-34. (Scope) addresses business processes; IT addressed based only on its support for business process
Crises Communications Plan (2.6)
(Purpose) provides procedures for disseminating status reports to personnel and the public NIST SP 800-34. (Scope) addresses communications with personnel and the public; not IT focused
Common BCP/DRP mistakes (4.2)
-Lack of management support -Lack of business unit involvement -Lack of prioritization among critical staff -Improper (often overly narrow) scope -Inadequate telecommunications management -Inadequate supply chain management -Incomplete or inadequate crisis management plan -Lack of testing -Lack of training and awareness -Failure to keep the BCP/DRP up to date
Cold Site (2.5)
A cold site is the least expensive recovery solution to implement. It does not include backup copies of data nor does it contain any immediately available hardware. After a disruptive event, a cold site will take the longest amount of time of all recovery solutions to implement and restore critical IT services for the organization. Especially in a disaster area, it could take weeks to get vendor hardware shipments in place so organizations using a cold site recovery solution will have to be able to withstand a significantly long MTD. A cold site is typically a datacenter with a raised floor, power, utilities, and physical security, but not much beyond that.
Hot Site (2.5)
A hot site is a location that an organization may relocate to following a major disruption or disaster. It is a datacenter with a raised floor, power, utilities, computer peripherals, and fully configured computers. The hot site will have all necessary hardware and critical applications data mirrored in real time. A hot site will have the capability to allow the organization to resume critical operations within a very short period of time—sometimes in less than an hour. It is important to note the difference between a hot and redundant site. Hot sites can quickly recover critical IT functionality; it may even be measured in minutes instead of hours. However, a redundant site will appear as operating normally to the end user no matter what the state of operations is for the IT program. A hot site has all the same physical, technical, and administrative controls implemented of the production site.
Call Trees (2.7)
A key tool leveraged for staff communication by the Crisis Communications Plan is the Call Tree, which is used to quickly communicate news throughout an organization without overburdening any specific person. The Call Tree works by assigning each employee a small number of other employees they are responsible for calling in an emergency event. For example, the organization president may notify executive leadership of an emergency situation and they, in turn, will notify their top tier managers. The top tier managers will then call the people they have been assigned to call. The Call Tree continues until all affected personnel have been contacted.
Failure and Recovery Metrics (2.3)
A number of metrics are used to quantify how frequently systems fail, how long a system may exist in a failed state, and the maximum time to recover from failure. These metrics include the Recovery Point Objective (RPO), Recovery Time Objective (RTO), Work Recovery Time (WRT), Mean Time Between Failures (MTBF), Mean Time to Repair (MTTR), and Minimum Operating Requirements (MOR).
Redundant Site (2.5)
A redundant site is an exact production duplicate of a system that has the capability to seamlessly operate all necessary IT operations without loss of services to the end user of the system. A redundant site receives data backups in real time so that in the event of a disaster, the users of the system have no loss of data. It is a building configured exactly like the primary site and is the most expensive recovery option because it effectively more than doubles the cost of IT operations. To be fully redundant, a site must have real-time data backups to the production system and the end user should not notice any difference in IT services or operations in the event of a disruptive event.
Simulation Test/Walk-Through Drill (3.1)
A simulation test, also called a walk-through drill (not to be confused with the discussion-based structured walk-through), goes beyond talking about the process and actually has teams to carry out the recovery process. A pretend disaster is simulated to which the team must respond as they are directed to by the DRP. The scope of simulations will vary significantly and tend to grow to be more complicated and involve more systems, as smaller disaster simulations are successfully managed. Though some will see the goal as being able to successfully recover the systems impacted by the simulation, ultimately, the goal of any testing of a DRP is to help ensure that the organization is well prepared in the event of an actual disaster.
Training (3.2)
Although there is an element of DRP training that comes as part of performing the tests discussed above, there is certainly a need for more detailed training on some specific elements of the DRP process. Another aspect of training is to ensure adequate representation on staff of those trained in basic first aid and CPR.
Structured Walk-Through/Tabletop (3.1)
Another test that is commonly completed at the same time as the checklist test is that of the structured walk-through, which is also often referred to as a tabletop exercise. During this type of DRP test, usually performed prior to more in-depth testing, the goal is to allow individuals who are knowledgeable about the systems and services targeted for recovery to thoroughly review the overall approach. The term structured walk-through is illustrative, as the group will talk through the proposed recovery procedures in a structured manner to determine whether there are any noticeable omissions, gaps, erroneous assumptions, or simply technical missteps that would hinder the recovery process from successfully occurring.
Parallel Processing (3.1)
Another type of DRP test is that of parallel processing. This type of test is common in environments where transactional data is a key component of the critical business processing. Typically, this test will involve recovery of critical processing components at an alternate computing facility and then restore data from a previous backup. Note that regular production systems are not interrupted.
Partial and Complete Business Interruption (3.1)
Arguably, the most high fidelity of all DRP tests involves business interruption testing. However, this type of test can actually be the cause of a disaster, so extreme caution should be exercised before attempting an actual interruption test. As the name implies, the business interruption style of testing will have the organization actually stop processing normal business at the primary location but will instead leverage the alternate computing facility. These types of tests are more common in organizations where fully redundant, often load-balanced, operations already exist.
Related Plans (2.6)
As discussed previously, the Business Continuity Plan is an umbrella plan that contains other plans. In addition to the Disaster Recovery Plan, other plans include the Continuity of Operations Plan (COOP), the Business Resumption/Recovery Plan (BRP), Continuity of Support Plan, Cyber Incident Response Plan, Occupant Emergency Plan (OEP), and the Crisis Management Plan (CMP). Table 8.2, from NIST Special Publication 800-34, summarizes these plans.
Assessing the Critical State (2.2)
Assessing the critical state can be difficult because determining which pieces of the IT infrastructure are critical depends solely on how it supports the users within the organization. For example, without consulting all of the users, a simple mapping program may not seem to be critical assets for an organization. However, if there is a user group that drives trucks and makes deliveries for business purposes, this mapping software may be critical for them to schedule pickups and deliveries.
BCP/DRP Mistakes (4.2)
Business continuity and disaster recovery planning is a business' last line of defense against failure. If other controls have failed, BCP/DRP is the final control. If it fails, the business may fail. The success of BCP/DRP is critical, but many plans fail. The BCP team should consider the failure of other organization's plan and view their own under intense scrutiny. They should ask themselves this question: "Have we made mistakes that threaten the success of our plan?"
Change Management (4.1)
Change management includes tracking and documenting all planned changes, formal approval for substantial changes, and documentation of the results of the completed change. All changes must be auditable. The BCP team should be a member of the change control board and attend all meetings. The goal of the BCP team's involvement on the change control board is to identify any changes that must be addressed by the BCP/DRP.
Checklist (3.1)
Checklist (also known as consistency) testing lists all necessary components required for successful recovery and ensures that they are, or will be, readily available should a disaster occur. For example, if the disaster recovery plan calls for the reconstitution of systems from tape backups at an alternate computing facility, does the site in question have an adequate number of tape drives on hand to carry out the recovery in the indicated window of time? The checklist test is often performed concurrently with the structured walk-through or tabletop testing as a solid first testing threshold. The checklist test is focused on ensuring that the organization has, or can acquire in a timely fashion, sufficient level resources on which their successful recovery is dependent.
DEVELOPING A BCP/DRP (2)
Developing a BCP/DRP is vital for an organization's ability to respond and recover from an interruption in normal business functions or catastrophic event. In order to ensure that all planning has been considered, the BCP/DRP has a specific set of requirements to review and implement. Below are listed these high-level steps, according to NIST 800-34, to achieving a sound, logical BCP/DRP. NIST 800-34 is the National Institute of Standards and Technologies Information Technology Contingency Planning Guide. -Project Initiation -Scope the Project -Business Impact Analysis -Identify Preventive Controls -Recovery Strategy -Plan Design and Development -Implementation, Training, and Testing -BCP/DRP Maintenance[2]
ISO/IEC 27031 (5.2)
ISO/IEC 27031 is part of the ISO 27000 series, which also includes ISO 27001 and ISO 27002 (discussed in Chapter 1, "Domain 1: Information Security Governance and Risk Management"). ISO/IEC 27031 focuses on BCP (DRP is handled by another framework; see below).
Activate Team (1.5)
If a disaster is declared, then the recovery team needs to be activated. Depending on the scope of the disaster, this communication could prove extremely difficult. The use of calling trees, which will be discussed in Section "Call Trees" in this chapter, can help to facilitate this process to ensure that members can be activated as smoothly as possible.
Respond (1.5)
In order to begin the disaster recovery process, there must be an initial response that begins the process of assessing the damage. Speed is essential during this initial assessment. The initial assessment will determine if the event in question constitutes a disaster.
Project Initiation (2.1)
In order to develop the BCP/DRP, the scope of the project must be determined and agreed upon. Project Initiation involves seven distinct milestones[3] as listed below: 1."Develop the contingency planning policy statement: A formal department or agency policy provides the authority and guidance necessary to develop an effective contingency plan. 2.Conduct the business impact analysis (BIA): The BIA helps to identify and prioritize critical IT systems and components. A template for developing the BIA is also provided to assist the user. 3.Identify preventive controls: Measures taken to reduce the effects of system disruptions can increase system availability and reduce contingency life cycle costs. 4.Develop recovery strategies: Thorough recovery strategies ensure that the system may be recovered quickly and effectively following a disruption. 5.Develop an IT contingency plan: The contingency plan should contain detailed guidance and procedures for restoring a damaged system. 6.Plan testing, training, and exercises: Testing the plan identifies planning gaps, whereas training prepares recovery personnel for plan activation; both activities improve plan effectiveness and overall agency preparedness. 7.Plan maintenance: The plan should be a living document that is updated regularly to remain current with system enhancements."[4]
DRP Testing (3.1)
In order to ensure that a Disaster Recovery Plan represents a viable plan for recovery, thorough testing is needed. Given the DRP's detailed tactical subject matter, it should come as no surprise that routine infrastructure, hardware, software, and configuration changes will alter the way the DRP needs to be carried out. Organizations' information systems are in a constant state of flux, but unfortunately, much of these changes do not readily make their way into an updated DRP. To ensure both the initial and continued efficacy of the DRP as a feasible recovery methodology, testing needs to be performed.
Mean Time between Failures (2.3)
Mean Time Between Failures (MTBF) quantifies how long a new or repaired system will run before failing. It is typically generated by a component vendor and is largely applicable to hardware as opposed to applications and software.
Minimum Operating Requirements (2.3)
Minimum Operating Requirements (MOR) describe the minimum environmental and connectivity requirements in order to operate computer equipment. It is important to determine and document what the MOR is for each IT-critical asset because, in the event of a disruptive event or disaster, proper analysis can be conducted quickly to determine if the IT assets will be able to function in the emergency environment.
Mobile Site (2.5)
Mobile sites are "datacenters on wheels": towable trailers that contain racks of computer equipment, as well as HVAC, fire suppression, and physical security. They are a good fit for disasters such as a datacenter flood, where the datacenter is damaged but the rest of the facility and surrounding property are intact. They may be towed onsite, supplied power and network, and brought online.
Recovery Strategy (2.5)
Once the BIA is complete, the BCP team knows the Maximum Tolerable Downtime. This metric, as well as others including the Recovery Point Objective and Recovery Time Objective, is used to determine the recovery strategy. A cold site cannot be used if the MTD is 12 hours, for example. As a general rule, the shorter the MTD, the more expensive the recovery solution will be.
CONTINUED BCP/DRP MAINTENANCE (4)
Once the initial BCP/DRP is completed, tested, trained, and implemented, it must be kept up to date. Business and IT systems change quickly, and IT professionals are accustomed to adapting to that change. BCP/DRPs must keep pace with all critical business and IT changes.
Communicate (1.5)
One of the most difficult aspects of disaster recovery is ensuring that consistent timely status updates are communicated back to the central team managing the response and recovery process. This communication often must occur out-of-band, meaning that the typical communication method of leveraging an office phone will quite often not be a viable option. In addition to communication of internal status regarding the recovery activities, the organization must be prepared to provide external communications, which involve disseminating details regarding the organization's recovery status with the public.
Identify Preventive Controls (2.4)
Preventive controls prevent disruptive events from having an impact. For example, as stated in Chapter 10, "Domain 10: Physical (Environmental) Security," HVAC systems are designed to prevent computer equipment from overheating and failing. The BIA will identify some risks that may be mitigated immediately. This is another advantage of performing BCP/DRP, including the BIA: it improves your security, even if no disaster occurs.
DRP TESTING AND TRAINING (3)
Testing, training, and awareness must be performed for the "disaster" portion of a BCP/DRP. Skipping these steps is one of the most common BCP/DRP mistakes. Some organizations "complete" their DRP and then consider the matter resolved and put the big DRP binder on a shelf to collect dust. This proposition is wrong on numerous levels. First, a DRP is never complete, but is rather a continually amended method for ensuring the ability for the organization to recover in an acceptable manner. Second, while well-meaning individuals carry out the creation and update of a DRP, even the most diligent of administrators will make mistakes. To find and correct these issues prior to their hindering recovery in an actual disaster testing must be carried out on a regular basis. Third, any DRP that will be effective will have some inherent complex operations and maneuvers to be performed by administrators. There will always be unexpected occurrences during disasters, but each member of the DRP should be exceedingly familiar with the particulars of their role in a DRP, which is a call for training on the process. Finally, awareness of the general user's role in the DRP, as well as awareness of the organization's emphasis on ensuring the safety of personnel and business operations in the event of a disaster, is imperative. This section will provide details on steps to effectively test, train, and build awareness for the organization's DRP.
Conduct BCP/DRP-Focused Risk Assessment (2.3)
The BCP/DRP-focused risk assessment determines what risks are inherent to which IT assets. A vulnerability analysis is also conducted for each IT system and major application. This is done because most traditional BCP/DRP evaluations focus on physical security threats, both natural and human.
BCI (5.3)
The Business Continuity Institute published a six-step Good Practice Guidelines (GPG) in 2008 that describes the Business Continuity Management (BCM) process: "Section 1 consists of the introductory information plus BCM Policy and Programme Management. Section 2 is Understanding the Organisation Section 3 is Determining BCM Strategy Section 4 is Developing and Implementing BCM Response Section 5 is Exercising, Maintaining & Reviewing BCM arrangements Section 6 is Embedding BCM in the Organisation's Culture"[10] [9]ISO/IEC 27031 Information technology—Security techniques—Guidelines for ICT Readiness for Business Continuity (final committee draft). [10]Business Continuity Management GOOD PRACTICE GUIDELINES 2008.
Relationship between BCP and DRP (1.3)
The Business Continuity Plan is an umbrella plan that includes multiple specific plans, most importantly the Disaster Recovery Plan. The Disaster Recovery Plan serves as a subset of the overall Business Continuity Plan, because a BCP would be doomed to fail if it did not contain a tactical method for immediately dealing with disruption of information systems. Figure 8.1, from NIST Special Publication 800-34, provides a visual means for understanding the interrelatedness of a BCP and a DRP, as well as Continuity of Operations Plan (COOP), Occupant Emergency Plan (OEP), and others.
Conduct Business Impact Analysis (2.3)
The Business Impact Analysis (BIA) is the formal method for determining how a disruption to the IT system(s) of an organization will impact the organization's requirements, processes, and interdependencies with respect to the business mission.[5] It is an analysis to identify and prioritize critical IT systems and components. It enables the BCP/DRP project manager to fully characterize the IT contingency requirements and priorities.[6] The objective is to correlate the IT system components with the critical service it supports. It also aims to quantify the consequence of a disruption to the system component and how that will affect the organization. The primary goal of the BIA is to determine the Maximum Tolerable Downtime (MTD) for a specific IT asset. This will directly impact what disaster recovery solution is chosen.
DRP Review (3.1)
The DRP review is the most basic form of initial DRP testing and is focused on simply reading the DRP in its entirety to ensure completeness of coverage. This review is typically to be performed by the team that developed the plan and will involve team members reading the plan in its entirety to quickly review the overall plan for any obvious flaws. The DRP review is primarily just a sanity check to ensure that there are no glaring omissions in coverage or fundamental shortcomings in the approach.
Disaster Recovery Planning (1.2)
The Disaster Recovery Plan (DRP) provides a short-term plan for dealing with specific IT-oriented disruptions. Mitigating a malware infection that shows risk of spreading to other systems is an example of a specific IT-oriented disruption that a DRP would address. The DRP focuses on efficiently attempting to mitigate the impact of a disaster and the immediate response and recovery of critical IT systems in the face of a significant disruptive event. Disaster Recovery Planning is considered tactical rather than strategic and provides a means for immediate response to disasters.
Recovery Point Objective (2.3)
The Recovery Point Objective (RPO) is the amount of data loss or system inaccessibility (measured in time) that an organization can withstand. "If you perform weekly backups, someone made a decision that your company could tolerate the loss of a week's worth of data. If backups are performed on Saturday evenings and a system fails on Saturday afternoon, you have lost the entire week's worth of data. This is the Recovery Point Objective. In this case, the RPO is 1 week."[7] The RPO represents the maximum acceptable amount of data/work loss for a given process because of a disaster or disruptive event.
Recovery Time Objective and Work Recovery Time (2.3)
The Recovery Time Objective (RTO) describes the maximum time allowed to recover business or IT systems. RTO is also called the systems recovery time. This is one part of Maximum Tolerable Downtime: once the system is physically running, it must be configured. Work Recovery Time (WRT) describes the time required to configure a recovered system. "Downtime consists of two elements, the systems recovery time and the work recovery time. Therefore, MTD = RTO + WRT."[8]
Identify Critical Assets (2.3)
The critical asset list is a list of those IT assets that are deemed business essential by the organization. These systems' DRP/BCP must have the best available recovery capabilities assigned to them.
Determine Maximum Tolerable Downtime (2.3)
The primary goal of the BIA is to determine the Maximum Tolerable Downtime (MTD), which describes the total time a system can be inoperable before an organization is severely impacted. It is the maximum time it takes to execute the reconstitution phase. Reconstitution is the process of moving an organization from the disaster recovery to business operations. Maximum Tolerable Downtime is composed of two metrics: the Recovery Time Objective (RTO) and the Work Recovery Time (WRT); see below.
Reconstitution (1.5)
The primary goal of the reconstitution phase is to successfully recover critical business operations at either primary or secondary site. If an alternate site is leveraged, adequate safety and security controls must be in place in order to maintain the expected degree of security the organization typically employs. The use of an alternate computing facility for recovery should not expose the organization to further security incidents. In addition to the recovery team's efforts at reconstitution of critical business functions at an alternate location, a salvage team will be employed to begin the recovery process at the primary facility that experienced the disaster. Ultimately, the expectation is, unless wholly unwarranted given the circumstances, that the primary site will be recovered and that the alternate facility's operations will "fail back" or be transferred again to the primary center of operations.
Disasters (1.4)
The three common ways of categorizing the causes for disasters are whether the threat agent is natural, human, or environmental in nature.[1] Natural—The most obvious type of threat that can result in a disaster is naturally occurring. This category includes threats such as earthquakes, hurricanes, tornadoes, floods, and some types of fires. Historically, natural disasters have provided some of the most devastating disasters that an organization can have to respond to. Human—The human category of threats represents the most common source of disasters. Human threats can be further classified by whether they constitute an intentional or unintentional threat. Environmental—Threats focused on information systems or datacenter environments include items such as power issues (blackout, brownout, surge, spike), system component or other equipment failures, and application or software flaws.
Assess (1.5)
Though an initial assessment was carried out during the initial response portion of the disaster recovery process, a more detailed and thorough assessment will be performed by the disaster recovery team. The team will proceed to assessing the extent of the damage to determine the proper steps necessary to ensure the organization's ability to meet its mission.
Starting Emergency Power (3.2)
Though it might seem simple, converting a datacenter to emergency power, such as backup generators that will begin taking the load as the UPS fail, is not to be taken lightly. Specific training and testing of changing over to emergency power should be regularly performed.
Disruptive Events (1.4)
Types of disruptive events include: Errors and omissions: typically considered the most common source of disruptive events. This type of threat is caused by humans who unintentionally serve as a source of harm. Natural disasters: include earthquakes, hurricanes, floods, tsunamis, etc. Electrical or power problems: loss of power may cause availability issues and integrity issues due to corrupted data. Temperature and humidity failures: may damage equipment due to overheating, corrosion, or static electricity. Warfare, terrorism, and sabotage: threat can vary dramatically based on geographic location, industry, brand value, and the interrelatedness with other high-value target organizations. Financially motivated attackers: attackers who seek to make money by attacking victim organizations and include exfiltration of cardholder data, identity theft, pump-and-dump stock schemes, bogus antimalware tools, or corporate espionage and others. Personnel shortages: may be caused by strikes, pandemics, or transportation issues. A lack of staff may lead to operational disruption.
SPECIFIC BCP/DRP FRAMEWORKS (5)
Given the patchwork of overlapping terms and processes used by various BCP/DRP frameworks, this chapter focuses on universal best practices, without attempting to map to a number of different (and sometimes inconsistent) terms and processes described by various BCP/DRP frameworks.
Disasters or Disruptive Events (1.4)
Given that organizations' Business Continuity and Disaster Recovery Plans are created because of the potential of disasters impacting operations, understanding disasters and disruptive events is necessary.
The Disaster Recovery Process (1.5)
Having discussed the importance of Business Continuity and Disaster Recovery Planning and examples of threats that justify this degree of planning, we will now focus on the fundamental steps involved in recovering from a disaster.
Terms and acronyms used by ISO/IEC 27031 (5.2)
ICT—Information and Communications Technology ISMS—Information Security Management System
Calling Tree Training/Test (3.2)
Another example of combination training and testing is in regard to calling trees, which was discussed previously in Section "Call Trees." The hierarchical relationships of calling trees can make outages in the tree problematic. Individuals with calling responsibilities are typically expected to be able to answer within a very short time period or otherwise make arrangements.
ISO/IEC 27031 is designed to (5.2)
"Provide a framework (methods and processes) for any organization—private, governmental, and non-governmental Identify and specify all relevant aspects including performance criteria, design, and implementation details, for improving ICT readiness as part of the organization's ISMS, helping to ensure business continuity. Enable an organization to measure its continuity, security and hence readiness to survive a disaster in a consistent and recognized manner."[9]
Continuity of Support Plan/IT Contingency Plan (2.6)
(Purpose) provide procedures and capabilities for recovering a major application or general support system NIST SP 800-34. (Scope) same as IT contingency plan; addresses IT system disruptions; not business process focused
Warm Site (2.5)
A warm site has some aspects of a hot site, for example, readily accessible hardware and connectivity, but it will have to rely upon backup data in order to reconstitute a system after a disruption. It is a datacenter with a raised floor, power, utilities, computer peripherals, and fully configured computers.
Alternate Terms for MTD (2.3)
Depending on the business continuity framework that is used, other terms may be substituted for Maximum Tolerable Downtime. These include Maximum Allowable Downtime (MAD), Maximum Tolerable Outage (MTO), and Maximum Acceptable Outage (MAO).
Reciprocal Agreement (2.5)
Reciprocal agreements are a bidirectional agreement between two organizations in which one organization promises another organization that it can move in and share space if it experiences a disaster. It is documented in the form of a contract written to gain support from outside organizations in the event of a disaster. They are also referred to as Mutual Aid Agreements (MAAs) and they are structured so that each organization will assist the other in the event of an emergency.
NIST SP 800-34 (5.1)
The National Institute of Standards and Technology (NIST) Special Publication 800-34 Rev. 1 "Contingency Planning Guide for Federal Information Systems" may be downloaded at http://csrc.nist.gov/publications/nistpubs/800-34-rev1/sp800-34-rev1_errata-Nov11-2010.pdf. The document is high quality and public domain. Plans can sometimes be significantly improved by referencing SP 800-34 when writing or updating a BCP/DRP.
BCP AND DRP OVERVIEW AND PROCESS (1)
The terms and concepts associated with Business Continuity and Disaster Recovery Planning are often misunderstood. Clear understanding of what is meant by both Business Continuity and Disaster Recovery Planning, as well as what they entail, is critical for the CISSP® candidate.
Business Continuity Planning (1.1)
Though many organizations will simply use the phrases Business Continuity Planning (BCP) or Disaster Recovery Planning interchangeably, they are two distinct disciplines. The overarching goal of a BCP is for ensuring that the business will continue to operate before, throughout, and after a disaster event is experienced. The focus of a BCP is on the business as a whole and ensuring that those critical services that the business provides or critical functions that the business regularly performs can still be carried out both in the wake of a disruption and after the disruption has been weathered.
