chapter 5
If the probability distribution function (p.d.f., sometimes called probabil-ity mass function, p.m.f.) of failure is given by f(t), then the reliability can be expressed as
pg 163 of chapter 5
The mathematical model of FTA is primarily concerned with
predicting the probability of an output failure event with the probabilities of events that cause this output failure. For simplification purposes, we assume all the input failures are independent of each other. Two basic constructs for pre-dicting the output events are the AND-gate and the OR-gate.
f we plot the system failure rate over time from a system life cycle perspective, it exhibits a so-called
"bathtub" curve shape, At the beginning of the system life cycle, the system is being designed, concepts are being explored, and system components are being selected and evaluated. At this stage, because the system is immature, there are many "bugs" that need to be fixed; there are many incompatibilities among components, and many errors are being fixed. The system gradu-ally becomes more reliable along with design effort; thus, the failure rate of the system components decreases. This is the typical behavior of the system failure rate in the early life cycle period, as shown in Figure 5.2 in the first segment of the failure rate curve: the decreasing failure rate period, or the "infant mortality region." Once the system is designed and put into operation, the system achieves its steady-state period in terms of failure rate, and presents a relatively constant failure rate behavior. The system is in its maturity period, as presented in the middle region in Figure 5.2. In this stage, system failure is more of a random phenomenon with steady failure rate, which is expected under normal operating conditions. When the system approaches the end of its life cycle, it is in its wear-out phase, characterized by its incompatibility with new technology and user needs and its worn-out condition caused by its age, it presents a significantly increasing pattern of failure occurrence, as seen in the last region of the life cycle in Figure 5.2. Failures are no longer solely due to randomness but to deterministic factors mentioned above; it is time to retire the system and start designing a new one and a new bathtub curve will evolve again.
The results of FTA are particu-larly beneficial for designers to identify any risks involved in the design, and more specifically to
1.Allocate failure probabilities among lower levels of system components 2.Compare different design alternatives in terms of reliability and risks 3.Identify the critical paths for system failure occurrence and provide implications of avoiding certain failures 4.Help to improve system maintenance policy for more efficient performance
With the basic sources of information available and preliminary assess-ment of the system structure, a team approach is applied to develop the FMEA analysis results; the basic steps are
1.Define system requirements. Requirements for system reliability need to be clearly defined, as do the TPMs (such as the MTBF of the system) and the system operating environments. With high-level systems requirements defined and refined at a lower level, the system structures can be identified, via a top-down approach, from system to subsystem level, down to the components and eventuallythe hardware and software units to construct the systems. This pro-vides a big picture for system reliability and a starting point to con-duct FMEA analysis. 2.Construct system FFBD. One thing we have to keep in mind is that FMEA analysis has be based on the system design synthesis and integration results. Ideally, FMEA analysis should be paired with system functional analysis and functional allocation, as FMEA anal-ysis is tied to each system component. To perform an FMEA analy-sis, the following materials/information are needed from functional analysis: a.System functional architecture and mission requirements. b.System FFBD. c.System operational information for each of the functions, simi-larly to Figure 4.10; input/output, mechanism, and constraints information are required to identify the failure mode and its effects. d.Rules, assumptions, and standards pertaining to components. Understanding the limitations of the current feasible technology and COTS constraints helps us to make predictions of the fail-ure mode more meaningful. The ground rules generally include system mission, phase of the mission, operating time and cycle, derivation of failure mode data (i.e., supplier data, historical log, statistical analysis, subject matter experts' estimates, etc.), and any possible failure detection concepts and methodologies. 3. Requirements allocation and function allocation: With the FFBD, the reliability-related TPMs are allocated to the components level; this is parallel to the function allocation process, described in Section 4.3.2. With the requirements allocated to the lower levels, the effects of fail-ure on the system can be specified at a quantitative level. 4. Identify failure mode. A failure mode is the manner in which a failure occurs at a system, subsystem, or component level. There are many types of failure mode involved in a single component; to derive a comprehensive analysis of the failure mode, designers should look at the various sources of information, including similar systems, sup-plier data, historical data for critical incidents and accidents studies, and any information related to environmental impacts. A typical component failure mode should include the following aspects: a.Failure to operate in proper time b.Intermittent operation c.Failure to stop operating at the proper time d.Loss of output e.Degraded output or reduced operational capability 5.Identify causes and effects of failure. The cause of the failure mode is usually an internal process or external influence, or an interaction between these. It is very possible that more than one failure could be caused by one process and a particular type of failure could have multiple causes. Typical causes of a failure include aging and natu-ral wearing out of the materials, defective materials, human error, violation of procedures, damaged components due to environmental effects, and damage due to the storage, handling, and transportation of the system. There are many tools to aid designers in laying out the sources and effects of failures and their relationships; for example, the "fishbone" diagram by Ishikawa, the Swiss model, and the human factors analysis and classification system (HFACS) for human error analysis have been widely used to identify the complex structure of error cause-effect relationships. Failure effect analysis assesses the potential impact of a failure on the components or the overall system. Failures impact systems at different levels, depending on the types of failures and their influences. Generally speaking, there are three levels of failure effect: a.Local effect: Local effects are those effects that result specifically from the failure mode of the component at the lowest level of the structure itself. b.Parent-level effect: These are the effects that a failure has on the operation and performance of the functions at the immediate next higher level. c.End-level effect. These effects are the ones that impact on the oper-ation and functions on the overall system as a whole. A small failure could cause no immediate effect on the system for a short period of time, cause degraded system overall performance, or cause the system to fail with catastrophic effects 6.Identify failure detection method. This section of FMEA identifies the methods by which the occurrence of a failure is detected. These methods include: a.Human operators b.Warning devices (visual or auditory) c.Automatic sensing devices The detection methods should include the conditions of detection (i.e., normal vs. abnormal system operations) and the times and frequencies of the detection (i.e., periodic maintenance checking to identify signs of potential failure, or diagnosis of failure when symptoms are observed). 7.Assign failure severity. After all failure modes and their effects on the system are identified, the level of impact of these failures on the system need to be ranked by assigning an appropriate severity score. This will enable design teams to prioritize failures based on the "seriousness" of the effect, so that they can be addressed in a very efficient way, especially given the limited resources available. To assign a severity score for each of the failure modes, each failure effect is evaluated in terms of the worst consequences of the fail-ure on the higher-level system, and through iterative team efforts, a quantitative score is assigned to that particular failure. Table 5.5 illustrates a typical severity ranking and scales. 8.Assign failure mode frequency and probability of detection. This is the start of the second half of the FMEA analysis, the criticality analy-sis (CA). It adds additional information about the system failure so that a better design can be achieved by avoiding these failures. The CA part of FMEA enables designers to identify the system reliability- and maintainability-related concerns and address these concerns in the design phase. The first step of CA is to transfer the data collected to determine the failure rate; that is, the frequency of failure occurrence. This rate is often expressed as a probabil-ity distribution as the failure occurs in a random manner. It also includes the information about the accuracy of failure detection methods, combining the probability of failure detection (i.e., a cor-rect hit detection, or a false alarm detection) to provide the level of uncertainty of failure occurrence and the probability of the failure being detected. 9.Analyze failure criticality. Once the failure rate and detection prob-ability of the failure has been identified, information pertaining to the failure needs to be consolidated to form a criticality assessment of the failure for it to be addressed in the design. Criticality can be assessed quantitatively or qualitatively. For a quantitative assessment, various means or measures can be used to calculate the critical values of the components;
System reliability has the following four major characteristics(four elements of determining the reliability of a system)
1.It is a probability. A system becomes unreliable due to failures that occur randomly, which, in turn, also make system reli-ability a random variable. The probability of system reliability provides a quantitative measure for such a random phenomenon. For example, a reliability of 0.90 for a system to operate for 80 h implies that the system is expected to function properly for at least 80 h, 90 out of 100 times. A probability measures the odds or the fraction/percentage of the number of times that the system will be functional, not the percentage of the time that the system is working. An intuitive definition of the reliability is as follows: Suppose there are n totally identical components that are simul-taneously subjected to a design operating conditions test; during the interval of time [0, t], nf(t) components are found to have failed and the remaining ns(t) survived. At time t, the reliability can be estimated as R(t)=ns(t)/n. 2.Satisfactory performance is specified for system reliability. It defines the criteria at which the system is considered to be functioning properly. These criteria are derived from systems requirement anal-ysis and functional analysis, and must be established to measure reliability. Satisfactory performance may be a particular value to be achieved, or sometimes a fuzzy range, depending on the different types of systems or components involved. 3.System reliability is a function of time. As seen in the definition of system reliability, reliability is defined for a certain system operation time period. If the time period changes, one would expect the value for reliability to change also. It is common sense that one would expect the chance of system failure over an hour to be a lot lower than that over a year! Thus, a system is more reliable for a shorter period of time. Time is one of the most important factors in system reliability; many reliability-related factors are expressed as a time function, such as MTBF. 4.Reliability needs to be considered under the specified operating con-ditions. These conditions include environmental factors such as the temperature, humidity, vibration, or surrounding locations. These environmental factors specify the normal conditions at which the systems are functional. As mentioned in Chapter 2, almost every system may be considered as an open system as it interacts with the environment regardless, and the environment will have a signifi-cant impact on system performance. For example, if you submerge a laptop computer under water, it will probably fail immediately. System reliability has to be considered in the context of the designed environment; it is an inherent system characteristic. System failures should be distinguished from accidents and damage caused by vio-lation of the specified conditions. This is why product warranties do not cover accidents caused by improper use of systems.
Starting from a very high level, requirements regarding system reliability are defined both quantitatively and qualitatively, including
1.Performance and effectiveness factors for system reliability 2.System operational life cycle for measuring reliability 3.Environmental conditions in which the system is expected to be used and maintained (such as temperature, humidity, vibration, radiation, etc.)
To conduct an FMEA analysis, there are some basic requirements that need to be fulfilled first. These requirements include:
1.System structure in schematic form. Without the basic understand-ing of the system architecture, especially the hardware and software structures, one cannot identify the possible consequences if one or more components fail. This is the starting point of FMEA analysis. 2.System function FFBD. As we stated earlier, the FFBD is the founda-tion for many analyses; only with the FFBD specified, functions can then be allocated to components, components allocated to hardware, software or humans, and the operational relationships between the components defined, which is necessary to conduct FMEA analysis. 3.Knowledge of systems requirements. System hardware and soft-ware architecture is derived based on requirements. At any point in the design, requirements are needed to verify the design decisions; ultimately, this is important for FMEA-related analysis, since FMEA is an inductive approach and everything is assessed on an empirical basis. 4.A comprehensive understanding of the systems components. This includes, but is not limited to, access to current technology, under-standing of the COTS items, and knowledge of supply chain opera-tions and structures related to system components.
Series Structure
A series system functions if and only if all of its components are functioning. If any one of the components fails, then the system fails; as seen in Figure 5.3, a failed component will cause the whole path to be broken.
Qualitative analysis is used when the failure rate for the item is not available.
A typical method used in qualitative analysis is to use the risk priority number (RPN) to rank and identify concerns or risks associated with the components due to the design decisions. The number provides a mean to delineate the more critical aspects of the systems design. The RPN can be determined from:RPN=(severity rating)×(frequency rating)×(probability of detection rating)Generally speaking, a component with a high frequency of failure, high impact/severity of failure effect, and difficulty of failure detec-tion usually has a high RPN. Such components should be given high priority in the design consideration.
What is an AND Gate
All the input events (Ei, i=1, 2, ..., n) attached to the AND-gate must occur in order for the output event (A) above the gate to occur. That is to say, in terms of the probability model, the output event is the intersection of all the input events. For example, for a three-branched AND-gate as illustrated in Figure 5.10,if P1=0.95, P2=0.90, and P3=0.92, then P(A)=P1 x P2 x p3 = P(A)= (.95)(.90)(.92)=.79
How is FMEA a bottom-up inductive process
Generally speaking, FMEA is a bottom-up inductive approach to analyze the possible component failure modes within the system, classifying them into different categories, severities, and likelihoods, identifying the conse-quences caused by these failures to develop a proactive approach to prevent them from occurring, and the related maintenance policy for these failures. It is an inductive process, because FMEA starts with detailed specific exam-ples and cases of failure, to gradually derive general propositions regarding system reliability predictions (as opposed to the deductive approach, where the specific examples are derived from the general propositions, as in the faulty tree analysis approach
What is an OR gate
If a failure occurs if one or more of the input events (Ei, i=1, 2, ..., n) occurs, then an OR-gate is used for this causal relationship. In terms of the prob-ability model, the OR-gate structure represents the union of the input events attached to it. For example, for a three-event OR-gate, as illustrated in Figure 5.12, if P4=0.95, P5=0.90, and P6=0.92, then P(B)=1-(1-p4)(1-p5)(1-p6)= P(B)=1-(1-.95)(1-.90)(1-.92)=.9996
failure rate is a function of time; it varies with different time intervals and different times in the system life cycle. Thus, when plotting failure rate
If we plot the system failure rate over time from a system life cycle perspective, it exhibits a so-called "bathtub" curve shape, At the beginning of the system life cycle, the system is being designed, concepts are being explored, and system components are being selected and evaluated. At this stage, because the system is immature, there are many "bugs" that need to be fixed; there are many incompatibilities among components, and many errors are being fixed. The system gradu-ally becomes more reliable along with design effort; thus, the failure rate of the system components decreases. This is the typical behavior of the system failure rate in the early life cycle period, as shown in Figure 5.2 in the first segment of the failure rate curve: the decreasing failure rate period, or the "infant mortality region." Once the system is designed and put into operation, the system achieves its steady-state period in terms of failure rate, and presents a relatively constant failure rate behavior. The system is in its maturity period, as presented in the middle region in Figure 5.2. In this stage, system failure is more of a random phenomenon with steady failure rate, which is expected under normal oper-ating conditions. When the system approaches the end of its life cycle, it is in its wear-out phase, characterized by its incompatibility with new technology and user needs and its worn-out condition caused by its age, it presents a significantly increasing pattern of failure occurrence, as seen in the last region of the life cycle in Figure 5.2. Failures are no longer solely due to randomness but to deterministic factors mentioned above; it is time to retire the system and start designing a new one and a new bathtub curve will evolve again.
When one com-ponent fails, there is a switch to connect a backup component instantly, as shown in Figure 5.6. This type of system is also called a redundant standby network.
In a standby system, the backup component is not put into operation until the preceding component fails. For example, in Figure 5.6, at the beginning, only Component 1 is operative while Components 2 and 3 are standing by. When Component 1 fails, Component 2 is immediately put in use until it fails; then, Component 3 becomes operational. When Component 3 fails, the system stops working and this is considered a system failure.In standby systems, the failures of individual components are not totally independent of each other; this is different from a purely parallel network, in which failures occur independently. In standby structures, failures occur one at a time, while in the parallel network, two parts can fail at the same time.
Difference between TPM and DDP
In module 5, we have reviewed one of the important aspects of systems design, DDPs, and TPMs. DDPs are Designed Dependent Parameters, specifications that are derived from the systems requirements and further define the system in details. The quantitative value of DDPs are called TPMs. As one of the critical role of systems engineers, our job is to translate the requirements into TPMs that accurately reflect the systems requirements. In this module, we first reviewed the most important TPMs for almost every system, the system reliability.
As an assumption for system status, the system is either in a functional condition or a state of failure, so the cumulative probability distribution function of failure F(t) is the complement of R(t), or
R(t)+ F(t)=1
So, knowing the distribution of failure, we can derive the reliability by
R(t)=1-F(t)
reliability is a function of time, t, which is a random variable denoting the time to failure. The reliability function at time t can be expressed as a cumulative probability, the probability that the system sur-vives at least time t without any failure which is expressed as
R(t)=P(t>t)
Generally speaking, there are four basic steps involved in conducting an FTA analysis:
Step 1: Develop the functional reliability diagram. Develop a functional block diagram for systems reliability, based on the system FFBD model, focusing on the no-go functions and functions of diagnosis and detection. Starting from the system FFBD, following the top-down approach, a tree structure for critical system failure events is identified. This diagram includes infor-mation about the structures of the no-go events, what triggers/activates the events, and what the likelihoods and possible consequences of those events are. Step 2: Construct the faulty tree. Based on the relationships described in the functional diagram, a faulty tree is constructed by using the symbols from Figure 5.8. The faulty tree is based on the functional diagram but is not exactly the same, in the sense that functional models follow the system oper-ational sequences of functions while the FTA tree follows the logical paths of cause-effect failure relationships; it is very possible that, for different opera-tional modes, multiple FTAs may be developed for a single functional path. In constructing the faulty tree, the focus is on the sequence of the failure events for a specific functional scenario or mission profile. Step 3. Develop the failure probability model. After the FTA is constructed, the next step is to quantify the likelihood of failure occurrence by developing the probability model of the faulty tree. Just as in understanding the models of reliability theory, readers need to familiarize themselves with basic prob-ability and statistics theory. Step 4. Identify the critical fault path. With the probability of failure of the system or of a higher-level subsystem, a path analysis can be conducted to identify the key causal factors that contribute most significantly to the fail-ure. Certain models can be applied to aid the analysis based on the assump-tions made about the faulty tree, such as Bayesian's model, Markov decision model, or simply using a Monte Carlo simulation. For a more comprehensive review of these models, readers can refer to the Reliability Design Military Handbook (MIL-HDBK-338B; U.S. Department of Defense 1988).
After talking about AND-gates and OR-gates, some readers may easily see that the calculation of the AND-gate is similar to the series structure and the OR-gate is similar to the parallel structure of the reliability network. why is that
This is because the logic for the AND and OR of failure events are the same as the reliability events in the series and parallel structures. Understanding the basic probability model for the AND-gate and OR-gate, we can solve any composite faulty tree structure; we just start from the bottom level and work our way up, until the probabilities for all the events are obtained. (example on page 191 chapter 5)
Failure mode effect analysis (FMEA), sometimes called failure mode, effects, and criticality analysis (FMECA), is
a commonly used analysis tool for analyzing failures that are associated with system components. It was originally developed by NASA to improve the reliability of hardware design for space programs. Although the original FMEA document is no longer in effect, the FMEA methodology, however, has been well preserved and tested and has evolved. Nowadays, FMEA has become a well-accepted standard for identifying reliability problems in almost any type of systems, ranging from military to domestic and mechanical to computer software design.
in terms of quantitative modeling methodology for systems engineering, probability and statistics are perhaps
are perhaps the most important subjects besides operations research; due to the uncertain and dynamic nature of complex system design, one can hardly find any meaningful solution to a system design problem without addressing its statistical nature.
according to chapter 5, technical performance measures (TPMs ) are
are the quantitative values for the DDPs that describe, estimate or predict the system technical behaviors. TPMs define the attributes for the system to make the system unique so that it can be realized. Examples of TPMs include systems functional parameters (such as size, weight, velocity, power, etc.), system reli-ability (i.e., mean time between failures [MTBF]), system maintainability (i.e., mean time between maintenance [MTBM]), usability (i.e., human error) and system sustainability TPMs provide estimated quantitative values that describe the system performance requirements. They measure the attributes or characteristics inherent within the design, specifically the DDPs.
Nevertheless, the types of parameters and TPMs involved in different systems vary a great deal;
development of TPMs primarily relies on the clear understanding the nature of the system and the knowledge and experiences of the developers.
what do well defined TPMs ensure
ensures that (a) the requirements reflect the customers' needs, and (b) the measurements (metrics) provide designers with the necessary guidance to develop their benchmark. Another advantage of using TPMs to balance cost, scheduling, and performance specifications throughout the life cycle is to specify measurements of success. Technical performance measurements can be used to compare actual versus planned technical development and design. They also report the degree to which system requirements are met in terms of performance, cost, schedule, and progress in implementation. Performance metrics are traceable to original requirements.
If there is only one component involved and maintenance actions are performed when this component fails so that it is functional again
he failure rate is estimated by the division of the total number of failures over the total time of the component being functional (total time minus downtime).
A k-out-of-n system is functioning if and only if at least k components of the n total components are functioning.
if and only if at least k components of the n total components are functioning
Failure rate is defined
in a time interval [t1, t2] as the probability that a failure per unit time occurs in the interval, given that no failure has occurred prior to t1, the beginning of the interval. Thus, the failure rate λ(t) can be formally expressed as (Elsayed 1996). see pg 164
A faulty tree analysis, or FTA model, is
is a graphical method for identify-ing the different ways in which a particular component/system failure could occur. Compared to the FMEA model, which is considered a "bottom-up" inductive approach, FTA is a deductive approach, using graphical symbols and block diagrams to determine the events and the likelihood (probability) of an undesired failure event occurring. FTA is used widely in reliability analysis where the cause-effect relationships between different events are identified.
System reliability is an inherent system characteristic;
it starts as a design-independent parameter from the user requirements, along with the design process; eventually, reliability will be translated to systems DDPs and TPMs will be derived in specific and quantitative format, so that reliability of the components can be verified. This translation process requires vigorous mathematical models to measure reliability.
Understanding this characteristic of system failure enables us to
make fea-sible plans for preventive and corrective maintenance activities to prolong system operations and make correct decisions about when to build a new system or to fix the existing one.
From the above example, we can see that the general procedure for solving a system reliability problem is quite simple and straightforward;
no matter how complex the system structure is, it can always be decomposed into one of the two fundamental structures, series and parallel. So, one would follow these steps: 1.Obtain the reliability value for the individual components for the time t. 2.Start from the most basic structure, and gradually work up to the next level, until the whole system structure is covered. For the above example, we started with the bottom level of the structure, which is A and B, obtaining the reliability of RAB, so A and B may be treated as being equivalent to one component in terms of reliability; then, we address A/B, C and D; they are the next level's basic structure, as they are a three-branched parallel structure; and finally we obtain the system reliability as the overall structure is a large series net-work between A/B/C/D and E.
If the MTBF is given, then we can use Equation 5.15 to
obtain the fail-ure rate by λ=1/MTBF, and Equation 5.18 can be used to obtain the reliability.
FMEA usually consists of two related but separate analyses;
one is FMEA, which investigates the possible failure modes at different system levels (com-ponents or subsystems) and their effects on the system if failure occurs; the second is criticality analysis (CA), which quantifies the likelihood of failure occurrence (i.e., failure rate) and ranks the severity of the effects caused by the failures. This ranking is usually accomplished by analyzing historical failure data from similar systems/components and through a team approach, derived in a subjective manner.
The original requirements are derived from users, mission planning, and feasibility analysis. Once the high-level requirements are obtained,
ower-level requirements are developed as the system design evolves; system requirements need to be allocated to the system components. System reliability is allocated in the system TPMs and integrated within the functional analysis and func-tional allocation processes. When allocating to the lower levels of the system, there is, unfortunately, no template or standard to follow, as every system is different and there may be tens of thousands of parts involved in multiple levels. Most of the allocations utilize a trial-evaluation-modify cycle until a feasible solution is reached. This approach uses a bottom-up procedure as well the top-down process, as different COTS components are considered for selection. Under these circumstances, it is very difficult to arrive at optimum solutions; usu-ally a feasible solution meeting the system requirements and complying with all other design constraints is pursued, and this process is also iterative and often involves users.
Generally, system reliability can be defined as
reliability is the probability that a system or a product will operate properly for a specific period of time in a satisfactory manner under the specified operating conditions. (Blanchard and Fabrycky, 2006). rom the definition, it is easy to see that reliability is a measure of the system's success in providing its functions properly without failure.
FTA can easily assist the designers to assess how resistant the system is to various risk sources. FTA is not good at finding the bottom-level initiating faults;
that is why it works best when combined with FMEA, which exhaustively locates the failure modes at the bottom level and their local effects. Performing FTA and FMEA together may give a more complete pic-ture of the inherent characteristics of system reliability, thus providing a basis for developing the most efficient and cost-effective system maintenance plans.
For a constant failure rate, the failure rate can also be estimated by using the following formula:
that symbol that looks like an a without the line= number of failures over total operating hours
The identification of TMPs evolves from
the development of system operational requirements and the maintenance and support concept. During the system design processes, one of the largest contributors to "risks" is the lack of an adequate system specification in precise quantitative forms.
the more components we have in the series structure, the less reliable the system is, and the more components we add to a parallel system,
the more reliable it is.
Components may be connected in different structures or networks within the system configuration
these could be in series, in parallel, or a combina-tion thereof (Also reviewed k out of n structure )
The failure rate is one of the most important measures for the systems designers, operators, and maintainers, as
they can derive the MTBF, or the mean life of components, by taking the reciprocal of the failure rate, expressed in Equation 5.15. MTBF is one common measure for systems reliability due to its simplicity of measurement and its direct relationship to the systems reliability measure. ( formulas found in module 5 word document )
FTA models are usually paralleled with
usually paralleled with functional analysis, providing a concise and orderly description of the different possible events and the com-bination thereof that could lead to a system/subsystem failure. FTA is com-monly used as a design method, based on the analysis of similar systems and historical data, to predict causal relationships in terms of failure occur-rences for a particular system configuration.