Six Sigma Test 10 (ASQ PART 3)
Types of Variation in Measurements: Bias
5) Bias- (difference b/w absolute value and real value) with respect to a standard master at various measurement points of the measuring range. 6) Wide-spread understanding of measurement system analysis was accepted after the introduction of QS-9000
Experimental Design & Interaction Effects
7. Interaction Effects a. DOE approach screens large number of factors with highly fractional experiments. b. Interactions occur when effect of one input factor on the output depends on the level of another input factor.
Box-and-whisker plot
Box-and-whisker plot a. Pictorial vies of minimum, maximum, median, and inter-quarantile range in one graph b. Better than distribution plots because easier to fine the outliners and gives more information. c. Normally distributed data= center line located in the middle c. Skewed data (unequal whisker or presence of outliner data outside the whiskers) d. Structure: i. Quarantile (4) set of data denoted as Q1, Q2, Q3, Q4 ii. Q2- median
Box-and-whisker plot Uses?
Box-and-whisker plot Used to: i. To visually compare two or more populations and see the shift in median and variation. ii. To mine information from a database.
A CP value less than 1 means?
A CP value less than 1 means that the process is wider than the specification, with defects spilling out over the edges. A CP value greater than 1 means that the effective width of the process variation is less than the required specification, with fewer defects occurring.
Attributes of sample data: Center of Sample
Attributes of Sample Data------Center of sample i. Center is quantified in 3 ways: Mean, Mode, Median
CP = PPK meaning?
Cp = Ppk---> Diagnosis = Your process is operating at its entitlement level of variation. Solution= Continue to monitor the capability of your process. Redesign your process to improve its entitlement level of performance.
Descriptive (enumerative) study
Descriptive (enumerative) study i. Collecting, organizing, summarizing, and presenting population data. ii. Shows the properties of set of data such as mean, median, mode, dispersion, shape, ect. iii. Graphical tools used (histograms, pie charts, box plots)
Examples of Special cause variation:
Examples of Special cause variation: a. Changes in quality b. Operators with varying skill levels c. Changes to process settings d. Environmental variations.
Go/no go gage.=
Go/no go gage.= This gage simply tells you if the part passes or it fails. There are only two possible outcomes.
Types of Measurement Scales:
Measurement Scales: 1. Ratio Scale 2. Interval Data Scale 3. Ordinal Data Scale 4. Nominal Data Scale
One-tail Test
One-tail Test a. Level of alpha risk determines the level of confidence (1- alpha). This risk factor used to determine the critical value of the test statistic, which is compared to calculated value. b. This test is used to test whether the sample value is larger or smaller than the population value.
3 uses of scatter diagrams?
Scatter Diagram a. Used for: i. Root Cause Analysis ii. Estimation of Correlation Coefficient iii. Regression line Calculation to make predictions.
Simple Linear Correlation
Simple Linear Correlation a. Relationship between 2 or more data sets b. Measures the strength and direction of the relationship between variables
Statistical process control errors:
Statistical process control errors: a. Any process can be monitored using basic SPC technieques. b. Control charts== separate assignable causes from random variation. c. Type I error= special cause that has not effect on the process. d. Type II= failure to address a defective process created by special cause variation. ****Use historical data with caution.
Steps for getting good results from DOE:
Steps for getting good results from DOE: a. Watch for process drifts and shifts b. Perserve all raw data---not just averages c. Avoid unplanned changes d. Get buy-in from all parties.
The various contributors to the measurement system variation can now be calculated. There are five that need to be calculated:
The various contributors to the measurement system variation can now be calculated. There are five that need to be calculated: • Equipment variation (EV) • Appraiser variation (AV) • Repeatability and reproducibility (GRR) • Part variation (PV) • Total variation (TV)
Two types of quantitative data:
Two types of quantitative data: a. Continuous (variable) data-→ measurement on continuous scale and have infinite number of values(temperature, length, weight) b. Discrete (attributes) data-→ result from counting the occurrence of events (eg. Number of bubbles)
Types of Attribute study involving ranking of scores:
Types of Attribute study involving ranking of scores: 1. Paper corrections 2. Tasting coffee/ tea/ wine
Types of Experimental objectives:
Types of Experimental objectives: i. Comparative -conclude if the factor is significant ii. Screening -select a few important main effects iii. Response Surface ---find optimal settings and weak points of processes. iv. Mixture—determines the best proportions
Types of Sampling Methods:
Types of Sampling Methods: a. Random sampling b. Sequential sampling c. Stratified sampling d. Rational Subgrouping
Variation in Process capability
Variation in process capability can be a. Physical or Mechanical (tool, machine, equipment, environment, maintenance) b. Procedural (workload, operator, accuracy, legibility) c. Two types: Special & Common Cause
Role of Histograms in Statistical process control errors:
f. Histogram—visual picture of the variation and center of the process.
Gage R&R study Calculations
l. Calculations: (See formula on page191) You must calculate the following: i. The average for each trial for each appraiser ii. The average and range for each part for each appraiser iii. The overall average and average range for each appraiser iv. The overall average and the average range for the part You then determine the average range for the three appraisers. Then determine the difference between the maximum appraiser average and the minimum appraiser average. A has the maximum average (3.157). C has the minimum average (2.695). Thus, the difference is 3.157 - 2.695 = 0.462. Next determine the range of the part averages (Rp). The largest part average is for Part 3 (4.099). The smallest part average is for Part 5 (1.936). So, Rp = 4.099 - 1.936 = 2.163
Design of Experimentation Levels?
2. Levels a. Settings or possible values of a factor in experimental design throughout the progress of experiment b. Levels can be quantified (eg. 3 different temperatures) or be qualitative (high, medium, low)
Experimental Objectives : Main Effects
6. Main Effects a. Estimate of the effect of a factor independent of any other means. b. Average Main Effects-→ average the results for each level of each factor c. The larger the main effect, the more influence that factor has on the quality characteristic i. Factors with the greatest difference between the "high" and "low" results are factors with the greatest impact on the quality characteristic of interest.
Attributes of sample data: Skewness of the sample shape
Attributes of sample data: iv. Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point.
Poisson Distribution:
B. Poisson Distribution: 1) Discrete data only 2) DPU= defects per unit
Types of Variation in Measurements: Linearity -
Linearity - Accuracy of measurement at various measurement points of measuring range in the equipment.
Population"= collection of data to be considered
Population"= collection of data to be considered a. Sample= subset of populations b. Samples randomly selected so that they represent the population from which they are drawn. c. Population parameters= population mean and standard deviation. d. Sample Statistic= statistics and quantitative research methodology, a data sample is a set of data collected and/or selected from a statistical population by a defined procedure.
Random sampling
Random sampling i. Every item has equal probability of being picked ii. Lot being sampled has to be homogenous for random sampling 1. Random samples are used in population sampling situations when reviewing historical or batch data. 2. The key to random sampling is that each unit in the population has an equal probability of being selected in the sample. 3. Using random sampling protects against bias being introduced in the sampling process, and hence, it helps in obtaining a representative sample. iii. In general, random samples are taken by assigning a number to each unit in the population and using a random number table or Minitab to generate the sample list. Absent knowledge about the factors for stratification for a population, a random sample is a useful first step in obtaining samples
Factors affecting sample size include:
Factors affecting sample size include: a. How confident (accurate) the analyst wants to be that the sample will provide a good estimate of the true population mean. The more confidence required, the greater the sample size needed. b. How close (precision) to the "truth" the analyst wants to be. For more precision, a greater sample size is needed. c. How much variation exists or is estimated to exist in the population. If there is more variation, a greater sample size is needed.
Interval Data Scale
Interval Data Scale i. Intervals between adjacent scale values are equal with respect to the attribute being measured. (difference between 30 degrees C and 20 Degrees C) ii. Application: a) Addition and subtraction of scale values, b) T-test, c)Standard Deviation, d) Mean
Causes of "Within Appraiser" inconsistencies due to human error
"Within Appraiser" inconsistencies due to human error : 1. Misunderstanding of the criteria 2. Human mood swings 3. Ergonomics of the inspection area 4. Fatigue. iv. Types of Attribute study involving ranking of
Design of Experimentation
1. Factor a. Variable controlled by the experimenter.
History of Design of Experiments
1. History of DOE a. Structured, organized method that is used to determine the relationship between different factors (X's) affecting a process & output of that process (Y). b. Use of DOE in agriculture
Purpose of Hypothesis testing
1. Purpose of Hypothesis testing--→ statistical conclusion about accepting or not accepting statements.
Types of Variation in Measurements: Reproducibility-
3) Reproducibility---("between appraiser" variation) -----Variation in measurement when measured by two or more appraisers multiple times. Measurements are taken from the same equipment by more than one appraiser over a short period of time. It is the variation in the average of the measurements made by the different appraisers when measuring the same characteristic on the same part. This is also called the "between system" variation. Appraiser variability is an estimate of standard deviation due to reproducibility.
Purpose of Design of Experiments Data matrix & ANOVA
3. Data matrix—table organizing the data into columns for analysis. (Columns= factors and rows= different experiments 4. ANOVA i. Provides accurate results even if the data matrix is small
Design of Experimentation Treatment
3. Treatment a. Single level assigned to a single factor or experimental unit during the experiment b. Specific combination of factor levels whose effect is to be compared with other treatments c. Treatment combination= series of levels for all factors in the experiment
8 Common Errors in performing Gage R&R:
8 Common Errors in performing Gage R&R: a. In process control situations, not selecting samples covering tolerance spread i. Best to either pick samples outside the specification limits or random samples from the process to study. b. Non-random samples measured-→ Results in Bias in the repetitive measurement trials. c. Use of untrained appraisers or process-unrelated employees in the experiment due to a lack of appraisers.-→ Results in inflated reproducibility errors. d. Samples altered during the study. e. Experimenters absent during the R&R study i. Assigning appraisers or remotely studying the process. ii. Important for experimenter to be present during the human interaction portion of the process (loading, setting, aligning, unloading, and ect) iii. Risk of invalidating results and having to start over f. Publishing the results with appraiser's names-→ risk of unhealthy comparisons between appraisers or appraisers feel uncomfortable and refuse to work for future studies; Best to represent appraisers as ABC g. Assuming that the Gage R&R results are valid forever GR&R results must be periodically validated as gage calibration is done with regular frequency. ii. Changes in measurement methods, appraisers, setting, or software/ firmware of equipment may occur. h. Assuming that Gage R&R performed on a specific piece of equipment is the same for all other equipment of that kind.
A Type I error in hypothesis testing
A Type I error in hypothesis testing is the probability of rejecting the null hypothesis when it is true. In this example, it is the probability that research will show that the knee joint replacement is causing nerve damage when it is not. Instead of accepting the null hypothesis of no damage, the researchers make a mistake of rejecting it and concluding that the replacement does cause damage.
Statistical Process Control
A. Statistical Process Control 1. Control Process a. Feedback loop which measures the actual performances, compares with a standard, and acts on the difference
AIAG Reference: Process Control Situations
AIAG Reference: Process Control Situations i. Measurement result & decision criteria determine the "process stability, direction, and compliance with the natural process variation" ( that is , SPC, process monitoring, capability, and process improvement), the availability of samples over the entire operating range becomes very important. ii. Process Capability Study = independent estimate of process variation used to assess the adequacy of the measurement system to process control; tool used to calculate percent GR&R to process variation.
AIAG Reference: Variation in Sample parts
AIAG Reference: Variation in Sample parts = the selection of sample parts from the process to represent the entire production operating range when : i. Independent estimate of process variation is not available ii. Determination of process direction and continued suitability of the measurement system for process control.
AIAG Reference: Product Controls Situations
AIAG Reference: a. Product Controls Situations i. Assessment of the measurement system is based on the percent GR&R to tolerance ii. Measurement result & decision criteria determine the "conformance or nonconformance to the feature specification"
Analytical (inferential) Study
Analytical (inferential) Study i. Involves making inferences, hypothesis testing, and making predictions. ii. Data taken from sample used to make estimates or inferences about the population which the sample was drawn. iii. Tools used include hypothesis testing and scatter diagrams to determine the relationships between variables and make predictions using regressions equations.
Assumptions that ensure the validity of the tests.
Assumptions that ensure the validity of the tests. a. Measurement system capability ---confirm the measurement aver b. Process Stability--→ establishment of bases lines to detect project drifts & shifts + Setting standard set points for the start and end point of the process. c. Residuals -→estimates of experimental error; should be normally & independently distributed with a mean of zero and constant variance.
Attributes of sample data: Sample shape & Kurtosis
Attributes of sample data: a. Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. That is, data sets with high kurtosis tend to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp peak. A uniform distribution would be the extreme case. b. The histogram is an effective graphical technique for showing both the skewness and kurtosis of data set. vii. The skewness for a normal distribution is zero, and any symmetric data should have a skewness near zero. Negative values for the skewness indicate data that are skewed left and positive values for the skewness indicate data that are skewed right. By skewed left, we mean that the left tail is long relative to the right tail.
Attributes of sample data: Shape of sample
Attributes of sample data: b. Shape of sample i. Smooth curve that serves as umbrella covering the tops of bars in the histogram ii. Shape: symmetry, skewness, and kurtosis. iii. A fundamental task in many statistical analyses is to characterize the location and variability of a data set. A further characterization of the data includes skewness and kurtosis.
Attributes of sample data: Variation
Attributes of sample data: Variation (dispersion or spread of sample) i. Dispersion of data is quantified with either the sample range or sample standard deviation.
Attributes of sample data:
Attributes of sample data: a. Variation (dispersion or spread of sample) b. Shape of sample c. Center of sample d. Normal distribution.
Steps for Process Capability Studies
B. Steps for Process Capability Studies 1) Measurement System Verification 2) Identification of rational sample subgroups for the plotting a control chart 3) Process Stability 4) Measuring Process Capability
Balanced Designs
Balanced Designs a. An experiment where each level of each factor is repeated the same number of times for the set of runs or combinations of levels that make up the experiment.
FMEA and Process Capability
Best to perform comprehensive process FMEA to identify the characteristics that merits a full process capability study + id the characteristics that need statistical process control + Control plan for detailed SPC planning. 5) Specification and tolerance are obtained from engineering drawings and customer contracts. a. Examples of specifications and tolerance (restaurant wait times, next day delivery) b. Unlike manufacturing, service providers required study of the market and customer expectations through surveys and informal interviews and identify the specifications and tolerances
Binomial Probability Distribution:
Binomial Probability Distribution: 1) Used to model discrete data (eg. Number of defects, wrong medicines, ect) 2) Must meet the following condition to apply Binomial distribution: a. Applicable only when N > 50 (not good for smaller samples) b. The ratio of sample (n) < 0.1(N)---sample size should be less then 10% of the population 3) Normal Approximation of the Binomial: a. For large values of n, the distributionso fthe count X and the sample proportion p are approximately normal (Central Limit Theorum) b. Can only be used when np > 10 and np (1-p) > 10
Design of Experimentation Block
Block a. Portion of the experimental material or environment that is common to itself and distinct from other portions. b. Mitigates the effect of variable that we are trying to avoid via grouping and eliminate the effect of special cause variation by the grouping experiments in batches of test runs.. i. Randomized block design—use to nullify the by grouping a selected factor that is the source of variability ****In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups (blocks) that are similar to one another. Typically, a blocking factor is a source of variability that is not of primary interest to the experimenter. An example of a blocking factor might be the sex of a patient; by blocking on sex, this source of variability is controlled for, thus leading to greater accuracy.
C. Hypothesis Tests for Means 1. Z-test
C. Hypothesis Tests for Means 1. Z-test a. Statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution. b. If the sample size is less than 30, then standard deviation is used to estimate the population standard deviation. c. Test statistic, Z, is compared with a critical value, which is based on alpha (significance level) for a one-tail or two-tail test.
What is C (p)?
CP is a measure of short-term process or characteristic capability. Use only the short-term standard deviation to calculate its value. Using a long-term standard deviation in its calculation gives you incorrect results.
Causes of non-linearity:
Causes of non-linearity: i. Instrument not calibrated properly at both the high and low end of the operating range ii. Error in on e or more of the master part measurements iii. Worn instruments iv. Characteristics of the instrument design. c. Average percent bias has P-value= 0-→higher significance of bias issues with the instrument. (Report also gives a breakdown of percent bias at the measurement points covered during the study)
Central Limit Theorem and Sampling Distribution of the Mean
Central Limit Theorem and Sampling Distribution of the Mean 1) Central limit theorem is an important principle used in statistical process control. 2) Foundation of statistical procedures. 3) The distribution of averages tends to be normal even though the distribution from which the average data are computed is from non-normal distribution.
Central Limit Theorem and Sampling Distribution of the Mean 3 key statements
Central Limit Theorem and Sampling Distribution of the Mean 3 key statements: a. Mean of the sample distribution of means is equal to the mean of the population from which the samples were drawn. b. Variance of the sampling distribution of means is equal to the variance of the population from which the samples were drawn divided by the size of the samples. c. If the original population was distributed normal, the sampling distribution of means will be normal. If the original population is not normally distributed, the sampling distribution of means will increasingly approximate a normal distribution as the sample size increases.
Charts used to calculate the mean and standard deviation.
Charts used to calculate the mean and standard deviation. i. Histogram ii. Run charts iii. Control Charts iv. Scatter Diagram ***Any process can be monitored using SPC techniques. If you have some control of the process, you then make changes to improve the variation (range) or the move the target (average)
Chi-Square & Probability Distribution:
Chi-Square: 1) The chi-square distribution has the following properties: a. The mean of the distribution is equal to the number of degrees of freedom: μ = v. b. The variance is equal to two times the number of degrees of freedom: σ2 = 2 * v c. When the degrees of freedom are greater than or equal to 2, the maximum value for Y occurs when Χ2 = v - 2. d. As the degrees of freedom increase, the chi-square curve approaches a normal distribution. e. Chi-square is non-negative & non-symmetric f. Many different chi-squares distributions, one for each degree of freedom
Classification of experimental designs: Blocked Factorials
Classification of experimental designs: Blocked Factorials ---reduces the number of runs and uses blocking to run the experiment in subsets.
Classification of experimental designs: Completely randomized
Classification of experimental designs: Completely randomized ---best when only one factor analyzed
Classification of experimental designs: Factorials
Classification of experimental designs: Factorials ---best when investigating several factors at two or more levels and interaction is necessary.
Classification of experimental designs: Fractional Factorials
Classification of experimental designs: Fractional Factorials ---- combination of factors and levels needed to run while resulting a close estimate to the full factorial
Classification of experimental designs: Latin square/ Youden square ---Investigates the primary factor while controlling the interactions and affects other variables. i. Latin Square is not a Factorial design
Classification of experimental designs: Latin square/ Youden square ---Investigates the primary factor while controlling the interactions and affects other variables. i. Latin Square is not a Factorial design
Classification of experimental designs: Mixture designs
Classification of experimental designs: Mixture designs---factor levels expressed as percentages. The property of interest depends on the proportions of the mixture components and not the amount.
Classification of experimental designs: Randomized Blocks-
Classification of experimental designs: Randomized Blocks---single factor when material or environment can be blocked.
Classification of experimental designs: Response surface
Classification of experimental designs: Response surface---contour diagrams or maps of how the factors influence the response and direction for optimization of variable settings
Classification of experimental designs:
Classification of experimental designs: a. Completely randomized b. Factorials --- c. Blocked Factorials --- d. Fractional Factorials ---- e. Randomized Blocks--- f. Mixture Desings g. Response surface h. Latin square/ Youden square
Common Cause Variation (Natural variation)
Common Cause Variation (Natural variation) a. Can't be controlled by process operators b. Sources of variation within the process itself. c. Common Cause variation reside within processes within statistical control and can be characterized by the following to estimate predictability. : i. Location (process average) ii. Spread (piece-to-piece variability) iii. Shape (distribution
Common interpretations of Cp and Cpk:
Common interpretations of Cp and Cpk: 1. Higher values best for Cp and Cpk 2. Cp value does not change as the process is being centered to target unless something in the process changes. 3. Cp and Cpk values will be equal if the recess if perfectly centered. 4. Cpk is always equal to or smaller than Cp. 5. Negative CPk→ process average is outside the engineering specifications. 6. If the process with one-sided specification-→ either Cp upper limit or Cp lower limit is calculated. 7. High Cpk from small sample may not be of much uses as the confidence intercal for Cpk will be very wide.
Control Limit verses Specification limit grid
Control Limit verses Specification limit grid a. First Quadrant— if the process in a state of statistical control (within natural process control limits and follows other applicable rules) and meets specification, the situation is "good" b. Third Quadrant-----if the process in not in a state of statistical control (within natural process control limits and follows other applicable rules) and does not meet specification---→ Greenbelt should stop the process and immediately investigate. c. Fourth Quadrant→ Engineers hesitation to invest time investigating in out of control process-→ Green belts should explain that out-of-control process lacks predictability and ma go out of the specification limit any time and recommend stop
Correlation vs. causation
Correlation vs. causation a. Causation= produces effect or gives rise to action. b. No such things as absolute cause for an event, the identification of which satisfies and completes all inquiry. c. If two variables are found to be associated, that does not mean than a cause-and-effect relationship exists between the 2 variables.
Cumulative frequency distribution
Cumulative frequency distribution a. Provide insight into how often a certain phenomenon (feature) is below a certain value. b. Statistical technique can be used to see how likely an event like a flood is going to happen again in the future, based on how often it happened in the past.
DOE--
DOE---best used for optimizing a process versus as problem-solving tools. a. SDCA ---standardize, do, check, act— i. Most commonly used once a process has improved to update control plans and process sheets to lock in the improvements & standardize the changes throughout the organization. ii. Outcome of applying sound recommendations based on the correct interpretations of valid effects obtained from proper design of experiments.
DOE & SDCA
DOE---best used for optimizing a process versus as problem-solving tools. a. SDCA ---standardize, do, check, act— i. Most commonly used once a process has improved to update control plans and process sheets to lock in the improvements & standardize the changes throughout the organization. ii. Outcome of applying sound recommendations based on the correct interpretations of valid effects obtained from proper design of experiments.
Data collection Methods:
Data collection Methods: a. Surveys -----Low response rate (10-15%) b. Face-to-Face interviews & Focus Groups------High data integrity and can clarify with respondents c. Mystery shopping d. Customer feedback-------Can be reactive after product failure; best to gather up-front information before designing product/service e. Automatic data capture i. Can eliminate errors involved with manual data capture. f. Manual data capture (prone to errors)
Data collection methods -----(5W1H)
Data collection methods -----(5W1H) a. Method of answering questions (what, where, why, who and how) before collecting data to ensure that data collection is effective. b. Data coding---best used when manual data entry is involved; prevents errors and avoids repetitive recording of numbers.
Data collection methods --- Reasons for data collection?
Data collection methods ---Reasons for data collection a. Legal/ regulatory b. Contractual requirements c. Analysis, improvement, & knowledge management
Design and Analysis of One-factor Experiments
Design and Analysis of One-factor Experiments a. Completely randomized when no tests are omitted b. ANOVA analysis of the results c. Significant values exceed the F-statistic derived from samples i. F-test analysis is the basis for model evaluation of both single factor and multi-factor experiments.
Difference between Statistical Process Control (SPC) and Statistical Quality Control?
Difference between Statistical Process Control (SPC) and Statistical Quality Control? Statistical Process Control (SPC) a. Application of statistical techniques for measuring and analyzing the variations in the process. Statistical Quality Control (SQC) a. Use of statistical techniques to measure & improve quality.
Design of Experimentation Difference between replication & repetition:
Difference between replication & repetition: i. Replication= process of running the experimental trials in random manner. Resetting of each trial condition. ii. Repetition= running experimental trials under the same setup of parameters
Drawing Valid Statistical Conclusions
Drawing Valid Statistical Conclusions 1) Statistics used to draw valid conclusion after the analysis of representative samples from a homogenous population.
Effect of R&R on process cability:
Effect of R&R on process cability: a. Higher gage R&R, the higher the error in C(p) assessment., and greater increases in capability. b. Example: If the observed C(p)=1 and Gage R&R = 50%, then actual C(p)= 1.23
Effects and Confounding
Effects and Confounding a. Certain situations where the effects of factor variables after interaction can't be differentiated from each other. Their contribution to the result on the response variable is intertwined and cannot be separated.
Examples of common cause variation:
Examples of common cause variation: a. Sequencing of the process b. Manufacturing equipment design c. Natural variation in incoming material supply d. Measuring equipment design
Design of Experimentation Experimental Design
Experimental Design a. Formal design or patterns that includes responses, factors, levels, blocks, treatments and the use of planned grouping, randomization, and replication. 6. Experimental Error a. Variation in response variable, when levels and factors are held constant. b. Experimental error must be s subtracted to determine
Experimental Objectives
Experimental Objectives a. Objectives derived from questions or enterprise goals. b. Factors to consider in setting objectives: i. Requires decision to give affordable and time available design. ii. Start with screening design, whose purse to determine the variables and levels for future study. c. Confirmation tests determine if the test is successfully identifies better process conditions
F-distribution
F-distribution 1) The F distribution is the distribution of the ratio of two estimates of variance. 2) It is used to compute probability values in the ANOVA analysis of variance. 3). Used in Hypothesis testing (to test whether variances of 2 or more populations are equal)
Fractional Factorial Design
Fractional Factorial Design= balanced experimental design with fewer than all combinations of all levels of all factors. a. Used for quick exploratory tests where interactions insignificant and many test needed rapidly.
Frequency Distribution
Frequency Distribution a. Used to illustrate the location & spread of numerical data. b. Peaks represent concentration of data i. Unimodal (one peak) ii. Bimodal (two peaks) iii. Multimodal (mixture of populations) iv. No peak/ flat curve (rectangular distribution) c. Width of curve represents spread of data i. Thinner distribution= less variation
Full-Factorial Experiments
Full-Factorial Experiments a. Looks for every possible combination in order to compete a full study of interactions. (contains all levels of all factors; no treatment is omitted) b. Two-way ANOVA used to evaluate the statistical significance of the results. --→ compares the between treatment variation with the within- the -treatment group c. The larger the effect, the more likely the results are significant.
Gage R&R for destructive testing----Types of Challenges:
Gage R&R for destructive testing --- the part or sample is altered or destroyed during testing. Gage R&R for destructive testing----Types of Challenges: 1. One-sided specification 2. Skewed distribution 3. Fully automated equipment with no or minimal appraiser interaction 4. Destructive testing 5. New product introduction where only a few units are available 6. Destructive testing 7. New product introduction where only a few units are available 8. Multiple station comparison 9. Equipment that requires resetting or calibration after every measurement 10. Incomplete GR&R (units shipped during the study due to urgency)
Gage Reproducibility & Repeatability Testing
Gage Reproducibility & Repeatability Testing c. Gage repeatability and reproducibility (R&R) is a method for finding out the variations within a measurement system. Basically, there are 3 sources for variation: variation of the part, variation of the measurement device, and variation of operator. Variation caused by operator and interaction between operator and part is called reproducibility and variation caused by measurement device is called repeatability. d. Check for only the precision of a measurement system. e. Variation due to Reproducibility - operator problems (training, skill, knowledge) or the machine problem (faulty design) f. Variation due Repeatability= equipment problems
CPK greater than 1.33 indicates?
Generally, a CPK greater than 1.33 indicates that a process or characteristic is capable in the short term. Values less than 1.33 tell you that the variation is either too wide compared to the specification or that the location of the variation is offset from the center of the specification. It may be a combination of both width and location.
Goal of Rational Subgrouping?
Goal of Rational Subgrouping? The goal should be to minimize the chance of special causes in variation in the subgroup and maximize the chance for special causes between subgroups. Subgrouping over time is the most common approach; subgrouping can be done by other suspected sources of variation (e.g., location, customer, supplier, etc.) v. For example, an equipment leasing business was trying to improve equipment turnaround time. They selected five samples per day from each of three processing centers. Each processing center was formed into a subgroup. vi. When using subgrouping, form subgroups with items produced under similar conditions. To ensure items in a subgroup were produced under similar conditions, select items produced close together in time.
Graphical Method: Tally
Graphical method=Tally a. Used to count defect quality by type, class. or category b. Provides visual idea of distribution shape c. Tally mark concentration and spread indicate distribution shape. i. Tally marks of 5 are crossed out as a group ii. Isolated groups of tall marks= uneven distribution.
Hypothesis Tests for Means Two-mean, Equal Variance T-test
Hypothesis Tests for Means Two-mean, Equal Variance T-test a. Tests are between two sample means and sigma are unknown but considered equal. 5. Two-mean, Unequal Variance T-test a. Tests between two samples means with unknown sigmas that are not equal.
Hypothesis Tests for Means One-Way Anova
Hypothesis Tests for Means-----One-Way Anova a. The one-way analysis of variance (ANOVA) is used to determine whether there are any significant differences between the means of two or more independent (unrelated) groups (although you tend to only see it used when there are a minimum of three, rather than two groups). b. Required conditions that must be met to conduct one-way ANOVA: i. Normally distributed population of interest ii. Samples are independent for each other iii. Each population have the same variance.
C. Hypothesis Tests for Means P-test
Hypothesis Tests for Means-----P-test a. Used when testing a claim about a population proportion and required a fixed number of independent trials having constant probabilities., with each trial having two possible outcomes (binomial distribution) b. When np < 5 or N(1-p), binomial distribution used to test the hypothesis relating to proportion.
Hypothesis Tests for Means Chi-square
Hypothesis Tests for Means----Chi-square a. Conditions: i. All expected frequencies are at least 1 ii. At most, 20% of the expected frequencies are less than
Hypothesis Tests for Means Contingency Tables
Hypothesis Tests for Means----Contingency Tables a. Two dimensional classification tables with rows ad columns with original frequencies and count data that can be analyzed to determine whether the two variables are independent or have significant association. b. Chi-square will test if there is dependency between the two classifications. ****Contingency coifficient- show the strength of correlation while the chi-square test shows significant dependency.
Hypothesis Tests for Means F-test
Hypothesis Tests for Means----F-test a. The F-test is designed to test if two population variances are equal. It does this by comparing the ratio of two variances. So, if the variances are equal, the ratio of the variances will be 1. b. This ratio of sample variances will be test statistic used. If the null hypothesis is false, then we will reject the null hypothesis that the ratio was equal to 1 and our assumption that they were equal.
Hypothesis Tests for Means Paired T-test
Hypothesis Tests for Means----Paired T-test a. Measures whether means from a within-subjects test group vary over 2 test conditions. b. This t‐test compares one set of measurements with a second set from the same sample. It is often used to compare "before" and "after" scores in experiments to determine whether significant change has occurred.
C. Hypothesis Tests for Means Student T- test
Hypothesis Tests for Means----Student T- test a. Used to make inferences about a population mean when the population variance is unknown and the sample size is small. b. Only applies to samples drawn from a normally distribted population. c. Can be applied to any sample size but best applied to samples greater than 30.
If CP = CPK and PP = PPK, then interpretation?
If CP = CPK and PP = PPK, then interpretation? Diagnosis= Overall, your process or characteristic is centered within its specifications. Solution= As needed, focus on reducing the long-term variation in your process or characteristic while maintaining on-center performance.
If CP = PP and CPK = PPK, then interpretation?
If CP = PP and CPK = PPK, then interpretation? Diagnosis= our process or characteristic suffers from a consistent offset in its center location. Solution= Focus on correcting the set point of your process or characteristic until it's centered.
Interpreting Process Capability
Interpreting Process Capability b. Process reached Six Sigma quality-→ Cp= 2 and Cpk= 1.5 c. Process is capable-→ Cp & Cpk ≥ 1.33 d. Process barely meets spefication & will have a 0.27 percent defective units-→Cp or Cpk value= 1 e. Process producing units outside the engineering specifications-→ Cp or Cpk < 1 f. Specification is loose or identifies an opportunity to move to a less expensive process-→ Cp or Cpk > 3
Least Square Method:
Least Square Method: a. Mathematical procedure to identify the linear equation that best fits a set of ordered pairs by finding values for the slope and y-interecept. b. Goal of this method= reduce the total squared errors between the values y and y-bar c. Standard error of estimate = tells the accuracy for y verse x; measures the amount of dispersion of observed data around the regression line i. Low values= data points are very close to line and small percentage or error
Linearity & Bias
Linearity & Bias a. Linearity Study used when equipment is accurate at one of measurement but varies in accuracy at other times. i. Tells how biased the measuring equipment is compared to the "master"
Locational Data
Locational data-→ used to identify where the data is coming from or to locate the defect which cannot be found with discrete or continuoud data
Short-term Vs. Long-term Capability Traits of Long-term capability index (CP)
Long-term capability indices (PP and PPK) The same capability indices that you calculate for short-term variation can also be calculated for long-term, or total, variation. To differentiate them from their short-term counterparts, these long-term capability indices are called PP and PPK. (The P stands for "performance.") The only difference in their formulas is that you use σLT in place of σST. These long-term capability indices are important because no process or characteristic operates in just the short term. Every process extends out over time to create long-term performance.
Long-term process capability---→
Long-term process capability---→ same measurements my include a variety of samples (different streams of machines, multiple operators/ spindles/ cavities/ measuring equipment) i. Process capability calculation (Pp, Ppk) performed using sample standard deviation of values. ii. Parameters not following normal distribution (flatness, wait time) are normalized with tranformation techniques before analyzing the data. iii. Variation and shift in the mean occurs in long-term process iv. The same capability indices that you calculate for short-term variation can also be calculated for long-term, or total, variation. To differentiate them from their short-term counterparts, these long-term capability indices are called PP and PPK. (The P stands for "performance.") v. The only difference in their formulas is that you use σLT in place of σST. These long-term capability indices are important because no process or characteristic operates in just the short term. Every process extends out over time to create long-term performance.
Measurement correlation
Measurement correlation a. Used when measurements are taken simultaneously with multiple measurement devices of the same type for parts coming from multiple streams of manufacturing. 6) Percent agreement: a. Attribute Gage R&R Study -→ Compares Appraisers b. Go/no go gage.= This gage simply tells you if the part passes or it fails. There are only two possible outcomes. i. This gage will simply tell if the part is within specifications. It does not tell you how "close" the result is to the nominal; only that it is within specifications. To determine the effectiveness of Gage, then must conduct a attribute gage R&R study to assess the variation. Variations can be either due to human judgment or due to machine variation (automatic measurement gauging where parts are screened as good/bad by the machine) ii. Results can be expressed as accept/reject or rating 1-5
Measuring Process Capability
Measuring Process Capability i. Process variation < specification limits-→ must reduce the non-conformances + reduce variations ii. Crux of Six Sigma Methodology= reduction of variations. 1. Containment Action= In some cases, may have to off-center the process distribution in a direction that required rework and salvage rather than scrapping of parts. Used to buy time while trying to find a way to reducing the variation. 2. Can also revisit the specification limits of the customer and engineering standpoint. (usually due to tight, unrealistic specifications created by the designers who have reviewed the limitation in technology or the process capability)
Methods of reducing errors:
Methods of reducing errors: a. Data collection plan b. Calibration schedule for data collection equipment c. Repeatability & reproducibility (R& R) studies on measurement system d. Statistical testing to remove outliners e. Clear & complete instruction & training f. Redundant error correction system for digital data transmission g. Auxiliary information recording (units, time of collection, condition, measurement equipment, data recorder)
How is the status of processes monitored?
Monitoring the status of processes: The status of processes can be monitored by averaging the measurement values in subgroups ( using central limit theorem) using SPC control charts.
Multi-variate Studies
Multi-variate Studies 1. Tool for analyzing the 3 types of variation: a. Cyclical (lot-to-lot or part-to-part variation) b. Temporal (shift-to-shift; changes over time) c. Positional (with-in part variation) 2. Minimizes variation by identifying areas in which there is excess or deficient variation. 3. Tool for investigating the stability or consistency of process. 4. Aids in identification of the source or location of variation within the process.
Multiple Linear Regression
Multiple Linear Regression a. Explains the higher portion of y b. Requires finding out the closeness of the calculated b-values to the actual coifficient for the population.
2. Experimental Objectives ANOVA , Alpha risk, Beta risk
Must use ANOVA to check if the perceived difference of "high" and "low" results are statistically significant difference in the dependent variable (and not due to errors) d. Alpha-risk= probability that analysis will show that there is a difference e. Higher the power of the experiment, the lower the beta risk f. Higher number of replications or a larger sample size-→ reduced beta risk or more precise experiment.
Natural process limits verses Specification limits
Natural process limits verses Specification limits a. Specification limits are the values between which products or services should operate. These limits are usually set by customer requirements.
Nominal Data Scale
Nominal Data Scale i. Values on the scale have no "numeric" meaning in the way that ususally applies with numbers and no ordering scheme (eg. Color-coded wires) ii. Application: Counting, Mode, chi-square
Non-normal data distribution vs. Normal distribution
Non-normal data distribution i. Box-Cox technique & Johnsons Transformations used to change data from non-normal to normal distribution. c. Normally Distributed Data--→ use normal distribution table to estimate process capability + estimate the mean and sigma using the control chart data
Normal Distributions
Normal Distributions 1) Continuous distribution used for Variable data (length, mass, time, ect) 2) Bell Shape Curve=Averages of measurements of individual data follow normal distribution even if the individual data are from a different distribution. 3) Standard Deviation (Z-scores)= represents the area under the bell curve 4) Standard normal distribution has a mean=0 and a SD=1 5) According to the central limit theorem, the sampling distribution of a statistic (like a sample mean) will follow a normal distribution, as long as the sample size is sufficiently large. Therefore, when we know the standard deviation of the population, we can compute a z-score, and use the normal distribution to evaluate probabilities with the sample mean.
Normal Probability plots
Normal Probability plots a. Used to test whether random data come from normal distributed data b. Used to test the normality before doing further analysis. c. Will produce a straight line when manually plotting data on probability graph paper d. Tests used to check normality of random data: i. Empirical cumulative distribution function tests: 1. Anderson-Darling test for normality (most widely used with statistical software) 2. Kolmogorov-Smirnov test for normality ii. Correlation-based tests 1. Ryan-Jointer test (correlation-based test) 2. Shapiro-Wilk test (Correlation based test) 8) Weibull Plots
Null Hypothesis
Null Hypothesis-→ believed to be true unless evidence contrary.
Objective of process capability studies:
Objective of process capability studies: i. To determine if the process is in statistical control ii. To determine if the process is capable of meeting specifications. e. If the process is not capable-→ take action to improve capability. f. If the process is capable→review the percent nonconformance outside the specification limits.
Ordinal Data Scale
Ordinal Data Scale i. Interval between adjacent scale values are indeterminate (eg. Categorization of defect by criticality, functional failures, performance degradation, cosmetic defects) ii. Application: a) "Greater that" & "Lesser than " operations, b) Median, c) Interquartile range, d) Sign test
Parametric & Nonparametric tests
Parametric & Nonparametric tests a. Parametric tests= descriptive measure calculated using population data. i. Assumes that data are normally distributed. b. Nonparametric tests i. Distribution-free testing-→makes no assumption regarding the population distribution. ii. Applied to ranked data in which data are not specific in any continuous data or attribute data.
Design of Experimentation Planned Grouping
Planned Grouping a. Practice done to promote uniformity within blocks and minimize the effect of unwanted variables. This will make experiment more effective in determining assignable causes.
Process Capability (Learning)
Process Capability (Learning) a. Specification limits and process variation both reflect the voice of customer b. Process capability measured by capability index (Cp) and process performance (Cpk)
Process Capability Indices
Process Capability Indices a. Used to quantify process capability in a single number. b. Focus on variable (continuous) data c. C(pk) ≥ 1-→ process is capable (natural process limits lie inside the specification limits); Six Sigma requires specification limit C (pk) = 2 or +/- 6Φ d. 6 Φ process= where σ ≤ [1/12 * specification) + process average ≤ 1.5 σ e. Percent violating the specification limits is based on Z-values corresponding to 4.5 σ f. C(p)--→ shows how good C(pk) could be if process were centerd (Cp--doesn't take into account whether the process is centered in the specification. ) g. C( r) --→ shows percent of specification used up by process variation; (Lower Values of Cr better) ----------------------→ [Cr= inverse of C(p)];
Process Capability Indices
Process Capability Indices a. Used to quantify process capability in a single number. b. Focus on variable (continuous) data c. C(pk) ≥ 1-→ process is capable (natural process limits lie inside the specification limits); Six Sigma requires specification limit C (pk) = 2 or +/- 6Φ d. 6 Φ process= where σ ≤ [1/12 * specification) + process average ≤ 1.5 σ e. Percent violating the specification limits is based on Z-values corresponding to 4.5 σ f. C(p)--→ shows how good C(pk) could be if process were centerd (Cp--doesn't take into account whether the process is centered in the specification. ) g. C( r) --→ shows percent of specification used up by process variation; (Lower Values of Cr better) ----------------------→ [Cr= inverse of C(p)];
Process Capability for Attributes Data
Process Capability for Attributes Data a. Discrete (Attribute Data)--→ Binomial process capability= process capability is simply the average of proportion defective which is then converted to process sigma Z. b. Percent defective= percentage of parts sampled during the final inspection that are defective i. Percent Defective (100 X Average P) ii. Process Z= capability index computed from the average P by finding the value from the standard normal distribution such that the area to the right of the value is the average P c. Chi-Square Distribution—used to calculate the confidence intervals for capability indices (calculation is not important) *** If the data is normal but not stable-→ use Pp and Ppk. **** If the data are normal and stable-→ use Cp and Cpk
Process Performance Indices
Process Performance Indices a. Provides picture of current process operations b. Used to compare & prioritize improvement efforts c.
Process Performance vs Specification
Process Performance vs Specification a. Natural Process Limits i. Natural process limits= +/- 3σ-→ if the natural process limits are outside the specification limits then the process can't meet specifications. ii. Derived from process variation after removal of all special causes and the process reached statistical stability. iii. Green Belts expected to review those out-of-control violations, assign special causes, and recalculate the control limits before firming up the limits of implementation.
Process capability, Common Cause , Special Cause Variation.
Process capability= ability of the process to meet the expected specifications a. Every process has variation due to common causes and special causes, both internal and external to the process. b. Common Cause variation (random process variations) = influence process capability c. Special Cause variation (Assignable causes)= removed before estimating process capability.
Purpose of Design of Experiments
Purpose of Design of Experiments a. Objective of a designed experiment to generate knowledge about product/service. b. Find the effect that a set of independent variables has on a set of dependent variables. c. Control factors= independent variables that the experimenter controls (the rest are called noise factors)_ d. Focus on how the outcomes depends on the variation. e. May involve designing a set of experiments to identify optimal conditions
Design of Experimentation Randomization
Randomization a. Organizes the experiment to have treatment combinations done in a chance manner, improving statistical validity.
Ratio Data Scale
Ratio Data Scale: 1) There is a rational zero point for this scale; Ratios can be equivalent 2) Application: a) Geometric Mean b) Coifficient of Variation c) Multiplication & Division of scale Values
Types of Variation in Measurements: Repeatability
Repeatability: 1. Variation in measurement that occurs when the SAME measuring system (appraiser, equipment, materials) are used and is reflected in R values 2. Repeatability is referred to as equipment variation and R averages indicated difference in appraisers. 3. If two R averages are compared, the smaller R value means that there were less errors and the appraiser is more accurate (inverse relationship between R-value and accuracy)
Repeatability: Equipment Variation (EV)
Repeatability: Equipment Variation (EV) This is the "within appraiser" variation. It measures the variation one appraiser has when measuring the same part (and the same characteristic) using the same gage more than one time. The calculation is given below. where K1 is a constant that depends on the number of trials. For 2 trials, K1 is 0.8862. For 3 trials, K1 is 0.5908. For this example:
Design of Experimentation Replication
Replication a. Observations or measurement to increase precision, reduce measurement errors, and balance unknown factors. b. Repetition of the set of all treatment combinations to be compared in experiment for sake of achieving reliability.
Reproducibility: Appraiser Variation (AV)
Reproducibility: Appraiser Variation (AV) This is the "between appraisers" variation. It is the variation in the average of the measurements made by the different appraisers when measuring the same characteristic on the same part.
Required Sample size
Required Sample size a. The sample size needed for hypothesis testing depends on: i. Type I and Type II risk ii. Minimum value to be detected between the population means iii. Variation in the characteristic being measured.
Purpose of Design of Experiments Response Variables
Response Variables i. Variables that show the observed results of an experimental treatment ii. Outcome of the experiment as a result of controlling levels, interactions, and number of factors
Run chart
Run chart a. Best tool for showing the stability of a process: i. Oscillation = process lacks steadiness ii. Trend= upward/downward pattern due to tool wear, gradual temperature changes, or loosing of fixture. iii. Mixtures=process points appearing on either side of the chart with nothing close to centerline→ mix-up between two different machines, two different operators, or two lots of materials. iv. Clustering=measurement problems
SPC Theory
SPC Theory 1. X-bar & R charts are most common type of control chart, and are calculated for each subgroup and plotted order of production on separate charts.
Sampling Methodis with Respect to Statistical Process Control
Sampling with Respect to Statistical Process Control 1) Random Sampling a. Measurements made within a batches are not a subgroup b. Used for batch processing-→ (1) pick random samples (2)average the measurements of the samples (3) plot the samples as one data point of a subgroup. 2) Systematic Sampling a. Used when performing individual moving range SPC monitoring by smapling every nth part. b. Used when pats are coming out of the conveyor c. Used in transactional process situtions-→ assessment of service quality by sampling every nth customer. 3) Subgroup Approach to Sampling a. Used for plotting X-bar & R-charts or X-bar & s charts. b. "Within group" variations should contain only chance causes-→ reason shy consecutive parts are sampled in the X-bar chart. c. Subgroup intervals should be planned to capture special causes.
Interpretation of Scatter Diagrams
Scatter Diagram b. Find the cause-and-effect relation or the correlation between variables. c. Linear relationship must be present to estimate correlation Coiffeent (Pearson's correlation) i. No correlation if the data are spread out with no inclination to right or left ii. X-axis= independent variable iii. Results: Positive correlation/ Negative Correlation/ No Correlation/ Non-linear Correlation. ****Association and causality are not the same thing e. The closer, r, is to +1 or -1, the stronger the association and there is a higher likelihood that variables are related.
Scatter Diagrams & irrelevant variables
Scatter Diagrams & irrelevant variables Mathmatical relationships can be found with irrelevant variable-→ Must use Engineering judgement to select variables
Coifficeint of Determination & Scatter Diagrams
Scatter Diagrams: Coifficeint of Determination.= How well the regression line fits the data.-→ how much of variability in y-values can be explained by the fact that they are related to X-value.
Sequential sampling
Sequential sampling i. Best for destructive testing and reliability testing where higher cost is involved in testing the unit. ii. Samples tested sequentially until the desired results reached.
Interpretation of Gage R& R
Several points lie outside the control limits in the sample= samples cover the spread of the process specification. -→ Must verify if it is a recording error or variation due to special causes. h. Best to have the points within the control limits i. Spread of measurement points & outliners will display whether an appraiser is consistently measuring higher or lower than others. 14) Sources of Measurement Variation:
Short-term Vs. Long-term Capability Traits of Short-term capability index (CP)
Short-term capability index (CP) Determining the width between the two rigid specification limits is easy; it is simply the distance between the upper specification limit (USL) and the lower specification limit (LSL). But with variation that trails out at the tails, how do you determine the width of the process? To get over this hurdle, Six Sigma practitioners have defined the effective limits of any process as being three standard deviations away from the average level. At this setting, these limits surround 99.7 percent, or virtually all, of the variation in the process. Figure 13-9 shows these limits graphically. image0.jpg So to compare the width of the specification to the short-term width of the process, you use the following formula: Cp= (USL - LSL) / (6 * Standard deviation (st)
Simple Linear Correlation Coeifficent-
Simple Linear Correlation Coeifficent- a. Indicates the strength and direction of the relationship between dependent and independent variable. b. Strength of the coefficient= how close the variable is to +1 or -1. c. Rho= population correlation coefficient d. Simple Regression= straight-line that best fits a series of ordered pairs (x,y) e. Assumes atht the data point are within a straight line whose equation is y= B0 +(B1) (X1)
Sources of Measurement Variation:
Sources of Measurement Variation: a. Part-to Part Variability b. Measurement system variability c. Variation due to gage (repeatability) d. Variation due to operations (Reproducibility)
Special Cause variation: (Assignable causes)
Special Cause variation: (Assignable causes) a. Unusual event that the operator can adjust or remove. b. Process outputs will be unpredictability influenced with random events if the all the special causes are not mitigated.
Specification limits verses Control limits:
Specification limits verses Control limits: a. Control limits are calculated from process data. They represent how your process actually performs. Specification limits are defined by your customer and represent the desired performance of your process. b. Specification limits and control limits are used for different purposes. Control limits let you assess whether your process is stable. Specification limits allow you to assess how capable your process is of meeting customer requirements.
Specification vs. Process capability
Specification-→ derived from customer or engineering point of view. Capable Process-→ process variation is significantly lower that the width of the specification (upper limit---lower limit)
Standard error of the mean
Standard error of the mean: a. Used to calculate the margin of error (used to find the confidence interval) b. 95% of the time, the sample mean lies within 2SD of the mean (where the true mean should lie +/- 1.96 (Sigma/ square root of n)
Statistical Process Delay.
Statistical Process Delay. a. Misapplication of statistical techniques to pretend that information is being used to run a process, when in reality it is only for show instead of actual application to improve a process
Stem-and-leaf plot
Stem-and-leaf plot a. Used to quickly identify repetitive data within class interval b. May get measurement errors fi the data values are not evenly distributed in the cells. c. Like tally column but the last digit of the data is recorded i. Leaf unit= tells which unit is used (pg. 163 + Khan Academy video on reading stem-plot)
Steps for Process Capability Studies Measurement System Verification
Steps for Process Capability Studies 1) Measurement System Verification a. Measurement system analysis (1st step in process capability study)----Variations in the measurement system can mislead the process capability study & process stability monitoring b. Remove sources of variation in the measurement system-→ reduces the variations due to measurements ( reduces measurement variation) present in the overall process variation and process tolerance.
Steps for Process Capability Studies Identification of rational sample subgroups for the plotting a control chart
Steps for Process Capability Studies Identification of rational sample subgroups for the plotting a control chart a. Subgroups size of 2-10 (usually less than 5 is best) b. Stability of process is observed on the control chart by plotting consecutive samples taken at equal intervals from a process VS. average/ range c. Must make sure that "within subgroup variation" is less than "between subgroup" variation. d. Low-volume processes (Subgroups size= 1): i. Individual charts are plotted to monitor the stability where the subgroup size= 1 ii. When subgroups size= 1-→ less sensitive in detecting shifts in processes e. High volume processes (Subgroup size > 10)-→ use Average Standard deviation charts (best for detecting shifts in processes but costly) f. Average range chart--→ more economical choice for detecting shifts in processes
Steps for Process Capability Studies Measuring Process Capability
Steps for Process Capability Studies Measuring Process Capability a. Construct histogram using the original readings (not the averages) from the control chart. i. Normality (normally distributed data)= histogram with most points grouped around a single peak and fairly symmetric tails on each side ii. Can conclude that data is drawn from normal population. iii. More data used, greater significance
Steps for Process Capability Studies Process Stability
Steps for Process Capability Studies Process Stability a. Process is stable if the control charts shows no special causes after 20-30 subgroups plotted (more points plotted = greater confidence in conclusion) b. Control chart monitoring (process stability) = not affected by non-normal distribution of data c. Process capability-→ normality is required for continuous data
Steps for conducting Gage R&R study
Steps for conducting Gage R&R study g. Planning: Check equipment calibration & availability of appraisers & supervisors. Inform apprasiers of measurement criteria and inspection method h. Sample Selection: Hand-pick samples covering the spread (avoid random sampling). Appraisers should not be involved in sample selection process. i. Data tracking: Create table for experimenting and comparing samples between trials and between appraisers j. Trials -Each appraiser within the group must conduct measurements of all samples for every trial to ensure completeness of the study (data imbalances if the study incomplete) k. Data Collection Sheet- data can be collected either in the randomized data collection sheet or by using the tabular calculation sheet. l. Calculations: (See formula on page191)
Stratified sampling
Stratified sampling i. No homogeneity in the lot (Mixture of parts from different machines, different streams, different raw material lots, different process settings) ii. Like random samples, stratified random samples are used in population sampling situations when reviewing historical or batch data. iii. Stratified random sampling is used when the population has different groups (strata) and the analyst needs to ensure that those groups are fairly represented in the sample. iv. In stratified random sampling, independent samples are drawn from each group. The size of each sample is proportional to the relative size of the group. 1. For example, the manager of a lending business wanted to estimate the average cycle time for a loan application process. She knows there are three types (strata) of loans (large, medium and small). Therefore, she wanted the sample to have the same proportion of large, medium and small loans as the population. She first separated the loan population data into three groups and then pulled a random sample from each group
T-distribution
T-distribution 1) The t distribution allows us to conduct statistical analyses on certain data sets that are not appropriate for analysis, using the normal distribution. 2) But sample sizes are sometimes small, and often we do not know the standard deviation of the population. When either of these problems occur, statisticians rely on the distribution of the t statistic (also known as the t score),
Techniques for assuring data accuracy and integrity 9 Common causes of errors: ???
Techniques for assuring data accuracy and integrity 1) Common causes of errors: a. Units of measure not defined b. Poor legibility c. Loss of precision (rounding errors) d. Distortion of data (emotional bias) e. Inconsistency f. Poor training g. Ambiguous terminology h. Clerical / typo errors i. Guesswork/ personal bias (inadequate use of validation techniques) j. Multiple points of data entry---inconstistency and errors
Test Statistic-
Test Statistic-→calculated value made from sample information used to test the null hypothesis
Test for Means, Variations & Proportions:
Test for Means, Variations & Proportions: 1. Continuous Data—large samples ---normal distribution used to find the CI 2. Continuous Data- small samples----Samples less than 30 use T-distribution test. 3. Confidence Intervals for Variation---- CI are not symmetrical around the average and so must use Chi Square to find the CI. 4. CI for Proportions--- normal distribution used to find the CI
The T distribution has the following properties:
The T distribution has the following properties: a. The mean of the distribution is equal to 0 . b. The variance is equal to v / ( v - 2 ), where v is the degrees of freedom (see last section) and v > 2. c. The variance is always greater than 1, although it is close to 1 when there are many degrees of freedom. With infinite degrees of freedom, the t distribution is the same as the standard normal distribution. d. T-distribution is bell-shaped but with smaller sample sizes show increased variability (flatter) e. As sample size increases, the distribution approached normal distribution f. Mean = 0 g. Population SD unknown h. Variance >1 but approaches one from above as the sample size increases.
There are many sources of variation in the process. However, these sources can be grouped into three categories:
There are many sources of variation in the process. However, these sources can be grouped into three categories: a. variation due to the process itself b. variation due to sampling c. variation due to the measurement system
Three process ( Ppk, Pp, & Cpm) all require stability of processes.
Three process ( Ppk, Pp, & Cpm) all require stability of processes. i. Cp & Cpk are better measures because they reflect capability derived from common cause variation. ii. Pp & Ppk= can be applied to data collected from incoming inspection material to obtain an indication of process capability at the supplier end were the SPC data is unavailable. iii. Process data containing special cause variation-→ Pp & Ppk can be used for information about the processes as long as the data follows normal distribution pattern. iv. Pp & Ppk can be used in situation where SPC-implemented suppliers are not willing to share their process data in time sequence with customers or when suppliers have not implemented SPC in their processes.
Two calculation used to identify how capable a process is
Two calculation used to identify how capable a process is: i. Capability index (Cp) ii. Process performance (Cpk)
Two calculation used to identify how capable a process is:
Two calculation used to identify how capable a process is: i. Capability index (Cp) ii. Process performance (Cpk)
Two reasons that one should move from discrete to continuous measurement:
Two reasons that one should move from discrete to continuous measurement: a. Control charts based on continuous data are more sensitive to process changes than those based on discrete data. b. When designed experiments are used in process improvement, changes in continuous dta may be observed even though the discrete measurement hasn't changed.
Two types of statistical studies:
Two types of statistical studies: a. Descriptive studies i. Purpose is to present data in a way to facilitate understanding. b. Inferential studies. i. Analyze data from a sample to infer properties of the population from which the sample was drawn
Two types of studies used for drawing statistical conclusions:
Two types of studies used for drawing statistical conclusions: a. Descriptive (enumerative) study b. Analytical (inferential) Study
Two-tail test
Two-tail test a. Test used to test whether a population shift has occurred in either direction.
Types of Error in Hypothesis Testing
Types of Error in Hypothesis Testing a. Type I Error i. Null hypothesis is incorrectly rejected ii. Alpha error= producer's risk (eg. Incoming goods were good but incorrectly labeled defective) b. Type II Error i. Failure to reject the null hypothesis ii. Beta error (defective product labeled as good)
Types of Variation in Statistical Study:
Types of Variation in Statistical Study: a. Calibration -Drift in average measurements of a absolute value. b. Stability—drift in absolute value over time. c. Repeatability - ( "within appraiser" variation) ----variation in measuring equipment when measured by one appraiser in the same setting at the same time. Equipment measurement variation expressed as standard deviation
Design of Experimentation Variables, Independent Variables, & Independent Variables.
Variables: a. Dependent Variables i. Independent variable cause of the apparent change in the dependent variables ii. Dependent variable values are compared in research. iii. Dependent variable can't change without independent variable b. Independent Variables i. Most commonly an input in the experiment and does not have interactions with other variables
Visual indicator of nonrandom patterns 3 uses?
Visual indicator of nonrandom patterns Used to: i. ID patterns in process data ii. Detection of non-random variations. iii. Detection of trends, osciallation, clusters, & mixtures. d. Typically used when subgroups= 1 i. Subgroups > 1-→ Means or Medians must be calculated and connected (like control charts) e. Provides real-time feedback using variable data. f. P< 0.05-→ Statistically significant non-random pattern g. Number of runs above or below the median= present of cluster, mixture, trend, or oscillation.
Weibull Plots
Weibull Plots a. Used for reliability data when the underlying distribution is unknown b. Can be used to estimate the shape parameter Beta and mean time between failures (MTBF) or failure rate. c. Can be done manually or via software. d. There is relevance between all these parameters and the product lifecycle.
When seeking to reduce variation, MSA should be evaluated for 2 reasons:
When seeking to reduce variation, MSA should be evaluated for 2 reasons: a. All data from the process is filtered through measurement system b. It often represents the most cost-effective way to reduce variation. 8) MSA systems commonly have 40-50% chance of having errors of process specification.
The F distribution has two parameters:
a. The F distribution has two parameters: degrees of freedom numerator (dfn) and degrees of freedom denominator (dfd). b. The dfn is the number of degrees of freedom that the estimate of variance used in the numerator is based on. The dfd is the number of degrees of freedom that the estimate used in the denominator is based on. The dfd is often called the degrees of freedom error or dfe. In the simplest case of a one-factor between-subjects ANOVA, (dfn = a-1)
If the characteristic or process variation is centered between its specification limits
c. If the characteristic or process variation is centered between its specification limits, the calculated value for CPK is equal to the calculated value for CP. But as soon as the process variation moves off the specification center, it's penalized in proportion to how far it's offset. CPK is very useful and very widely used. i. That's because it compares the width of the specification with the width of the process while also accounting for any error in the location of the central tendency. This approach is much more realistic than the one the CP method offers. ii. Generally, a CPK greater than 1.33 indicates that a process or characteristic is capable in the short term. Values less than 1.33 tell you that the variation is either too wide compared to the specification or that the location of the variation is offset from the center of the specification. It may be a combination of both width and location.
Rational Subgrouping
e. Rational Subgrouping i. Process of putting measurements into meaningful groups to better understand the important sources of variation. ii. Used in process sampling situations when data is collected in real time during process operations. iii. It involves grouping measurements produced under similar conditions, sometimes called short-term variation. This type of grouping assists in understanding the sources of variation between subgroups, sometimes called long-term variation.