Analyze Phase
Kypo? w/ Normal or non-normal
1 and 2 sample t-tests, variance tests, ANOVA for normal data. OR non-parametric tests for non-normal data, 1 and 2 P tests and chi-square test for attribute data.
Hypothesis of Proportions (Non-Normal Data)
1-Proportion, 2-Proportion; Chi-Square (>2)
Hypothesis of Median Tests (Non-Normal Data)
1-Sample Sign, 1-Sample Wilcoxon, Mann-Whitney for 2 Samples; Kruskal-Wallis (>2), Mood's Median (>2), Friedman (>2)
Non-Parametric Tests (Median)
1-Sample Sign, 2-Sample Wilcoxon, Mann-Whitney, Kruskal Wallis, Mood's Median, Friedman
Hypothesis of Means Tests (Normal Data)
1-Sample z/t Test, 2-Sample t-test, Paired t-test, One-way ANOVA or Balanced ANOVA (>2 samples)
Z Distribution
1-sample z-test of means
Conducting a Hypothesis Test
1. Choose the population parameter to be tested. 2. State the H0 and the HA. 3. Choose the Hypothesis Test to analyze the situation. 4. Identify the consequences of making a bad decision. 5. Establish the SS required to detect the difference. 6. Collect data and perform appropriate Hypothesis Testing with a statistical software package like Minitab. 7. Use the p-vale/test statistics/CI to make the decision.
Inference
A conclusion reached on the basis of evidence and reasoning. In Six Sigma problem solving methodology, we move from describing the nature of the sample data to inferring what will happen in the future with our data (descriptive to inferential).
One-Sample Sign Test
A hypothesis test that determines whether a statistically significant difference exists between the median of a non-normally distributed continuous data set and a standard. It provides a way to determine if there is truly a difference between the standard and a particular data set median or whether the difference is due to random chance. Evaluate with point estimate, confidence interval, or F-value. Sign is (positive or negative)=above or below the target. An alternative to One-sample Z and One-sample t-tests. An example would be testing whether a call center, that has guaranteed a median hold-time of 1 minute, is performing as promised.
Mood's Median Test
A hypothesis test that determines whether a statistically significant difference exists between the medians of two or more independent sets of non-normally distributed continuous data. It is useful for determining if a particular strata or group could provide insight into the root cause of process issues. Equality of population medians in a one-way design (aka median or sign scores test). Alternative to one-way ANOVA. More robust to outliers and less powerful for large sample sizes that Kruskal-Wallis. Special case of Pearson's Chi-Square test. An example would be if Assembly Line A products have a median production cycle time of 10.3 minutes, Assembly Line B products have a median production cycle time of 9 minutes and Assembly Line C have a median production cycle time of 11.5 minutes and you want to determine if any of the 3 lines truly have different median cycle times from the others or if the difference is just due to random chance.
2-Proportion Test
A hypothesis test that examines the difference between two population proportions using information from two samples, when the two sets of samples are statistically independent. P-value and Confidence Interval estimated based on: binomial distribution and the normal approximation to the binomial distribution for sample size greater than 50. The method of estimating the proportions separately is preferred for large sample size WHEREAS the pooled estimate of the proportion is used to calculate p-value for small sample size. H0: P1-P2=P0; HA: P1-P2 not =P0. If P0=0, HO: P1=P2 and HA: P1 not equal to P2.
1-Proportion Test
A hypothesis test that examines the population proportion using information from one sample and comparing it to a target value. Used to determine if the single process/population proportion is statistically different than a target specified value. P-value and Confidence Interval are estimated based on: the binomial distribution OR the normal distribution to the binomial distribution for sample size (>50). Hypothesis Statements: H0: P=P0, HA: P not = to P0
Weibull Distribution (Continuous but Non-Normal)
A mathematical distribution showing the probability of failure or survival of a material as a function of the stress. Life data analysis.
Standard Normal Distribution (Continuous)
A normal distribution with a mean of 0 and a standard deviation of 1. Area under curve=1. Z-score.
Cluster Sampling
A probability sampling technique in which clusters of participants within the population of interest are selected at random, followed by data collection from all individuals in each cluster. Divide the population into groups and randomly select thee groups to collect random data (e.g. Exit Polls).
Systematic Sampling
A procedure in which the selected sampling units are spaced regularly throughout the population; that is, every n'th unit is selected.
F-Statistic
A ratio of two measures of variance: (1) between-groups variance, which indicates differences among sample means, and (2) within-groups variance, which is essentially an average of the sample variances. SS (Part): SS (Whole)
Random Sampling
A sample that fairly represents a population because each member has an equal chance of inclusion. Selected by chance or random order.
Point Estimation
A single numerical value as the estimate for an unknown population parameter.
R-squared value
A statistical measurement of the strength of the correlation between two variables and ranges from 0 to 1.
Two-Sample t-Test
A statistical method used to compare the means of 2 groups of subjects (groups are independent samples) to see if the difference is statistically significant or chance. If p <0.05, the null is rejected and the means are different. If Std Dev (Sample 1) > 2*Std Dev (Sample 2), use an unpooled method (unequal variances). Otherwise consider both the same and use a pooled method. Compare t-value and t-critical.
One-Way ANOVA
A statistical test used to analyze data from an experimental design with one independent variable that has three or more groups (levels) of numerical or categorical data with a numerical/continuous output.
Balanced ANOVA
A statistical test used to determine whether or not different groups have different means. Two or more factors with two or more levels (numerical or categorical) with numerical or continuous output. Observations are taken of all levels of all factors. Determining if all means are equal or if at least one factor is different. Unrestricted model considers all factors irrespective of restrictions.
MANOVA
A statistical test used to evaluate the relationship between three or more levels of an independent variable (numerical or categorical) and two or more dependent variables. Numerical or continuous output.
Stratified Sampling
A type of probability sampling in which the population is divided into groups with a common attribute and a random sample is chosen from each group.
Confidence Interval of Mean Formula
Alpha=1-CL=risk of a wrong decision. If the SS/a error increase, the CI decreases. First look at Confidence and subtract from 1 to find alpha risk. Then divide alpha by 2 and use the z/t able. Find MOE by multiplying the z/t value by the SE (Std. Dev. / Sq Rt of n). Add value plus or minus (z/t value times the LCI and UCI) to get the mean plus the MOE.
Measurement Error
An error that occurs when there is a difference between the information desired by the researcher and the information provided by the measurement process (MSA/Gage R&R).
Expected Count
An estimate of how many observations should be in a cell of a two way table if there were no association between the row and column variables. (Row Total * Column Total)/Total Sample Size (n)
Standard Error
An estimate of the standard deviation of the sampling distribution of a statistic. Inversely proportional to sample size. The Std. Dev. for the sampling distribution of means is called the SE of the mean. The rate of change in the SE approaches 0 at SS ~ 30. For samples >30, the T and Z distributions become nearly equivalent. Represents the average distance that the observed values fall from the fitted line.
One-Sample T Test
An inferential statistical procedure that uses the mean for one sample of data for either estimating the mean of the population or testing whether the mean of the population equals some claimed value (e.g. customer service time vs. benchmarked value).
Confidence Interval
An interval estimate for an unknown population parameter that is dependent on the confidence level desired and the sample size.
Hypothesis of Std Dev (Non-Normal Data)
Bonnett's Test for 1/2 Samples; Levene's Test (>2)
Granularity
Can be seen in a dot plot. Looks normal but is symmetric and not continuous. Causes: measurement system resolution and categorization of data.
Multiple Modes
Causes: mixtures of distributions, trends or patterns (lack of independence), catastrophic failure.
Hypothesis of Std Dev Tests (Normal Data)
Chi-Squared (1 Sample), F-test (2 Sample), Bartlett's (>2 samples)
Friedman Test
Compares medians of three or more matched groups. Ranks the values in each block (separately by row) from low to high. Sums the rank in each group (column)-if the sums are very different, the p-value will be small (using Chi-Square). Use when: underlying measurement is on an ordinal scale, distribution of a dependent variable is highly skewed, used for repeated measures type of experiment to determine if a factor has an effect. Assumptions: the block is a random sample from the population; there is no interaction between blocks (rows) and treatment levels (columns); one block is measured on three or more different occasions; data should be ordinal/continuous; the samples do not need to be normally distributed. Hyp Statement-H0-column effects are all the same; HA they are not all the same Preferred when the same parameter has been measured under different conditions on the subject. Alternative to one-way ANOVA (with repeated measures). Similar to Kruskal-Wallis and extension of Paired Sign test.
One Sample Wilcoxon Signed Rank Test
Compares two related samples or matches samples to assess whether their median of pairwise averages of differences is zero. Determines whether two dependent samples were selected from populations having the same distribution. Used to examine the effectiveness of two methods/products etc. Assumes: data is at least ordinal in nature, pairs are independent of each other, dependent variable is continuous in nature. Alternative to Paired-t and One-sample t-tests. Similar to One-sample sign test but is more discriminating/efficient and requires symmetrical distribution of paired difference data. Steps: 1. Determine the magnitudes of the differences of paired data. 2. Estimate pairwise averages or Walsh averages (the means of each possible pairs of differences, including the pair of each different value with itself). 3. Evaluate the corresponding: point estimate (estimated median of Walsh averages), Test-statistics (W-the number of Walsh averages exceeding the null hypothesized value), Confidence Interval, and P-value.
Poisson (Discrete/Attribute)
Defect data
Binomial (Discrete/Attribute)
Defective data. Does not require natural logarithmic base for probability calculation.
Chi-Square (Continuous)
Depends on sample size. X^2 value.
Hypothesis Testing and Attribute Data
Depends on the practical difference to be detected and the statistical confidence you want to have.
Hypothesis Testing and Continuous Data
Depends on the practical differences to be detected, the inherent variation in the process. The statistical confidence you wish to have (alpha value).
Sampling Error
Error due to differences among samples drawn at random from the population. The difference between an estimate and the true value. The only source of error that statistics can accommodate.
Lack of Measurement Validity
Error in measurement when equipment does not actually measure what it is intended to.
Interval Estimation
Estimate an unknown parameter using an interval of values that is likely to contain the true value of that parameter. Estimation of the amount of variability in a sample statistic when many samples are repeatedly taken from a population.
Kruskal-Wallis Test
Extension of Mann-Whitney test and alternative to one-way ANOVA. Rank-based non-parametric test that is used to test whether samples are likely to derive from the same population by comparing average ranks of more than two independent samples. H is the test statistic. Chi-Square distribution approximates that distribution of H (if no group has fewer than five observations). Similar to Mood's Median test but is more powerful (lower confidence interval) and less robust to outliers.
F Distribution (Continuous)
F value. 2-Sample Variance.
Sampling
Factors: procedure, size, and participation (responsiveness). Sample the entire population when: the population is very small, when you have extensive resources, or when you do not expect a high response rate.
Type 2 Error (beta)
False negative (beta-setting a guilty man free). No difference when there really is one. 1-beta=Power of Test (generally 10%)
More than 2 Variance Tests
For normal data, Bartlett's test is used. For non-normal data, Levene's test is used for 2 variance and more than 2 variances. H0 says that there is difference between any of the variances. HA says that at least one is different.
Sample Size
Function of: the size of the population of interest, how important it is to be accurate (e.g. Confidence Level of 95%), and how important it is to be precise (e.g. MOE or Confidence Interval). In general, increasing the SS improves precision by decreasing the MOE/CI within a given Confidence Level.
Hypothesis
General concept/types, p-value and risk, alpha/beta, statistical vs. practical significance.
Pattern
Graphical analysis: multi vari chart (largest family of variation) and classes and causes of distribution.
Multi-Variate Analysis
Graphically displays patterns of variations through a series of charts. Used to identify possible Xs or families of variation. Use to analyze the effects of categorical/discrete inputs on a continuous output and to illustrate ANOVA. Balanced data is required to generate these properly and an equal number of data points is required for each condition. *Collected data should represent at least 80% of the variation in a process. Hypothesis testing is performed bases on the findings to establish statistical significance. Can also be used to assess capability, stability, and relationships between Xs and Ys.
ANOVA Introduction
If there are more than two samples of data, you could plausibly do pairwise comparisons (2-sample t-test). The alpha risk would increase as the number of means increases (1-(1-a)^m when m=number of pairs of means. Analysis of Variance-used to test for more than two means of quantitative populations. H0=all means are equal. HA=at least one is different. ANOVA extends the 2-sample t-test for testing the equality of more than two population means. Verify data normality->take equal variance->review residual plots->review interval plots
Inferential
Inference/CI and sampling methods and Central Limit Theorem
Non-Valued Add
Lean approach using root-cause analysis by studying: VSM, process mapping, and process data.
Student T Distribution (Continuous)
Mean=0, Std Dev=1. T-value.
Kurtosis
Measure of the fatness of the tails of a probability distribution relative to that of a normal distribution. Indicates likelihood of extreme outcomes. Causes: mixtures (multiple batches), sorting or selecting, trends or patterns (lack of independence), non-linear relationships (X and Y). Mesokurtic=normal kurtosis is 0. Platykurtic=flat with short tails (negative). Distributions with multiple means shifting over time overlaying each other produce a flat head. Leptokurtic=peaked with long tails (positive). Distributions with very different variances overlaying each other produce a peaked head.
Causes of Non-Normal Distributions
Multiple sources of variation. Skewness, kurtosis, multiple modes, and granularity.
Chi-Square (Discrete/Attribute)
Nominal data in a contingency table
Mann-Whitney Test (aka Wilcoxon Rank-Sum Test or Wilcoxon-Mann-Whitney Test)
Nonparametric test used to test whether two samples are likely to derive from the same population. By comparing differences between the medians of two independent samples and computing the ranks of two samples. An alternative to the two-sample t-test. Evaluate: Point estimate (difference of medians of the samples), Test statistics (W-minimum of two Rank Sums), Confidence Interval, P-value.
95% CI of Mean
Normal Distribution a=.05=5%, so Z sub 0.025=1.96 T Distribution, Z=2.201 (n=12); use t if >30 in sample.
Sampling Bias
Occurs when a sample is collected in such a way that some members of the intended population are less likely to be included that others and results in a biased/non-random sample.
Proportion Tests
One and Two-Sample Proportion (binomial: pass/fail data), Chi-Square Contingency Table (association data)
Tests for Variance (Normal)
One, Two, and more than two Variance Tests.
Tests for Means (Normal)
One, Two-Sample and Paired T-test. ANOVA for more than two samples.
Parametric Tests (Normal Data)
One-Paired t-Test, One-sample & paired t-test, Two-sample t-test, One-way ANOVA, Two-way ANOVA, One-way ANOVA with repeated measures
Significant Difference
Practical and statistical. Most common error in statistics is to not tie these two differences together. Hypothesis testing is used to detect the difference.
Lognormal Distribution (Continuous but Non-Normal)
Probabilistic design and failure.
Inferential Statistics
Procedures used to draw conclusions about process or population characteristics from information contained in a sample drawn from a population in a way that accounts for randomness and uncertainty in the observations.
Descriptive Statistics
Procedures used to summarize and describe the characteristics of a set of measurements (bar charts, pie charts, and parameters such as mean and Std. Dev.).
Type 1 Error (alpha)
Rejecting null hypothesis when it is true. Difference concluded when there is none. False positive. 1-alpha=CL (generally 5%)
Skewness
Skewness is a characteristic of an asymmetrical distribution. A distribution is "negatively" skewed when a higher frequency of scores are found above the mean than below it (long tail to the left). A distribution is "positively" skewed when a higher frequency of scores are found below the mean than above it (long tail to the right). Skewness is 0 for normal distribution. Causes: natural limits, artificial limits, mixtures of data sets, nonlinear relationships, interaction of two inputs, and nonrandom patterns across time (aging).
Non-Normal Data Causes
Skewness is a natural state when processes are operating near limits while the other three are usually symptoms of a problem. A common reaction is to transform this. First determine if data transformation is appropriate or if you can remove underlying causes first.
Practical Significance
The amount of difference that will be of practical value to business.
Hypothesis Testing
The comparison of sample results (point and interval estimation) with a known population parameter or a target/specified value. Helps us make fact-based decisions about whether there is significant difference in process or population parameter before and after improvements.
Statistical Significance
The magnitude of difference or change required to distinguish between a true difference and one that could have occurred by chance. Statistically significant differences can be found without practical differences.
Statistical Power
The probability (1-beta) of correctly rejecting H0 when false. Used in establishing a SS that will allow for the proper overlap of distributions. Factors: 1. Difference (^), 2. Std. Dev. (down), 3. SS (up), 4. Significance Level (a) desired (up)
P-Value
The probability level which forms basis for deciding if results are statistically significant (not due to chance). The risk involved in wrongly rejecting the H0. If SS is greater than 3 sigma, use z-value. Otherwise use t-value. The right tail shows the left side of the curve so either use the "minus z" value or subtract the original value from 1.
Alpha Risk
The probability of accepting the alternate hypothesis when, in reality, the null hypothesis is true. Usually .01-.1 (5% in manufacturing and 10% in transactional projects). Critical value associated with alpha (aka level of significance). Not affected by sample size. If p-value is less than or equal to .05, then there is not enough evidence to accept the H0.
Beta Risk
The probability of accepting the null hypothesis when, in reality, the alternate hypothesis is true. Related inversely to SS and alpha. Usual value of beta risk is 10%.
Margin of Error (MOE)
The range of values below and above the estimate in a Confidence Interval.
Null Hypothesis (H0/N0)
The statistical hypothesis tested by the statistical procedure; usually a hypothesis of no difference or no relationship. Generally assumed to be true. Either reject the null or fail to reject it.
Central Limit Theorem
The theory that, as sample size increases, the distribution of sample means of size n, randomly selected, approaches a normal distribution. Helps us understand the risk we are taking: If the population is normal, the sampling distribution of means will be exactly normal regardless of SS. If the population is non-normal, sampling distribution of means will be normal when SS is >30.
Exponential Distribution (Continuous but Non-Normal)
Time between events
Purpose of Hypothesis Testing
To integrate the Voice of the Process (VOP) with the Voice of the Business (VOB), to make data based decisions to resolve problems, and to avoid high costs of experimental efforts by using existing data. Influenced by: past experience (belief/perception), current need (relevance/preference), evidence (data), and risk (tolerance for making the wrong decision).
Error DF
Total DF-DF (each source)
Non-Parametric Tests
Use when: data is obviously non-normal, the sample is too small for Central Limit Theorem to lead to normality of averages, distribution is not known, data is nominal or ordinally scaled. Use median to describe central tendency. Used if just one distribution is non-normal. Still evaluates p-value. In general, these tests: 1. rank order the data, 2. sum the data by ranks, 3. sign the data above or below the mean, and 4. calculate, compare, and test the median. These tests make no assumptions about the data. The only assumption is that the samples come from any identical continuous distributions or symmetrical distributions. Ordinal-median Skewed continuous-median Nominal-mode
Paired t-Test
Used to determine if the mean difference between two sets of observations is zero. Each subject is measured twice resulting in paired and related data sets (data is independent). Samples size of the two sets of observations must be the same. Is the mean difference 0 (H0)? Not 0 (HA).
ANOVA Purpose
Used to investigate and model the relationship between a dependent variable and one or more independent variables. The classification variable (factor) usually has three or more levels (if there are only two, a 2-sample t-test can be used). An error (residual) is the difference in what the model predicts and the true observation. Assumptions-1. Observations are adequately described by the model and checked by normality at each level. 2. Errors/residuals are normally and independently distributed. 3. Homogeneity of variance among factor levels is checked by equal variance tests across all levels.
Chi-Square Test and Contingency Table (Bivariate/2-Way)
Used to simultaneously compare more than two sample proportions with each other. Works better with five or more observations in each cell of the contingency table (observations can be pooled by combining cells). HA is that at least one proportion is different.
Contingency Table
Used to test if the proportion is contingent upon (or dependent upon) the categorical factors or variables linked to data groups (aka used to test the relationship between two categorical factors or variable in a population). HO=no relationship (independent) and HA=relationship (dependent) Steps: 1. Calculated Estimated, 2. Calculate Chi-Square, 3. Calculate DF, 4. Compare Chi-Square (calculated) value with Chi-Square Table.
2-Variance Test
Used to test if the two populations have the same variance. For normal data, F-test is used to conduct the two-variance test. For non-normal data that is less skewed, Bonnett's is better. Bonnett's can also be used for any continuous distribution. For extremely skewed non-normal data, Levene's is the most appropriate.
1-Variance Test
Used to test if the variance of a population is equal to a target, specified value, or hypothesized population variance. For normal data, Chi-Square method is used to calculate the CI. For non-normal data, Bonnett's method is better than Chi-Square.
Unequal Variance
Usually the result of differences in the shape of the distribution with some special causes (e.g. extreme tails, outliers, multiple modes).
Alternative Hypothesis (Ha/Na)
Will be considered true if the null hypothesis is false.