six sigma test 3
Non-parametric Friedman two-way ANOVA by ranks is often used when
the assumptions of parametric two-way ANOVA cannot be satisfied downside:no way to test the blocks, no way to determine whether background vars have more or less effect of same effect on treatments
the purest of all the DOE techniques because it neatly separates main effects from interaction effects
the full factorials
5 steps of hypothesis testing
1)statement of hypothesis 2)select the level of significance 3)compute test statistic 4)formulate the decision rule 5) make a decision
KW step 3: the test statistic
H is the test statistic. H is approximated by chi squared-distribution if all the sample sizes are > 5 with a and K - 1 as the entering arguments
Kruskal-Wallis one- way analysis of variance by ranks
Non-parametric most commonly used alternative to the parametric one-way ANOVA
The 5-step hypothesis testing procedure can be used for any parametric or non-parametric test (T/F)
True
"voice of the manager or engineer" must be replaced by
"voice of the customer."
null hypothesis
(Ho) is a statement concerning a population that is generally believed to be true.
t computed formula
(x-bar - u)/[s/SQRT(n)]
z computed formula
(x-bar - u)/[sigma/SQRT(n)]
friedman step 5: compute chi squared and make decision
*The ranking is done for each block instead of as a combined set of data.
*Every manufactured product (or process)possesses one (or two) major features/characteristics that appropriately reflect the results of changes or adjustments to the process
*Visual, workmanship and cosmetic characteristics that are present or absent are designated as attributes Attribute ratings ARE NOT sensitive enough for problem-solving efforts
friedman step 4: decision rule
*chi squared-computed > chi -critical, reject the H0. One-tail test with the risk on the right.
for full factorials: The number of treatments needed is
*the number of levels (usually two) to the nth power, where n is the number of factors (e.g., four variables would be 2^4 = 16 treatments)
"When you can measure what you're speaking about and express it in numbers, you know something about it, but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind."
. Lord Kelvin (1824 -1907)
structural weaknesses in classical/teguchi approach
1. Technique a. Classical (Box) and Taguchi use fractional factorial method (Fisher and Shainin use the full factorial method.). b. If Taguchi orthogonal array approach fails to discover the RED X, no alternatives. c. Shainin DOE uses 10 distinct techniques, each suited for a particular problem or application. 2. . Clue generation a. In fractional factorial and Taguchi DOE, engineers and/or teams guess at possible causes of the problem b. In Shainin DOE, "Listen to the parts. The parts and processes are smarter than the engineers."
prereqs for the mann whitney u test
1. data can be converted into ranks. 2. samples are independent. Note: Independence can be tested.
The Mann-Whitney U test published when and by who
1947 by Henry B. Mann and D. R. Whitney
kaizen usually involved use of
7 tools of quality control Industrial engineering methods
interaction
A common condition where the effect of one factor depends on the level of another factor
parametric one way ANOVA
A treatment is a cause, or specific source, of variation in a set of data.
larger samples mann u test (rarely the case)
As the sizes of the two independent samples increase, the sampling distribution of the statistic U tends to become normally distributed. If one of the sample sizes is > 20, the Z-test is applied. formula for Z : Z = SumR1 - SumR2 - (n1 - n2)[(n1 + n2 + 1)/2] SQRT(n1n2 [(n1 + n2 + 1)/3]
balance (full factorials)
All possible combos of variables and levels are tested in a balanced design. allows systematic separation and quantification of all main effects and interaction effects
Limitations of full factorials
As number of factors/variables increases, number of treatments increases exponentially. From a practical pov, full factorials should be limited to four factors or less
wilcoxon step 5: make a decision
Compute after - before difference Only the + and - changes are considered Order the "absolute" differences from lowest to highest. Multiple differences are given an average rank. Each assigned rank given the sign of the original difference. All negative ranks summed (T-) and all positive ranks summed (T+). lower is T-computed. T-critical comes from the Wilcoxon Table
(kaizen) All processes have two aspects
Control-taking action on deviations to maintain a given process state, adhering to established SOPs Improvement requires experimentally modifying the process to produce better results
Metrics such as percent defective do or do not consider the degree of defectiveness?
DO NOT
a method for carrying out carefully planned experiments on a process
Design of experiments
best way to solve chronic quality problems and reduce variation.
Designed experiments
Designed experiments developed by _____ of the U.S.A. solved quality related problems
Dorian Shainin (1914 - 2000)
Sir Ronald A. Fisher (1890 - 1962)
Experimental area Block—Uniform properties Plot—Basic unit of design which could be split by drawing a line. Treatment (e.g., application of fertilizer)
one way ANOVA step 3: test statistic
F is the test statistic F = Estimated pop variance between the treatments/Estimated population variance within the samples F-critical entering arguments are a, K - 1, and N - K where K = # of treatments and N = total number of observations
quantifying attributes, case of the electric shaver
Frequency Spectrum Analyzer,Decibel Meter,Human Ear Ear more accurate at determining noise level selected quietest and noisiest,did component search- take apart and swap components-able to ID red x Able to increase tolerances All goes back to- define problem, noise, how do you measure that, took some trial/error, but once found system using boderic scale- rest was just mechanics
Orthogonal Array designed experiments developed by ford put him on the map
Genichi Taguchi (1924 - 2012) of Japan.
If the treatments are not normally distributed and/or the SD's are not equal, then use the non-parametric Kruskal-Wallis one-way analysis of variance by ranks procedure
H is the test statistic. H is approximated by chi squared if the sample sizes, n, > or equal to 5 Data is at least ordinal level (capable of ranking)
wilcoxon step 1: null/alternative hypotheses
H0: There is no difference between the operators' output using the old procedure versus the proposed procedure. Ha: There is a difference in the two procedures. The proposed procedure increases output (one-tailed test)
KW step 1: null/alternative
H0: four treatments are more-or-less equal. Ha: At least one of the treatments is different
small samples (usually the case) with mann whitney u test step 1: null and alternative hypotheses
H0: is no difference between business majors and engineers with respect to problem-solving aptitude. Ha: is a sig difference between business majors/engineers with respect to problem-solving aptitude. can be either two-tail (there is a difference) or one tail (one group is better or worse than the other)
one way ANOVA step 1: null/alternative
H0: u85 = u87 = u89 = u91 Ha: At least one treatment mean is different
step 1: statement of hypothesis
Ho: What is generally held to be true. Ha: What the analyst believes to be true based on the sampled evidence.
null (Ho) vs alternative (Ha)
Ho: the process is in control Ha: the process is out of control
Poor manufacturing processes
How many processes are "certified?" Almost none. Six Sigma companies do a Process Certification "scrub." 1. "scrub"-process audit against standardized practices along with needed remedial actions. 2. Six Sigma "scrubs" are initiated both before and at the end of an improvement project
most practiced aspect of statistics
Hypothesis testing
only way to scientifically validate a theory or statement based on a sample of data.
Hypothesis testing
one way ANOVA step 4: the decision rule
If F-computed > F-critical, reject the H0.
KW step 4: the decision rule
If H-computed > chi squared-critical, reject the H0.
one way ANOVA step 5: find f and reach decision
If H0 is rejected use to find the significant differences 1. Sample sizes all the same use Tukey's HSD test. 2. Sample sizes differ use Tukey-Kramer.
wilcoxon step 4: the decision rule
If T-computed < T-critical, reject H0
full factorials analysis method used when
If a treatment and a block are each tested at two levels (e.g., optimal setting and most likely worst setting), and it is possible to obtain repeat readings for a given set of experimental conditions
which procedures are more powerful in controlling Type-I and Type-II errors.
If assumptions for parametric procedures can be reasonably satisfied, parametric should be used because they are more powerful in controlling Type-I and Type-II errors
block design step 5:compte f and make decision
If the H0 for the treatments is rejected, use Tukey to determine the significant differences
advantages of statistically designed experiments
Interactions can be detected and measured. Can use the same observations to estimate several different effects Experimental error is quantified and used to determine confidence in the results
poor management (and 3 sins of omission/commission)
Juran's 85:15 rule: 85% of all quality problems caused by management, 15% by workers. "sins of omission and commission" 1. Lack of knowledge. 2. No leadership on variation reduction. 3. No resources or time allocated to variation reduction. alternative: "Our quality thinking should be reduced process variation around the nominal as an operating philosophy for never- ending quality improvement."—Theresa Wagner, VP Continuous Improvement, Sunny Fresh Foods, Baldrige Award 1999, 2005
________ IS NOT a fancy name for statistical methods, it IS NOT an alternative to statistical methods, NOR is it limited to only product quality
Key variable isolation
friedman step 3: statistical test
Like non-parametric Kruskal-Wallis one-way ANOVA by ranks, chi squared is used as the test statistic. K - 1 is the degrees of freedom.
block design step 3: statistical test
Like parametric one-way ANOVA, F is the test statistic. F = Mean square of the treatments/ Mean square of the error MSTr/MSE
flaws w/ traditional approach
Not possible to hold all other variables constant. No way to account for joint variation (i.e., interactions) No way to account for experimental error
step 5: make a decision
Only one of two statements can be made. The Ho IS NOT true The Ho may be true. Since Zcomputed < Zcritical , the QC inspector can reject the Ho. The claim that the average tensile strength of the bar-stock = 10,000 psi IS NOT true.
If the larger of the two samples has < or equal to 20 observations, the small sample approach is followed (mann whitney)
Otherwise, the samples are considered large. with parametric procedures large samples are generally > or equal to 30.
two commonly used ANOVA procedures used to conduct one-factor designed experiments
Parametric one-way ANOVA Non-parametric Kruskal-Wallis test
commonly used two-factor ANOVA procedures used to conduct two-factor designed experiments:
Parametric randomized complete block design (2way ANOVA) Non-parametric Friedman two-way ANOVA by ranks Full factorials (multiple readings possible)
If blocking (i.e., background) variables are considered, and they may at discretion be tested, then a two-factor experiment is conducted.Three approaches available:
Parametric two-way ANOVA- strict assumptions, tests blocks. Non-parametric Friedman two-way ANOVA by ranks-only tests treatments, not blocks. Full factorial method
some of the causes of variation "the big 5"
Poor management Poor product/process specifications Inadequate quality tools and systems Poor manufacturing practices Poor supplier materials
primary input variables and there percentages
Red x- 40-50% of green y Pink x- 20-30% Pale pink x- about 10-15%
Two concepts of particular interest: replication and randomization
Replication: collection of more than one observation for the same set of experimental conditions. Randomization: to reduce bias, variables not specifically controlled as factors should be randomized
randomization (full factorials)
Set patterns introduce biases into the experiment. There could be: a. A shift over time b. A trend over time c. A cyclical pattern over time Experiments must be run in random order
friedman step 1: null/alternative
Stated in the basically the same manner as the one-way ANOVA. Ho: The treatments are more-or-less equal. Ha: At least one treatment is different. Hypotheses about block means ARE NOT tested.
In most cases, if the data is interval or ratio level and comparing two independent population means that come from approximately normal distributions and the sample sizes are less than 30, use _________
Student's t-distribution; parametric null hypothesis would be, for example, H0: u1 = u2 .
wilcoxon step 3: the test statistic
T, not Students-t
KW step 5: compute H and reach decision
The data is ranked from low to high as a combined set
randomized block design step 1: null/alternative
The hypotheses statements for the treatments are made in the same manner as done in a one-way ANOVA (Mean of method a= mean of method b= mean of method c Alternative- at least one is different) Testing hypotheses about block means is optional
Limitations of all types of fractional factorials
The lower the % of treatment tests the more contaminated/confounded the results meaning marginal or plain wrong results*principle: if the basic construction of the experiment is flawed, the results will also be flawed
step 2: select level off significance
The producer's risk (a), A-priori. Type-I error or the probability of rejecting a true Ho. probability of Type-I error low bc evidence designed to prove Ho false. Typically a = 0.05.
power curves
The set of probabilities in column 3 -"power function" of the test. --higher the probabilities, the greater the discriminatory power of the test. merely a graphic representation of the probability of rejecting a submitted lot or Ho for alternative values of m.
Response variable
The variable being investigated, also called the dependent variable or Green Y.
assumptions for parametric one way ANOVA
Treatments are normally distributed Treatment standard deviations are equal Samples are independent Data is at least interval level
Like the Kruskal-Wallace test, Friedman test, and Mann-Whitney U test (the populations are independent), the _______test is a non-parametric procedure to test dependent populations
Wilcoxon
most commonly used test statistics
Z (standard normal), Student's-t, chi squared, F
The Mann-Whitney test of significance
a non-parametric procedure used to test relationship between two indep populations (i.e., they differ or one is better than the other). non-parametric null hypothesis would be, H0: There is no significant difference between the two populations.
*Attribute scales (e.g., Bo Derek): Provides
a rating that can be used to solve the defect problem.
When experiments are conducted which involve two factors, and it is not possible to obtain repeat readings for a given set of experimental conditions, _____may be used.
a two-way ANOVA
Has the Green Y been defined and quantified in terms of:
a. Defect levels or field failure levels? b. Cost, safety, or environmental impact? c. Time (i.e., Longevity)?
Three schools of thought on DOEs:
a. Sir Ronald Fisher (1890 - 1962)- full factorials & George Box (1919-2013)-fractional, tried to simplify fisher b. Genichi Taguchi (1924 - 2012 )- orthogonal rays, simplified box, fractional c. Dorian Shainin (1914 - 2000)- nonpara,let process parts do the talking,10 different techniques,sep main effects from interactions
manufacturing programs should be utilized to determine:
a. Whether SOPs are necessary? b. If necessary, are SOPs too difficult or too bureaucratic for employees to follow? c. Are the SOPs written in simple terms? d. Is employee input encouraged? e. Do employees follow SOPs only when it is necessary or when evaluated? f. Is positive and visual control enforced?
full factorial experiments
all possible treatment combinations formed from 2+ factors, each being studied at two levels, are examined so that interactions (differential effects) as well as main effects can be estimated
type 2 error
b (beta) is the consumer's risk. Accepting a false Ho thinking it is true. The P(b) is relatively high because the test is not designed to prove the Ho true, only false.) AFTER the fact
principles of full factorials
balance, replication, randomization,
most common method of handling background variables
blocking
want to determine whether sample data are compatible with the hypothesis that they were drawn from a population that follows some specified functional form
chi squared goodness-of-fit test.
central weakness of fractional factorial and Taguchi approaches, is the
chronic inability to separate the main effects from the interaction effects
properties of chi squared
computed value of c2 always positive bc difference between what is observed (fo) and what is expected (fe) is squared not one but a family of c2 distributions. One for each degree of freedom. degrees of freedom is K - 1 where K is number of cells, categories, intervals, or classes. c2 distribution is positively skewed (skewed to the right), risk is always on right tail
If you cannot identify and isolate the key variables as part of a problem-solving system then ______ cannot be achieved.
continuous improvement
kaizen
continuous improvement, implies everyone will be involved, little expense, to consistently improve situation at work and in life philosophy of continuous improvement, belief that all aspects of life should be constantly improved. like 6s, does not concern itself with creating new processes
primary variables
controllable variables believed most likely to have an effect (i.e., the vital few)
small sample mann u test step 4: the decision rule
critical values for the statistic U are given in the Critical Values of U in theMann-Whitney test. Ho rejected if the computed U is < the critical value
step 4: formulate the decision rule
defines the conditions when you can reject the Ho. The critical value of the test statistic is based on a and if the test is two-tailed or one-tailed. If the Ha is a "not equal to," the test is two-tailed. If Ha is < or >, the test is one-tailed and the risk (burden of proof) is in direction of the sign
Even if only 2 or 3 process variables remain under consideration, determining which one of them—or their interactions—are the cause of process problems can present real difficulties. In these cases, what provides most direct solution.
designed experiments
experimental error
due to those variables that are not considered explicitly and are analogous to trivial many common causes of variation. a. Represent the "noise level" b. "unknown and unknowable."
randomized complete block design
each treatment must be applied to each different kind of experimental material, called a block. Treatment differences are found using Tukey called two-way analysis of variance since we classify each observation according to two criteria: the treatment and the block. 1. K = # of treatments 2. n = # of blocks in a given treatment The assumptions are the same as the parametric one-way ANOVA
designed experiment
experiment where one or more factors, called independent variables, believed to have an effect on the experimental outcome, are identified and manipulated according to a predetermined plan
Full factorial designed experiments developed by _____ and fractional factorials by______ both of Great Britain.
fisher (1890-1962) george box (1919-2013)
Kaizen does NOT cover radical (kaikaku) innovations (not black belt)
fundamental idea behind kaizen comes from the Shewhart/Deming PDSA cycle. "Japanese" approach-roots well-established in scientific method
traditional approach to problem solving
hold all factors constant except one.
important fact about a particular hypothesis test has to do with
how well the test controls Type-II errors
In the Six Sigma ______ Stage, designed experiments play an important role in quality improvements
improve
replication (full factorials) -the purpose of replication (i.e., repeating)
in each combination or cell is to determine the variation or inconsistency within each cell
Type-II error (b)
is the probability of accepting a false null hypothesis.
Type-I error (a)
is the risk of rejecting a true null hypothesis.
why managers cause lack of variation
lack of knowledge/leadership
Operating Characteristic (OC) and Power Curves give information on
likelihood of accepting/rejecting a false Ho.
. Inadequate quality tools and systems
majority of managers/engineers totally unaware of the existence of designed experiment methodologies.
OC curve
merely a convenient way of showing graphically the probability of accepting a submitted lot or the Ho for alternative values of u
Full factorials should be limited to
no more than four factors or excessive complexity creeps in design at exponential rate
Wilcoxon Signed Rank Test of Differences.
non-parametric procedure by Frank Wilcoxon 1945 The differences must be at least ordinal level so that they can be ranked. If assumptions to use Z or t cant be satisfied and/or you need results quickly then non- parametric procedures should be seriously considered
chi squared goodness of fit test
non-parametric procedure. NOT testing a parameter but a statement, e.g., "HO: The population is normally distributed." compares what we observed with what we expected based on our theory of how the data is distributed.
what assumption underlies many of the hypothesis testing procedures done in practice
normally distributed population
block design step 4:decision rule
same basic rule applies to treatments as well as blocks. *If F-computed > F-critical, reject the H0. One-tail test with the risk on the right
Full factorials are used for _______studies.
optimization
downside of non parametric
para- better job of controlling p value, probability of error
when is the mann-whitney u test used
populations not normally distributed data are not at least interval scaled
Before start of a designed experiment, important to describe, define, and quantify the _____
problem
Kaizen approaches the operation using _____thinking (i.e., how do we do this) rather than _____thinking (i.e., what are we trying to do).
process; functional
A good experimental plan depends on
purpose of the experiment (should be to ID red x) Physical restrictions on process of taking measurements. Management's intent
attribute numerical rating scale
reflects overall appearance /performance of like products rather than individual defects of the product. intended for problem-solving, not for determining if the product is or is not fit-for-use Attribute- either you have it or you don't- -counted % defective, etc- not sensitive enough To discover red x, set up artificial (boderic, leickert) ordinal scale Use designed experiments- whenever you want to figure out what vars affect the output
the green y
represents the magnitude of the problem (output/response variable) that must be solved
small sample mann u test step 5: make a decision
scores ranked from lowest to highest or vice versa. Ties are given averaged ranks. Two statistics computed, U and U' U = n1n2 + [n1 (n1 + 1)]/2 - SumR1 U' = n1n2 + [n2 (n2 + 1)]/2 - SumR2 smaller computed value used to arrive at a decision to either accept or reject the H0. If H0 is rejected, means that the majority of low ranks were found in one group and the majority of high ranks were found in the other group.
A major cause of poor product/process specifications lies in the difference between selling and marketing
selling: Management/engineers determine product requirements in isolation then push product onto customers. Marketing: company first makes a painstaking effort to explore customer wants and then designs products to fit those needs.
who created and using only the full factorial method, improved the productivity of the British farm and was knighted for his great contribution?
sir ronald fisher
One of the most powerful tools for a Black Belt
statistically designed experiments
When do you use designed experiments?
studying a process learn how variables effect output determine which variables are important and not change the process average and/or reduce process variation
chi squared computed
sum[(fe - fo)^2/fe]
The main duty for all employees, not just mangers and engineers, is
systematic reduction of variation.
type 1 error
the probability of rejecting a true Ho. The probability of a Type-I error is relatively low because the evidence is designed or intended to prove the Ho false.
Use the Mann-Whitney U test to determine if ...
theres a statistically significant difference between two indep samples from same population when: a. the population is not normally distributed b. if there is doubt that the measurement scale is interval level test statistic is U for small samples and Z for large samples
purpose of wilcoxon
to find out if there is any difference between two sets of paired (related) observations
use one way parametric ANOVA (F is the test statistic) if..
treatments normally distributed treatment s's are equal samples are independent data is at least interval level
1 tailed vs 2 tailed
two-tailed test: analyst believes mean is significantly different from 10,000 1 tailed- the Ha may be written as < or > indicating the analyst believes the test statistic is significantly different in a particular direction. This is one-tailed test.
designed experiments (DOE)
used to determine the important factors (i.e., the Red X, Pink X, and Pale Pink X) to compare a variety of options or to find the optimal setting for the important factors
background variables
variables identified by the designers which may have an effect but cannot/should not be manipulated or controlled
statistically designed experiments
varying 2+ variables simultaneously and obtaining multiple measurements under the same experimental conditions
when is randomized complete block design appropriate:
when one can cross-classify the experimental units according to two criteria or can apply treatments to different kinds of experimental material
crucial part of problem-solving process lies in choosing ______
which characteristics should be measured and evaluated
Effective use of a numerical rating scale can create a consistent quantification system
will always be a certain amount of subjectivity. no such thing as perfect data. Each attribute rating system should be backed up by prominently displayed samples of each defect level
If you assume the population under study follows some prescribed theoretical distribution without doing something to shore up your assumption...
you could find yourself "on very thin ice" and could draw an erroneous conclusion
. The beauty of full factorials
you have balance, replication, and randomness that enable a complete quantification of the factors plus their interactions