Misuse of Statistics
unjustified, incorrect
"Using numbers such that either by intent or through ignorance or carelessness, the conclusions are ____________ or ___________.
misuse
A _________ of statistics occurs when a statistical argument asserts a falsehood.
Pseudoreplication
A technical error associated with Analysis of variance. Complexity hides the fact that statistical analysis is being attempted on a single sample (N=1). For this degenerate case the variance cannot be calculated (division by zero). What is this called? _________________.
gambler's fallacy
Assumes that an event for which a future likelihood can be measured had the same likelihood of happening once it has already occurred. What is this called? __________ ______________.
prosecutor's fallacy
Has led, in the UK, to the false imprisonment of women for murder when the courts were given the prior statistical likelihood of a woman's 3 children dying from Sudden Infant Death Syndrome as being the chances that their already dead children died from the syndrome. The courts then handed down convictions in spite of the statistical inevitability that a few women would suffer this tragedy. Meadow's calculations were irrelevant to these cases, but even if they were, using the same methods of calculation would have shown that the odds against two cases of infanticide were even smaller (one in billions). What is this called? __________ ______________.
false causality
If the number of people buying ice cream at the beach is statistically related to the number of people who drown at the beach, then nobody would claim ice cream causes drowning because it's obvious that it isn't so. (In this case, both drowning and ice cream buying are clearly related by a third factor: the number of people at the beach). In well-designed studies, the effect of false causality can be eliminated by assigning some people into a "treatment group" and some people into a "control group" at random, and giving the treatment group the treatment and not giving the control group the treatment. What is this called? _______ ____________.
Proof Null Hypothesis
In a statistical test, the null hypothesis (H_0) is considered valid until enough data proves it wrong. But if data does not give us enough proof to reject H_0, this does not automatically prove that H_0 is correct. This can—using the judicial analogue above—be compared with the truly guilty defendant who is released just because the proof is not enough for a guilty verdict. This does not prove the defendant's innocence, but only that there is not proof enough for a guilty verdict. What is this called? _______ of the _____ ____________.
Data dredging
In data dredging, large compilations of data are examined in order to find a correlation, without any pre-defined choice of a hypothesis to be tested. Note that data dredging is a valid way of finding a possible hypothesis but that hypothesis must then be tested with data not used in the original dredging. The misuse comes in when that hypothesis is stated as fact without further validation. What is this called? ______ _____________.
statistical fallacy
In some cases, the misuse of statistics may be accidental while in other cases, it is purposeful and for the gain of the perpetrator. When the statistical reason involved is false or misapplied, this constitutes a ____________ __________.
Data manipulation
Informally called "fudging the data," this practice includes selective reporting (see also publication bias) and even simply making up false data. The easiest and most common examples involve choosing a group of results that follow a pattern consistent with the preferred hypothesis while ignoring other results or "data runs" that contradict the hypothesis. What is this called? ______ _____________.
Misreporting of estimated error
Many people may not realize that the randomness of the sample is very important. Non-random sampling makes the estimated error unreliable. People may think that it is impossible to get data on the opinion of dozens of millions of people by just polling a few thousands. This is also inaccurate. What is this called? ____________ __ __________ __________. aka misunderstanding of estimated error.
Discarding unfavorable data
Organizations that do not publish every study they carry out, such as tobacco companies denying a link between smoking and cancer, anti-smoking advocacy groups and media outlets trying to prove a link between smoking and various ailments, or miracle pill vendors, are likely to use this tactic. What is this called? __________ __________ ______. Another term related to this concept is cherry picking.
types, misuse
Other _______ of _________: Other misuses include comparing apples and oranges, using the wrong average, regression toward the mean, and the umbrella phrase garbage in, garbage out. Some statistics are simply irrelevant to an issue.
Overgeneralization
Overgeneralization is a fallacy occurring when a statistic about a particular population is asserted to hold among members of a group for which the original population is not a representative sample. For example, suppose 100% of apples are observed to be red in summer. The assertion "All apples are red" would be an instance of overgeneralization because the original statistic was true only of a specific subset of apples (those in summer), which is not expected to representative of the population of apples as a whole. What is this called? _________________.
Ludic fallacy
Probabilities are based on simple models that ignore real (if remote) possibilities. Poker players do not consider that an opponent may draw a gun rather than a card. What is this called? _______ ____________.
Biased samples
Scientists have learned at great cost that gathering good experimental data for statistical analysis is difficult. Calling people who have home phones, will exclude younger people who only use cell phones. In conclusion, the sample poll will be biased because it's not representative of the entire population. What is this called? ________ ___________.
practical significance
Statistical significance is a measure of probability; practical significance is a measure of effect. A baldness cure is statistically significant if a sparse peach-fuzz usually covers the previously naked scalp. The cure is practically significant when a hat is no longer required in cold weather and the barber asks how much to take off the top. The bald want a cure that is both statistically and practically significant; It will probably work and if it does, it will have a big hairy effect. Scientific publication often requires only statistical significance. This has led to complaints (for the last 50 years) that statistical significance testing is a misuse of statistics. What is this called? Confusing statistical significance with __________ _______________.
Loaded questions
The answers to surveys can often be manipulated by wording the question in such a way as to induce a prevalence towards a certain answer from the respondent. Another way to do this is to precede the question by information that supports the "desired" answer. For example, more people will likely answer "yes" to the question What is this called? ________ __________.