Statistics Chapter 4

Ace your homework & exams now with Quizwiz!

Sample

A sample is the part of the population from which we actually collect information. We use information from a sample to draw conclusions about the entire population.

Convenience Sample

A sample selected by taking the members of the population that are easiest to reach is called a convenience sample. Convenience samples often produce unrepresentative data.

Sample Survey

A sample survey is a study that uses an organized plan to choose a sample that represents some specific population. The first step in planning a sample survey is to say exactly what population we want to describe. The second step is to say exactly what we want to measure, that is, to give exact definitions of our variables.

Simple Random Sample (SRS)

A simple random sample of size n consists of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected. An SRS not only gives each individual an equal chance to be chosen but also gives every possible sample an equal chance to be chosen.

Treatment

A specific condition applied to the individuals in an experiment is called a treatment. If an experiment has several explanatory variables, a treatment is a combination of specific values of these variables.

Response Bias

A systematic pattern of incorrect responses in a sample survey leads to response bias. Careful training of interviewers and careful supervision to avoid variation among the interviewers can reduce response bias. Good interviewing technique is another aspect of a well-done sample survey.

Making Inferences

-If the individuals WERE randomly selected and the individuals WERE randomly assigned to groups, inference about the population and inference about cause and effect CAN both be made. -If the individuals WERE NOT randomly selected and the individuals WERE randomly assigned to groups, inference about the population CANNOT be made but inference about cause and effect CAN be made. -If the individuals WERE randomly selected and the individuals WERE NOT randomly assigned to groups, inference about the population CAN be made but inference about cause and effect CANNOT be made. -If the individuals WERE NOT randomly selected and the individuals WERE NOT randomly assigned to groups, inference about the population CANNOT be made but inference about cause and effect CANNOT be made.

Criteria for Establishing Causation

-The association is strong. -The association is consistent. -Larger values of the explanatory variable are associated with stronger responses. -The alleged cause precedes the effect in time. -The alleged cause is plausible. In the absence of an experiment, good evidence of causation requires a strong association that appears consistently in many studies, a clear explanation for the alleged causal link, and careful examination of lurking variables.

Three Principals of Experimental Design

1. Control for lurking variables that might affect the response: Use a comparative design and ensure that the only systematic difference between the groups is the treatment administered. 2.Random assignment: Use impersonal chance to assign experimental units to treatments. This helps create roughly equivalent groups of experimental units by balancing the effects of lurking variables that aren't controlled on the treatment groups. 3. Replication: Use enough experimental units in each group so that any differences in the effects of the treatments can be distinguished from chance differences between the groups.

Block and Randomized Block Design

A block is a group of experimental units that are known before the experiment to be similar in some way that is expected to affect the repose to the treatments. In a randomized block design, the random assignment of experimental units to treatments is carried out separately within each block. Blocks are another form of control. They control the effects of some outside variables by bringing those variables into the experiment to form the blocks. A randomized block design allows us to draw separate conclusions about each block. Control what you can, block on what you can't control, and randomize to create comparable groups.

Matched Pairs Design

A common type of randomized block design for comparing two treatments is a matched pairs design. The idea is to create blocks by matching pairs of similar experimental units. Then we can use chance to decide which member of a pair gets the first treatment. The other subject in that pair gets the other treatment. That is, the random assignment of subjects to treatment is done within each matched pair. Just as with other forms of blocking, matching helps reduce the effect of variation among the experimental units.

Lurking Variable

A lurking variable, or confounding variable, is a variable that is not among the explanatory or response variables in a study but that may influence the reponse variable.

Rigorously Controlled Design

A rigorously controlled design carefully assigns subjects to different treatment groups so that those that get a specific treatment are similar in an important way.

Table of Random Digits

A table of random digits is a long string of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 with these properties: -Each entry in the table is equally likely to be any of the 10 digits 0 through 9. -The entries are independent of each other. That is, knowledge of one part of the table gives no information about any other part. How to choose an SRS using the table: -Step 1: Label. Give each member of the population a numerical label of the same length. -Step 2: Table. Read consecutive groups of digits of the appropriate length from the table. -Your sample contains the individuals whose labels you find.

Voluntary Response Sample

A voluntary response sample consists of people who choose themselves by responding to a general appeal. Voluntary response samples show bias because people with strong opinions (often in the same direction) are most likely to respond.

Basic Data Ethics

All planned studies must be reviewed in advance by an institutional review board charged with protecting the safety and well-being of the subjects. The purpose of an institutional review board is not to decide whether a proposed study will produce valuable information or whether it is statistically sound. The board's purpose is to protect the rights and welfare of human subjects (including patients) recruited to participate in research activities. All individuals who are subjects in a study must give their informed consent before data is collected. Subjects must be informed in advance about the nature of a study and any risk of harm it may bring. Subjects must then consent in writing. All individual data must be kept confidential. Only statistical summaries for groups of subjects may be made public. Confidentiality is not the same as anonymity. Anonymity means that individuals are anonymous - their names are not known even to the director of the study. Even where anonymity is possible, it prevents any follow-up to improve nonresponse or inform individuals of results. Randomized comparative experiments can answer questions that can't be answered without them. The interests of the subjects must always prevail over the interests of science and society.

Single-blind

An experiment can be single-blind if the individuals who are interacting with the subjects and measuring the response variable don't know which treatment the individuals received. In other single-blind experiments, subjects are unaware of which treatment they are receiving, but the people interacting with them and measuring their response variables do know.

Experiment

An experiment deliberately imposes some treatment on individuals to measure their responses. The purpose of an experiment is to determine whether the treatment causes a change in the response. When our goal is to understand cause and effect, experiments are the only source of fully convincing data.

Observation Study

An observational study observes individuals and measures variables of interest but does not attempt to influence the responses. Sample surveys are one kind of observational study. In a cross-sectional study, data is collected one point at a time. In a retrospective study, data is collected from the past. In a prospective study, data is collected from the future. The goal of an observational study can be to describe some group or situation, to compare groups, or to examine relationships between variables. An observational study is a poor way to gauge the effect that changes in one variable have on another variable.

Statistically Significant

An observed effect so large that it would rarely occur by chance is called statistically significant. A statistically significant association in data from a well-designed experiment does imply causation.

Confounding

Confounding occurs when two variables are associated in such a way that their effects on a response variable cannot be distinguished from each other (ex: sugar and mass of soda can on buoyancy). With no association between the lurking variable and the explanatory variable, there can be no confounding (ex: sugar and caffeine of soda on buoyancy). Observational studies of the effect of one variable on another often fail because of confounding between the explanatory variable and one or more lurking variables. Outside the lab, badly designed experiments often yield worthless results because of confounding. Tidbit: Explain how the confounding variable is associated with the explanatory variable and affects the response variable.

Completely Randomized Design

In a completely randomized design, the treatments are assigned to all the experimental units completely by chance. Notice that the definition of a completely randomized design does not require that each treatment be assigned to an equal number of experimental units. Using chance to assign treatments in a experiment does not guarantee a completely randomized design when you force the groups to have an equal number of experimental units. Tidbit: You are expected to describe how the treatment are assigned to the experimental units and to clearly state what will be measured or compared.

Double-blind

In a double-bind experiment, neither the subjects nor those who interact with them and measure the response variable know which treatment a subject received.

Systematic Sample

In a systematic sample, we begin at some starting point and choose every kth member to be part of the sample.

Random Assignment

In an experiment, random assignment means that experimental units are assigned to treatments at random, that is, using some sort of chance process.

Comparative Experiment

Most well-designed experiments compare two or more treatments. Comparison alone isn't enough to produce results we can trust. If the treatments are given to groups that differ greatly when the experiment begins, bias will result.

Nonresponse

Nonresponse occurs when an individual chosen for the sample can't be contacted or refuses to participate. CAUTION: Some students misuse the term "voluntary response" to explain why certain individuals don't respond in a sample survey. Nonresponse can occur only after a sample has been selected. In a voluntary sample, every individual has opted to take part, so there won't be any nonresponse.

Sampling v. Nonsampling Errors

Sampling errors come from the act of choosing a sample. Random sampling error and undercoverage are common types of sampling error. Nonsampling errors have nothing to do with choosing a sample. Examples include nonresponse, response bias, and the wording of questions.

Sampling Frame

Sampling often begins with a list of individuals from which we will draw our sample. this list is called the sampling frame. Ideally, the sampling frame should list every individual in the population.

Factors, Levels

Sometimes the explanatory variables in an experiment are called factors. Many experiments study the joint effects of several factors. In such an experiment, each treatment is formed by combining a specific value (often called a level) of each of the factors.

Bias

The design of a statistical study shows bias if it systematically favors certain outcomes. In a voluntary response sample, people choose whether to respond. In a convenience sample, the interviewer makes the choice. In both cases, personal choice produces bias. Tidbit: When asked to identify bias, identify the direction of the bias (overestimation or underestimation).

Experimental Units and Subjects

The experiment units are the smallest collection of individuals to which treatments are applied. When the units are human beings, they often are called subjects.

Population

The population in a statistical study is the entire group of individuals about which we want information.

Inference

The purpose of a sample is to give us information about a larger population. The process of drawing conclusions about a population on the basis of sample data is called inference because we infer information about the population from what we know about the sample. Inference from convenience samples or voluntary samples would be misleading because these methods of choosing a sample are biased. We are almost certain that the sample does not fairly represent the population. The first reason to rely on random sampling is to eliminate bias in selecting samples from the list of available individuals.

Wording of Questions

The wording of questions is the most important influence on the answers given to a sample survey. Confusing or misleading questions can introduce strong bias, and changes in wording can greatly change a survey's outcome. Even the order in which questions are asked matters. CAUTION: Don't trust the results of a sample survey until you have read the exact questions asked.

Margin of Error

The results of random sampling don't change haphazardly from sample to sample. Because we deliberately use chance, the results obey the laws of probability that governs chance behavior. These laws allow us to say how likely it is that sample results are close to the truth about the population. The second reason to use random sampling is that the laws of probability allow trustworthy inference about the population. Results from random samples come with a margin of error that sets bound on the size of the likely error. Large random samples give better information about the population than smaller samples.

Lack of Realism

The serious threat is that the treatment, the subjects, or the environment of our experiment may not be realistic. Lack of realism can limit our ability to apply the conclusions of an experiment to the settings of greatest interest.

Stratified Random Sample and Strata

To select a stratified random sample, first classify the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form the full sample. If the individuals in each stratum are less varied than the population as a whole, a stratified random sample can produce better information about the population than an SRS of the same size.

Cluster Sample and Clusters

To take a cluster sample, first divide the population into smaller groups. Ideally, these clusters should mirror the characteristics of the population. Then choose an SRS of the clusters. All individuals in the chosen clusters are included in the sample. You might say that strata are ideally "similar within, but different between," while clusters are ideally "different within, but similar between." Cluster samples are often used for practical reasons. They don't offer the statistical advantage of better information about the population that stratified random sample do. That's because clusters are often chosen for ease or convenience, so they may have as much variability as the population itself.

Undercoverage

Undercoverage occurs when some groups in the population are left out of the process of choosing the sample (a list of the entire population is rarely available).


Related study sets

Athletic injuries Shoulder/ Upper Arm

View Set

Legal/Ethics practice questions (nclex style)

View Set

Fahrenheit 451 multiple choice questions

View Set

Chapter 5 Safety and infection control

View Set

Nclex questions for last MC exam!!!

View Set