Stats section 1
Stratified sample
One obtained by separating the population into homogeneous, non-overlapping groups called strata, and then obtaining a simple random sample from each stratum.
Matched-pairs design
experimental design in which the experimental units are paired up. The pairs are selected so that they are related in some way. - same person before and after treatment - twins - husband and wife -same geographical location There are only 2 levels of treatment example: does listening to music help students learn? Match students: on IQ and gender Ex: 2 females with IQ in 110-115 range, or 2 males with IQ range in 105-110 range Assign treatment: for each pair of students flip and coin to decide which pair gets quiet room or room with music. Administering treatment: give same material and time to study, give test, and compare differences of scores of each matched pair. Any difference in score will be attributed to the treatment. match students according to their IQ, randomly assign a student from each pair to a treatment, Administer treatment and exam to each matched pair, for each matched pair compute difference in scores on the exam.
A poll is being conducted outside of a school to obtain a sample of the population of an entire country. What is the frame for this type of sampling? Who would be excluded from the survey and how might this affect the results of the survey?
frame= population of an entire country Any person not going to the school is excluded. This would cause sampling bias due to under coverage
cohort study
observational study group of participants observed over long period of time over this time characteristics of the individuals are recorded at the end: the value of response variable is recorded for the individuals
Convenience sample
one in which the individuals in the sample are easily obtained.
Placebo controlled experiment
placebo control group serves as a baseline against which to compare the results
Experimental design types
1) completely randomized design 2) Matched-pairs design Treatment is given to each experimental unit. child gets the milk with Xylitol and without Xylitol. The individual is matched within him or herself Ex: do husbands and wives have similar IQ's? This is a matched-pair bc as soon as you select the husband, the wife automatically comes as the pair.
What is a closed question? What is an open question? Discuss the advantages and disadvantages of each type of question.
A closed question has fixed choices for answers, whereas an open question is a free-response question. Closed questions are easier to analyze, but limit the responses. Open questions allow respondents to state exactly how they feel, but are harder to analyze due to the variety of answers and possible misinterpretation of answers.
What is a designed experiment?
A designed experiment is when a researcher assigns individuals to a certain group, intentionally changing the value of an explanatory variable, and then recording the value of the response variable for each group.
What is a lurking variable?
A lurking variable is an explanatory variable that was not considered in a study, but that affects the value of the response variable in the study. In addition, lurking variables are typically related to explanatory variables in the study. A relation that appears to exist between a certain explanatory variable and the response variable may be due to some other variable or variables not accounted for in the study. These variables are called lurking variables.
What does it mean when an observational study is retrospective?
A retrospective study requires that individuals look back in time or require the researcher to look at existing records.
Factor
A variable whose effect on the response variable is to be assessed by the experimenter
What is a confounding variable?
An explanatory variable that was considered in a study whose effect cannot be distinguished from a second explanatory variable in the study.
What is an observational study? What is a designed experiment? Which allows the researcher to claim causation between an explanatory variable and a response variable?
An observational study measures the value of the response variable without attempting to influence the value of either the response or explanatory variables.
Suppose you assigned 40 subjects to each of the three treatment groups. In addition, you decided to control the variable exercise by having each subject perform 150 minutes of cardiovascular exercise each week by walking on a treadmill. However, the 40 subjects in the placebo group decided they did not want to walk on the treadmill and skipped the weekly exercise. Explain how exercise is now a confounding variable.
Any difference in the change in the response variable , cannot be attributed to the treatment level. It may be the exercise, that caused the change in the response variable.
What is a case-control study?
Case-control studies are observational studies that are retrospective, meaning that they require individuals to look back in time or require the researcher to look at existing records. In case-control studies, individuals that have a certain characteristic are matched with those that do not. A disadvantage to this type of study is that it requires individuals to recall information from the past. Plus it requires the individuals to be truthful in their responses. An advantage of case-control studies is that they are relatively inexpensive to conduct and can be done relatively quickly.
To determine customer opinion of their safety features, Toyota randomly selects 40 dealerships during a certain week and surveys all customers visiting the dealerships.
Cluster
Explain what is meant by confounding. What is a lurking variable? What is a confounding variable?
Confounding in a study occurs when the effects of two or more explanatory variables are not separated. Therefore, any relation that may exist between an explanatory variable and the response variable may be due to some other variable or variables not accounted for in the study. Confounding is potentially a major problem with observational studies. Often, the cause of confounding is a lurking variable.
What is a cross-sectional study?
Cross-sectional studies are observational studies that collect information about individuals at a specific point in time or over a very short period of time. For example, a researcher might want to assess the risk associated with smoking by looking at a group of people, determining how many are smokers and comparing the incidence rate of lung cancer of the smokers to the nonsmokers. A clear advantage of cross-sectional studies is that they are cheap and quick to do. However, cross-sectional studies have limitations. For the lung cancer study, it could be that individuals develop cancer after the data are collected, so the study will not give the full picture.
Completely randomized design
Each experimental until is randomly assigned to a treatment unit. ex: 600 sample, take a random sample of 200 and assign to group 1, and do the same with 200 more, remaining 200 will go to group 3
A simple random sample is always preferred because it obtains the same information as other sampling plans but requires a smaller sample size.
False, because other sampling techniques may provide more information for less cost than a simple random sample. The statement is false because, quite often, the goal is not to obtain the best sample, but to obtain as much information as possible about the population at the least cost. With this goal in mind, it may be advantageous to use sampling techniques other than simple random sampling.
When obtaining a stratified sample, the number of individuals included within each stratum must be equal
False. Within stratified samples, the number of individuals sampled from each stratum should be proportional to the size of the strata in the population. The statement is false. A stratified sample is obtained by separating the population into nonoverlapping groups called strata and then obtaining a simple random sample from each stratum. The individuals within each stratum should be homogeneous (or similar) in some way. In order for the stratified sample to be representative of the population, the number of individuals sampled from each stratum should be proportional to the size of the strata in the population. For example, if one wanted take a stratified sample of 100 individuals from a population that is 53% female and 47% male, then 53 females and 47 males should be sampled.
Blocking:
Grouping together similar homogeneous experimental units. Each homogeneous group is called a BLOCK Each block contains a variety, so the differences in experimental units will not affect the value of the response variable.
Which is the superior observational study? Why?
Neither study is always the superior to the other. Both have advantages and disadvantages that depend on the situation. Both studies are inexpensive and can be done relatively quickly. A case-control study is limited in that it requires individuals to recall information correctly, and to answer questions truthfully. A cross-sectional study is limited in that it only gives information at a specific point in time or over a very short period of time, and might not contain valuable information that occurs outside of that point in time.
Study is conducted to determine if there is a relationship between brain cancer and use of cell phones. Doctors ask their patients with brain cancer about their use of cell phones Does the description correspond to an observational study or an experiment?
Observational: because it examines individuals in a sample, but does not try to influence the response variable.
To determine her power usage, Carolyn divides up her day into three parts: morning, afternoon, and evening. She then measures her power usage at 3 randomly selected times during each part of the day.
Stratified
Study: Students are divided into two groups, one group taught math with trad techniques, other taught math with reformed method. 1 year later test is given to test proficiency. What is the study description?
Study is an experiment- researchers control one variable to determine the effect on the response variable.
To estimate the percentage of defects in a recent manufacturing batch, a quality control manager at Daimler minus Chrysler selects every 13th van that comes off the assembly line starting with the second until she obtains a sample of 110 vans.
Systematic
Randomized Block design
The experimental units are divided into homogeneous groups called blocks. Within each block, the experimental units are randomly assigned to treatments.
In the report, the researchers stated, "These results remain significant after adjustment for socioeconomic status." What does this mean?
The researchers made an effort to avoid confounding by accounting for potential lurking variables.
In the report, the researchers stated that "the research team also hasn't ruled out that a common factor like genetics could be causing both the emotions and the diabetes." Explain what this sentence means. Choose the correct answer below.
The researchers may be concerned with confounding that occurs when the effects of two or more explanatory variables are not separated or when there are some explanatory variables that were not considered in a study, but that affect the value of the response variable.
Determine whether the following statement is true or false. Explain. When taking a systematic random sample of size n, every group of size n from the population has the same chance of being selected.
The statement is false because a systematic sample is obtained by selecting every kth individual from the population. The first individual selected corresponds to a random number between 1 and k. Certain groups would never be selected this way, such as the second half of the population.
Sampling error
Using a sample to estimate information about a population
What does it mean when an observational study is prospective?
What does it mean when an observational study is prospective?
Experiment
a controlled study conducted to determine the effect varying one or more explanatory variables or factors has on a response variable. Any combination of the values of the factors is called a TREATMENT example: 2 factors combined into 4 levels of a treatment factors- smoking vs nonsmoking, low protein diet vs high protein diet
experimental unit
a person, object, or some other well defined item upon which a treatment is applied, aka the subject (person)
a designed experiment
allows the researcher to claim causation between an explanatory variable and a response variable
Control group
baseline treatment that can be used to compare it to other treatments.
Steps in designing an experiment
design- describe overall plan in conducting the experiment. What is the plan to be followed? Step 1: Identify problem to be solved. State problem clearly, identify the response variable and the population study is applied to Step 2: determine factors that affect the response variable. These are EXPLANATORY VARIABLES then decide which factors are "fixed" controlled, and which are "manipulated" mind: the factors that already exist that cant be controlled Step 3: determine number of experimental units Step 4: determine level of explanatory variables (factors) Control: 2 ways to control 1- fix level of factor at one value throughout the experiment. Factor whose impact on response variable does not interest you Ex: everyone eats 5 slices of bread one hour before receiving alcohol. dont care about this for reaction time. 2- set level of a factor at predetermined level. The factor whose impact on response variable does interest you. Ex: how varying the amount of alcohol impacts the reaction time. have different levels. 1 oz every hr, 3 oz every hr, and placebo- gives 3 levels of explanatory variable. combination of the levels of all varied factors represents the treatment in the experiment. Randomize: some variables can not be controlled. Experimental groups are randomly assigned to treatment group. "averages out" the affects of uncontrolled explanatory groups. Ex: innate reaction time of subjects. So average reaction time in the 3 treatment groups is close. Step 5: Conduct the experiment. aka: 1- REPLICATION Each treatment is applied to more than one experimental unit. To make sure that the effect of a treatment is not because some characteristic of a single experimental unit. Each treatment group should have the same number of experimental units 2- collect and process the data by measuring the value of the response variable in each replication. Differences in value of response variable can be attributed to differences in level of the treatment. Step 6: Test the claim testing research objective/goal
Placebo
innocuous medication- sugar tablet- looks tastes and smells like experiment medication
Blinding technique
non disclosure of treatment that is being received by the experimental unit single blinding: experimental unit does not know WHICH treatment they are getting double blinding: experimental unit OR the researcher know which treatment the experimental until is getting.
Cluster Sample
obtained by selecting all individuals within a randomly selected collection or group of individuals. or A cluster sample is obtained by dividing the population into groups and selecting all individuals from within a random sample of the groups.
Systematic sample
obtained by selecting every Kth individual from the population. The first individual selected is a random number between 1 and K
response variable
variable of interest to be measured in the study. The value of response variable is affected by the explanatory variable.
Sampling bias
when the technique used to obtain the sample's individuals tends to favor one part of the population over another. All convenience samples have sampling bias, because they are not chosen through random sampling. Under coverage: proportion of one segment of population is lower in a sample than it is in a population. Can lead to incorrect predictions. When people are contacted by telephone surveys- if people are different from those who don't have a telephone.