Stats
Which sampling method would most likely be used in a poll of voters regarding a referendum calling for a national housing development program? A. Use systematic random sampling. By varying the k-value, a sampling of the whole nation can be selected to achieve the needed accuracy. B. Use cluster sampling. Each geographical region can be treated as a mini-population, because a nation has so many individuals. C. Use stratified random sampling. Since this is a national issue, different geographical locations are likely to have similar views. D. Use convenience sampling. A large number of voters is needed to accurately reflect the nation's voters; so a quick way to collect information is needed. E. Use simple random sampling. People are very likely to have different opinions on the issue based on how much they use any present services that would be affected.
Use stratified random sampling. Since this is a national issue, different geographical locations are likely to have similar views.
Unstacked Data table
data values are stores n two columns each column is a variable from a different group info in a row does NOT correspond to the same individual.
Define simple random sampling
A sample of size n from a population of size N is obtained through simple random sampling if every possible sample of size n has an equally likely chance of occurring. The sample is then called a simple random sample.
Categorical Variables have 2 main Categories
Bar Charts - Have spaces between bins and Pie Graph
two types of variables
Categorical and Numerical
The number of times a value is observed in a data set is called a
Frequency
What are two types of Variables in statistics
Numerical and Categorical
In statistics, the data we work with is just one part of a bigger picture called the
Population
Coded Data
Represent something else, 0 is male, 1 is female
Systematic Sampling
Sampling every kth, member of a population after randomly determining the first individual by selecting a random number between 1 and 4.
Simple Random Sampling
Use a random number generator or random number table to determine which members are chosen ex. Drawing numbers from a hat, Computer aided random number generator, random number table
Statistics rests on Two major Concepts
Variation and Data
Data are more than just numbers, because data have
context
Historgram
has no gaps between bins unless there is NO value.
Categorical variables are also referred to as ______ variables.
qualitative
A(n) ____________________ is obtained by dividing the population into homogeneous groups and randomly selecting individuals from each group.
stratified sample
Can a categorical (qualitative) variable have values that are numeric? Why or why not?
Yes; it is possible to have numeric variables that do not count or measure anything, and, as a result, are categorical (qualitative). Ex. Zip Code, or Social Security Numbers.
Why is random assignment used to assign people to treatment groups and control groups in a controlled experiment? A. To make the groups as similar as possible, minimizing bias. B.To make sure that the percentage of men and women in each group is exactly the same. C.To ensure that the researchers do not know which groups subjects are members of. D.To ensure that the groups have equal numbers.
A. To make the groups as similar as possible, minimizing bias.
To determine customer opinion of their safety features, Toyota randomly selects 90 dealerships during a certain week and surveys all customers visiting the dealerships A.Systematic B.Stratified C.Cluster D.Simple random E.Convenience
Cluster
Indicate whether the study is an observational study or a controlled experiment. A group of boys is randomly divided into two groups. One group watches violent cartoons for one hour, and the other group watches cartoons without violence for one hour. The boys are then observed to see how many violent actions they take in the next two hours, and the two groups are compared.
Controlled
Of the following, which is the only method of data collection suitable for making conclusions about causal relationships? A.Controlled experiments Your answer is correct. B.Observational studies C.Anecdotes D.All three are suitable.
Controlled Experiments
Indicate whether the study is an observational study or a controlled experiment. A researcher was interested in the effects of exercise on academic performance in elementary school children. She went to the recess area of an elementary school and identified some students who were exercising vigorously and some who were not. The researcher then compared the grades of the exercisers with the grades of those who did not exercise.
Observational
Do people walk faster in an airport when they are departing (getting on a plane) or after they have arrived (getting off a plane)? An interested passenger watched a random sample of people departing and a random sample of people arriving and measured the walking speed (in feet per minute) of each. What type of study design is being performed? A.randomized block experimental design B.observational study C.questionnaire D.completely randomized experimental design
Observational Study
Suppose a pharmaceutical company is designing a double-blind experiment to test their new allergy medication. They divide the 200 subjects by gender and then randomly assign the men and women to either receive the medication or a placebo. The researcher finds that even the group receiving the placebo (a sugar pill) showed a decrease in allergy symptoms. Which of the following accounts for this result? Response bias Confounding Blocking Placebo effect
Placebo Affect
Cluster Sampling
Population is divided into clusters reflecting the variability within population and then a certain number of clusters are randomly selected and every member of the selected cluster is surveyed.
IBM wants to administer a satisfaction survey to its current customers. Using their customer database, the company randomly selects 30 customers and asks them about their level of satisfaction with the company. A.Simple random B.Stratified C.Cluster D.Systematic E.Convenience
Simple Random
IBM wants to administer a satisfaction survey to its current customers. Using their customer database, the company randomly selects 60 customers and asks them about their level of satisfaction with the company. A.Convenience B.Systematic C.Stratified D.Simple random E.Cluster
Simple Random
To determine her stress level, Miranda divides up her day into three parts: morning, afternoon, and evening. She then measures her stress level at 2 randomly selected times during each part of the day. A.Stratified B.Convenience C.Systematic D.Simple random E.Cluster
Stratified
The outcome variable in a question about causality is also referred to as what? The response variable The treatment variable The predicting variable The control group
The Response Variable
In an experiment studying the association between a treatment variable and an outcome variable, the group of people who do NOT receive the treatment are called what? The placebo group The non-treatment group The treatment group The control group
The control group
Inferences based on voluntary response samples are generally not reliable
True, because it is often the case that the individuals who volunteer do not accurately represent the population.
A study was done to see whether a smaller dose of flu vaccine could be used successfully. In this study, the usual amount of vaccine was injected into half the patients, and the other half of the patients had only a small amount of vaccine injected. The response was measured by looking at the production of antibodies. In the end, the lower dose of vaccine was just as effective as a higher dose for those under 65 years old. What more do we need to know to be able to conclude that the lower dose of vaccine was equally effective at preventing the flu for those under 65? A.This is an observational study and causation cannot be inferred. B.The patients need to be randomly assigned the full or lower dose. Without randomization there could be bias, however, with randomization we can infer causation. C.This is a controlled experiment and causation cannot be inferred. D.The sample size must be at least 500 to ensure the sample size is large enough to infer causation.
B.The patients need to be randomly assigned the full or lower dose. Without randomization there could be bias, however, with randomization we can infer causation.
A group of educators want to determine how effective tutoring is in raising students' grades in a math class, so they arrange free tutoring for those who want it. Then they compare final exam grades for the group that took advantage of the tutoring and the group that did not. Suppose the group participating in the tutoring tended to receive higher grades on the exam. Does that show that the tutoring worked? If not, explain why not and suggest a confounding variable. A.Because this was an observational study, it only shows an association; it does not show that the tutoring worked. It could be that an insufficient number of students were studied to show that the tutoring worked. B.The study shows that the tutoring worked. C. Because this was an observational study, it only shows an association; it does not show that the tutoring worked. It could be that more motivated students attended the tutoring and that was what caused the higher grades.
C. Because this was an observational study, it only shows an association; it does not show that the tutoring worked. It could be that more motivated students attended the tutoring and that was what caused the higher grades.
When obtaining a stratified sample, the number of individuals included within each stratum must be equal.
False. Within stratified samples, the number of individuals sampled from each stratum should be proportional to the size of the strata in the population
Two-Way Table
Shows combinations of the various variables Male Female 2 3 1 2 Total: 3 5
A(n) _______________ is obtained by dividing the population into groups and selecting all individuals from within a random sample of the groups.
cluster sample
Pareto Chart
is a bar graph in which the bars are arranged from tallest to shortest. (this cannot necessarily be done in a histogram)
Variability on graphs
is how spread out the bins are not how tall the bins are.
The _________________ is/are a subset of the population that is being studied.
sample
Causality two variables
Treatment Variable - Weather or not a specific treatment is sued and Outcome (or response) Variable - Whether or not a certain outcome is seen
Which of the following is an identifying mark of an observational study? A.Subjects are often given a placebo or harmless pill that they believe is actually an effective treatment. B.Researchers observe but do not interact with subjects in an experiment. C.Observational studies are controlled experiments. D.Subjects in the study are put into the treatment group or the control group either by their own actions or by the decision of someone not involved in the research study.
D. Subjects in the study are put into the treatment group or the control group either by their own actions or by the decision of someone not involved in the research study.
Which of the following is used to summarize two potentially related categorical variables? Frequency table Stacked table Histogram A two-way table
A Two-Way Table
A teacher asks 90 students who drive how many speeding tickets they received in the last year. Predict the shape of the distribution and explain. A. The distribution will be right-skewed. Most people will have no tickets, but there will be a few people with 1, 2, 3, or more tickets. B. The distribution will be right-skewed. Most people will have at least one ticket, but there will be a few people with no tickets. C. The distribution will be left-skewed. Most people will have at least one ticket, but there will be a few people with no tickets. D. The distribution will be roughly symmetric. The number of people who have fewer tickets than the mean and the number of people who have more tickets than the mean is roughly equal. E. The distribution will be left-skewed. Most people will have no tickets, but there will be a few people with 1, 2, 3, or more tickets.
A. The distribution will be right-skewed. Most people will have no tickets, but there will be a few people with 1, 2, 3, or more tickets.
A pharmaceutical company wants to conduct a survey of 35 individuals who have high cholesterol. The company has obtained a list from primary care physicians throughout the country of 5600 individuals who are known to have high cholesterol. Design a sampling method to obtain the individuals in the sample. Be sure to support the choice. What sampling method(s) is/are valid? Select all that apply. A. Select a hospital at random from a list of all of the country's hospitals. Survey individuals that pass by the front desk until there are responses from 35 individuals with high cholesterol. B. Group the individuals by common primary care physician. Assign each physician a different number and use a random number table to select physicians until the total number of patients of all the selected physicians is at least 35 C. Alphabetize the list of 5600 individuals by last name and select one of the first 160 individuals at random. Starting from the selected individual, read down the list and select every 160th individual. D. Group the individuals by common primary care physician. For each group, assign all the individuals different numbers, and use a random number table to select an appropriate number of individuals.
B. C. and D.
Which of the following is a reason we can never draw cause-and-effect conclusions from observational studies? A. Observational studies are not scientific in nature. B. Potential confounding variables may explain the differences between groups rather than the treatment variable. C.Researchers may be biased in the observations they choose to record. D.Observational studies often do not involve a large enough sample to draw cause-and-effect conclusions.
B. Potential confounding variables may explain the differences between groups rather than the treatment variable.
The distribution of in-state annual tuition for all colleges and universities in the United States is bimodal. What is one possible reason for this bimodality? A. The distribution might be bimodal because students with poor parents and students with wealthy parents attend college or university in roughly equal quantities. B. The distribution might be bimodal because there are a few colleges/universities that have an in-state tuition price that is much higher than the typical college or university. C. The distribution might be bimodal because private colleges and public colleges tend to differ in amount of tuition. D. The distribution might be bimodal because each state sets in-state tuition prices; the wealth of the state determines in-state tuition prices.
C. The distribution might be bimodal because private colleges and public colleges tend to differ in amount of tuition
Indicate whether the following study is an observational study or a controlled experiment. A researcher is interested in the effect of music on memory. She randomly divides a group of students into three groups: those who will listen to quiet music, those who will listen to loud music, and those who will not listen to music. After the appropriate music is played (or not played), she gives all the students a memory test. A. This is an observational study. The researcher observes only the results of the memory test. Without observing the subjects' performance on memory tests prior to assignment to different groups, the study is not sufficiently active to make this a controlled experiment. B. This is an observational study. The number of treatment groups is not sufficiently large to control for all relevant factors aside from the effect of music on memory; a large number of treatment groups is essential to conducting a controlled experiment. C. This is a controlled experiment. She assigns students to the control and treatment groups at random in order to control for all relevant factors aside from the effect of music on memory, which is essential to conducting a controlled experiment. D. This is a controlled experiment. She controls for other factors prior to assigning students to the control group and two treatment groups. This is essential to conducting a controlled experiment.
C. This is a controlled experiment. She assigns students to the control and treatment groups at random in order to control for all relevant factors aside from the effect of music on memory, which is essential to conducting a controlled experiment.
A study reported on the effects of vitamin C in breast milk for breast-feeding mothers. The children whose mothers had chosen to take high doses of vitamin C had a 30% lower risk of developing allergies. Can you conclude that the use of vitamin C caused the reduction in allergies? Why or why not? A. You can conclude that the use of vitamin C caused the reduction in allergies because the mothers were separated into treatment or control groups when they elected whether to take high doses of vitamin C. This is sufficient to identify causation. B. You can conclude that the use of vitamin C caused the reduction in allergies because the researchers did not know beforehand which mothers chose to take high doses of vitamin C, eliminating the possibility of bias. This is sufficient to identify causation. C. You cannot conclude that the use of vitamin C caused the reduction in allergies because the researchers did not randomly assign mothers to treatment and control groups. This step is necessary for identifying causation. D. You cannot conclude that the use of vitamin C caused the reduction in allergies because the mothers are aware that they are in a study, confounding results. Eliminating this bias is necessary for identifying causation.
C. You cannot conclude that the use of vitamin C caused the reduction in allergies because the researchers did not randomly assign mothers to treatment and control groups. This step is necessary for identifying causation.
Which of the following would produce a simple random sample? Select all that apply. A. Ask someone their opinion on which three books are best, and select those three. B. List the names in alphabetical order and take the first three on the list. C. Number the books from 1 to 9 and use a random number table to produce 3 different one digit numbers corresponding to the books selected. D. List each book on a separate piece of paper, place them all in a hat, and pick three .
C. and D.
Some people believe that wearing copper bracelets is a good treatment for arthritis of the hand. To test this belief, suppose you recruit 100 people and supply them all with copper bracelets. After the patients wear the bracelets for a month, you ask them whether or not their pain is less than it was before they began wearing the bracelets. Explain how to improve this study. A. To improve the study, accept only patients who believe that copper bracelets are a good treatment for arthritis. Then, assign the copper bracelets to patients in the treatment group, and nothing to patients in the control group. After a month, survey the patients on the levels of their pain. B.To improve the study, accept only patients who do not believe that copper bracelets are a good treatment for arthritis. Then, assign the copper bracelets to patients in the treatment group, and nothing to patients in the control group. After a month, survey the patients on the levels of their pain. C.To improve the study, the patients should be randomly divided into two groups; one group will be given the copper bracelets, and the other group will be given nothing. After a month, the patients will be surveyed on the levels of their pain. D.To improve the study, the patients should be randomly divided into two groups; one group will be given the copper bracelets, and the other group will be given non-copper bracelets. After a month, the patients will be surveyed on the levels of their pain.
D. To improve the study, the patients should be randomly divided into two groups; one group will be given the copper bracelets, and the other group will be given non-copper bracelets. After a month, the patients will be surveyed on the levels of their pain.
A _______ experiment is one in which neither the subject nor the researcher know which treatment the subject is receiving.
Double-Blind
Stacked Data Table
Each row contains data for a single individual baby A and then lists across weight, gender etc. Baby B and then lists across weight, gender etc.
Stemplot
Is for numerical data ex. stem leaves 0 1111113333555666 1 0564 2 088675
What is Statistics?
Statistics is the science of collecting, organizing, summarizing, and analyzing data to answer questions and/or draw conclusions
Stratified Random Sampling
The population is divided into two or more groups (strata) according to some criterion and sub samples are randomly chosen from each strata.