Statistics Chapter 1
A gambler wanted to compare two types of poker strategies. One type is playing fight and the other is a loose style of play. It is a common belief that playing tight will win more pots. This belief is tested by having 10 poker players use each type of strategy in a game and comparing the number of pots won under each method of play. A coin flip was used to determine which type of poker strategy each player would follow first. Results indicated that there was no difference in the first types of strategy. -What type of experimental design is this? -What is the response variable in this experiment? -What is the treatment? -dentify the experimental units. -Why is a coin used to decide the poker strategy each player would follow first?
-Matched pairs -The number of pots won. -The types of strategy -The poker players -To eliminate bias as to which strategy was used first
A newspaper article reported, "The famous magazine survey of more than 5,000 women aged 18-34 found about 42 percent considered themselves overweight or obese." This survey has bias. What is the type of bias? What is a possible remedy?
-Sampling bias -Use systematic random sampling.
What is a Pareto chart?
A Pareto chart is a bar graph whose bars are drawn in decreasing order of frequency or relative frequency.
What is a closed question? What is an open question? Discuss the advantages and disadvantages of each type of question.
A closed question has fixed choices for answers, whereas an open question is a free-response question. Closed questions are easier to analyze, but limit the responses. Open questions allow respondents to state exactly how they feel, but are harder to analyze due to the variety of answers and possible misinterpretation of answers.
What is a designed experiment?
A designed experiment is when a researcher assigns individuals to a certain group, intentionally changing the value of an explanatory variable, and then recording the value of the response variable for each group.
What is a frame?
A frame is a list of the individuals in the population being studied.
What does it mean when a part of the population is under-represented?
A part of the population is under-represented when it is proportionally smaller in a sample than in its population.
What does it mean when an observational study is prospective?
A prospective study collects the data over time.
What does it mean when an observational study is retrospective?
A retrospective study requires that individuals look back in time or require the researcher to look at existing records.
Consider a survey conducted of people in a certain country. One aspect of the survey is to measure the weight of people in a particular very small town. (a) What is the population of interest for this portion of the survey? (b) What is the variable of interest for this portion of the survey? (c) Is the variable qualitative or quantitative? (d) What is the level of measurement for the variable? (e) Is a census feasible in this situation? (f) Is a sample feasible in this situation?
All people in the town Weight Quantitative Ratio Yes Yes
What is an observational study?
An observational study measures the value of the response variable without attempting to influence the value of either the response or explanatory variables.
Treatment
Any combination of the values of the factors (explanatory variables)
What are some solutions to nonresponse?
Attempt callbacks Offer rewards and incentives
Experimental unit.
A person, object, or some other well-defined item upon which a treatment is applied
What is a case-control study?
Case-control studies are observational studies that are retrospective, meaning that they require individuals to look back in time or require the researcher to look at existing records.
To determine customer opinion of their Pricing Greyhound Lines randomly selects 100 buses during a certain week and surveys all passengers on the buses. What type of sampling is used?
Cluster
A radio station asks its listeners to call in their opinion regarding the use of pesticides in residential areas. What type of sampling is used
Convenience
What is a cross-sectional study?
Cross-sectional studies are observational studies that collect information about individuals at a specific point in time or over a very short period of time.
True or false. When taking a systematic random sample of size n, every group of size n from the population has the same chance of being selected.
False, because certain groups would never be selected.
True or False. A simple random sample is always preferred because it obtains the same information as other sampling plans but requires a smaller sample size.
False, because other sampling techniques may provide more information for less cost than a simple random sample.
True or False. When obtaining a stratified sample, the number of individuals included within each stratum must be equal.
False. Within stratified samples, the number of individuals sampled from each stratum should be proportional to the size of the strata in the population. The statement is false. A stratified sample is obtained by separating the population into nonoverlapping groups called strata and then obtaining a simple random sample from each stratum. The individuals within each stratum should be homogeneous (or similar) in some way. In order for the stratified sample to be representative of the population, the number of individuals sampled from each stratum should be proportional to the size of the strata in the population. For example, if one wanted take a stratified sample of 100 individuals from a population that is 53% female and 47% male, then 53 females and 47 males should be sampled.
To help assess student learning in her developmental art courses, an art professor at a community college implemented pre- and post-tests for her developmental art students. A knowledge-gained score was obtained by taking the difference of the two test scores. What is the response variable in this experiment? What is the treatment?
Matched pair Difference in test scores Art course
Suppose you are interested in comparing brand A interior enamel paint to brand B interior enamel paint. Design an experiment to determine which paint is better for painting kitchens .
Matched-pairs design because experimental units are paired up and there are only two levels of treatment.
Surveys tend to suffer from low response rates. based on past experience, a researcher determines that the typical response rate for an email survey is 30%. She wishes to obtain a sample of 300 respondents, so she emails the survey to 2000 randomly selected email addresses. Assuming the response rate for her survey is 30%, will respondents form an unbiased sample?
NO, the survey still suffers from undercoverage (sampling bias), nonresponse bias, and potentially response bias
A polling organization conducts a study to estimate the percentage of households that has their children in private schools It mails a questionnaire to 1034 randomly selected households across the United States and asks the head of each household if he or she has their children in private schools. Of the 1034 households selected, 12 responded. The survey design is flawed. Which of the following best describes the problem?
Nonresponse
Distinguish between nonsampling error and sampling error.
Nonsampling error is the error that results from undercoverage, nonresponse bias, response bias, or data-entry errors. Sampling error is the error that results because a sample is being used to estimate information about a population.
What does it mean when sampling is done without replacement?
Once an individual is selected, the individual cannot be selected again.
What is replication in an experiment?
Replication is applying each treatment to more than one experimental unit.
An anti-war advocate wants to estimate the percentage of people who favor cutting back of funds for the military. She conducts a nationwide survey of 1850 randomly selected adults 18 years and older. The interviewer asks the respondents, "Do you favor supporting peace by limiting our military involvement in world affairs question mark" The survey has bias. Which of the following best describes the problem?
Response bias
Suppose you are conducting a survey regarding violence towards siblings in the Durham school district. You obtain a cluster sample of 17 households within the Durham school district and sample all children in the randomly selected households. The survey is administered by the parents The survey has bias. Which of the following best describes the problem?
Response bias
A school psychologist wants to test the effectiveness of a new method of teaching statistics. she recruits 600 second-grade students and randomly divides them into two groups. group 1 is taught by means of the new method, while group 2 is taught via traditional methods. The same teacher is assigned to both groups. At the end of the year, an achievement test is administered and the results of the two groups compared
Response variable: the score on the achievement test Explanatory variable manipulated: method of teaching 2 levels of treatment Type of experimental design: completely randomized assignment subjects: 600 students
The owner of a shopping mall wishes to expand the number of shops available in the food court. He has a market researcher survey the first 100 customers who come into the food court during weekday evenings to determine what types of food the shoppers would like to see added to the food court. (a) The survey has bias. Determine whether the flaw is due to the sampling method or the survey itself. What is the cause of the bias? Identify the cause of the error.
Sampling Bias Ask customers throughout the day on both weekdays and weekends.
A retail store manager wants to conduct a study regarding the shopping habits of his customers. He selects the first 60 customers who enter his store on a Saturday morning. What is the type of bias? What is a possible remedy?
Sampling bias Use systematic random sampling
Suppose that a magazine predicted that Candidate A would defeat Candidate B in a certain election. They conducted a poll of its subscribers with a response rate of 25%. On the basis of the results, the magazine predicted that Candidate A would win with 57% of the popular vote. However, Candidate B won the election with about 62% of the popular vote. At the time of this poll, most subscribers belonged to the party of Candidate A. Name two biases that led to this incorrect prediction.
Sampling bias: Using an incorrect frame led to undercoverage. Nonresponse bias: The low response rate caused bias.
Nonresponse bias
Sometimes, in survey sampling, individuals chosen for the sample are unwilling or unable to participate in the survey.
Define statistics.
Statistics is the science of collecting, organizing, summarizing, and analyzing information to draw a conclusion and answer questions. In addition, statistics is about providing a measure of confidence in any conclusions.
To determine her breathing rate, Miranda divides up her day into three parts: morning, afternoon, and evening. She then measures her breathing rate at 3 randomly selected times during each part of the day.
Stratified
To determine her stress level, Carrie divides up her day into three parts: morning, afternoon, and evening. She then measures her stress level at 2 randomly selected times during each part of the day. What type of sampling is used?
Stratified
For a poll of voters regarding a referendum calling for a national food and drug administration, design a sampling method to obtain the individuals in the sample. Which sampling method would most likely be used in a poll of voters regarding a referendum calling for a national food and drug administration?
Stratified random sampling since this is a national issue, different geographical locations are likely to have similar views.
To estimate the percentage of defects in a recent manufacturing batch, a quality control manager at Daimler minus Chrysler selects every 15th van that comes off the assembly line starting with the eighth until she obtains a sample of 80 vans. What type of sampling is used?
Systematic
To estimate the percentage of defects in a recent manufacturing batch, a quality control manager at General Foods selects every 19th soup can that comes off the assembly line starting with the fifth until she obtains a sample of 150 soup cans What type of sampling is used?
Systematic
Which sampling method does not require a frame?
Systematic
Consider the following question from a recent poll. Thinking about how the social security issue might affect your vote for major offices, would you vote only for a candidate who shares your views on social security or consider a candidate's position on social security as just one of many important factors? [rotated] Why is it important to rotate the two choices presented in the question?
The choices need to be rotated to minimize response biases.
Poll being conducted at a mall to obtain a sample of the population of an entire country. What is the frame for this type of sampling? Who would be excluded from the survey and how might this affect the results of the survey?
The frame is the entire population of the country. Excluded: Any person who does not shop at the mall is excluded. This could result in sampling bias due to undercoverage.
What are the advantages of having a presurvey with open questions to assist in constructing a questionnaire that has closed questions?
The researcher can learn common answers A presurvey could give the researcher an idea of what the most common responses are from a population. The researcher could then use these responses as the answers to closed questions in the actual survey.
Determine whether the quantitative variable is discrete or continuous. Percentage of a car's surface which is rusted Is the variable discrete or continuous?
The variable is continuous because it is not countable.
Determine whether the variable is qualitative or quantitative. Grams of carbohydrates in a donut Is the variable qualitative or quantitative?
The variable is quantitative because it is a numerical measure.
True or False. Inferences based on voluntary response samples are generally not reliable.
True, because it is often the case that the individuals who volunteer do not accurately represent the population. The statement is true because individuals that participate in the sample are not chosen using random sampling. Certain individuals may be more or less inclined to volunteer and this introduces biases (knowingly or unknowingly).
Grouping together similar experimental units and then randomly assigning the experimental units within each group to a treatment is called
blocking
To determine customer opinion of their check dash in service, American Airlines randomly selects 60 flights during a certain week and surveys all passengers on the flights. What type of sampling is used?
cluster
is obtained by dividing the population into groups and selecting all individuals from within a random sample of the groups.
cluster sample
Which allows the researcher to claim causation between an explanatory variable and a response variable?
designed experiment
A(n) _________ is a person or object that is a member of the population being studied. Choose the correct answer below.
individual
Sampling bias
is a bias in which a sample is collected in such a way that some members of the intended population are less likely to be included than others.
define matched pair design
is a special case of a randomized block design. It can be used when the experiment has only two treatment conditions; and subjects can be grouped into pairs, based on some blocking variable. Then, within each pair, subjects are randomly assigned to different treatments.
define completely randomized design
is probably the simplest experimental design, in terms of data analysis and convenience. With this design, subjects are randomly assigned to treatments. Treatment. Placebo. Vaccine.
Response bias
is the tendency of a person to answer questions on a survey untruthfully or misleadingly. For example, they may feel pressure to give answers that are socially acceptable.
Determine the level of measurement of the variable. Makes of a car
nominal
A frequency distribution lists the --------of occurrences of each category of data,
number
Determine the level of measurement of the variable below. Highest degree conferred left parenthesis high school comma bachelor's comma and so on right parenthesis
ordinal
is a numerical summary of a population.
parameter
relative frequency distribution lists the ----- of occurrences of each category of data.
proportion
IBM wants to administer a satisfaction survey to its current customers. Using their customer database, the company randomly selects 40 customers and asks them about their level of satisfaction with the company. What type of sampling is used?
simple random
is a numerical summary of a sample.
statistic
is obtained by dividing the population into homogeneous groups and randomly selecting individuals from each group.
stratified sample
Why is it rare for frames to be completely accurate?
t is rare for frames to be accurate because frames are obtained periodically, whereas populations are constantly changing.
define randomized blocking design
the experimenter divides subjects into subgroups called blocks, such that the variability within blocks is less than the variability between blocks. Then, subjects within each block are randomly assigned to treatment conditions.
_________ are the characteristics of the individuals of the population being studied.
variables
True or False. When conducting a cluster sample, it is better to have fewer clusters with more individuals when the clusters are heterogeneous.
True, because when the clusters are heterogeneous, they are scaled down versions of the population. The statement is true because a heterogeneous cluster likely resembles the heterogeneity of the population. In other words, each cluster is a scaled-down representation of the overall population.