Stat 1342 - Unit I Exam Review
What is a closed question? What is an open question?
A closed question has fixed choices for answers, whereas an open question is a free-response question.
What is a designed experiment?
A designed experiment is when a researcher assigns individuals to a certain group, intentionally changing the value of an explanatory variable, and then recording the value of the response variable for each group.
What is an observational study?
An observational study measures the value of the response variable without attempting to influence the value of either the response or explanatory variables.
Define treatment
Any combination of the values of the factors (explanatory variables)
A mutual fund rating agency ranks a fund's performance by using one to five stars. A one-star mutual fund is in the bottom 20% of its investment class; a five-star mutual fund is in the top 20% of its investment class. Interpret the meaning of a four-star mutual fund.
A four-star fund is in the 4th quintile of the funds. That is, it is above the bottom 60%, but below the top 20% of the ranked funds.
Define experimental unit.
A person, object, or some other well-defined item upon which a treatment is applied
A(n) _________ is a person or object that is a member of the population being studied.
Individual
Determine the level of measurement of the variable below. Year of birth of college students
Interval
Determine the level of measurement of the variable. Years of elections: 1988, 1990, 1992, 1994, and 1996
Interval
What is a Pareto chart?
a bar graph whose bars are drawn in decreasing order of frequency or relative frequency.
What is a frame?
a list of the individuals in the population being studied.
The mean finish time for a yearly amateur auto race was 186.35 minutes with a standard deviation of 0.341 minute. The winning car, driven by Ted, finished in 185.62 minutes. The previous year's race had a mean finishing time of 110.8 with a standard deviation of 0.145 minute. The winning car that year, driven by Kate, finished in 110.52 minutes. Find their respective z-scores. Who had the more convincing victory? a) Ted had a finish time with a z-score of b) Kate had a finish time with a z-score of c) Which driver had a more convincing victory?
a) -2.14 b) -1.93 c) Kate a more convincing victory because of a lower z-score.
Explain the meaning of the following percentiles in parts (a) and (b). (a) The 10th percentile of the weight of males 36 months of age in a certain city is 12.0 kg. (b) The 90th percentile of the length of newborn females in a certain city is 53.8 cm.
a) 10% of 36-month-old males weigh 12.0 kg or less, and 90% of 36-month-old males weigh more than 12.0 kg. b) 90% of newborn females have a length of 53.8 cm or less, and 10% of newborn females have a length that is more than 53.8 cm.
The accompanying data represent the monthly rate of return of a certain company's common stock for the past few years. Complete parts (a) and (b) below. a) Determine and interpret the quartiles. b) Check the data set for outliers. Select the correct choice below and, if necessary, fill in the answer box to complete your choice.
a) 1st = -0.05 / 2nd = 0.035 / 3rd = 0.095 /Of the monthly returns, 25% are less than or equal to the first quartile, 50% are less than or equal to the second quartile, and 75% are less than or equal to the third quartile. b) 0.48
The data in the table represent the tuition for all 2-year community colleges in a region in 2014-2015. a) Construct a cumulative frequency distribution. b) Construct a cumulative relative frequency distribution.
a) keep adding numbers together by going downward b) add all numbers together & divide frequency by the sum
The managermanager of a shopping mall wishes to expand the number of shops available in the food court. HeHe has a market researcher survey the first 9090 customers who come into the food court during weekdayweekday morningsmornings to determine what types of food the shoppers would like to see added to the food court. Complete parts (a) and (b) below. a)What is the cause of the bias? b) Which of the following is the best way to remedy this problem?
a) sampling bias b) Ask customers throughout the day on both weekdays and weekends.
Define placebo
An innocuous medication, such as a sugar tablet, that looks, tastes, and smells like the experimental medication
What is the population in the study? What is the sample in the study? A polling organization contacts 1029 adult men who are 40 to 60 years of age and live in the United States and asks whether or not they had seen their family doctor within the past 6 months.
Population is the Adult men who are 40 to 60 years of age and live in the United States. Sample is the 1029 adult men who are 40 to 60 years of age and live in the United States.
Determine whether the variable is qualitative or quantitative. Hair color
Qualitative because it is an attribute characteristic.
Determine whether the variable is qualitative or quantitative. Favorite brand of computer
The variable is qualitative because it is an attribute characteristic.
Determine whether the variable is qualitative or quantitative. Street name of address
The variable is qualitative because it is an attribute characteristic.
Determine whether the variable is qualitative or quantitative. Amount of sugar in a bag
The variable is quantitative because it is a numerical measure
What is a bar graph?
A bar graph is a horizontal or vertical representation of the frequency or relative frequency of the categories. The height of each rectangle represents the category's frequency or relative frequency.
In Marissa's calculus course, attendance counts for 20% of the grade, quizzes count for 15% of the grade, exams count for 45% of the grade, and the final exam counts for 20% of the grade. Marissa had a 100% average for attendance, 89% for quizzes, 86% for exams, and 81% on the final. Determine Marissa's course average.
88.25%
Explain the difference between a population and a sample
A population is the entire group that is being studied while a sample is a subset of the population that is being studied
Define simple random sampling
A sample of size n from a population of size N is obtained through simple random sampling if every possible sample of size n has an equally likely chance of occurring.
Define factor
A variable whose effect on the response variable is to be assessed by the experimenter
Discuss the advantages and disadvantages of each type of question.
Closed questions are harder to analyze, because they limit the responses. Open questions allow respondents to state exactly how they feel, but are easier to analyze due to the variety of answers.
Which allows the researcher to claim causation between an explanatory variable and a response variable?
Designed experiment
Explain the difference between a single-blind and a double-blind experiment.
In a single-blind experiment, the subject does not know which treatment is received. In a double-blind experiment, neither the subject nor the researcher in contact with the subject knows which treatment is received.
Which is the superior observational study? Why?
Neither study is always the superior to the other. Both have advantages and disadvantages that depend on the situation.
Determine the level of measurement of the variable. Hair colors
Nominal
It is extremely important for a researcher to clearly define the variables in a study because this helps to determine the type of analysis that can be performed on the data. For example, if a researcher wanted to describe peoplepeople based on Social SecuritySocial Security numbernumber, what level of measurement would the variable "Social Security numberSocial Security number" be? Now suppose the researcher felt that certain people with a greater birthpeople with a greater birth weightweight received higher numbersnumbers. Does the level of measurement of the variable change? If so, how?
Nominal / Yes it changes to ordinal
Distinguish between nonsampling error and sampling error.
Nonsampling error is the error that results from undercoverage, nonresponse bias, response bias, or data-entry errors. Sampling error is the error that results because a sample is being used to estimate information about a population.
What does it mean when sampling is done without replacement?
Once an individual is selected, the individual cannot be selected again.
Determine the level of measurement of the variable below. Assessed value of a house
Ordinal
Determine whether the underlined numerical value is a parameter or a statistic. Explain your reasoning. In a phone survey of 100 random homes in a country, 21 % of families had garages.
Parameter because the data set of 100 random homes is a population.
Define confounding.
The effect of two factors (explanatory variables on the response variable) cannot be distinguished.
Define response variable
The quantitative or qualitative variable for which the experimenter wishes to determine how its value is affected by the explanatory variable
Determine whether the study depicts an observational study or an experiment. Seventh-grade students are randomly divided into two groups. One group is taught English using traditional techniques. The other is taught English using a reform method. After 1 year, each is given an achievement test to compare its proficiency with that of the other group.
The study is an experiment because the researchers control one variable to determine the effect on the response variable.
Determine whether the study depicts an observational study or an experiment. Thirty university students are divided into two groups. One group receives free tutoring in mathematics, the other doesn't. After one semester, scores on final mathematical examinations are compared.
The study is an experiment because the researchers control one variable to determine the effect on the response variable.
Determine whether the underlined value is a parameter or a statistic. Upper A study of 6,076 adults in public rest rooms found that 23 % did not wash their hands before exiting.
The value is a parameter because the 6,076 adults in public rest rooms are a population.
Determine whether the quantitative variable is discrete or continuous. Number of bacteria seen on a microscope slide
The variable is discrete because it is countable
Determine whether the quantitative variable is discrete or continuous. Number of days of rainfall in a city in one year
The variable is discrete because it is countable.
Determine whether the quantitative variable is discrete or continuous. Number of notes in a song
The variable is discrete because it is countable.
Determine whether the quantitative variable is discrete or continuous. Running time of a film
The variable is discrete because it is countable.
Violent crimes include rape, robbery, assault, and homicide. The following is a summary of the violent-crime rate (violent crimes per 100,000 population) for all states of a country in a certain year. Complete parts (a) through (d). Upper 1equals=272.8, Upper Q 2equals=387.9, Upper Q3equals=529.7 a) Provide an interpretation of these results. Choose the correct answer below. b) Determine and interpret the interquartile range. c) The violent-crime rate in a certain state of the country in that year was 1,496. Would this be an outlier? d) Do you believe that the distribution of violent-crime rates is skewed or symmetric?
a) 25% of the states have a violent-crime rate that is 272.8 crimes per 100,000 population or less. 50% of the states have a violent-crime rate that is 387.9 crimes per 100,000 population or less. 75% of the states have a violent-crime rate that is 529.7 crimes per 100,000 population or less. b) 256.9 / The middle 50% of all observations have a range of 256.9 crimes per 100,000 population. c) The lower fence is −112.55 crimes per 100,000 population. The upper fence is 915.05crimes per 100,000 population. / Yes, because it is greater than the upper fence. d) The distribution of violent-crime rates is skewed right.
A concrete mix is designed to withstand 3000 pounds per square inch (psi) of pressure. The following data represent the strength of nine randomly selected casts (in psi). 3950, 4080, 3300, 3200, 2930, 3830, 4080, 4050, 3700 Compute the mean, median and mode strength of the concrete (in psi). a) Compute the mean strength of the concrete. b) Compute the median strength of the concrete. c) Compute the mode strength of the concrete.
a) 3680 b) 3830 c) 4080
Researchers wanted to test the effectiveness of a new drug therapy for treating patients with allergies.allergies. To do this, they identified 120 patients with a diagnosis of allergies.allergies. Patients were randomly assigned to one of three treatment groups. FortyForty patients were randomly assigned to receive the new drug therapy, another 40 received the older drug therapy, and the final 40 received a placebo therapy. To measure the effectiveness of the treatment, researchers scored each patient on a standardized rating scale for allergies.allergies. After collecting and comparing the scores for the three treatment groups, the researchers concluded that the new drug therapy is significantly more effective than both the older drug therapy and the placebo therapy in the treatment of allergies.allergies. Complete parts (a) through (f). a) What type of experimental design is this? b) What is the population being studied? c) What is the response variable in this study? d) What are the treatment(s)? e) Identify the experimental units. Choose the correct answer below. f) Which figure illustrates the design of this experiment?
a) Completely randomized design b) All patients with a diagnosis of allergies c) The score on the standardized rating scale for allergies d) The new drug therapy, the older drug therapy, and the placebo therapy e) The 120 patients with a diagnosis of allergies f) circle picture
A polling organization conducts a study to estimate the percentage of households that have more thanhave more than one computerone computer. It mails a questionnaire to 1030 randomly selected households across the country and asks the head of each household if he or she has more than one computer. Of the 1030 households selected, 13 responded. a) Which of these best describes the bias in the survey? b) How can the bias be remedied?
a) Nonresponse bias b) The polling organization should try contacting households that do not respond by phone or face-to-face.
As part of a college literature course, students must select threethree classic works of literature from the provided list and complete critical book reviews for each selected work. Write a short description of the processes that can be used to generate a simple random sample of threethree books. Obtain a simple random sample of size 33 from this list. a) Which of the following would produce a simple random sample? Select all that apply. b) Use the portion of the random number table provided below to obtain a simple random sample of size 33 from this list. If you start on the left and take the first three numbers between 1 and 9, what three books would be selected from the numbered list?
a) Number the books from 1 to 9 and use a random number table to produce 33 different one digit numbers corresponding to the books selected / List each book on a separate piece of paper, place them all in a hat, and pick three. b) Pride and Prejudice, The Sun Also Rises, Crime and Punishment
A marketing research firm wants to determine the most effective method of selling kitchen utensilsselling kitchen utensils: print, radio, television, or online. They recruit 470470 volunteers to participate in the study. The researcher segments the volunteers by ageage. Of the 470470 volunteers, 130130 are under age 20are under age 20, 120120 are 20 dash 39 years oldare 20-39 years old, 130130 are 40 dash 59 years oldare 40-59 years old, and 9090 are 60 years old or olderare 60 years old or older. The volunteers from each group are randomly assigned to either the print advertising group, the radio group, the television group, or the online group. Each group is exposed to the advertising. After 1 hour1 hour, a recall exam is given with the proportion of correct answers recorded. Complete parts (a) through (f) below. a) What type of experimental design is this? b) What is the response variable in this experiment? c) What is the explanatory variable that is manipulated and set at various levels? d) How many levels of treatment are there? e) What variable serves as the block?
a) Randomized block design b) The score on the recall exam c) The type of advertising d) 4 e) The scores on the recall exam
An anti dash drunk drivingAn anti-drunk driving advocate wants to estimate the percentage of people who favor increasing theincreasing the penalty forpenalty for drunk drivingdrunk driving. SheShe conducts a nationwide survey of 18301830 randomly selected adults 18 years and older. The interviewer asks the respondents, "Do you favor harsher penalties for penalties for those convicted for drunk driving question mark for drunk driving?" a) Which of these best describes the bias in the survey? b) How can the bias be remedied?
a) Response bias b) The interviewer should reword the question.
To help assess student learning in her music theory courses, a music professor at a community college implemented pre- and post-tests for her music theory students. A knowledge-gained score was obtained by taking the difference of the two test scores a) What type of experimental design is this? b) What is the response variable in this experiment? c) What is the treatment?
a) case-control b) difference in test scores c) music theory course
To determine if topiramate is an effective treatment for alcohol dependence, researchers conducted a 14-week trial of 371 men and women aged 18 to 65 years diagnosed with alcohol dependence. In this doubleblind, randomized, placebo-controlled experiment, subjects were randomly given either 300 milligrams (mg) of topiramate (183 subjects) or a placebo (188 subjects) daily, along with a weekly compliance enhancement intervention. The variable used to determine the effectiveness of the treatment was self-reported percentage of heavy drinking days. Results indicated that topiramate was more effective than placebo at reducing the percentage of heavy drinking days. The researchers concluded that topiramate is a promising treatment for alcohol dependence. Complete parts (a) through (f). a) What does it mean for the experiment to be placebo-controlled? b) What does it mean for the experiment to be double-blind? Why do you think it is necessary for the experiment to be double-blind? c) What does it mean for the experiment to be randomized? d) What is the population for which this study applies? What is the sample? e) What are the treatments? f) What is the response variable?
a) The experiment will have a control group that takes a placebo, which is a lesser dose of the medication. This control group serves as a baseline treatment that can be used to compare to the group that is taking the full dose of the medication. b) Neither the subject nor the researcher knows which treatment the subject is receiving. / The experiment is double-blind so that the subjects receiving the medication do not behave differently and so the individual monitoring the subjects does not treat those receiving medication differently from those receiving a placebo. c) It means that the subjects are randomly assigned to take either the topiramate or the placebo. d) The population is all 18-65 year olds with alcohol dependence. The sample is 371 men and women aged 18 to 65 years diagnosed with alcohol dependence. e) 300 mg of topiramate or a placebo daily, and a weekly compliance enhancement intervention f) Percentage of heavy drinking days
A researcher wanted to determine the number of televisions in households. He conducts a survey of 4040 randomly selected households and obtains the data in the accompanying table. Complete parts (a) through (h) below. a) Are these data discrete or continuous? Explain. d) What percentage of households in the survey have three televisions? e) What percentage of households in the survey have four or more televisions? h) Describe the shape of the distribution
a) The given data are discrete because they can only have whole number values. d) 20% e) 7.5% h) skewed right
A community college has 76487648 students currently enrolled in classes. To gain the students' opinions about an upcoming building project, the college president wishes to obtain a simple random sample of 88 students. He numbers the students from 1 to 7648. Complete parts (a) and (b) below. a) Using the provided random number table, the president closes his eyes and drops his ink pen on the table. It points to the digit in row 2, column 4. Using this position as the starting point and proceeding downward, determine the numbers for the 88 students who will be included in the sample. b) The president uses technology to produce the following random numbers. 5457, 7077, 2537, 4266, 5520, 7077, 1457, 876, 6761, 2537. Determine the numbers for 8 students who will be included in the sample
a) The numbers for the students are 7538 comma 2636 comma 4027 comma 1048 comma 4652 comma 5819 comma 6902 comma 74547538, 2636, 4027, 1048, 4652, 5819, 6902, 7454. b) The numbers for the students are 5457, 7077, 2537, 4266, 5520, 1457, 876, 6761
a) Which position had the most MVPs? b) How many MVPs played first base(1B)? c) How many more MVPs played outfield (OF) than first base? d) There are three outfield positions (left field, center field, right field). Given this, how might the graph be misleading?
a) The position with the most MVPs was outfield (OF). b) 15 c) 20 d) The chart seems to show that one position has many more MVPs because three positions are combined into one. They should be separated.
Researchers wanted to determine if having a video game consolevideo game console in the bedroom is associated with obesity. The researchers administered a questionnaire to 391391 twelve-year-old adolescents. After analyzing the results, the researchers determined that the body mass index of the adolescents who had a video game consolevideo game console in their bedroom was significantly higher than that of the adolescents who did not have a video game consolevideo game console in their bedroom. Complete parts (a) through (e) below. a) Why is this an observational study? What type of observational study is this? b) What is the response variable in the study? Is the response variable qualitative or quantitative? What is the explanatory variable? c) Can you think of any lurking variables that may affect the results of the study? d) In the report, the researchers stated, "These results remain significant after adjustment for socioeconomic status." What does this mean? e)Does a video game consolevideo game console in the bedroom cause a higher body mass index? Explain.
a) The researchers administered a questionnaire to obtain their data without trying to influence an explanatory variable of the study. Cross-sectional study b)The response variable is the body mass index of the adolescents. The response variable is quantitative. he explanatory variable is whether the adolescent has a video game console in the bedroom or not. c) Yes. For example, possible lurking variables might be eating habits and the amount of exercise per week. d) The researchers made an effort to avoid confounding by accounting for potential lurking variables. e) Yes. A video game consolevideo game console in the bedroom is associated with obesity because the body mass index of the adolescents who had a video game consolevideo game console in their bedroom was significantly higher than that of the adolescents who did not have a video game consolevideo game console in their bedroom.
Researchers wanted to determine if there was an association between the level of traumatrauma of an individual and their risk of diabetesdiabetes. The researchers studied 19541954 people over the course of 1111 years. During this 1111-year period, they interviewed the individuals and asked questions about their daily lives and the hassles they face. In addition, hypothetical scenarios were presented to determine how each individual would handle the situation. These interviews were videotaped and studied to assess the emotions of the individuals. The researchers also determined which individuals in the study experienced any type of diabetesdiabetes over the 1111-year period. After their analysis, the researchers concluded that the trauma dash freetrauma-free individuals were less likely to experience diabetesdiabetes. Complete parts (a) through (c). a) What type of observational study was this? Explain. b)What is the response variable? What is the explanatory variable? c)In the report, the researchers stated that "the research team also hasn't ruled out that a common factor like genetics could be causing both the emotions and the diabetesdiabetes." Explain what this sentence means. Choose the correct answer below.
a) This was a cross-sectional study, because information was collected about a group of individuals at a specific point in time. b)The response variable is whether or not diabetes was contracted, because it affects the other variable. The explanatory variable is the number of individuals who participated in the study, because it is the variable of interest. c) The researchers may be concerned with confounding that occurs when the effects of two or more explanatory variables are not separated or when there are some explanatory variables that were not considered in a study, but that affect the value of the response variable.
Researchers wanted to determine if there was an association between daily cantaloupe consumption and the occurrence of skin cancer. The researchers looked at 93,284 women and asked them to report their cantaloupe-eating habits. The researchers also determined which of the women had nonmelanoma skin cancer. After their analysis, the researchers concluded that consumption of two or more servings of cantaloupe per day was associated with a reduction in nonmelanoma skin cancer. Complete parts (a) through (c) below. a) What type of observational study was this? Explain. b)What is the response variable in the study? Is the response variable qualitative or quantitative? What is the explanatory variable? c)In their report, the researchers stated that "After adjusting for various demographic and lifestyle variables, daily consumption of two or more servings was associated with a 30% reduced prevalence of nonmelanoma skin cancer." Why was it important to adjust for these variables?
a)This was a case-control study because individuals that had a certain characteristic were matched with those that did not. b)The response variable is whether the woman has nonmelanoma skin cancer or not. The response variable is qualitative. The explanatory variable is the number of individuals who participate in the study who eateat two or more servings of cantaloupe per day. c)The researchers were sure that various demographic and lifestyle variables had greater influence than daily consumption of two or more servingstwo or more servings.
To determine customer opinion of their pricing, Amtrak randomly selects 70 train strains during a certain week and surveys all passengers on the trains passengers on the trains. What type of sampling is used?
cluster
A(n) is obtained by dividing the population into groups and selecting all individuals from within a random sample of the groups.
cluster sample
A radio stationradio station asks its listenerslisteners to call in their opinion regarding the use ofthe use of pesticides inpesticides in residential areas.residential areas. What type of sampling is used?
convenience
Relative Frequency calculation
frequency / sum of all frequencies
What is a case-control study?
observational studies that are retrospective, meaning that they require individuals to look back in time or require the researcher to look at existing records.
What is a cross-sectional study?
observational studies that collect information about individuals at a specific point in time or over a very short period of time.
A(n) _________ is a numerical summary of a population.
parameter
Apple wants to administer a satisfaction survey to its current customers. Using their customer database, the company randomly selects 6060 customers and asks them about their level of satisfaction with the company. What type of sampling is used?
simple random
A(n) ________ is a numerical summary of a sample.
statistic
To determine her body temperature, Carrie divides up her day into three parts: morning, afternoon, and evening. She then measures her body temperature at 3 randomly selected times during each part of the day. What type of sampling is used?
stratified
A(n) is obtained by dividing the population into homogeneous groups and randomly selecting individuals from each group.
stratified sample
To estimate the percentage of defects in a recent manufacturing batch, a quality control manager at Daimler minus ChryslerDaimler−Chrysler selects every 1515th vanvan that comes off the assembly line starting with the fourthfourth until she obtains a sample of 6060 vansvans. What type of sampling is used?
systematic
sample variance
s²
Statistics
the science of collecting, organizing, summarizing, and analyzing information to draw a conclusion and answer questions. In addition, statistics is about providing a measure of confidence in any conclusions.
Find the population mean or sample mean as indicated. Population: 11, 88, 1717, 1010, 2424
use the n/u looking symbol = 12
_________ are the characteristics of the individuals of the population being studied.
variables
Find the population mean or sample mean as indicated. Sample: 1919, 1616, 22, 33, 1010
x = 10
Fill in the blank. The _______ represents the number of standard deviations an observation is from the mean.
z-score
population variance
σ²
