Stats final MA 180 UAB

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Suppose the equation of a​ least-squares regression line is y^= −3.17−2.4x. What can be said about the​ y-intercept?

-3.17

Suppose two events E and F are disjoint. What is​ P(E and​ F)?

0

The probability of observing a particular value of a continuous random variable​ _______.

0

According to the Empirical​ Rule, 68% of the area under the normal curve is within one standard deviation of the mean. What percent of the area under the normal curve is more than one standard deviation from the​ mean?

32%

Can a qualitative variable have values that are​ numeric? Why or why​ not?

yes, numeric values can exist without counting or measuring something. (Rate from 1 to 5)

An experiment was performed to look at the reflectivity of different paints used on roads. Four paints were used​ (call them Paint​ 1, Paint​ 2, Paint​ 3, and Paint​ 4). Twenty-four sections of roads with similar travel patterns and weather patterns were used. Each type of paint was randomly assigned to one of the 24 sections of road so that each paint type was used on 6 different road sections. The percent of reflectivity after 6 months was determined and recorded. How many degrees of freedom does the​ F-statistic have in this​ problem?

3 numerator and 20 denominator

John performed a​ one-sample z-test for proportions and obtained a​ p-value of 0.35. John decided to reject the null hypothesis. What is the probability John made a Type I​ Error?

0.35

After constructing any relative frequency​ distribution, what should be the sum of the relative​ frequencies?

1 or 100%

Suppose the probability that a randomly selected​ man, aged​ 55-59, will die of cancer during the course of the year is 300/100,000. How would you find the probability that at least 1 man out of​ 1,000 of this age will die of cancer during the course of the​ year?

1-(0.997)^1000

What critical value should be used to construct a​ 90% confidence interval for the population mean when the population standard deviation is​ known?

1.645

What critical value should be used to construct a​ 95% confidence interval for the population mean when the population standard deviation is​ known?

1.96

According to the Empirical​ Rule, 68% of the area under the normal curve is within one standard deviation of the mean. What percent of the area under the normal curve is more than one standard deviation above the​ mean?

16%

What critical value should be used to construct a​ 99% confidence interval for the population mean when the population standard deviation is​ known?

2.58

According to the Empirical​ Rule, 95% of the area under the normal curve is within two standard deviations of the mean. What percent of the area under the normal curve is more than two standard deviations from the​ mean?

5%

A random sample of 25 students in an Introductory Statistics course were asked how many hours of sleep they got last night. The average of these 25 students was 5.4 hours with a standard deviation of 1.3 hours. Suppose all conditions are met for inference using the​ one-sample t-methods. Calculate the upper bound of a​ 95% confidence interval for the mean number of hours students in the Introductory Statistics course slept last night.

5.9366 hours [stat crunch: Stats > t Tests > One Sample > with summary > input values]

A medical study was investigating if getting a flu shot actually reduced the risk of developing the flu. A hypothesis test is performed. Suppose the null hypothesis was rejected with a​ p-value of 0.0002. The power of the test was 0.90. What type of error could be made and what is the probability of making that​ error?

A Type I error could be made with a probability of 0.0002.

A confidence interval for a population mean​ __________.

A confidence interval for a population mean gives possible values the true population mean will be with a certain level of confidence.

critical value is​ _____________.

A critical value is the number of standard errors​ (or standard​ deviations) to move from the mean of a sampling distribution to correspond to a specified level of confidence.

Which distribution shape​ (skewed left, skewed​ right, or​ symmetric) is most likely to result in the mean being substantially smaller than the​ median?

A distribution that is skewed left will likely have a mean that is smaller than the median since the extreme values in the tail tend to pull the mean to the left.

Why is it important that the relationship between the explanatory and response variable be linear when performing a linear regression​ analysis?

A linear regression analysis relies on a straight line being fit between the points on a scatterplot.

When should a paired​ t-test be performed instead of a​ two-sample t-test?

A paired​ t-test should be performed instead of a​ two-sample t-test when each observation in one group has a dependence on a particular observation in the other group.

When should a paired​ t-test be​ performed?

A paired​ t-test should be performed when the variable of interest is​ quantitative, there are two groups being​ compared, and the samples taken are dependent.

Which of the following statements is true about a normal density curve as sigmaσ ​increases? A) The curve becomes more spread out. B) The curve becomes less spread out. C) There is no change in the spread of the curve. D) There is not enough information to determine the effect on the spread of the curve.

A) The curve becomes more spread out.

t is hypothesized that​ 50% of Americans attend church regularly. Which of the following would be an example of making a Type I​ Error?

A study was conducted that had evidence to reject the null hypothesis. In​ reality, half of Americans actually do attend church regularly.

What critical value should be used to construct a​ 90% confidence interval for the population mean when the population standard deviation is​ known?

z = 1.645

Which of the following statements is not a requirement for a probability density function or state that they all are. A) The curve must be symmetric and centered at zero. B) The total area under the curve must equal one. C) Every point on the curve must be on or above the​ x-axis. D) These are all requirements for a probability density function

A) The curve must be symmetric and centered at zero.

Identify which of the following statements about the graph of a probability density function is true or state that they are both true or neither are true. A) The graph must always be on or above the horizontal axis. B) The graph must always be to the right of the vertical axis. C) Both of the first two statements are true. D) Neither of the first two statements is true.

A) The graph must always be on or above the horizontal axis.

An investigator conducts an experiment with four treatment groups. The response variable is growth of a plant during the experiment. She performs an​ F-test. What is the null hypothesis the researcher is testing with the​ F-test?

All four treatment groups have the same average growth of the plants during the experiment.

Which of the following statements is not true about binomial probability​ distributions?

As the probability of success​ increases, the probability distribution for a binomial variable becomes bell shaped.

An investigator conducts an experiment with four treatment groups. The response variable is growth of a plant during the experiment. She performs an​ F-test. What is the alternative hypothesis the researcher is testing with the​ F-test?

At least one treatment group has a different average growth of plants during the​ experiment, but not all four necessarily have different mean growths.

A​ p-value is the probability​ _____________.

A​ p-value is the probability of observing the actual​ result, a sample​ mean, for​ example, or something more unusual just by chance if the null hypothesis is true

Identify which of the following is not a property of the standard normal curve. A) The​ mean, median, and mode are all equal to zero. B) It has inflection points at μ ± 2σ. C) It is symmetric about its mean of​ zero, and has standard deviation equal to 1. D) As the value of z​ increases, the graph​ approaches, but never​ equals, zero.

B) It has inflection points at μ ± 2σ.

Which of the following would increase the width of a confidence interval for a population​ mean? A) Decrease the sample standard deviation. B) Increase the sample size C) Increase the level of confidence D) All of the above

C) Increase the level of confidence

Which of the following is a property of the standard normal​ curve, but not necessarily a property of every normal​ curve? A) The​ mean, median, and mode are all equal. B) The area under the curve is one. C) The mean is zero and the standard deviation is one. D) The curve is symmetric about the mean.

C) The mean is zero and the standard deviation is one.

Suppose every student in a class is surveyed and it is reported that​ 75% of the class plans to take another math class. Is this an example of descriptive or inferential​ statistics? Explain.

Descriptive​ statistics; The results of the class sample are described without making any generalizations about the population of all students at the school.

Researchers conducted a study and obtained a​ p-value of 0.75. Based on this​ p-value, what conclusion should the researchers​ draw?

Fail to reject the null hypothesis but do not accept the null hypothesis as true either.

Type II error

Failing to reject a false null hypothesis

Which of the following statements about probability is not​ true?

For any event​ E, 0less than<​P(E)less than<​1, where​ P(E) is the probability of event E.

It is recommended that adults get 8 hours of sleep each night. A researcher hypothesized college students got less than the recommended number of hours of sleep each​ night, on average. The researcher randomly sampled 50 college students and calculated a sample mean of 7.9 hours per night. The researcher performed a hypothesis test. What is the null​ hypothesis?

H0: μx=8 hours per night

Alex hypothesized​ that, on​ average, students study less than the recommended two hours per credit hour each week outside of class. Which of the following is​ Alex's alternative​ hypothesis?

H1: μ<2 hours per week per credit

Why does the formula for calculating the sample​ variance, involve squaring the difference between each value and the​ mean?

If the differences were not squared, the sum of all deviations from the mean would always be zero since the positive deviations are balanced by the negative deviations

Why does the formula for calculating the sample​ variance, why do we divide by n-1 instead of n

If the formula divided by n, the sample variance would be biased and consistently underestimate the population variance

Suppose every student in a class is surveyed and it is found that​ 75% of the class plans to take another math class. It is reported that​ 75% of all students at the school plan to take another math class. Is this an example of descriptive or inferential​ statistics? Explain.

Inferential​ statistics; the results of the class sample are extended to make a generalization about the population of all students at the school.

A student wondered if more than​ 10% of students enrolled in an introductory Chemistry class dropped before the midterm. He noticed that 2 out of 15 of his friends in the class dropped before the midterm. Based on his​ sample, he performed a hypothesis test. Is the hypothesis test a​ one-tailed or​ two-tailed test?

It is a​ one-tailed test since the alternative hypothesis states that the parameter is greater than the hypothesized value.

Suppose the equation of a​ least-squares regression line is y^ = −3.17 − 2.4x. What can be said about the correlation​ coefficient?

It is​ negative, but its exact value cannot be determined from the given information.

When analyzing two quantitative​ variables, what is the first thing that should be​ done?

Make a scatterplot.

The probability that a randomly selected adult in a particular community is a smoker is​ 20%. The probability that a randomly selected adult in the community is a​ smoker, given that the adult earns more than​ $75,000 per​ year, is​ 10%. Are the events​ "is a​ smoker" and​ "earns more than​ $75,000 per​ year" independent? Explain.

No, because the probability of smoking is different for people who earn over​ $75,000 per​ year, the events are not independent.

A survey found that​ 5% of adults have not visited a dentist in the last five years. Suppose you ask 50 adults selected at random if they have visited a dentist in the last five years. Should a normal distribution be used to approximate the distribution of the random variable x that counts the number of adults who have not visited a dentist in the last five​ years?

No; since npless than<​5, the normal distribution should not be used.

Data were collected on many different variables of a fast food​ chain's sandwiches several years ago. Two variables were the serving size​ (in ounces) of a sandwich and the number of calories in the sandwich. A hungry customer wanted to estimate the number of calories in a sandwich based on its serving size. With this in​ mind, which variable would go on the​ y-axis in the​ scatterplot?

Number of calories goes on the​ y-axis, since it is the response variable.

In a normal​ distribution, approximately​ 68% of the area under the normal curve is within how many standard​ deviation(s) of the​ mean?

One

The ____ is/are the entire group of individuals or items being studied

Population

In​ regression, what is the proportion of variation in the response variable that is explained by the regression model​ called?

R^2

Type 1 error

Rejecting a true null hypothesis

Suppose you want to know if more technical service calls are made to homes with cable television or with satellite dish television. Should you use frequencies or relative frequencies to make the​ comparison? Why?

Relative frequencies should be used since there is likely a difference in the number of users of cable and satellite television. If you make comparisons using​ frequencies, the results can be very misleading for different population sizes.

Jan performed a study and obtained a​ p-value of 1.24. What conclusion should Jan​ make?

She made an error since it is not possible to get a​ p-value of 1.24.

Which measure of center must be equal to an actual data​ value? Explain why.

Since the mode is the most frequent observation that occurs in the data​ set, it must be an actual value from the data set

Which of the following statements correctly describes the complement of event​ E?

The complement of event E is the set of outcomes which are in the sample space but not in event E.

April calculated a correlation coefficient between sex and GPA as −0.25. She said there is a weak correlation between a​ person's sex and their GPA. Which of the following is an appropriate comment about​ April's statement?

The correlation coefficient does not make sense to describe the relationship between a categorical and quantitative variable.

What is the definition of the correlation​ coefficient?

The correlation coefficient is a measure that describes the direction and strength of the linear relationship between two quantitative variables.

A collection of data on class sizes at a community college produces the​ five-number summary below. Comment on the shape of the distribution of class sizes. Min=12 Q1=22 Q2=35 Q3=38 Max=40

The distribution appears to be skewed left since the median is further from the first quartile than the third quartile.​ Also, the left whisker would be longer than the right whisker in a boxplot for the data.

The following​ five-number summary represents the annual snowfall totals for a Midwest town for the last 75 years. Comment on the shape of the distribution of snowfall totals. Min=14 Q1=17 Q2=21 Q3=29 Max=38

The distribution appears to be skewed right since the median is closer to the first quartile than the third quartile. Also, the right whisker tends to be longer than the left in a box plot of a right skewed test

How is the​ best-fitting line between the points in a scatterplot​ defined?

The line that gives the smallest sum of the squared vertical distances between each point and the line

What is the mean of a probability​ distribution?

The mean is the expected value of the random variable.

Identify which statement about the mean of a discrete random variable is not true or state that they are all true.

The mean must be a possible value of the random variable.

Which measure of center​ (mean or​ median) is​ resistant? Explain what it means for that measure to be resistant.

The median is resistant because it is not sensitive to extreme values in the data set. If the largest observation was​ doubled, for​ example, the median would not change since that largest value does not factor into its computation.

Suppose x(bar)=60, H0:μx=50, HA:μx>50, and the​ p-value from a​ one-sample test is 0.04. What does this​ p-value mean?

The probability of getting a sample mean of 60 or more if the true population mean is 50 is 0.04.

When looking at a scatterplot of two quantitative​ variables, what do we typically look​ for?

The relationship between the two variables and if there are any deviations from the pattern​ (outliers or clusters of​ points, for​ example).

Is the average body temperature of humans really 98.6°​F? After sampling​ 15,600 healthy people from around the​ country, researchers found a sample mean of 98.5°F. The​ p-value was 0.0001. Which of the following is​ true?

The results are​ "statistically significant" because the sample size was quite large and the​ p-value was quite small.

Describe the sample variance in words rather than with a formula.

The sample variance is the sum of the squared deviations from the​ mean, divided by ​(nminus−​1).

The​ least-squares regression equation y^ = 33.967 + ​11.358x, what is 11.358

The slope of the least squares regression line

Identify the requirements for a discrete probability distribution.

The sum of the probabilities must equal one. Each probability must be between zero and one inclusive

Which of the following is NOT a condition of the Analysis of Variance​ model?

The treatment group means must fall on a straight line.

Which of the following is not a criterion for the binomial​ distribution?

The trials must be dependent.

What is wrong with the following definition of the correlation​ coefficient? The correlation coefficient measures the strength and direction of the linear relationship between two variables.

The two variables must be quantitative.

If​ someone's gross annual income has a​ z-score of positive​ 2, what can be​ concluded?

Their income is 2 standard deviations above the mean income

Gina calculated a correlation coefficient between hours studied and grade point average as​ +0.75. Which of the following is a correct statement based on this correlation​ coefficient?

There is a fairly strong positive relationship between hours studied and grade point​ average, indicating that grade point averages tend to be higher for students who study more.

What does a correlation coefficient of 0​ indicate?

There is no linear relationship between the two quantitative variables.

Researchers timed 21 subjects as they tried to complete​ paper-and-pencil mazes. Each subject attempted a maze both with and without the presence of a floral aroma. Subjects were randomized with respect to which trial they did first. Suppose a paired​ t-test is to be performed to determine whether there is evidence to indicate that the time to complete the maze is faster in scented trials compared to unscented​ trials, on average. The​ p-value from the paired​ t-test is 0.11. Which of the following is the most appropriate conclusion based on this​ p-value?

There is not sufficient evidence to indicate that the individuals complete mazes faster with a floral aroma present compared to when no floral aroma is​ present, on average.

A certain marathon has had a wheelchair division since 1977. An interested fan wondered who is​ faster: the​ men's marathon winner or the​ women's wheelchair marathon​ winner, on average. A paired​ t-test was​ performed, and the​ p-value was found to be 0.001. Which of the following is the correct​ conclusion?

There is sufficient evidence to indicate that the​ men's running winning time and the​ women's wheelchair winning time each year are​ different, on average.

Brett is a huge sports fan. He hypothesized half of sports fans liked football the​ best, one-quarter liked baseball the​ best, 15% liked basketball the​ best, and​ 5% liked hockey the​ best, and the rest liked some other sport the best. He surveyed 100 sports fans and asked what sport they liked the best. Assuming all conditions are​ satisfied, which of the following tests should Brett use to test his​ hypothesis?

The​ goodness-of-fit chi-square test

Why does sample size need to be accounted for in the​ t-distribution?

The​ t-distribution changes for different sample sizes.

Suppose you want to calculate the​ z-score for your height. How will the​ z-scores compare if you use your height in inches verses​ centimeters?

The​ z-scores will be the same regardless of the unit used for your height because​ z-scores are unitless.

n a normal​ distribution, approximately​ 99.7% of the area under the normal curve is within how many standard​ deviation(s) of the​ mean?

Three

Explain how to find the mean of a discrete random variable.

To find the mean of a random​ variable, multiply each value of the random variable by its probability and then add those products.

True or​ false? A histogram and a relative frequency​ histogram, constructed from the same​ data, always have the same basic shape.

True. A relative frequency histogram will have different scale on the y axis but the same shape as a regular histogram

In a normal​ distribution, approximately​ 95% of the area under the normal curve is within how many standard​ deviation(s) of the​ mean?

Two

A research organization keeps track of what citizens think is the most important problem facing the country today. They randomly sampled a number of people in 2003 and again in 2009 using a different random sample of people in 2009 than in 2003 and asked them to choose the most important problem facing the country today from the following​ choices, war,​ economy, health​ care, or other. Which of the following is the correct test to use to determine if the distribution of​ "problem facing this country​ today" is different between the two different​ years?

Use a​ chi-square test of homogeneity.

When is it appropriate to use the pooled​ two-sample t-methods?

Use the pooled​ two-sample t-methods when the samples come from different populations with the​ same, or nearly the​ same, standard deviations.

A graduate student wanted to estimate the average time spent studying among graduate students at her school. She randomly sampled graduate students from her school and obtained a​ 99% confidence interval of​ (17.3,22.5) hours/week. In the context of the​ problem, which of the following interpretations is​ correct?

We are​ 99% sure that the average amount of time spent studying among graduate students at this​ student's school is between 17.3 and 22.5 hours per week

In a​ chi-square test, when would the null hypothesis be​ true?

When all observed counts are the same as their expected counts

When will a​ chi-square statistic be​ 0?

When all observed counts are the same as their expected counts

When are conclusions said to be​ "statistically significant"?

When the​ p-value is less than a given significance level

Elmo likes music. He wondered if listening to music while studying will improve scores on an exam. Fifty students who were to take the midterm in a week agreed to be part of a study. Half were randomly assigned to listen to classical music while studying for the exam. The other half were told not to listen to any music while studying for the exam. A hypothesis test is to be performed to determine if the average scores of those listening to music while studying for the exam were higher than those who did not listen to any music while studying for the exam. Which of the following hypothesis tests should be​ used?

a​ two-sample t-test

The​ _________________ is/are a subset of the population that is being studied.

sample

The probability of obtaining x successes in n independent trials of a binomial experiment is given by ​P(x)=xnCxp^x(1−p)^n−x​, where p is the probability of success. What does the n−x represent in the​ formula?

the number of failures

The​ F-statistic in a​ one-way Analysis of Variance problem has how many numerator degrees of​ freedom?

the number of groups being compared minus 1

The probability of obtaining x successes in n independent trials of a binomial experiment is given by ​P(x)=nCxp^x(1−p)^n−x, where p is the probability of success. What does nCx represent in the​ formula?

the number of ways to get x successes in n trials

The probability of obtaining x successes in n independent trials of a binomial experiment is given by ​P(x)=nCxp^x(1−p)^n−x where p is the probability of success. What does (1−p)^n−x represent in the​ formula?

the probability of failure raised to the number of failures

The probability of obtaining x successes in n independent trials of a binomial experiment is given by ​P(x)=nCxp^x(1−p)^n−x where p is the probability of success. What does the p Superscript p^x represent in the​ formula?

the probability of success raised to the number of successes

Describe the sample standard deviation in words rather than with a formula.

the sample standard deviation is the square root of the quotient of the sum of the squared deviation from the mean and (n-1)

What does the standard error of the distribution of sample means​ estimate?

the standard deviation of the distribution of sample means

The​ F-statistic in a​ one-way Analysis of Variance problem has how many denominator degrees of​ freedom?

the total sample size of all groups combined minus the number of groups being compared

It is assumed that approximately​ 15% of adults in the U.S. are​ left-handed. Consider the probability that among 100 adults selected in the​ U.S., there are at least 30 who are​ left-handed. Given that the adults surveyed were selected without​ replacement, can the probability be found by using the binomial probability formula with x counting the number who are​ left-handed? Why or why​ not?

​Yes, because the 100 adults represent less than​ 5% of the U.S. adult​ population, the trials can be treated as independent.

Cuckoos lay their eggs in the nests of other​ (host) birds. The eggs are then adopted and hatched by the host​ birds, but the potential host birds lay eggs of different sizes. A random sample of sparrow host eggs and wagtail host eggs was taken and the length of the cuckoo eggs for each host was recorded. Based on the sample​ data, suppose a​ 95% confidence interval for the difference in mean lengths of cuckoo eggs​ (sparrow hosts−wagtail ​hosts) is ​(−0.6, −0.1) mm. Is there evidence at the​ 5% significance level to indicate that cuckoos do change the size of their eggs between sparrow and wagtail​ hosts, on​ average?

​Yes, since 0 is not between the lower and upper bounds of the confidence interval.

A professor wondered if there was a difference in the proportion of students who dropped math classes between females and males. The professor randomly selected 20 math classes around campus and recorded the gender of the individual and whether or not a student enrolled in the class at the beginning of the term dropped the class at some point during the term. Assuming all conditions are​ satisfied, which of the following tests should the researcher​ use?

​two-sample z-test for proportions


Ensembles d'études connexes

Cs1428 revel quizzes/checkpoints

View Set

SER216: Testing Lifecycle, Unit Testing, Network Programming

View Set

Mental Health Ch 9 (Therapeutic Communication) - NCLEX style questions

View Set

Final Lecture Exam: micro Dr. Dan

View Set