stats final

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

A sample regression equation is given by yˆy^ = 155 - 34x1 - 12x2. If x1 = 3 and x2 = 2, the predicted value of y would be ________.

29 . The predicted value is computed using the regression equation yˆy^ = b0 + b1x1 + b2x2, substituting x1 and x2 values.

We can use the ______ transformation, x = µ + zσ, to compute x values for given probabilities.

We can use the *inverse* transformation, x = µ + zσ, to compute x values for given probabilities.

In the following table, likely voters' preferences of two candidates are cross-classified by gender. MaleFemaleCandidate A150130Candidate B100120 The p-value is _______________.

between 0.05 and 0.10 . The p-value can be found using the χ2dfχdf2 distribution table, Excel or R.

For the goodness-of-fit test, the expected category frequencies are found using the _________________________.

proportions specified under the null hypothesis. The expected frequencies, ei = npi, are computed under the assumption that the proportions specified in the null hypothesis are correct.

The following scatterplot indicates that the relationship between the two variables x and y is ________.

strong and positive

The standard error of the estimate measures ________.

the variability of the observed y-values around the predicted y-values . The standard error of the estimate se, is a point estimate of the standard deviation of the random error which is the variability of the observed y-values around the predicted y-values.

If a test statistic has a value of X and is assumed to be χ2χ2 distributed with df degrees of freedom, then the p-value for a right-tailed test found by using Excel's command ________________________.

'=CHISQ.DIST.RT(X, Deg_freedom)' . The Excel's command '=CHISQ.DIST.RT(X, Deg_freedom)' returns the right-tailed probability of the chi-squared distribution.

If we reject a null hypothesis at the 1% significance level, then we have _________ evidence that the null hypothesis is false.

If we reject a null hypothesis at the 1% significance level (α = 0.01), then we have *very strong* evidence that the null hypothesis is false.

The null hypothesis typically corresponds to a presumed default state of nature.

true. We think of the null hypothesis as corresponding to a presumed default state of nature or status quo.

A fund manager wants to know if it is equally likely that the Dow Jones Industrial Average will go up each day of the week. For each day of the week, the fund manager observes the following number of days when the Dow Jones Industrial Average goes up. Day of the WeekObservedMonday192Tuesday189Wednesday202Thursday199Friday218 At the 5% significance level, the critical value is _______.

9.488 . The critical value for the right-tailed test χ2α,dfχα,df2 can be found using χ2dfχdf2 distribution table.

A(n) ______ error is committed when we reject the null hypothesis when the null hypothesis is true.

A *Type I* error is committed when we reject the null hypothesis when the null hypothesis is true.

A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. She estimates the model: Salary = β0 + β1 Service + ε. The following ANOVA table summarizes a portion of the regression results. dfSSMSFRegression1555,420555,4207.64Residual271,962,87372,699 Total282,518,293 CoefficientsStandard Errort-statp-valueIntercept784.92322.252.440.02Service9.193.202.870.01 Which of the following is the monthly salary of an employee that has worked for 48 months at the bank?

$1,226. Use the regression equation yˆy^ = b0 + b1x with given coefficients and x to compute the value of yˆy^ .

Assume you ran a multiple regression to gain a better understanding of the relationship between lumber sales, housing starts, and commercial construction. The regression uses lumber sales (in $100,000s) as the response variable with housing starts (in 1,000s) and commercial construction (in 1,000s) as the explanatory variables. The estimated model is Lumber Sales = β0 + β1 Housing Starts + β2 Commercial Constructions + ε. The following ANOVA table summarizes a portion of the regression results. dfSSMSFRegression2180,77090,385103.3Residual4539,375875 Total47220,145 CoefficientsStandard Errort-statp-valueIntercept5.371.713.140.0030Housing Starts0.760.098.440.0000Commercial Construction1.250.333.780.0005 If Housing Starts were 17,000 and Commercial Construction was 3,200, the best estimate of Lumber Sales would be ________.

$22,290,000. Use the regression equation yˆy^ = b0 + b1x1 + b2x2 with given coefficients, x1 and x2, to compute the value of yˆy^ .

A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and collects monthly data for 25 firms. He estimates the model: Sales = β0 +β1 Advertising + ε. The following ANOVA table shows a portion of the regression results. dfSSMSFRegression178.5378.533.58Residual23504.0221.91 Total24582.55 CoefficientsStandard Errort-statp-valueIntercept40.1014.082.8480.0052Advertising2.881.52-1.8950.0608 Which of the following is the predicted Sales for a firm with Advertising of $500?

$54,500. Use the regression equation yˆy^ = b0 + b1x with given coefficients and x to compute the value of yˆy^ .

In the following table, individuals are cross-classified by their age group and income level. IncomeAgeLowMediumHigh21-351201007536-5015016010051-65160180160 Which of the following is the estimated joint probability for the "low income and 21-35 age group" cell?

0.0996 . The joint probability of the cell is computed as the observed cell frequency divided by the sample size.

Suppose the round-trip airfare between Boston and Orlando follows the normal probability distribution with a mean of $387.20 and a standard deviation of $68.50. What is the probability that a randomly selected airfare between Boston and San Francisco will be between $325 and $425?

0.5275. The normal transformation implies that any value x of X has a corresponding value z of Z given by z=x−μσ.z=x−μσ. Compute P($325 ≤ X ≤ $425). Use z table that provides cumulative probabilities P(Z ≤ z) for positive and negative values of z. Compute P(z1 ≤ Z ≤ z2 ) = P(Z ≤ z2 ) - P(Z ≤ z1 ).The appropriate Excel function is =NORM.DIST(425,387.2,68.5,TRUE)-NORM.DIST(325,387.2,68.5,TRUE) = 0.5275

According to the 2011 Gallup daily tracking polls (www.gallup.com, February 3, 2012), Mississippi is the most conservative U.S. state, with 53.4 percent of its residents identifying themselves as conservative. What is the probability that at least 100 but fewer than 115 respondents of a random sample of 200 Mississippi residents identify as conservative?

0.7099. If P⎯⎯⎯P¯ is normal, we can transform it into a standard normal random variable as Z=P⎯⎯⎯−pp(1−p)n√Z⁢ = P¯⁢ − pp⁢ ( 1⁢ − p) n , and any value of p⎯⎯p¯ on P⎯⎯⎯P¯ has a corresponding value z on Z given by Z=P⎯⎯⎯−pp(1−p)n√Z⁢ = P¯⁢ − pp⁢ ( 1⁢ − p) n .Compute P(0.5≤P⎯⎯⎯<0.575).P⁢ ( 0.5⁢≤P¯⁢ < 0.575) . Note that P(z1≤Z≤z2)=P(Z≤z2)−P(Z≤z1).P⁢( z1⁢ ≤ Z⁢ ≤⁢ z2⁢ ) = P( Z≤ z2⁢) − P(Z⁢≤ z1⁢ ). Use z table.The appropriate Excel function is =NORM.DIST(115/200,0.534,SQRT(0.534*(1-0.534)/200),TRUE)-NORM.DIST(100/200,0.534,SQRT(0.534*(1-0.534)/200),TRUE) = 0.7099

Professor Elderman has given the same multiple-choice final exam in his Principles of Microeconomics class for many years. After examining his records from the past 10 years, he finds that the scores have a mean of 76 and a standard deviation of 12. What is the probability Professor Elderman's class of 36 has a class average below 78?

0.8413. If X⎯⎯⎯X¯ is normal, we can transform it into a standard normal random variable as Z=X⎯⎯⎯−μσ/n√,Z=X¯-μσ/n, and any value of x⎯⎯x¯ on X⎯⎯⎯X¯ has a corresponding value z on Z given by Z=x⎯⎯−μσ/n√Z=x¯-μσ/n . Compute P(X⎯⎯⎯<78).P⁢ ( X¯⁢ < 78⁢ ) . Use z table.The appropriate Excel function is =NORM.DIST(78,76,12/SQRT(36),TRUE) = 0.8413

The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. SeniorityRaceCoordinatorAnalystManagerDirectorWhite3220259Black3510255Hispanic3215132Asian1011100 Assuming that race and seniority are independent, which of the following is the expected frequency of Asian directors?

1.95 . The expected frequency for a test of independence is computed as eij=(Row i total)(Column j total)Sample Sizeeij=Row i totalColumn j totalSample Size .

In the following table, individuals are cross-classified by their age group and income level. IncomeAgeLowMediumHigh21-351201007536-5015016010051-65160180160 Assuming age group and income are independent, the expected "low income and 21-35 age group" cell frequency is ________.

105.27 . Using the expected joint probability pij of the cell, the expected frequency is computed as eij = npij. This can also be found as eij=(Row i total)(Column j total)Sample Sizeeij=Row i totalColumn j totalSample Size .

A daily mail is delivered to your house between 1:00 p.m. and 5:00 p.m. Assume delivery times follow the continuous uniform distribution. Determine the percentage of mail deliveries that are made after 4:00 p.m.

25%. For any subinterval [c, d] of the interval [a, b], probability is computed as P(c ≤ X ≤ d) = d−cb−a.d−cb−a.=(5-4)/(5-1) = 25%

A card-dealing machine deals spades (1), hearts (2), clubs (3), and diamonds (4) at random as if from an infinite deck. In a randomness check, 1,600 cards were dealt and counted. The results are shown below. SuitObservedSpades410Hearts405Clubs370Diamonds415 For the goodness-of-fit test, the value of the test statistic is _____.

3.125

The heights (in cm) for a random sample of 60 males were measured. The sample mean is 166.55, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The following table shows the heights subdivided into non-overlapping intervals. ClassObservedExpected piHeight < 150100.09150 ≤ Height < 16060.21160 ≤ Height < 170180.31170 ≤ Height < 180170.25Height ≥ 18090.14 For the heights subdivided into five intervals, the expected frequency of males that weigh less than 150 is _____.

5.4 The expected frequency for the goodness-of-fit test for normality is computed as ei = npi.

A statistics student is asked to estimate y = β0 + β1x + ε. She calculates the following values: Σxi=220,Σ(xi−x⎯⎯)2Σxi⁢= 220, Σ(xi⁢− x¯)2 = 440, Σ(xi−x⎯⎯)(yi−y⎯⎯)=−568Σ(xi⁢− x¯)(yi⁢− y¯) = −568 , Σyi=385,Σ(yi−y⎯⎯)2Σ⁢yi⁢=385, Σ(yi⁢− y¯)2 = 1,120, n = 11. Which of the following is the value of y if x equals 2?

58.22. The slope b1 and the intercept b0 of the simple regression equation are computed as b1=Σ(xi−x⎯⎯)(yi−y⎯⎯)Σ(xi−x⎯⎯)2b1⁢= Σ⁢(xi⁢− x¯) (yi⁢ − y¯)Σ⁢(xi⁢− x¯)2⁢ and b0=y⎯⎯−b1x⎯⎯.b0⁢= y¯⁢− b1x¯. The simple linear regression equation is yˆ=b0+b1xy^⁢ = b0⁢⁢+ b1x

On average, a certain kind of kitchen appliance requires repairs once every four years. Assume that the times between repairs are exponentially distributed. Which of the following Excel commands computes the probability that the appliance will work no more than three years without requiring repairs?

EXPON.DIST(3,0.25,1). Excel's command EXPON.DIST require three parameters: x value, λ, and true/false value "Cumulative."

Which of the following identifies the range for a correlation coefficient?

Any value between −1 and 1. The range of the sample correlation coefficient rxy is −1 ≤ rxy ≤ 1.

Suppose you want to determine if gender and major are independent. Which of the following tests should you use?

Chi-square test for independence. The chi-square test of a contingency table—also called a test of independence—analyzes the relationships between two qualitative variables.

Consider the sample regression equation: yˆy^ = 12 + 2x1 - 6x2 + 6x3 + 2x4. When x1 increases 1 unit and x2 increases 2 units, while x3 and x4 remain unchanged, what change would you expect in the predicted y?

Decrease by 10. In the multiple linear regression model, the coefficients measure the change in the predicted value of the response variable yˆy^ given a unit increases in the associated explanatory variables, holding all other explanatory variables constant.

In the following table, individuals are cross-classified by their age group and income level. IncomeAgeLowMediumHigh21-351201007536-5015016010051-65160180160 To test that age group and income are independent, the null and alternative hypothesis are _________________________________________________________________________.

H0: Age group and income are independent; HA: Age group and income are dependent. . The null hypothesis for the chi-square test for independence implies that two variables are independent and the alternative hypothesis implies that two variables are dependent.

In the following table, likely voters' preferences of two candidates are cross-classified by gender. MaleFemaleCandidate A150130Candidate B100120 To test that gender and candidate preference are independent, the null hypothesis is ___________________________.

H0: Gender and candidate preference are independent . The null hypothesis for the chi-square test for independence implies that two variables are independent and the alternative hypothesis implies that two variables are dependent.

A fund manager wants to know if it is equally likely that the Dow Jones Industrial Average will go up each day of the week. For each day of the week, the fund manager observes the following number of days when the Dow Jones Industrial Average goes up. Day of the WeekObservedMonday192Tuesday189Wednesday202Thursday199Friday218 For the goodness-of-fit test, the null and alternative hypotheses are ________________________________________.

H0:p1 = p2 = p3 = p4 = p5 = 1/5,HA: Not all population proportions are equal to 1/5 . When setting up the competing hypotheses for a multinomial experiment, we have essentially two choices. We can set all population proportions equal to some specific value, or equal to one another. The sum of the category probabilities for multinomial experiment is p1 + p2 + ... + pk = 1.

A university has six colleges and takes a poll to gauge student support for a tuition increase. The university wants to ensure each college is represented fairly. The below table shows the observed number of students who participate in the poll from each college and the actual proportion of students in each college. CollegeObservedProportion14570.2022060.0833010.1347920.2953360.1563730.15 For the goodness-of-fit test, the alternative hypothesis states that __________________________________________.

HA: At least one of the population proportions is different from its hypothesized value . For the goodness-of-fit test, the alternative hypothesis is that at least one of the proportions is different from its hypothesized value.

A television network is deciding whether or not to give its newest television show a spot during prime viewing time at night. If this is to happen, it will have to move one of its most viewed shows to another slot. The network conducts a survey asking its viewers which show they would rather watch. The network receives 827 responses, of which 428 indicate they would like to see the new show in the lineup. Which of the following is an appropriate hypotheses to test if the television network should give its newest show a spot during prime time at night?

Ho: p ≤ 0.50, HA: p > 0.50 . The competing hypotheses are Ho: p ≤ po,HA: < po. It is referred to as a right-tailed test of the population proportion.

The Boston public school district has had difficulty maintaining on-time bus service for its students ("A Year Later, School Buses Still Late," Boston Globe, October 5). Suppose the district develops a new bus schedule to help combat chronic lateness on a particularly woeful route. Historically, the bus service on the route has been, on average, 12 minutes late. After the schedule adjustment, the first 36 runs were an average of eight minutes late. As a result, the Boston public school district claimed that the schedule adjustment was an improvement—students were not as late. Assume a population standard deviation for bus arrival time of 12 minutes. Which of the following can be used to determine whether the schedule adjustment reduced the average lateness time of 12 minutes?

Ho:μ≥12,HA:μ<12. . To establish whether the mean does not exceed some value, the following hypothesis test should be performed: Ho:μ ≥ μo, HA:μ < μo.

For a multinomial experiment, which of the following is not true?

The trials are dependent. A multinomial experiment consists of a series of n independent and identical trials of random experiment.

When testing whether the correlation coefficient differs from zero, the value of the test statistic is t20 = −2.95 with a corresponding p-value of 0.0061. At the 5% significance level, can you conclude that the correlation coefficient differs from zero?

Yes, since the p-value is less than 0.05. Using the p-value approach, the decision rule is reject the null hypothesis H0: ρxy = 0 if the p-value is less than the significance level α, do not reject the null hypothesis if the p-value is greater than the significance level α. To get p-value for two-tailed test the Excel's function T.DIST.2T or R's function cor.test can be used.

Unlike the coefficient of determination, the sample correlation coefficient in a simple linear regression ________.

indicates whether the slope of the regression line is positive or negative. The coefficient of correlation in a simple linear regression indicates the strength and direction (whether the slope of the regression line is positive or negative) of the linear association between two numeric variables.

What are the degrees of freedom for the goodness-of-fit test for normality?

k−3. The goodness-of-fit test statistic for normality follows the χ2dfχdf2 distribution with df = k − 3 where k is the number of intervals.

When two regression models applied on the same data set have the same response variable but a different number of explanatory variables, the model that would evidently provide the better fit is the one with a ________.

lower standard error of the estimate and a higher adjusted coefficient of determination. If two models differ in the number of explanatory variables, the adjusted R2 is a better goodness-of-fit measure than R2. The comparison of two models becomes straightforward when one of them has a better (lower) se and a better (higher) adjusted R2. Unfortunately, it may happen that one model has a better se but a worse adjusted R2.

Consider the following sample regression equation yˆy^ = 150 - 20x, where y is the demand for Product A (in 1,000s) and x is the price of the product (in $). The slope coefficient indicates that if ________.

the price of Product A increases by $1, then we predict the demand to decrease by 20,000. The slope parameter b1 indicates that if x increases by 1 unit then y is predicted to change by b1.

For the chi-square test of a contingency table, the expected cell frequencies are found as ________________.

the row total multiplied by the column total divided by the sample size. The expected frequencies for each cell of the contingency table are calculated assuming the null hypothesis (two qualitative variables are independent of one another) is true, and they are computed as eij=(Row i total)(Column j total)Sample Sizeeij=Row i totalColumn j totalSample Size .

Over the past 30 years, the sample standard deviations of the rates of return for stock X and Stock Y were 0.20 and 0.12, respectively. The sample covariance between the returns of X and Y is 0.0096. When testing whether the correlation coefficient differs from zero, the value of the test statistic is t28 = 2.31. At the 5% significance level, the critical value is t0.025,28 = 2.048. The conclusion to the hypothesis test is ________.

to reject H0; we can conclude that the correlation coefficient differs from zero. Using critical value approach, the decision rule is to reject the null hypothesis if the test statistic is greater than the critical value; do not reject the null hypothesis if the test statistic is less than the critical value.

The exponential distribution is related to the Poisson distribution.

true . The exponential distribution is related to the Poisson distribution even though the Poisson distribution deals with discrete random variables.

The standard deviation of X⎯⎯⎯X¯ (standard error of the sample mean) equals the population standard deviation divided by the square root of the sample size, or, se(X)⎯⎯⎯⎯⎯=σn√se⁢ ( X⁢ )¯ = σn equivalently.

true . To distinguish the variability between samples from the variability between individual observations, we refer to the standard deviation of the sample mean or to the standard error of the sample mean.

Social-desirability bias refers to systematic difference between a group's 'socially acceptable" responses to a survey or poll and this group's ultimate choice.

true. Social-desirability bias refers to systematic difference between a group's 'socially acceptable" responses to a survey or poll and this group's ultimate choice. This is used as one explanation of the polling missteps that occurred in the 2016 election between Donald Trump and Hillary Clinton. Voters might have provided incorrect answers to a survey or poll because they thought others would look unfavorably on their ultimate choices.

Consider the following simple linear regression model: y = β0 + β1x + ε. The random error term is ________.

ε . The simple linear regression model is defined as y = β0 + β1x + ε, where y and x are the response variable and the explanatory variable, respectively, and ε is the random error term. The coefficients β0 and β1 are the unknown parameters to be estimated.

The sample standard deviations for x and y are 10 and 15, respectively. The covariance between x and y is −120. The correlation coefficient between x and y is ________.

−0.8 . The correlation coefficient is computed as rxy=sxysxsy.

Consider the sample regression equation yˆy^ = 10 - 5x, with an R2 value of 0.65. Which of the following is the value of the sample correlation between y and yˆy^ ?

−0.81. The sample correlation between the response variable y and its predicted value yˆy^ is the square root of R2, that is, ryyˆ=R2‾‾‾√ryy^⁢= R2 . Its sign is determined by the sign on the regression coefficient b1.


Kaugnay na mga set ng pag-aaral

Ch.4 Exam: Disability Income Insurance

View Set

Chapter 8 Current and Contingent Liabilities

View Set

Chapter 9 Notes: Making Capital Investment Decisions

View Set

Chapters 16, 17, 18, 19 Vocabulary

View Set

MGT 420 Topic 1 Adaptive Practice

View Set

Hormones for the Final Exam - Melatonin

View Set