Quiz #3 Business Analytics Multi Choice Questions

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

________ sampling applies to populations that are divided into natural subsets and allocates the appropriate proportion of samples to each subset.

Stratified sampling

The average cost for the sample of x sales of a product is = $xxx and the sample standard deviation is s = $xx. The hypothesized mean is μ0 = $xxx. Compute the value of the test statistic.

(sample mean-hypothesized mean)/(standard dev/sq rt sample size) A test statistic used depends on the type hypothesis the proper depends on assumptions on the populations The value of Uzero is the hypothesized value of the population mean A one sample test for mean is unknown, the formula is X bar - U zero / (S / sqrt N)

Identify the power of the test from the following probabilities.

1-b Represents the probability of correctly rejecting the null when it is indeed false We would like the power of the test to be high

Excel questions

1. Regression 2. ?

How many additional dummy variables are required if a categorical variable has 4 levels?

3 dummy variables

In statistical terminology, the variable of interest is called a ________.

Factor

Which of the following Excel functions computes the p-value for the chi-square test?

=CHISQ.TEST(G5:H6,G10:H11) The Excel formula we'll be using to calculate the p-value is: " =tdist(x,deg_freedom,tails) Where the arguments are: " x = t " deg_freedom = n-1 (degrees of freedom) " tails = 1 for a one-tail test or 2 for a two-tail test

(15)From the table above, determine the cumulative probability for z at a 95% confidence level.

=norm.dist(x,mean, standard dev, true) z = (X - μ) / σ

(32)What is the observed level of significance?

=tdist(x,deg_freedom,tails(1=one,) Observed Significance Level (P-value) The observed significance level, or P-value, for a specific. statistical test is the probability (assuming the null hypothesis. is true) of observing a value of the test statistic that is at.

For a two-sample hypothesis which tests for differences in population parameters (1) and (2), a two-tailed test seeks evidence that population parameter:

??? A one sample hyp involves a single population parameter such as the mean proportion, standard deviation To conduct the test we use a single sample of data from the population and the three types of test are greater than, less than or equal to A two sample test invloves comparing two populations for differences in means, proportions or other population parameters. Lower tail test - T.DIST Upper tail test - T.DIST.RT two tail test - T.INV.2T

Which of the following is a valid one-sample hypothesis test?

A test that involves a single population parameter such as a mean, proportion, and a single sample of data Three types of one sample tests: 1. H0: parameter ≤ constant H1: parameter > constant 2. H0: parameter ≥ constant H1: parameter < constant 3. H0: parameter = constant H1: parameter ≠ constant • The dependent variable must be continuous (interval/ratio). • The observations are independent of one another. • The dependent variable should be approximately normally distributed. • The dependent variable should not contain any outliers.

Which of the following is true while applying the Excel ANOVA tool?

Analysis of variance (ANOVA) is a statistical technique that is used to check if the means of two or more groups are significantly different from each other. ANOVA checks the impact of one or more factors by comparing the means of different samples. We can use ANOVA to prove/disprove if all the medication treatments were equally effective

________ states that if the sample size is large enough, the sampling distribution of the mean is approximately normally distributed, regardless of the distribution of the population and that the mean of the sampling distribution will be the same as that of the population.

Central Limit Theorem

In sampling, ________ involves assessing the value of an unknown population parameter-such as a population mean, population proportion, or population variance-using sample data.

Estimation

From the standard deviation formula, (∑▒(x-x ̅ )^2 )/(n-1), identify the estimator.

Estimators are the means, sample variance, or sample proportions X bar

Type II error occurs when the test:

H0 is false and the test incorrectly fails to reject H0 (called Type II error) A type II error, or false negative, is where a test result indicates that a condition failed, while it actually was successful. A Type II error is committed when we fail to believe a true condition.

In a two-sample test for differences in means, the hypotheses are of the form:

H0: U1-U2 (greater than or equal to, less than or equal to, or equal to) 0 H1: U1-U2 (greater than, Lesser than, or not equal to) 0

For a two-sample hypothesis test for differences in population parameters (1) and (2), which of the following is the correct form of an upper-tailed test?

H0: population parameter (1) - population parameter (2) ≤ D0 H1: population parameter (1) - population parameter (2) > D0 This test seeks evidence that the difference between population parameter (1) and population parameter (2) is greater than some value, D0. When D0 = 0, the test simply seeks to conclude whether population parameter (1) is larger than population parameter (2). (Slide 27, Ch. 7) T.DIST.RT Excel

(33)Which of the following is the conclusion?

H1< Either reject of accpet

________ means that the variation about the regression line is constant for all values of the independent variable.

Homoscedasticity

Which of the following helps in evaluation of autocorrelation?

In order to understand autocorrelation, we can discuss some instances that are based upon cross sectional and time series data. In cross sectional data, if the change in the income of a person A affects the savings of person B (a person other than person A), then autocorrelation is present. In the case of time series data, if the observations show inter-correlation, specifically in those cases where the time intervals are small, then these inter-correlations are given the term of autocorrelation.

Interaction is:

In statistics, an interaction[1][2] may arise when considering the relationship among three or more variables, and describes a situation in which the simultaneous influence of two variables on a third is not additive. Most commonly, interactions are considered in the context of regression analyses.

The Ransin Sports Company has noted that the size of individual customer orders is normally distributed with a mean of $xxx and a standard deviation of $x. Which of the following is the answer for the probability that the next individual who buys a product will make a purchase of more than $xxx?

Mean = 36 Standard devation = 8 what is the probability that the next individual buys a product more than 40 =1-NORM.DIST(40,36,8,TRUE)

When two or more independent variables in the same regression model can predict each other better than the dependent variable, the condition is referred to as ________.

Multicollinearity

Which of the following is true about multicollinearity?

Multicollinearity refers to a situation in which two or more explanatory variables in a multiple regression model are highly linearly related. We have perfect multicollinearity if, for example as in the equation above, the correlation between two independent variables is equal to 1 or −1.

Which of the following is true about Excel outputs Multiple R?

Multiple R. This is the correlation coefficient. It tells you how strong the linear relationship is. For example, a value of 1 means a perfect positive relationship and a value of zero means no relationship at all. It is the square root of r squared

Which of the following is true of the t-distribution?

Need to understand what comes with t distributions t distribution is actually a family of probability distributions with a shape similar to the standard normal distribution Different t distributions are distinguished by an additional parameter called degrees of freedom or df n-1 1. the t statistic could be considered as an estimated z statistic 2. The t statistic provides a relatively poor estimate of z with small sample size when the population standard deviation is unknown you can use the t statistic, assuming all relevant assumptions are satisfied. As the number of degrees of freedom increases, the t-distribution converges to the standard normal distribution

Which of the following is true of prediction intervals?

Prediction intervals: is one that provides a range for predicting the value of a new observation from the same population. A confidence interval is associated with the sampling distribution of a statistic, but a prediction interval is associated with the distribution of the random variable itself. A 100(1 - a)% prediction interval for a new observation is (Slide 32, Ch.6) a prediction interval is an estimate of an interval in which future observations will fall, with a certain probability,

Which of the following types of sampling involves using random procedures to select a sample?

Probability Sampling

In Excel's Trendline tool, the value of the________ gives the measure of fit of the line to the data.

R squared

Which of the following denotes the power of the test

Slide 10 The value of 1 - β is called the power of the test = P(rejecting H0 | H0 is false). Sensitive to the sample size Statistical power is inversely related to beta or the probability of making a Type II error. In short, power = 1 - β. (BETA) The further away from the true mean is from the hypothesized value the smaller is b As Alpha increases B deceases

Which of the following describes periodic sampling?

Systematic Sampling Or Periodic sampling means that Systematic sampling is a type of probability sampling method in which sample members from a larger population are selected according to a random starting point and a fixed periodic interval. This interval, called the sampling interval, is calculated by dividing the population size by the desired sample size.

Which of the following Excel tools is used for a two-sample test for equality of variances?

T.DIST.2T

For a one-tailed test, the critical value:

The critical value determines whether or not the test statistic falls in the rejections if the level of significance is .05 then the critical value of the 1 tailed test is the value of the t distribution with n -1 degrees of freedom =T.INV(1-a,n-1)

Which of the following is true about one-tailed and two-tailed tests?

The critical value divides the sampling distribution into two parts, a reject region and a non reject region one tailed test is where we specifity a direction of relationship where Ho is greater than or equal or less than or equal to two tailed test is either equal to or not equal to

Which of the following is true about the observed errors associated with estimating the value of the dependent variable using the regression line?

The errors can be negative or positive

Standard residuals:

The residual standard deviation is a statistical term used to describe the standard deviation of points formed around a linear function, and is an estimate of the accuracy of the dependent variable being measured.

While checking for linearity by examining the residual plot, the residuals must:

The residuals should not be either systematically high or low. So, the residuals should be centered on zero throughout the range of fitted values. In other words, the model is correct on average for all fitted values

Level of significance is the probability of:

The null hypothesis is rejected if the p-value is less than a predetermined level, α. α is called the significance level, and is the probability of rejecting the null hypothesis given that it is true (a type I error). It is usually set at or below 5%.

Which of the following accurately describes a sampling distribution of the mean?

The symbol μM is used to refer to the mean of the sampling distribution of the mean. U = Ux is the distribution of the means of all possible samples of a fixed size n from some population. Standard error of the mean = standard deviation/sqrt N

Which of the following probabilities gives the confidence coefficient?

a/2 For a confidence coefficient of .95 we mean that we expect 95 out of 100 samples to support the null hypothesis rather than the alt hyp when Ho is actually true

Which of the following is true of the equation [n ≥ (zα/2)2 ] for computing the sample size required to achieve a desired confidence interval half-width for a proportion?

as sample size increase the width of the confidence interval decreases giving a more accurate estimation the most conservative approach is to use .5 for the estimate of the true proportion (Pi) Telling me that the half width of a confidence interval is the same as the margin of error. If the half width of the resulting confidence interval is within the required margin of error then we clearly have achieved our goal if not we can use a new sample standard deviation s to determine a new sample size

While testing hypotheses for regression coefficients, the t-test for the slope is expressed as:

b1-0/standard error

Categorical variables that have been coded are called ________.

categorical variables are referred to as enumerations or enumerated types

For which of the following is the value of the estimator said to be biased?

check slide 7 An estimator of a population parameter whose expected value does NOT equal the population parameter

In which of the following cases is a proportion of the observations of a sample used in estimating the confidence interval?

gender, college, high school, categorical variables, usually interested in a sample that has a certain characteristic A sample proportion is where X is the number in a sample having the desired characteristics X/N

Which of the following is true of the R-squared (R2) value in Excel's Trendline function?

how well the regression line fits data R-squared is a statistical measure of how close the data are to the fitted regression line. It is also known as the coefficient of determination, or the coefficient of multiple determination for multiple regression. ... 100% indicates that the model explains all the variability of the response data around its mean computed in Excel to determine how closely our data conform to a linear relationship. R. 2. values range from 0 to 1, with 1 representing a perfect fit between the data and the line. drawn through them, and 0 representing no statistical correlation between the data and a.

Additional sources

https://quizlet.com/46889688/quantitative-methods-chapter-4-flash-cards/ https://quizlet.com/173636137/assignment-3-flash-cards/ https://www.wikihow.com/Calculate-Confidence-Interval

When using the t-statistic in multiple regression to determine if a variable should be removed:

if t<1 the standard error willl decrease adjusted r^2 will increase if the variable is removed

Which of the following is true about chi-square distribution?

is characterized by the number of degrees of freedom The chi-square distribution has the following properties: " The mean of the distribution is equal to the number of degrees of freedom: μ = v. " The variance is equal to two times the number of degrees of freedom: σ2 = 2 * v " When the degrees of freedom are greater than or equal to 2, the maximum value for Y occurs when Χ2 = v - 2. " As the degrees of freedom increase, the chi-square curve approaches a normal distribution.

Which of the following is true about the rejection region?

is located in the upper tail of the F-distribution since the ANOVA is only a one-tail, upper-tailed test. values from the test that falls in the reject region would be a rejection to the Null hyp

For a lower-tail test, the p-value in the output from an Excel tool:

is the probability to the left of the test statistic t in the t-distribution, and is found using the Excel function: =T.DIST(t, n-1, TRUE). P value seeks to conclude whether the population parameter 1 is smaller than the population 2

Which of the following statements is true when using the Excel Regression tool?

it can be used for both simple and multiple linear regression

Which of the following is true about the value of the power of the test?

its sensitive to the sample size how good the test is the power of the test can be increased by larger samples which enables us to detect smaller differences between sample statistics and population parameters Rejecting a null hypothesis when it is false is what every good hypothesis test should do. Having a high value for 1 -b (near 1.0) means it is a good test, and having a low value (near 0.0) means it is a bad test. tells you the value of the test near 1 is good near 0 is bad

Which of the following generates a scatter chart in Excel with the values predicted by the regression model included?

line fit plots but a scatter plot with a trendline is visually superior

Which of the following is true of linear functions used in predictive analytical models?

linear functions show steady increase or decrease over the range of x interpreation; data between two points extraporlation ; last few points and what they mean

(14)Based on the data in the table above, calculate the margin of error at 95% confidence interval.

mean*standard dev/sq root of sample size =CONFIDENCE.NORM(alpha, standard_deviation, size). Or =confidence.T (alpha, standard_deviation, size)

(16)From the table above, calculate the lower confidence interval estimate at a confidence level of 95%

mean-(=confidence(.05, standard dev., sample size))

Which of the following is true about multiple linear regression?

more than one independent variable partial regression coefficients represent the expected change in the dependent variable when the associated independent variable is increased by one unit while the values of all other independent variables are held constant uses least squares to estimate the intercept and slope coefficients that mininmize the sumn of sq error terms over all observations

(45) What is the expected value for a 90 year-old piece of furniture according to the data of 43)?

plug in value for x into the equation

(43) The following table exhibits the age of antique furniture and the corresponding prices. What is the relationship between the age of the furniture and their values? (Hint: Use scatter diagram and the Excel Trendline tool where necessary).

scatter diagram trendline tool highlight data make scatter add chart element trendline find trendline that best fits data

Which of the following is true about determining the proper form of the hypotheses?

the form depends upon the type of hypotheses test as well as certain assumptions about the population H0 is statistically proved true while testing failure to reject H0 proves H1 wrong H0 is always assumed to be true in testing H1 is always assumed to be true in testing

Which of the following is true of calculating confidence intervals for larger samples?

the more samples we have the less sampling errors we will have. The interval is a range of values between which the value of the population parameter is believed to be, along with a probability that the interval correctly estimates the true (unknown) population parameter. This probability is called the level of confidence, denoted by 1 - a, where a is a number between 0 and 1. As the level of confidence, 1 - a, decreases, za/2 decreases, and the confidence interval becomes narrower. (Slide 24)

The Ransin Sports Company has noted that the size of individual customer orders is normally distributed with a mean of $xxxand a standard deviation of $x. If a soccer team of xx players were to make the next batch of orders, what is the probability that the mean purchase would exceed $xxx?

this is standard error for some reason we have to calculate the new std with the old 8 dollar std from the previous problem $8/sqrt of 16 = $2 Fourth x equals standard error

(44) Refer to the table above, which of the following equations correctly expresses the relationship between the two variables?

use trendline and show equation for line

For the formula for calculating the confidence level with known standard deviation, the value ________ represents the value of a standard normal random variable with a cumulative probability of 1 - α / 2.

zα/2 is the value of the standard normal random variable for an upper tail area of α/2 (or a lower tail area of 1 − α/2). zα/2 is computed as =NORM.S.INV(1 - a/2) Example: if a = 0.05 (for a 95% confidence interval), then NORM.S.INV(0.975) = 1.96;


Kaugnay na mga set ng pag-aaral

Practice A&P Lab Practical Exercise 11

View Set

Avit 309 block 1 book questions and answers study set.

View Set

Physics hw/ quiz work and energy

View Set