business analytics final gupta
in the movie forest gump, the public school required an IQ of at least (greater than) 80 for admittance. if IQ test scores are normally distributed with mean 100 and standard deviation 16, what percentage of people for admittance to the school?
.8944
is [a,b] denotes an arbitrary interval of numbers on the real number line, find the probability when P(x = a)
0
the mean of a standard normal distribution is always equal to __________
0
a weather forecaster predicts that the may rainfall in a local area will be between 1 and 7 inches but has no idea where within the interval the amount will be. Let x be the amount of may rainfall in the local area and assume that x is uniformly distributed over the interval 1 to 7 inches. Calculate the standard deviation of may rainfall
1.732
exponential smoothing
A weighted-moving-average forecasting technique in which data points are weighted by an exponential function.
based on a random sample of 25 units of product X, the average weight is 102 lb and the sample standard deviation is 10 lb. we would like to decide if there is enough evidence to establish that the average weight for the population X is greater than 100 lb. assume the population is normally distributed. what is the critical value used to test the claim at a=.05
1.1711
For nonnormal populations, as the sample size (n) ________, the distribution of sample means approaches a(n) ________ distribution.
increases, normal
which of the following possible response variables is most appropriate to predict using a regression tree?
monthly sales of used cars
classification tree
predicting a qualitative or categorical response variable
regression tree
predicting a quantitative response variable
the coefficient of determination measures the _________ explained by the simple linear regression model
proportion of variation
unlike a classification tree, a regression tree enables us to predict the value of a ________ response variable
quantitative
which of the following possible response variables is most appropriate to predict using a regression tree?
rate of return on an incestment
seasonal
regular periodic up and down movements that repeat within the calendar year
it is appropriate to use the uniform distribution to describe a continuous random variable x when
relative frequencies of all possible values of x are about the same
The fill weight of a certain brand of adult cereal is normally distributed with a mean of 910 grams and a standard deviation of 5 grams. We calculated the value of z for a specific box of this brand of cereal, and the z value was negative. This negative z value indicates that
the fill weight is less than 910 grams
which of the following possible response variables is most appropriate to predict using a classification tree?
whether or not an applicant will accept a job offer
the _____________ of the simple linear regression model is the value of y when the mean of x is zero
y-intercept
consider the trash bag problem. Suppose that an independent laboratory has tested trash bags and has found that no 30 gallon bags that are currently on the market have a mean breaking strength of 50 pounds or more. on the basis of these results, the producer of the new, improved trash bag feels sure that its 30 gallon bag will be the strongest such bag on the market if the new trash bag's mean breaking strength can be shown to be at least 50 pounds. the mean of the sample of 39 trash bag breaking strengths in table 1.10 is x = 50.573. if we let u denote the mean of the breaking strengths of all possible trash bags of the new type and assume that o equals 1.61 using the 95 percent confidence interval, can we be 95 percent confident that u is at least 50 pounds
yes, because 95% interval is above 50
when testing a hypothesis about a single mean, if the sample size is 51 and the population standard deviation is known, the correct test statistic to use is
z
when testing the difference between two population proportions using large independent random samples, the _________ test statistic is used
z
Consolidated Power, a large electric power utility has just built a modern nuclear power plant. This plant discharges waste water that is allowed to flow into the Atlantic Ocean. The Environmental Protection Agency (EPA) has ordered that the waste water may not be excessively warm so that thermal pollution of the marine environment near the plant can be avoided . Because of this order, the waste water is allowed to cool in specially constructed ponds and is then released into the ocean . This cooling system works properly if the mean temperature of waste water discharged is 60°F or cooler. Consolidated Power is required to monitor the temperature of the waste water . A sample of 100 temperature readings will be obtained each day, and if the sample results cast a substantial amount of doubt on the hypothesis that the cooling system is working properly (the mean temperature of waste water discharged is 60 °F or cooler ), then the plant must be shut down and appropriate actions must be taken to correct the problem . suppose that consolidated power decides to use a level of significance of a = .05, and suppose a random sample of 100 temperature readings is obtained. if the sample mean of the 100 temperature readings is x = 60.383, test H0 versus Ha and determine whether the power plant should be shut down and the fooling system is repaired. assume o = 2
z = 1.92
the mean and the standard deviation of the sample of 100 bank customer waiting times are x = 5.01 and s 2.116. calculate a t based 95 percent confidence interval for u, the mean of all possible bank customer waiting times using the new system
[4.590,5.430]
recall that a bank manager has developed a new system to reduce the time customers spend waiting to be served by tellers during peak business hours. the mean waiting time during peak business hours under the current system is roughly 9 to 10 minutes. the bank manager hopes that the new system will have a mean waiting time that is less than six minutes. the mean of the sample of 91 bank customer waiting times is x = 5.41. if we let u denote the mean of all possible bank customer waiting times using the new system and assume that o equals 2.42: calculate 99 percent confidence intervals for U
[4.757, 6.063]
find P(x>172) if u=175 and variance =9
.8413
____________ says that if the sample size is sufficiently large, then the sample means are approximately normally distributed
the central limit theorem
in logistic regression _________
the dependent variable is categorical
the correlation coefficient may assume any value between
-1 and 1
the correlation coefficient may assume any value between __________
-1 and 1
a golf tournament organizer is attempting to determine whether hole (pin) placement has a significant impact on the average number os strokes for the 13th hole on a given gold course. Historically, the pin has been placed in front of the right corner of the green and the historical mean number os strokes for the hole has been 4.25, with a standard deviation of 1.6 strokes. on a particular day during the most recent gold tournament, the organizer placed the hole (pin) in the back left corner of the green. Sixty-four golfers played the hold with the new placement on that day. Determine the probability of the sample average number of strokes exceeding 4.75 using the historical mean and standard deviation
.0062
in a bottle filling process, the amount of drink injected into 15 oz bottles is normally distributed with a mean of 15 oz and a standard deviation of .34 oz. Bottles containing less than 14.23 oz do not meet the bottler's quality standard. What percentage of filled bottles do not meet the standard?
.0119
suppose that we will randomly select a sample of 64 measurements from a population having a mean equal to 20 and a standard deviation equal to 4. calculate the probability that we will obtain a sample mean greater than 21; that is, calculate P(X>21). Hint: find the z value corresponding to 21 by using Ux and Ox because we wish to calculate a probability about X
.0228
Packages of sugar bags for Sweeter Sugar Inc. have an average weight of 16 ounces and a standard deviation of .2 ounces. The weights of the sugar packages are normally distributed. What is the probability that 16 randomly selected packages will have a weight in excess of 16.075 ounces?
.0668
Find P(x ̄x ̄ < 35) if μ = 40, σx = 16, n = 16.
.1056
Find P(x<35) if u = 40, o = 16, n = 16
.1056
suppose that we will randomly select a sample of 64 measurements from a population having a mean equal to 20 and a standard deviation equal to 4. calculate the probability that we will obtain a sample mean less than 19.385; that is, calculate P(X<19.385)
.1093
of the 223 sampled customers, what is the samples proportion of those who churned
.143
given that the length an athlete throws a hammer is a normal random variable with mean 50 feet and standard deviation 5 feet, what is the probability he throws it no less than 55 feet
.1587
the probability of a new employee passing a test is .20 what are the odds of the employee passing the test
.25
what is the probability that a standard normal random variable will be between .3 and 3.2
.3814
Packages of sugar bags for Sweeter Sugar Inc. have an average weight of 16 ounces and a standard deviation of .2 ounces. The weights of the sugar packages are normally distributed. What is the probability that 9 randomly selected packages will have a weight in excess of 16.075 ounces?
.4013
suppose that an airline quotes a flight tome pf 133 minutes between two cities. furthermore, suppose that historical flight records indicate that the actual flight time between the two cities, x, is uniformly distributed between 115 and 151 minutes. letting the time unit one, find P(128<=x<=144)
.4444
of these 40 silver card holders, what is the proportion that did not upgrade?
.5250
find P(X<402), if u = 400, o = 200, and n = 100
.5398
assume that the ages for first marriages are normally distribured with a mean of 26 years and a standard deviation of 4 years. what is the probability that a person getting married for the first time between 20 and 30 years of age>
.7745
suppose that an airline quotes a flight time of 133 minutes between two cities. furthermore, suppose that historical flight records indicate that the actual flight time between the two cities, x, is uniformly distributed between 115 and 151 minutes. letting the time unit be one minute, write the formula for the probability curve of x.
1/36
if we are testing the difference between the means of two normally distributed independent populations with samples of n1=10, n2=10, the degrees of freedom for the t statistic is __________
18
a simple regression analysis with 20 observations would yield __________ degrees of freedom error and ___________ of freedom total
18, 19
in testing H0: u=23 versus HA: U > 23 using the critical value rule, when x =26, s = 6, and n = 20, what is the value of the test statistic? assume that the population from which the sample is selected is normally distributed
2.24
in testing H0: u=23; versus HA: u>23 using the critical vcalue rule, when x x =26, s=6, and n=20 what is the value of the test statistic? assume that the population from which the sample is selected is normally distributed
2.24
a mail order business prides itself in its ability to fill customers orders in less than six calendar days, on average. periodically, the operations manager selects a random sample of customer orders and determines the number of days required to fill the orders. On one occasion when a sample of 39 orders was selected, the average number of days was 6.65 with a population standard deviation of 1.5 days. calculate the appropriate test statistic to test the hypothesis (Z value)
2.71
in a manufacturing process, a random sample of 9 manufactured bolts has a mean length of 3 inches with a variance of .09 and is normally distributed. What is the 90 percent confidence interval for the mean length of the manufactured bold
2.8140 to 3.1860
in a manufacturing process, a random sample of 36 manufactured bolts has a mean length of 3 inches with a standard deviation of .3 inches. what is the 99 percent confidence interval for the true mean length of the manufactured bolt
2.864 to 3.136
based on the information given in the table above, what is the MSD
3.3333
a soccer player takes a shot on goal 15 times and scores a goal 3 of those attempts. what are the odds of the soccer player scoring a goal
3/12
a weather forecaster predicts that the may rainfall in a local area will be between 1 and 7 inches but has no idea where within the interval the amount will be. Let x be the amount of may rainfall in the local area and assume that x is uniformly distributed over the interval 1 to 7 inches. Calculate the expected (mean value) may rainfall.
4.0
determine the predicted sales for this month (month = 26)
45.9
in a manufacturing process, we are interested in measuring the average length of a certain type of bolt. Past data indicate that the standard deviation is .25 inches. How many manufactured bolts should be samples in order to make us 95 percent confident that the sample mean bolt length is within .02 inches of the true mean bolt length
601
decision trees
A decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm.
If we are testing the hypothesis about the mean of a population of paired differences with samples of n1 = 10, n2 = 10, the degrees of freedom for the t statistic is
9
which of the following statements is not a property of the normal probability distribution
95.44 percent of all possible observed values of the random variable x are within plus or minus three standard deviation of the population mean
the sample size is 100. what is the degrees of freedom?
99
if the random variable x is normally distributed, _______ percent of all possible observed values of x will be within three standard deviations of the mean
99.73
an accountant wishes to predict direct labor cost (y) on the basis of the batch size (x) of a product produced in a job shop. Data for 12 production runs are given in the table below, along with the excel output from fitting a least squares regression line to the data
B0 = 18.488, B1 = 10.146
a new company is in the process of evaluating its customer service. the company offers two types of sales: (1) internet sales and (2) store sales. The marketing research manager believes that the internet sales are more than 10 percent higher than store sales. the null hypothesis would be
Pinternet - Pstore <= .1
a new company is in the process of evaluating its customer service. the company offers two types of sales (1) internet sales and (2) store sales. the marketing research manager believes that the internet sales are more than 10 percent higher than store sales. the alternative hypothesis for this problem would be stated as
Pinternet - Pstore > .10
a new company is in the process of evaluating its customer service. the company offers two types of sales: (1) internet sales and (2) store sales. The marketing research manager believes that the internet sales are more than 10 percent higher than store sales. The alternative hypothesis for this problem would be stated as
Pinternet - Pstore > .10
a new company is in the process of evaluating its customer service. the company offers two types of sales: (1) internet sales and (2) store sales. The marketing research manager believes that the internet sales are more than 10 percent higher than store sales. The null hypothesis would be
Pinternet - Pstore<= .10
in multiple regression analysis, the explained sum of squares divided by the total sum of squares yields the ______
R^2
consider the trash bag problem. Suppose that an independent laboratory has tested trash bags and has found that no 30 gallon bags that are currently on the market have a mean breaking strength of 50 pounds or more. on the basis of these results, the producer of the new, improved trash bag feels sure that its 30 gallon bag will be the strongest such bag on the market if the new trash bag's mean breaking strength can be shown to be at least 50 pounds. the mean of the sample of 39 trash bag breaking strengths in table 1.10 is x = 50.573. if we let u denote the mean of the breaking strengths of all possible trash bags of the new type and assume that o equals 1.61
[50.068, 51.078]
a sustained long term change in the level of the variable that is being forecasted per unit of time is
a trend
a sustained long-term change in the level of the variable that is being forecasted per unit of time is
a trend
dummy variable
a variable that takes on one of two values (usually one or zero)
in a simple linear regression analysis, the correlation coefficient (r) and the slope (b) _____________ have the same sign
always
in testing for the equality of means from two independent populations, if the hypothesis of equal populations means is rejected at a = .01, it will _____________ be rejected at a = .05
always
using the p-value rule, if a null hypothesis is rejected at a significance level of .01, it will ______________ be rejected at a significance level of .05
always
when using simple exponential smoothing, the value of the smoothing constant a can be
between 0 and 1
which of the following is not continuous probability distributions?
binomial
when the magnitude of the seasonal swing does not depend on the level of a time series, we call this ____________ variation
constant seasonal
the ____________ measures the strength of the linear relationship between the dependent variable and the independent variable
correlation coefficient
the value of the test statistic is compared with a(n) _________ in order to decide whether the null hypothesis can be rejected
critical value
the __________ component of the time series measures the fluctuations in a time series due to economic conditions of prosperity and recession with a duration of approximately 2 years or longer
cyclical
as the standard deviation decreases, the width of the confidence interval ___________
decreases
the area under the curve of a valid continuous probability distribution must _______________
equal 1
irregular
erratic very short movements that follow no regular pattern
all of the following are assumptions of the error terms in the simple linear regression model except
error terms are dependent on each other
the area under the normal curve between z = 0 and z = 1 is _____________ the area under the normal curve between z = 1 and z = 2
greater than
the following results were obtained as part of a simple regression analysis r^2=.9162 F statistic from the F table = 3.59 calculated value of F from the ANOVA table = 81.87 a = .05 p-value = .000 the null hypothesis of no linear relationship between the dependent variable and the independent variable _______________
is rejected
the null hypothesis of no linear relationship between the dependent variable and the independent variable
is rejected
______________ values of the standard deviation result in a normal curve that is wider and flatter
larger
which of the following would you find on a classification tree?
leaf
the ___________ regression method is used when the response variable is qualitative or a categorical variable
logistic
the _____________ regression method is used when the response variable is a qualitative or a categorical variable
logistic
trend
long-run growth or decline
cycle
long-run up and down fluctuation around the trend level
which of the following methods do we use to best fit the data in logistic regression
maximum likelihood
suppose that we will randomly select a sample of n=116 elements from a population and that we will compute the sample proportion Phat of these elements that fall into a category of interest. if the true population proportion p equals .8: find the mean and the standard deviation of the sampling distribution of Phat
mean - phat = .8, standard deviation p-hat = .04
find the mean, variance, and standard deviation of the sampling distribution of the sample proportion Phat. p=.6, n=241
mean = .6, standard deviation = .0136, variance = .000996
suppose that we will randomly select a sample of 64 measurements from a population having a mean equal to 20 and a standard deviation equal to 4. find the mean and the standard deviation of the sampling distribution of the sample mean
mean = 20, standard deviation = .5
the specific shape of each normal distribution is determined by its ____________ and___________
mean, standard deviation
which is not the goodness of fit test in logistic regression
minitab regression
true no
model and actual predict no
true positive
model and actual predict yes
false negative
model predicts no, in actual yes, type 2 error
false positive
model predicts yes, in actual no, (reject H0 when true) type 1 error
recall that a bank manager has developed a new system to reduce the time customers spend waiting to be served by tellers during peak business hours. the mean waiting time during peak business hours under the current system is roughly 9 to 10 minutes. the bank manager hopes that the new system will have a mean waiting time that is less than six minutes. the mean of the sample of 91 bank customer waiting times is x = 5.41. if we let u denote the mean of all possible bank customer waiting times using the new system and assume that o equals 2.42: using the 99 percent confidence interval, can the bank manager be 99 percent confident that u is less than 6 minutes? explain
no, the 99% interval extends above mean 6
suppose that we will randomly select a sample of 64 measurements from a population having a mean equal to 20 and a standard deviation equal to 4. Describe the shape of the sampling distribution of the sample mean
normally distributed
the normal approximation of the binomial distribution is appropriate when
np>= 5 and n(1-p) >= 5
Consolidated Power, a large electric power utility has just built a modern nuclear power plant. This plant discharges waste water that is allowed to flow into the Atlantic Ocean. The Environmental Protection Agency (EPA) has ordered that the waste water may not be excessively warm so that thermal pollution of the marine environment near the plant can be avoided . Because of this order, the waste water is allowed to cool in specially constructed ponds and is then released into the ocean . This cooling system works properly if the mean temperature of waste water discharged is 60°F or cooler. Consolidated Power is required to monitor the temperature of the waste water . A sample of 100 temperature readings will be obtained each day, and if the sample results cast a substantial amount of doubt on the hypothesis that the cooling system is working properly (the mean temperature of waste water discharged is 60 °F or cooler ), then the plant must be shut down and appropriate actions must be taken to correct the problem . Consolidated Power wishes to set up a hypothesis test so that the power plant will be shut down when the null hypothesis is rejected . Set up the null hypothesis H_{0} and the alternative hypothesis H_{a} that should be used .
null hypothesis: mean <=60 alternate hypothesis : mean >60
the crown bottling company has just installed a new bottling process that will fill 7 ounce bottles of the popular crown classic cola soft drink. both overfilling and underfilling bottles are undesirable. underfilling leads to customer complaints and overfilling costs the company considerable money. in order to verify that the filler is set up correctly, the company wishes to see whether the mean bottle fill, u, is close to the target fill of 7 ounces. to this end, a random sample of 36 filled bottles is selected from the output of a test filler run. if the sample results cast a substantial amount of doubt on the hypothesis that the mean bottle fill is the desired 7 ounces, the the filler's initial setup will be readjusted. The bottling company wants to set up a hypothesis test so that the filler will be readjusted if the null hypothesis is rejected. Set up the null and alternative hypotheses for this hypothesis test.
null hypothesis: mean = 7, alternate hypothesis mean <> 7
the exact spread of the t distribution depends on the ___________
number of degrees of freedom
a recent study conducted by the state government attempts to determine whether the voting public supports a further increase in cigarette taxes. The opinion poll recently sampled 1,500 voting age citizens. 1,020 of the sampled citizens were in favor of an increase in cigarette taxes. The state government would like to decide if there is enough evidence to establish whether the proportion of citizens supporting an increase in cigarette taxes is significantly greater than .66. identify the null hypothesis
p<=.66
based on this classification tree, which of the following silver card holders would the bank classify as an upgrader (assuming they classify with an upgrade probability estimate of at least .5 as upgraders
platprofile(1)
when a least squares line is fit to the 8 observations in the fuel consumption data, we obtain SSE = 4.427. Calculate s^2 and s
s^2 = .738, s = .859
a confidence interval for the population mean is an interval constructed around the ____________
sample mean
Those fluctuations that are associated with climate, holidays, and related activities are referred to as ________ variations.
seasonal
in simple regression analysis, the quantity that gives the amount by which Y(dependent variable) changes for a unit change in X(independent variable) is called the ________
slope of the regression line
the z value tells us the number of _______________ that a value of x is from the mean
standard deviations
When testing a null hypothesis about a single population mean and the population standard deviation is unknown, if the sample size is less than 30, one compares the computed test statistic for significance with a value from the ___________ distribution.
t distribution
in testing the difference between the means of two normally distributed populations using independent random samples with equal variances, the correct test statistic to use is
t statistic
in simple regression analysis, if the correlation coefficient is a positive value, then ____________
the slope of the regression line must also be positive
entropy Rsquare
the square of the simple correlation coefficient between the observed 0 and 1 upgrade values and the corresponding upgrade probability estimates
when the population is normally distributed, population standard deviation is unknown and the sample size is n = 15, the confidence interval for the population mean is based on
the t distribution
rejecting a true null hypothesis is called a ___________ error
type 1
a continuous probability distribution having a regular shape, where the probability is evenly distributed over an interval of numbers is a(n) _____________distribution
uniform
which one of the following is not an assumption about the residuals in a regression model?
variance of zero