BST- Practice for Final
What is zα / 2 𝑧𝛼 / 2 for a 95% confidence interval of the population mean? Multiple Choice 0.48 0.49 1.645 1.96
1.96 Explanation P ( Z ≥ zα / 2 ) = α /2.𝑃 ( 𝑍 ≥ 𝑧𝛼 / 2 ) = 𝛼 /2. Confidence level = 100( 1 − α )%100( 1 − 𝛼 )% . Use z table or Excel's function NORM.S.INV. The appropriate Excel function is=NORM.S.INV(1-0.05/2) = 1.96
Many experts believe that _____ of the data in the world today were created in the last two years alone. Multiple choice question. 75% 80% 90% 95% all
90%
This is a catch-phrase, meaning a massive amount of both structured and unstructured data. Multiple choice question. Big data Unstructured Cross-sectional Structured Time series
Big data
Many cities around the United States are installing LED streetlights, in part to combat crime by improving visibility after dusk. An urban police department claims that the proportion of crimes committed after dusk will fall below the current level of 0.84 if LED streetlights are installed. Specify the null and alternative hypotheses to test the police department's claim. Multiple Choice H0: p = 0.84 and HA: p ≠ 0.84 H0: p < 0.84 and HA: p ≥ 0.84 H0: p ≤ 0.84 and HA: p > 0.84 H0: p ≥ 0.84 and HA: p < 0.84
H0: p ≥ 0.84 and HA: p < 0.84 Explanation Hypothesis testing is used to resolve conflicts between two competing hypotheses on a particular population parameter of interest.
Which of the following BEST describes a frequency distribution for qualitative data? Multiple choice question. It groups data into histograms, and records the proportion (fraction) of observations in each histogram. It groups data into intervals called classes, and records the proportion (fraction) of observations in each class. It groups data into intervals called classes, and records the number of observations in each class. It groups data into categories, and records the number of observations in each category.
It groups data into categories, and records the number of observations in each category.
As a general guideline, we use the alternative hypothesis as a vehicle to establish something new, or contest the status quo, for which a corrective action may be required. T or F
T Explanation As a general guideline, we use the alternative hypothesis as a vehicle to establish something new, that is, to contest the status quo. In general, the null hypothesis regarding a particular population parameter of interest is specified with one of the following signs: =, ≥, ≤; the alternative hypothesis is then specified with the corresponding opposite sign: ≠, >, <.
In a one-tailed test, the rejection region is located under one tail (left or right) of the corresponding probability distribution, while in a two-tailed test this region is located under both tails. T or F
T Explanation A one-tailed test involves a null hypothesis that can be rejected only on one side of the hypothetical value. In a two-tailed test, we can reject the null hypothesis on either side of the hypothesized value of the population parameter.
Which of the following is an example of inferential statistics? Summarize the variability of the exam scores of 40 students based on all 40 exam scores. Test the longevity of all light bulbs based on a sample of 100 light bulbs. Find the average height of 50 female students at State University. Calculate a mutual fund's average return for the last five years.
Test the longevity of all light bulbs based on a sample of 100 light bulbs.
For a given confidence level 100 ( 1− α )%100 ( 1− 𝛼 )% and population standard deviation σ, the width of the confidence interval for the population mean is wider, the smaller the sample size n. T or F
True Explanation For a given confidence level 100 ( 1− α )%100 ( 1− 𝛼 )% and sample size n, the width of the confidence interval for the population mean is wider, the greater the population standard deviation σ.
If a small segment of the population is sampled then an estimate will be less precise. T or F
True Explanation The estimate will be less precise if the variability of the underlying population is high or a small segment of the population is sampled.
The null hypothesis typically corresponds to a presumed default state of nature. T or F
True Explanation We think of the null hypothesis as corresponding to a presumed default state of nature or status quo.
We draw a random sample of size 25 from the normal population with variance 2.8. If the sample mean is 21.5, what is a 90% confidence interval for the population mean? [20.5075, 22.4925] [20.6310, 22.3690] [20.9495, 22.0505] [21.0268, 21.9732]
[20.9495, 22.0505] Explanation The confidence interval for the population mean is computed as x¯ ± zα/2 σn√𝑥¯ ± 𝑧𝛼/2 𝜎𝑛 . Use z table. The appropriate Excel functions areLower Limit =21.5-CONFIDENCE.NORM(0.10,SQRT(2.8),25) = 20.9495.Upper Limit =21.5+CONFIDENCE.NORM(0.10,SQRT(2.8),25) = 22.0505.
Consider the following hypotheses that relate to the medical field: H0: A person is free of disease. HA: A person has disease. In this instance, a Type I error is often referred to as __________. Multiple Choice a false positive a false negative a negative result the power of the test
a false positive Explanation A Type I error is committed when we reject the null hypothesis when the null hypothesis is actually true. We often refer to this type of result as a false positive.
The interval scale of measurement Multiple choice question. allows for the use of negative values. is used to measure certain types of qualitative data. allows for the construction of meaningful ratios between the data values. is a weaker scale of measurement than the nominal scale.
allows for the use of negative values.
The hypothesis statement H: μ<60𝜇<60 is an example of a(an) __________ hypothesis.
alternative Explanation In general, the null hypothesis regarding a particular population parameter of interest is specified with one of the following signs: =, ≥, ≤; the alternative hypothesis is then specified with the corresponding opposite sign: ≠, >, <.
What is the most typical form of a calculated confidence interval? Multiple Choice a. Point estimate ± Standard error b. Point estimate ± Margin of error c. Population parameter ± Standard error d. Population parameter ± Margin of error
b. Point estimate ± Margin of error Explanation It is common to construct a confidence interval as Point estimate ± Margin of error.
A stacked column chart is an advanced version of the______ chart.
bar
A(n) ______ depicts the frequency or the relative frequency for each category of a qualitative variable as a series of horizontal or vertical bars, the lengths of which are proportional to the values that are depicted. Multiple choice question. ogive polygon pie chart bar chart
bar chart
One method of graphical presentation for qualitative data is a _____. Multiple choice question. histogram stem-and-leaf diagram bar chart scatterplot
bar chart
A bar chart is sometimes referred to as a ______chart.
column
Relative frequency distributions are generally more useful than frequency distributions when Multiple choice question. comparing data sets of different sizes. comparing small data sets of the same size. comparing large data sets of the same size.
comparing data sets of different sizes.
______tables help us summarize the relationship between two categorical variables.
contingency
A _____ variable is characterized by uncountable values within an interval. Multiple choice question. nominal qualitative continuous discrete
continuous
Consider the following variable: a runner's time in a 100-meter race. This variable is best categorized as a Blank______ variable. Multiple choice question. nominal qualitative continuous discrete
continuous
Data that are collected about many subjects at the same point in time or without regard to differences in time are known as ______ data. Multiple choice question. cross-sectional correlated time series parametric
cross-sectional
When a researcher examines quantitative data and wants to know the number of observations that fall below the upper limit of a particular class, the researcher is BEST served by creating a ______. Multiple choice question. bar chart relative frequency distribution pie chart cumulative frequency distribution
cumulative frequency distribution
The branch of statistics that summarizes important aspects of a data set is often referred to as ______ statistics. descriptive correlated parametric nominal
descriptive
A ______ is a way to organize qualitative data into categories and record the number of observations in each category. frequency distribution polygon correlation coefficient scatterplot
frequency distribution
For quantitative data, a ______ groups data into classes and records the number of observations that falls into each class. Multiple choice question. pie chart scatterplot stacked bar chart frequency distribution
frequency distribution
In order to summarize qualitative data, a useful tool is a(n) ______. Multiple choice question. stem-and-leaf diagram ogive histogram frequency distribution
frequency distribution
The branch of statistics that draws conclusions about a large set of data based on a smaller set of data is often referred to as ______ statistics. inferential descriptive nominal parametric
inferential
It is important to note that numerical results are not very useful unless they are accompanied with clearly stated actionable business __________.
insights
A sociologist notes the birth year of 50 individuals. Nominal Ordinal interval Ratio
interval
Samples are primarily used to . make inferences about population parameters. contradict the population data. define a population.
make inferences about population parameters.
The ratio scale is Multiple choice question. less sophisticated than the ordinal scale. more sophisticated than the nominal scale. less sophisticated than the interval scale.
more sophisticated than the nominal scale.
In general, the null and alternative hypotheses are __________. Multiple Choice additive correlated multiplicative mutually exclusive
mutually exclusive Explanation We use hypothesis testing to resolve conflicts between two competing hypotheses on a particular population parameter of interest, and these competing hypotheses are mutually exclusive.
When we reject the null hypothesis when it is actually false, we have committed __________. Multiple Choice no error a Type I error a Type II error a Type I error and a Type II error
no error Explanation A Type I error is committed when we reject the null hypothesis when the null hypothesis is actually true. A Type II error is made when we do not reject the null hypothesis and the null hypothesis and the null hypothesis is actually false.
To construct a 95% confidence interval for µ, the sampling distribution of X (X bar) must be _______.
normal Explanation For estimating the population mean and the population proportion, the sampling distribution of the underlying statistic is approximately normal. The symmetry implied by the normal distribution allows us to construct a confidence level by adding and subtracting the same margin of error to the point estimate.
The hypothesis statement H: µ = 25 is an example of a(an) __________ hypothesis.
null Explanation In general, the null hypothesis regarding a particular population parameter of interest is specified with one of the following signs: =, ≥, ≤; the alternative hypothesis is then specified with the corresponding opposite sign: ≠, >, <.
In general, we use sample data because sample data has more variability than population data. population data are inadequate. obtaining data from the population is often an expensive process. sample data is more precise than population data.
obtaining data from the population is often an expensive process.
Rating products from one to five stars generates __________data.
ordinal
A(n) Blank______ is a segmented circle whose segments portray the relative frequencies of the categories of some qualitative variable. Multiple choice question. pie chart ogive histogram bar chart
pie chart
One method of graphical presentation for qualitative data is a(n) ______. Multiple choice question. pie chart histogram ogive polygon
pie chart
A ______ includes all items of interest in a statistical problem. sample population pilot program subset
population
A research analyst collects data on the weekly closing price of gold throughout the year. The scale for these data is ______. Multiple choice question. ordinal nominal interval ratio
ratio
Generally, for a frequency distribution, the width of each interval is the_______ for each interval.
same
Sampling, rather than surveying an entire population, can offer some substantial benefits. Some of those benefits include obtaining less variability in the data set. saving money and time. eliminating the risk of error.
saving money and time.
A ______ ____chart is designed to visualize more than one categorical variable, plus it allows for the comparison of composition within each category.
stacked, column
The null hypothesis in a hypothesis test refers to __________. the desired outcome the default state of nature the altered state of nature the desired state of nature
the default state of nature Explanation We can think of a null hypothesis as corresponding to a presumed default state of nature or status quo. An alternative hypothesis, on the other hand, contradicts the default state of nature or status quo.
A local courier service advertises that its average delivery time is less than 6 hours for local deliveries. When testing the two hypotheses, H0: μ ≥ 6 and HA: μ < 6, μ stand for __________. Multiple Choice the mean delivery time the standard deviation of the delivery time the number of deliveries that took less than 6 hours the proportion of deliveries that took less than 6 hours
the mean delivery time Explanation Hypothesis testing is used to resolve conflicts between two competing hypotheses on a particular population parameter of interest.
Data that are collected by recording a characteristic of a subject over several time periods are referred to as ______ data. nonparametric time series cross-sectional correlated
time series
A contingency table shows the frequencies for _____ categorical variables.
two
The width of the confidence interval is _____ times the margin of error.
two Explanation In as much as we are basically adding and subtracting the margin of error from x¯𝑥¯ the width of the confidence interval is two times the margin of error.
A characteristic of interest that differs among various observations is referred to as a ______. Multiple choice question. constant variable coefficient parameter
variable
Three steps are essential for doing good statistics. Which of the following is not one of these three steps? Make sure the numeric data is linear Clearly communicate information into verbal and written language Find the right data Use the appropriate statistical tools
Make sure the numeric data is linear
This represents the least sophisticated level of measurement. Nominal scale Ratio scale Interval scale ordinal scale
Nominal scale
This scale represents the least sophisticated level of measurement. Multiple choice question. Ratio scale Ordinal scale Interval scale Nominal scale
Nominal scale
Time it takes each student to complete a final exam. Categorical Numerical; discrete Numerical; continuous
Numerical; continuous
The number of patrons who frequent a restaurant. Categorical Numerical; discrete Numerical; continuous
Numerical; discrete
With this data, we are only able to both categorize and rank the data with respect to some characteristic or trait. Multiple choice question. Ratio scale Interval scale Nominal scale Ordinal scale
Ordinal scale
A meteorologist records the amount of monthly rainfall over the past year. Nominal Ordinal interval Ratio
Ratio
An investor monitors the daily stock price of BP following the 2010 oil disaster in the Gulf of Mexico. Nominal Ordinal interval Ratio
Ratio
Which of the following is an example of cross-sectional data? Quarterly housing starts collected over the last 60 years Daily price of DuPont stock during the first quarter Results of market research testing current consumer preferences for soda drinks GDP of the United States from 1990-2010
Results of market research testing current consumer preferences for soda drinks
These days, it has become easy to access data by simply using a search engine like_______
There are several guidelines to follow when constructing graphs that summarize statistical data. Which of the following statements is LEAST accurate? Multiple choice question. Axes should be clearly labeled. Graphs should have a lot of adornments. The simplest graph that effectively communicates the data should be used. Axes that are numerical should be to the appropriate scale.
Graphs should have a lot of adornments.
What is the purpose of calculating a confidence interval? Multiple Choice A. To provide a range of values that has a certain large probability of containing the sample statistic of interest. B. To provide a range of values that, with a certain measure of confidence, contains the sample statistic of interest. C. To provide a range of values that, with a certain measure of confidence, contains the population parameter of interest. D. To provide a range of values that has a certain large probability of containing the population parameter of interest.
C. To provide a range of values that, with a certain measure of confidence, contains the population parameter of interest. Explanation The concept of confidence intervals is used to estimate the unknown population parameters. For a given calculated confidence interval, the probability of containing this parameter is actually either 1 or 0 because this interval may contain the parameter or not.
Which of the following is an example of descriptive statistics? Estimate the percent of all U.S. voters who approve of the President's performance. Test whether the average lifetime of all Brand A batteries exceeds 500 hours based on lifetime data from a sample of 20 Brand A batteries. Conclude which of two brands of coffee is preferred by all coffee consumers based on taste-test preferences of 200 coffee drinkers. Calculate the percent of 2500 U.S. voters in an opinion poll who approve of the President's performance.
Calculate the percent of 2500 U.S. voters in an opinion poll who approve of the President's performance.
Colors of cars in a mall parking lot. Categorical Numerical; discrete Numerical; continuous
Categorical
Question 34
Explanation a. The null hypothesis reflects the status quo, which means the average wait time is within 10 minutes. The alternative hypothesis contradicts the status quo, which means the average wait time is more than 10 minutes. b. Rejecting the null hypothesis means there is evidence that the average waiting time is greater than 10 minutes, or the population mean is to the right side of the hypothesized value of 10 minutes. Therefore, a one-tailed test should be conducted. c. The manager is interested in supporting the claim that the average wait time is within 10 minutes (i.e., the null hypothesis is true). As a result, the manager is concerned about a Type I error of rejecting the null hypothesis when it is true. d. The customers are interested in refuting the claim that the average wait time is within 10 minutes (i.e., the null hypothesis is false). In other words, the customers are concerned about a Type II error of not rejecting the null hypothesis when it is false.
On the basis of sample information, we either "accept the null hypothesis" or "reject the null hypothesis." T or F
F Explanation On the basis of sample information, we either "reject the null hypothesis" or "do not reject the null hypothesis." Only one of two hypotheses is true and the hypotheses cover all possible values of the population parameter.
A Type II error is made when we reject the null hypothesis and the null hypothesis is actually false. T or F
False Explanation A Type II error is made when we do not reject the null hypothesis that is actually false.
A confidence interval provides a value that, with a certain measure of confidence, is the population parameter of interest. T or F
False Explanation A confidence interval provides a range of values that, with a certain level of confidence, contains the population parameter of interest.
For a given confidence level 100 ( 1 − α )%100 ( 1 − 𝛼 )% and sample size n, the width of the confidence interval for the population mean is narrower, the greater the population standard deviation σ. T or F
False Explanation For a given confidence level 100 ( 1 − α )%100 ( 1 − 𝛼 )% and sample size n, the width of the confidence interval for the population mean is wider, the greater the population standard deviation σ.
You want to test if more than 20% of homes in a neighborhood have recently been sold through a short sale, at a foreclosure auction, or by the bank following an unsuccessful foreclosure auction. You take a sample of 60 homes from this neighborhood and find that 14 fit your criteria. The appropriate null and alternative hypotheses are __________. Multiple Choice H0:p−≤0.23,HA:p−>0.23𝐻0:𝑝−≤0.23,𝐻𝐴:𝑝−>0.23 H0:p−=0.23,HA:p−≠0.23𝐻0:𝑝−=0.23,𝐻𝐴:𝑝−≠0.23 H0:p≤0.20,HA:p>0.20𝐻0:𝑝≤0.20,𝐻𝐴:𝑝>0.20 H0:p=0.20,HA:p≠0.20
H0:p≤0.20,HA:p>0.20 Explanation The competing hypotheses are H0: p ≤ p0,HA: p > p0. It is referred to as a right-tailed test of the population proportion.
Expedia would like to test if the average round-trip airfare between Philadelphia and Dublin is less than $1,200. The correct hypothesis statement would be __________. Multiple Choice H0:μ≠$1,200,HA:μ=$1,200𝐻0:𝜇≠$1,200,𝐻𝐴:𝜇=$1,200 H0:μ<$1,200,HA:μ≥$1,200𝐻0:𝜇<$1,200,𝐻𝐴:𝜇≥$1,200 H0:μ≥$1,200,HA:μ<$1,200𝐻0:𝜇≥$1,200,𝐻𝐴:𝜇<$1,200 H0:μ=$1,200,HA:μ≠$1,200
H0:μ≥$1,200,HA:μ<$1,200 Explanation The competing hypotheses are H0: ≥ μ0,HA: < μ0.
Which measurement scales are associated with quantitative data? Multiple choice question. Nominal and ordinal Interval and ratio Ordinal and interval Nominal and ratio
Interval and ratio
How many rows are there in a contingency table? Multiple choice question. 1 2 3 A contingency table does not have rows It depends 4
It depends
Consider the following hypotheses:H0: μ ≤ 57.1HA: μ > 57.1A sample of 22 observations yields a sample mean of 58.2. Assume that the sample is drawn from a normal population with a population standard deviation of 3.6. (You may find it useful to reference the appropriate table: z table or t table)a-1. Find the p-value. multiple choice 1p-value 0.10p-value < 0.010.01 p-value < 0.0250.025 p-value < 0.050.05 p-value < 0.10 a-2. What is the conclusion if α = 0.10? multiple choice 2Do not reject H0 since the p-value is smaller than α.Do not reject H0 since the p-value is greater than α.Reject H0 since the p-value is smaller than α.Reject H0 since the p-value is greater than α. a-3. Interpret the results at α = 0.10. multiple choice 3We cannot conclude that the sample mean is greater than 57.1.We conclude that the sample mean is greater than 57.1.We cannot conclude that the population mean is greater than 57.1.We
Review - Steps Explanation a-1.With n = 22, the value of the test statistics,z = x− − μ0σ/n√= 58.2 − 57.13.6/22√= 1.43;𝑧 = 𝑥− − 𝜇0𝜎/𝑛= 58.2 − 57.13.6/22= 1.43;the p-value = P(Z ≥ 1.43) = 0.0764.a-2.We reject H0 because the p-value < 0.10 = α.b-1.The value of the test statistics,z = x− − μ0σ/n√= 58.2 − 57.13.6/108√= 3.18;𝑧 = 𝑥− − 𝜇0𝜎/𝑛= 58.2 − 57.13.6/108= 3.18;the p-value = P(Z ≥ 3.18) = 0.0007.b-2.We reject H0 because the p-value < 0.10 = α.
Exercise 7-6 Static A random sample is drawn from a normally distributed population with mean μ = 12 and standard deviation σ = 1.5. [You may find it useful to reference the z table.]. a. What is the expected value and the standard error of the sampling distribution of the sample mean with n = 20 and n = 40. (Round the standard error to 3 decimal places.) b. Can you conclude that the sampling distribution of the sample mean is normally distributed for both sample sizes? multiple choice 1Yes, both the sample means will have a normal distribution.No, both the sample means will not have a normal distribution.No, only the sample mean with n = 20 will have a normal distribution.No, only the sample mean with n = 40 will have a normal distribution. c. If the sampling distribution of the sample mean is normally distributed with n = 20, then calculate the probability that the sample mean is less than 12.5. (If approp
Review Steps --- and process question 35
Exercise 7-7 Algo A random sample is drawn from a population with mean μ = 56 and standard deviation σ = 4.7. [You may find it useful to reference the z table.] a. What is the expected value and the standard error of the sampling distribution of the sample mean with n = 14 and n = 34? (Round the standard error to 3 decimal places.) b. Can you conclude that the sampling distribution of the sample mean is normally distributed for both sample sizes? multiple choice 1Yes, both the sample means will have a normal distribution.No, both the sample means will not have a normal distribution.No, only the sample mean with n = 14 will have a normal distribution.No, only the sample mean with n = 34 will have a normal distribution. c. If the sampling distribution of the sample mean is normally distributed with n = 14, then calculate the probability that the sample mean falls between 56 and 58. (If appropriate, round final
Review Steps for problem 36
______column charts help us summarize the relationship between two categorical variables.
Stacked
A sales invoice is what type of data? Multiple choice question. Unstructured Structured Cross-sectional Time series
Structured
A sales invoice is what type of data? Multiple choice question. Unstructured Structured Time series Cross-sectional
Structured
__________data often consist of numerical information that is objective and is not open to interpretation. Multiple choice question. Structured Cross-sectional Unstructured Time series
Structured
For a given sample size, any attempt to reduce the likelihood of making one type of error (Type I or Type II) will increase the likelihood of the other error. T or F
True Explanation It is not always easy to determine which of two errors has more serious consequences. For a given evidence, there is a trade-off between these errors; by reducing Type I error, we implicitly increase Type II error, and vice versa.
t is generally believed that no more than 0.50 of all babies in a town in Texas are born out of wedlock. A politician claims that the proportion of babies born out of wedlock is increasing. In testing the politician's claim, how does one define the population parameter of interest? Multiple Choice The current proportion of babies born out of wedlock The mean number of babies born out of wedlock The number of babies born out of wedlock The general belief that the proportion of babies born out of wedlock is no more than 0.50
The current proportion of babies born out of wedlock Explanation Hypothesis testing is used to resolve conflicts between two competing hypotheses on a particular population parameter of interest.
The national average for an eighth-grade reading comprehension test is 73. A school district claims that its eighth-graders outperform the national average. In testing the school district's claim, how does one define the population parameter of interest? Multiple Choice The mean score on the eighth-grade reading comprehension test The number of eighth graders who took the reading comprehension test The standard deviation of the score on the eighth-grade reading comprehension test The proportion of eighth graders who scored above 73 on the reading comprehension test
The mean score on the eighth-grade reading comprehension test Explanation Hypothesis testing is used to resolve conflicts between two competing hypotheses on a particular population parameter of interest.
Which of the following characteristics of interest is a variable? Multiple choice question. The number of degrees in a circle The number of months in a year The number of pizzas ordered from Pizza Hut per day The number of letters in the English alphabet
The number of pizzas ordered from Pizza Hut per day
Which of the following scenarios is an example of the interval scale? Multiple choice question. An undergraduate's major The outside temperature (in degrees Fahrenheit) A runner's time in a race (in seconds) The price of a stock (in dollars)
The outside temperature (in degrees Fahrenheit)
When constructing a graph, which of the following statements is MOST accurate? Multiple choice question. The simplest graph should be used for a given set of data. It is common to give the vertical axis a very high value as an upper limit. The vertical axis should be stretched so that an increase (or decrease) of the data appears pronounced.
The simplest graph should be used for a given set of data.
_____ data can include hourly, daily, weekly, monthly, quarterly, or annual observations.
Time Series
A Type I error is committed when we reject the null hypothesis, which is actually true. T or False
True Explanation A Type I error is committed when we reject the null hypothesis, which is actually true.
A professional sports organization is going to implement a test for steroids. The test gives a positive reaction in 94% of the people who have taken the steroid. However, it erroneously gives a positive reaction in 4% of the people who have not taken the steroid. What is the probability of Type I and Type II errors giving the null hypothesis "the individual has not taken steroids." Multiple Choice Type I: 4%, Type II: 6% Type I: 6%, Type II: 4% Type I: 94%, Type II: 4% Type I: 4%, Type II: 94%
Type I: 4%, Type II: 6% Explanation We consider two types of errors in the hypothesis testing: a Type I error and a Type II error. A Type I error is committed when we reject the null hypothesis when the null hypothesis is actually true. A Type II error is made when we do not reject the null hypothesis and the null hypothesis and the null hypothesis is actually false. Type I: provided at 4%; Type II: 100% − 94% = 6%
Which characteristic of big data does the following describe? The credibility and quality of data. Multiple choice question. Variety Reason: Data come in all types, forms, and granularity, both structured and unstructured. Veracity Value Reason: Organizations must develop a methodical plan for formulating business questions, curating the right data, and unlocking the hidden potential in big data. Volume incorrect Reason: An immense amount of data is compiled from a single source or a wide range of sources, including business transactions, household and personal devices, manufacturing equipment, social media, and other online portals. Velocity Reason: Data from a variety of sources get generated at a rapid speed.
Veracity
Suppose taxi fares from Logan Airport to downtown Boston is known to be normally distributed and a sample of seven taxi fares produces a mean fare of $19.51 and a 95% confidence interval of [$19.16, $19.84]. Which of the following statements is a valid explanation of the confidence interval. Multiple Choice 95% of all taxi fares are between $19.16 and $19.84. We are 95% confident that a randomly selected taxi fare will be between $19.16 and $19.84. The mean amount of a taxi fare is $19.51, 95% of the time. We are 95% confident that the average taxi fare between Logan Airport and downtown Boston will fall between $19.16 and $19.84.
We are 95% confident that the average taxi fare between Logan Airport and downtown Boston will fall between $19.16 and $19.84. Explanation Technically, the 95% confidence interval for the population mean μ implies that for 95% of the samples, the procedure (formula) produces an interval that contains μ. Informally, we can report with 95% confidence that μ lies in the given interval. It is not correct to say that there is a 95% chance that μ lies in the given interval. It either does or does not fall in the interval. The probability is either zero or one. In addition, the interval is not about the sample. We know the true value for the sample.
You are considering buying insurance for your new laptop computer, which you have recently bought for $1,400. The insurance premium for three years is $113. Over the three-year period there is an 10% chance that your laptop computer will require work worth $538, a 5% chance that it will require work worth $932, and a 2% chance that it will completely break down with a scrap value of $120. Should you buy the insurance? (Assume risk neutrality.) Yes, because the expected cost of repair is greater than the cost of the insurance. Correct No, because the expected cost of repair is less than the cost of the insurance.
Yes Explanation Let X represent the repair cost if you do not buy the insurance. Therefore, E(X) = 538 × 0.10 + 932 × 0.05 + 1,280 × 0.02 + 0 × 0.83 = 126. If you are risk neutral, you should buy the insurance because the expected cost of repair ($126) is greater than the cost of the insurance ($113). In other words, a risk neutral person gains $13 ($126 - $113) by buying the insurance.
The owner of a large car dealership believes that the financial crisis decreased the number of customers visiting her dealership. The dealership has historically had 890 customers per day. The owner takes a sample of 100 days and finds the average number of customers visiting the dealership per day was 840. Assume that the population standard deviation is 290. The value of the test statistic is __________. Multiple Choice t99 = 1.724 t99 = −1.724 z = 1.724 z = −1.724
z = −1.724 Explanation When testing the population mean and standard deviation is known, the value of the test statistic is computed as z=x−−μoσ/n√.𝑧=𝑥−−𝜇𝑜𝜎/𝑛.z-score = 840−890290/100√840-890290/100 = −1.7241.
