Stat Exam Prep

Ace your homework & exams now with Quizwiz!

Scores on Ms. Bond's test have a mean of 70 and a standard deviation of 11. Michelle has a score of 48. Convert Michelle's score to a z-score. (Round to two decimal places if necessary.)

-2

If P(A)=.80, P(B)=.65 and P(AUB)=.78, then P(A|B)=

.9750

The starting salaries of individuals with an MBA degree are normally distributed with a mean of $40,000 and a standard deviation of $5,000. What percentage of MBA's will have starting salaries of $34,000 to $46,000?

0.0668

Forty percent of all registered voters in a national election are female. A random sample of 5 voters is selected. What is the probability that there are no females in the sample?

0.0778

The random variable x is known to be uniformly distributed between 70 and 90. The probability of x having a value between 80 and 95 is

0.5

Suppose we had a data set of a call center where customers were asked to choose between the following three options: hear account information, billing questions, and customer service. Using the given order of the three options, using 0-1 dummy variables to encode the categorical variables, which of the following combinations would yield an entry "customer service"?

001

The strength of a cluster can be measured by comparing the average distance in a cluster to the distance between cluster centroids. One rule of thumb is that the ratio for between-cluster distance to within-cluster distance should exceed what value for useful clusters?

1

What is the total area under the normal distribution curve?

1

Compute the geometric mean for the following data on growth factors of an investment for 10 years. 1.10, .50, .70, 1.21, 1.25, 1.12, 1.16, 1.11, 1.13, 1.22

1.0148

A sample of 2,500 people was asked how many cups of coffee they drink in the morning. You are given the following information: 0=700, 1=900, 2=600, 3=300. The expected number of cups of coffee is

1.2

A survey of 100 random high school students finds that 85 students watch the Super Bowl, 25 students watched the Stanley Cup Finals, and 20 students watched both games. How many students did not watch either?

10

The College Board reported that the mean Math Level 2 SAT subject test score was 686 with a standard deviation of 96. Assuming scores follow a bell-shaped distribution, use the empirical rule to find the percentage of students who scored less than 494.

2.5%

The manager of a grocery store has selected a random sample of 100 customer. The average length of time it took these 100 customers to check out was 3.0 minutes. It is known that the SD of the check out time is one minute. The 95% confidence interval for the average checkout time for all customers is________________.

2.804 to 3.196

In order to determine an interval for the mean of a population with unknown SD, a sample of 24 items is selected. The mean of the sample is determined to be 23. The number of degrees of freedom for reading the t-value is

23

Manhattan distance is the distance traveled as if traveled along rectangular city blocks. The Manhattan distance for the standardized observations of (-1.85, 0.65) and (0.55, -.75)

3.80

A researcher has collected the following sample data. The mean of the sample is 5. 3,5,12,3,2. What is the SD?

4.062

Compute the relative frequency for students who earned a C shown in the table below. A=10, B=31, C=36, D=8. total 83

43

Euclidean distance can be used to calculate the dissimilarity between two observations. Let u=(25,$350) correspond to a 25-year old customer that spent $350 at Store A in the previous fiscal year. Let v-(53, $420) correspond to a 53-year old customer that spent $4,100 at store A in the previous year. Calculate the dissimilarity between these two observations using Euclidean distance.

75.39

Which of the following graphs provides information on outliers and IQR of a data set?

Box Plot

_____________ is a measure of calculating dissimilarity between clusters by calculating the distance between the centroids of the two clusters.

Centroid linkage

Natalie needs to compare the number of employees by job title for the last five years. Which of the following charts should Natalie use?

Clustered-column (bar) chart

______________ is a measure of calculating dissimilarity between clusters by considering only the two most dissimilar observations in the two clusters.

Complete linkage

________________ are collected from several entities at the same point in time.

Cross-sectional data

A retail owner offers a discount on product A and predicts that the customers would purchase products B and C in addition to product A. Identify the technique used to make such prediction.

Data Mining

A student wants to determine if pennies are really fair, meaning equally likely to land heads up or tails up. He flips a random sample of 50 pennies and finds that 28 of them land heads up. What are the appropriate null and alternative hypotheses?

Hp:p=0.5, Hap=0.5

_____________ is the most critical step of the decision making process.

Identifying and defining the problem

Which statement is true about mutually exclusive events?

If events A and B cannot occur at the same time, they are called mutually exclusive.

Data ink used in a table or chart that

Is necessary to convey the meaning of the data to the audience

Which of the following is true of Euclidean distances?

It is commonly used as a method of measuring dissimilarity between quantitative observations.

DJ needs to display data over time. Which of the following charts should he use?

Line chart

_____________ is the dissimilarity measure that is more robust to outliers than Euclidean distance.

Manhattan distance

If A and B are independent events with P(A)=.38 and P(B)=.55, then P(A|B)=

None of the answers are correct

To summarize and analyze data with both a crosstabulation and charting, Excel typically pairs

PivotCharts with PivotTables

Which of the following analytical techniques helps us arrive at the best decision?

Prescriptive analytics

Data-driven decision making tends to decrease a firms _________.

Risk

A __________ is a graphical presentation of the relationship between two quantitative variables.

Scatter chart

A ___________ decision is concerned with how the organization should acheive the goals and objectives set by its strategy.

Tactical

Which of the following statements is correct?

The binomial distribution is a discrete probability distribution and the normal distribution is a continuous probability distribution.

For a population with an unknown distribution, the form the sampling distribution of the sample mean is__________.

approximately normal for a large sample size

A chart that is recommended as an alternative to a pie chart is a

bar chart

The charts that are helpful in making comparisons between categorical variables are

bar charts and column charts

In interval estimation, as the sample size becomes larger the interval estimate

becomes narrower

The correlation coefficient will always take the values

between -1 and +1

The data preparation technique used in market segmentation to divide consumers into two different homogenous groups is called

cluster analysis

An alternative for a stacked column chart when comparing more than a couple of quantitative variables in each category is

clustered column chart

Complete linkage can be used to measure the distance between _______________ in cluster analysis.

clusters

A graphical presentation that uses vertical bars to display the magnitude of quantitate data is known as a

column chart

In preparing categorical variables for analysis, it is usually best to _____________.

convert the categories to binary, dummy variables

A data visualization tool that updates in real time and gives multiple outputs is called

data dashboard

The US Internal Revenue Service uses _________ to identify patterns that distinguish questionable annual income tax filing.

data mining

Fields may be chosen to represent all of the following except ________ in the body of a PivotTable.

filters

The finite correction factor should be used in the computation of the SD of the sample mean and the standard population with n/N is

greater than .05

A two-dimensional graph representing the data using different shades of color to indicate magnitude is called

heat map

Bar charts use

horizontal bars to display the magnitude of the quantitative variable

Tactical decisions are concerned with

how the organization should achieve the goals and objectives set by its strategy.

Deleting the grid lines in a table and the horizontal lines in a chart

increases the data-link ratio

A disadvantage of stacked-column charts and stacked-bar charts is that

it can be difficult to perceive small differences in areas

In a business, the values indicating the business's current operating characteristics, such as its financial position, the inventory on hand, and customer service metrics, are typically known as

key performance indicators

A time series plot is also known as a

line chart

Single linkage can be used to measure the distance between clusters that are the _______ in cluster analysis.

most similar

A simple random sample of size n from a finite people of size N is a sample selected such that each possible sample of size

n has the same probability of being selected

In k-clustering, k represents the

number of clusters

Euclidean distance can be used to measure the distance between ______ in cluster analysis.

observations.

K-means clustering is the process of ______________.

organizing observations into distinct groups based on a measure of similarity.

In many cases, white space in a chart can improve

readability

A _____________ acts as a representative of the population.

sample

The value of the _____ is used to estimate the value of the population parameter.

sample statistic

A useful chart for displaying multiple variables is the ____________.

scatter chart matrix

The basis for using a normal probability distribution to approximate the sampling distribution of the sample means and population mean is

the central limit theorem

All of the events in the sample space that are not part of the specified event are called

the complement of the event

When clustering only by dummy variables that represent categorical variables, the simplest measure of similarity between two observations is called the

the matching coefficient

Simulation optimization helps ______________.

to find good decisions in highly complex and highly uncertain settings.

The symbol U indicates the

union of events

The CEO of a company wants to estimate the percent of employees that use company computers to go on Facebook during work hours with a 95% confidence. He selects a random variable of 150 of the employees and finds that 53 of them logged onto Facebook that day. What is the estimate of the Standard error of the proportion?

.039 √.353(1-.353)/150

A sample of 51 observations will be taken from a process (an infinite population). The population proportion equals 0.85. The probability that the sample proportion will be between 0.9115 and 0.946 is

.0819

The p-value is equal to: (n=49, x=54.8, o=28, Hou=50= Hau=50

.2302

The assembly time for a product is uniformly distributed between 6 and 10 minutes. The probability density function has what value in the interval between 6 and 10?

.25

The CEO of a company wants to estimate the percent of employees that use company computers to go on Facebook during work hours with 95% confidence. He selects a random sample of 150 of the employees and finds that 53 of them logged onto Facebook that day. What is the point estimate of the proportion of the population that logged onto Facebook that day?

.35

The time between arrivals of vehicles at a particular intersection follows an exponential probability distribution with a mean of 12 seconds. What is the probability that the arrival time between vehicles is 6 seconds or less?

.3935

In a multiple regression analysis involving 15 independent variables and 200 observations, SST=800 and SSE= 240. The coefficient of determination is

.700

A researcher has collected the following sample data. The mean of the sample is 5. What is the variance?

16.5

A health conscious student faithfully wears a device that tracks his steps. Suppose that the distribution of the number of steps he takes in a day is normally distributed with a mean of 10,000 steps and a SD of 1,500 steps. What percent of the days does he exceed 13,00 steps?

2.28%

A researcher has collected the following sample data. The mean of the sample is 5. 3,5,12,3,2. What is the coefficient variation?

81.24%

Compute the IRQ for the following data set: 10, 15, 17, 21, 25, 12, 16, 13,11, 22

9.50

A manager of a fast food restaurant wants the drive through employee to ask every 5th customer if he or she is satisfied with the service. Who makes up the population?

All customers who use the drive through window at the restaurant.

The ratio of the amount of ink used in a table or chart that is necessary to convey information to the total amount of ink used in the table and chart is known as data-ink ratio. Using additional ink that is not necessary to convey information has what effect on the data-ink ratio?

It reduces the data-ink ratio

____________ merges maps and statistics to present data collected over different geographies.

The geographic information system

If the Euclidean distance were to be represented in a right triangle, which of the following would be considered the distance between two observations?

The hypotenuse

Larger values of O have the disadvantage of increasing the probability of making a

Type 1 error

As the number of degrees of freedom for a t distribution increases, the difference between the t distribution and the standard normal distribution

becomes smaller

In order to visualize three variables in a two-dimensional graph, we use a

bubble chart

Average linkage is a measure of calculating dissimilarity between two clusters by

computing the average distance between every pair of observations between two clusters.

Corporate-level managers use _____________ to summarize sales by region, current inventory levels, and other company wide metrics all in a single screen.

data dashboards

A tree diagram used to illustrate the sequence of nested clusters produced by hierarchical clustering is known as a _____.

dendogram

In order to manage an organizations human resource activities, such as hiring employees, tracking, and influencing employee retention, HR personel uses ________________.

descriptive and predictive analytics

Jaccard's coefficient is different from the matching coefficient in that the former:

does not count matching zero entries while the latter does

Any data value with a z-score less than -3 or greater than +3 is considered to be a(n)

outlier

A __________ is used for examining data with more than two variables, and it includes a different vertical axis for each variable.

parallel-coordinates plot

A simple random sample of 31 observations was taken from a large population. The sample mean equals 5. Five is a

point estimate

Bayes' theorem is a method used to compute __________ probabilities.

posterior

Revised probabilities of events based on additional information are

posterior probabilities

Observation refers to the

set of recorded values of variables associated with a single entity

A line chart that has no axes but is used to provide information on overall trends for time series data is called a

sparkline

A method for modifying variables that reduces bias prior to cluster analysis is

standardization

The goal of __________ is to use the variable values to identify relationships between observations.

unsupervised learning


Related study sets

Gender Studies Exam (Readings *theses)

View Set

Financial Leverage and Capital Structure Policy 1

View Set

Functions in Python: Gaining a Deeper Understanding of Python Functions

View Set

Web Apps I (File extension terminology and definitions)

View Set