Chapter 2 Business Analytics

Ace your homework & exams now with Quizwiz!

The histogram below represents scores achieved by 250 job applicants on a personality profile. How many job applicants scored above 50?

50

The average score for a class of 30 students was 75. The 20 male students in the class averaged 70. The 10 female students in the class averaged

85

An operations management professor is interested in how her students performed on her midterm exam. The histogram shown below represents the distribution of exam scores (where the maximum score is 100) for 50 students. Based on this histogram, how would you characterize the students' performance on this exam?

Exam scores are fairly normally distributed. Majority of scores are between 70 and 90 points, while 23% of scores are above 90 and 12% of scores are 70 or below.

A data set is typically a rectangular array of data, with observations in columnsand variables in rows.

False

A distribution with a high kurtosis has almost all of its observations within threestandard deviations of the mean.

False

All nominal data may be treated as ordinal data.

False

Phone numbers, Social Security numbers, and zip codes are examples of numerical variables.

False

Suppose that a sample of 10 observations has a standard deviation of 3, then the sum of the squared deviations from the sample mean is 30.

False

A financial analyst collected useful information for 30 employees at GammaTechnologies, Inc. These data include each selected employees' gender, age, number of years of relevant work experience prior to employment at Gamma,number of years of employment at Gamma, number of years of post-secondaryeducation, and annual salary

Gender - categorical, nominalAge - numerical, continuousPrior experience - numerical, discreteGamma experience - numerical, discreteEducation - numerical, discreteAnnual salary - numerical, continuous

Suppose that an analysis of a set of test scores reveals that:

IQR = Q3 - Q1 = 60. This means that the middle 50% of the test scores are between 45 and 105

As a measure of variability, what is defined as the maximum value minus the minimum value?

Range

Suppose that an analysis of a set of test scores reveals that:

Since 34 is less than , the observation 34 is among the lowest 25% of the values. The value 84 is a bit smaller than the middle value, which is . Since , the value 104 is larger than about 75% of the values.

In an effort to provide more consistent customer service, the manager of a local fast-food restaurant would like to know the dispersion of customer service times in relation to their average value for the facility's drive-up window. The table below provides summary measures for the customer service times (in minutes)for a sample of 50 customers collected over the past week. Explain why the mean is slightly lower than the median in this case.

The data is slightly skewed to the left. This causes the mean to be slightly lower than the median. It is important to understand that service times are bounded on the lower end by zero (it is impossible for the service time to be negative). However, there is no boundary on the maximum service time.Therefore, the smaller service times cause the mean to be some what lower than the median.

IQR = . This means that the middle 50% of the test scores are between 45 and 105.Suppose that an analysis of a set of test scores reveals that:

The fact that Q2-Q1 = 40 is greater than Q3-Q2 = 20 indicates that the distribution is skewed to the left

The data below represents monthly sales for two years of beanbag animals at a local retail store (Month 1 represents January and Month 12 represents December). Given the time series plot below, do you see any obvious patterns in the data? Explain.

This is a representation of seasonal data. There seems to be a small increase in months 3, 4, and 5 and a large increase att he end of the year. The sales of this item seem to peak in December and have a significant drop off in January.

Age, height, and weight are examples of numerical data.

True

In the term "frequency table," frequency refers to the counts of observations in specified categories.

True

The core purpose of time series graphs is to detect historical patterns in the data.

True

The count of categories is the only meaningful way to summarize categorical data.

True

Categorizing age variables as "young," "middle-aged," and "elderly" is an example of

binning

In a generic box plot, the x inside the box indicates the location of the

mean

Excel stores dates as

numbers

In order for the characteristics of a sample to be generalized to the entire population, it should be:

representative of the population

Researchers may gain insight into the characteristics of a population by examining a

sample of the population

A histogram that is positively skewed is also called

skewed to the right

The following data represent the number of children in a sample of 10 families from Chicago: 4, 2, 1, 1, 5, 3, 0, 1, 0, and 2.

1.90

The histogram below represents scores achieved by 250 job applicants on a personality profile. What percentage of the job applicants scored between 30 and 40?

10%

If the mean is 75 and two observations have values of 65 and 85, what is the squared deviation of each?

100

The histogram below represents scores achieved by 250 job applicants on a personality profile.

100

The histogram below represents scores achieved by 250 job applicants on a personality profile. eventy percent of the job applicants scored above what value

20

Expressed in percentiles, the interquartile range is the difference between the

25th and 75th percentiles

The histogram below represents scores achieved by 250 job applicants on a personality profile. Half of the job applicants scored below what value?

30

A sample of 20 observations has a standard deviation of 4. The sum of thesquared deviations from the sample mean is

304

Below you will find summary measures on starting salaries for classroom teachers across the United States. You will also find a list of selected states and their average starting teacher salary. All values are in thousands of dollars What salary amount represents the second quartile?

35,000 (median)

The histogram below represents scores achieved by 250 job applicants on a personality profile. What percentage of the job applicants scored below 60?

90%

If a value represents the 95th percentile, this means that

95% of all values are below this value

A manager for Marko Manufacturing, Inc. has recently been hearing somecomplaints that women are being paid less than men for the same type of work inone of their manufacturing plants. The box plots shown below represent theannual salaries for all salaried workers in that facility (40 men and 34 women). How large must a person's salary should be to qualify as an outlier on the highside? How many outliers are there in these data?

A person's salary should be somewhere above $70,000. There isone male salary that would be considered an outlier (atapproximately $80,000)

Below you will find summary measures on starting salaries for classroom teachers across the United States. You will also find a list of selected states and their average starting teacher salary. All values are in thousands of dollars. Which of the states listed paid their teachers average salaries that are below 75%of all average salaries?

AL, CO, NE, NV, NH, NM, SC, SD, TN, TX, UT, VT, VA, & WY

.Below you will find summary measures on starting salaries for classroomteachers across the United States. You will also find a list of selected states andtheir average starting teacher salary. All values are in thousands of dollars. Which of the states listed paid their teachers average salaries that exceed at least75% of all average salaries?

CT, DE, NJ

Gender and State are examples of which type of data?

Categorical data

In an effort to provide more consistent customer service, the manager of a local fast-food restaurant would like to know the dispersion of customer service times in relation to their average value for the facility's drive-up window. The table below provides summary measures for the customer service times (in minutes)for a sample of 50 customers collected over the past week. Are the empirical rules applicable in this case? If so, apply them and interpret your results. If not, explain why the empirical rules are not applicable here

Considering that this distribution is only very slightly skewedto the left, it is acceptable to apply the empirical rules asfollows:Approximately 68% of the customer service times will fallbetween 0.873 0.432, that is between 0.441 and 1.305minutes.Approximately 95% of the customer service times will fallbetween 0.873 2(0.432), that is between 0.009 and 1.737minutes.Approximately 99.7% of the customer service times will fallbetween 0.873 3(0.432), that is between 0 and 2.169 (lowerend is set to zero because service times cannot assumenegative values)

Because they represent such extreme values, outliers should be eliminated from statistical analyses.

False

Categorical variables can be classified as either discrete or continuous.

False

In an extremely right-skewed distribution, the mean is much smaller than the median.

False

Mean absolute deviation (MAD) is the average of the squared deviations.

False

Suppose that a sample of 8 observations has a standard deviation of 2.50, then the sum of the squared deviations from the sample mean is 17.50.

False

The median is one of the most frequently used measures of variability.

False

Statistics professor has just given a final examination in his statistical inference course. He is particularly interested in learning how his class of 40 students performed on this exam. The scores are shown below. What are the mean and median scores on this exam?

Mean - 80.40 Median - 79.50

A financial analyst collected useful information for 30 employees at GammaTechnologies, Inc. These data include each selected employees' gender, age,number of years of relevant work experience prior to employment at Gamma,number of years of employment at Gamma, number of years of post-secondaryeducation, and annual salary. Based on the histogram shown below, how would you describe the agedistribution for these data?

The age distribution is slightly skewed to the right. The largest grouping is in the 30-40 range. This means the most workers above the age of 30 years.

A financial analyst collected useful information for 30 employees at GammaTechnologies, Inc. These data include each selected employees' gender, age,number of years of relevant work experience prior to employment at Gamma,number of years of employment at Gamma, number of years of post-secondaryeducation, and annual salary. Based on the histogram shown below, how would you describe the salarydistribution for these data?

The salary distribution is skewed to the right. There appears to be several workers who are being paid substantially more than the others. If you eliminate those above $80,000, the salaries are fairly normally distributed around $35,000.

In an effort to provide more consistent customer service, the manager of a local fast-food restaurant would like to know the dispersion of customer service times in relation to their average value for the facility's drive-up window. The table below provides summary measures for the customer service times (in minutes)for a sample of 50 customers collected over the past week. Interpret the variance and standard deviation of this sample.

The variance is 0.187. Standard Deviation is 0.432

Statistics professor has just given a final examination in his statistical inference course. He is particularly interested in learning how his class of 40 students performed on this exam. The scores are shown below. Explain why the mean and median are different.

There are few higher exam scores that tend to pull the mean away from the middle of the distribution.

A question of great interest to economists is how the distribution of familyincome has changed in the United States during the last 20 years. The summarymeasures and histograms shown below are generated for a sample of 500 familyincomes, using the 1985 and 2005 income for each family in the sample. Based on these results, discuss as completely as possible how the distribution offamily income in the United States changed from 1985 to 2005.

These summary measures say quite a lot. The mean has increased for 2005 when compared with 1985, although the median has decreased. There is also more variation. In fact,the 5th percentile has decreased slightly for 2005 when compared with 1985, whereas the 95th percentile is much larger -- indicating that the rich people are getting richer. This behavior is also evident in the two histograms, which use the same categories for ease of comparison.

A manager for Marko Manufacturing, Inc. has recently been hearing somecomplaints that women are being paid less than men for the same type of work inone of their manufacturing plants. The box plots shown below represent theannual salaries for all salaried workers in that facility (40 men and 34 women). What can you say about the shape of the distributions given the accompanyingbox plots?

They both appear to be slightly skewed to the right (both have amean > median). The total variation seems to be close for bothdistributions (with one outlier for the male salaries), but there seemsto be more variation in the middle 50% for the women than for themen. There seem to be more men's salaries clustered more closelyaround the mean than for the women.

A distribution of a numerical variable with no skewness is said to be symmetric.

True

A frequency table indicates how many observations fall within each category, anda histogram is its graphical analog.

True

A histogram is based on binning the variable, which means putting the variableinto discrete categories.

True

A population includes all elements or objects of interest in a study, whereas asample is a subset of the population used to gain insights into the characteristicsof the population.

True

A variable (or field or attribute) is a characteristic of members of a population,whereas an observation (or case or record) is a list of all variable values for asingle member of a population.

True

Abby has been keeping track of what she spends to rent movies. The last seven week's expenditures, in dollars, were 6, 4, 8, 9, 6, 12, and 4. The mean amount Abby spends on renting movies is $7.

True

As a graphical tool, the histogram is ideal for showing whether the distribution ofa numerical variable is symmetric or skewed.

True

Assume that the histogram of a data set is symmetric and bell shaped, with amean of 75 and standard deviation of 10. Then, approximately 95% of the datavalues were between 55 and 95.

True

Both ordinal and nominal variables are categorical.

True

Cross-sectional data are data on a population at a distinct point in time, where as time series data are data collected over time.

True

Data can be categorized as cross-sectional or time series.

True

The mean is a measure of central tendency

True

The median of a data set with 30 values would be the average of the 15th and the16th values when the data values are arranged in ascending order

True

The number of car insurance policy holders is an example of a discrete numerical variable

True

Below you will find summary measures on starting salaries for classroom teachers across the United States. You will also find a list of selected states and their average starting teacher salary. All values are in thousands of dollars How would you describe the salary of Virginia's teachers compared to those across the entire United States? Justify your answer.

Virginia' teacher salary = $35,000, which is also the median.Virginia is at the 50th percentile, meaning that 50% of the teachers' salaries across the U.S. are below the Virginia teacher salary and 50% of the salaries are above.

A manager for Marko Manufacturing, Inc. has recently been hearing somecomplaints that women are being paid less than men for the same type of work inone of their manufacturing plants. The box plots shown below represent theannual salaries for all salaried workers in that facility (40 men and 34 women). Would you conclude that there is a difference between the salaries of women andmen in this plant? Justify your answer

Yes. The men seem to have higher salaries than the women do inmany cases. We can see from the box plots that the mean andmedian values for the men are both higher than for the women. Youcan also see from the box plots that the middle 50% of salaries formen is above the median for women. This means that if you were inthe 25th percentile for men, you would be above the 50th percentilefor women. You can also see that the mean and median salaries forthe men are about $10,000 above those for the women.

A sample of a population taken at one particular point in time is categorized as

cross-sectional

Data that arise from counts are called

discrete data

Coding males as 1 and females as 0 in a data set illustrates the use of

dummy variable

The difference between the first and third quartile is called the

interquartile range

The length of the box in the box plot portrays the

interquartile range

In a generic box plot, the vertical line inside the box indicates the location of the

median

The interquartile range (IQR) represents what percent of the observations?

middle 50%

The median can also be described as the

middle observation when the data values are arranged in ascending order

The mode is best described as the

most frequently occurring value

How is the median defined if the number of observations is even?

the average of the two middle observations

A variable is classified as ordinal if

there is a natural ordering of categories

The daily closing values of the Dow Jones Industrial Average are examples of

time-series data


Related study sets

Florida B.E.S.T. Standards 6-12 Glossary

View Set

TransNH_CompTIA 220-901_PRACTICE EXAM 2

View Set

A/C, D/C, and Magnetism - Part 2

View Set

NURX104 Essentials of Nursing Care: Health Safety

View Set