AP Statistics Semester 1 Quiz/Checkpoint Questions

Ace your homework & exams now with Quizwiz!

The newspaper uses a line graph to show the performance of stocks over the last month. This is an example of:

descriptive statistics (The data were gathered and organized into a graph, which is an example of descriptive statistics.)

Since the distribution of housing prices in a community is usually skewed right, which measure of center should you use for housing prices?

median

What is the only measure of center that can be used with categorical, or non-numeric, data?

mode

A class has the following distribution of eye colors: 10 blue, 18 brown, 5 green. Which measure of central tendency should you use to find the eye color of the typical class member? What do you get when you use this measure?

mode; brown

A standard deviation calculated from data from an entire population would be called a(n):

parameter

We call any numerical fact about a population a:

parameter

The entire group we're interested in is called a:

population

All the following statements about the sample standard deviation are true, except:

the standard deviation is negative when there are extreme values in the sample. (variation can never be in negative numbers)

If you list and graph the dates of coins in people's pockets and purses you'd probably find that the graph's distribution is skewed left, because more recent dates are more common. If you want to express the average date of a coin you'd use:

the median

For this question, refer to the histogram you created for this Self Check. (The data you need for creating this histogram is at the end of your Study Guide.) Change your histogram so that $5,000 is added to the Xmin and Xmax. Change Xmin to $45,000 Change Xmax to $185,000 Keep Xscl at $10,000 You now have different classes that have no frequencies. What are their class limits?

$105,000 ≤ x < $115,000 $135,000 ≤ x < $145,000 $145,000 ≤ x < $155,000 $165,000 ≤ x < $175,000

When your class participates in an Internet game show and counts the votes for door #1 and door #2, the counts are examples of what kind of data?

Counted numerical

Which of the following represents a plot or graph of the cumulative counts across each of the intervals or midpoints?

Cumulative frequency plot (This indicates the cumulative count of observations across each of the intervals.)

Which of the following would most likely be graphed as a bar chart rather than a histogram? - number of students that use windows laptops vs. macintosh ones - the number of cars in each color in a parking lot - (one other option I forgot, obv. categorical though)

all of the above (Each of these are examples of categorical data. They're counts of the members of a category rather than measured values of a numeric variable.)

What is the term for the width of a building in a histogram?

class interval

Most statisticians use statistics instead of parameters because:

data from an entire population is almost always very difficult to obtain.

A basketball fan thinks the large salaries of NBA players will force the NBA to raise ticket prices. Here's how she came to this conclusion: since salaries are part of the total NBA operating costs, she used the NBA average salary to infer the total. Which measure of center did she use?

the mean

for a mound-shaped and symmetrical distribution, what measure of center and measure of variation should be used?

the mean and the standard deviation

Why is the interquartile range (IQR) considered to be a resistant statistic?

Adding a new extreme observation has little effect on it. (The IQR is based on the median, which is resistant. That is, a new extreme value added to the data set will have a much larger effect on the mean than on the median.)

Which of the following is a sample of the population of all high school students? (HINT: A sample is not always a simple random sample.) - High school students taking chemistry - Your math class - All high school students in your state - High school students in Cook County, Illinois - All of the above

All of the above (Each of these is a subset of the population, which is all high school students. Any subset of a population is a sample of that population, though it may not be random.)

Your class is participating in an Internet game show and must choose whether a prize is between door #1 or door #2. You take a vote. The numbers for the doors are an example of which kind of data?

Categorical

Which phase of inferential statistics is sometimes considered to be the most crucial because errors in this phase are the most difficult to correct?

Data gathering (Data gathering is often considered the most critical phase of inferential statistics. It's crucial to have an unbiased and representative sample for a statistical study. It's also usually the most time-consuming phase.)

A family's phone number qualifies as what kinds of data?

Discrete and categorical

Mr. Thompson wants to curve student's exam scores based on the highest score in the class. He takes the highest score (which happens to be an outlier) and treats it as the perfect score. He then computes everyone else's score as a percentage of this perfect score. You're smart and complain that his method is not resistant. What would be a more resistant method of grading these exams?

Grading scores relative to the median score

What is the term for organizing and summarizing data without a particular question in mind?

Exploratory data analysis (In the term exploratory data analysis the word exploratory implies that researchers are looking at the data but not expecting to find a particular pattern.)

Which of the following indicates how many times every value in a distribution appears?

Frequency table (The phrase how many times implies the count for each data value, which is best shown through a frequency table.)

Besides phone numbers, what are the other categorical variables in your survey? (The survey asks for family size, the kind of pets they have, the grade of the youngest child in the family, the family's annual income in dollars, what the dad does for a living, whether the mom works, and their phone number.)

Grade of youngest child, dad's occupation, whether mom works, and kind of pets

The blood pressure reading for any age group is normally distributed. The normal distribution is symmetric and mound-shaped. Therefore, the correct measure of central tendency to use for blood pressure is:

the mean.

for a skewed distribution, what measure of center and measure of variation should be used?

the median and the IQR (When the data are numeric and the distribution is skewed either to the right or the left, the median is usually the best choice for the measure of central tendency. This is because the mean is too heavily influenced by extreme observations that fall on only one side of a skewed distribution.) & (The IQR gives the values of the observations at the 25th and 75th percentiles (or the average of the two values closest to these percentiles if there is not a single value), Q1 and Q3. These values are less influenced by outliers than the standard deviation and so are best used with skewed data.)

If you have a data set of 40 whole numbers, which of the following could be true about the five-number summary?

The upper quartile does not have to be a whole number.

In a sample's distribution of income, the modal income (mode, or most frequently occurring observation of income) is $27,000 a year, the median income is $35,000 a year, and the mean income is $45,000 a year. Which statistic do you think is the best estimate of average income, and would you say that income is normal, skewed to the left, or skewed to the right?

$35,000, skewed right (The median value, $35,000, is the best average to use in a skewed distribution. You can tell the distribution is skewed right because the order of the statistics from left to right is mode, median, and mean.)

Consider a complete table of relative frequencies. The sum of the relative frequency column in such a table must be:

1

How many degrees of freedom does a sample of 12 have when you calculate a standard deviation?

11 (n - 1, n = 12, 12 - 1 = 11)

In a distribution with many values, which of the percentiles is also known as Q1?

25th percentile

The midpoint of the interval whose boundaries are 27.5 and 38.5 is:

33

Estimate, to the nearest whole number, the sample standard deviation of this data set, which is a sample from a larger population: {71, 75, 65, 73, 69, 77, and 67}. The mean of the data in this sample is 71.

4

Consider these eight observations: {11, 6, 2, 5, 8, 4, 4, 9}. What is the median?

5.5

In a distribution with many values, which of the following percentiles is equal to the median?

50th percentile

Look at the data set below. How many possible values are there for this variable? What kind of variable is it? red yellow blue yellow white red green blue green green blue yellow red white yellow orange

6, categorical

Consider these eight observations: {11, 6, 2, 5, 8, 4, 4, 9}. What is the mean?

6.125

Which of the following is a sample of the population of bookstores on the West Coast of the United States? I. All bookstores in California II. Randomly selected children's bookstores III. Internet bookstores

I only (Only bookstores in California count as a sample of bookstores on the West Coast of the U.S. The others may overlap with the population of West Coast bookstores, but these groups probably have some bookstores that aren't on the West Coast.)

Each of the following data sets has a mean of 40. I {38, 43, 47, 27, and 45} II {41, 40, 39, 42, and 38} III {59, 41, 53, 17, and 30} Estimate their population standard deviations (represented by sigma: σ) and list them from smallest to largest according to standard deviation size.

II (1.41), ! (7.16), III (15.23)

Which of the following would be an example of the use of inferential statistics? I. You have your entire class's math grades, and you calculate the average math grade for your class. II. You have your entire class's math grades and you use the grades to find the average math grade for everyone in your school who's taken the same math course. III. You have your entire class's math grades, and you use these grades to estimate the average math grade for the same math course at another school.

II and III

Which of the following statements is true for numerical data?

It can be measured.

What measure of central tendency do you use with standard deviation?

Mean

Which one of the following activities is not an example of data gathering?

Reaching a conclusion about the results of a reading program (Reaching a conclusion about the results of a reading program)

A scientist is testing certain sampling strategies to see which one is best. She's gathered data from an entire population and calculates a population mean µ (mu) = 14 and a population standard deviation σ(sigma) = 5. She draws five different samples from this population using five different strategies and gets the following sample means and standard deviations: Sample 1: (x-bar) = 16.9, s = 6 Sample 2: (x-bar) = 14.5, s = 4.7 Sample 3: (x-bar) = 10.5, s = 3.3 Sample 4: (x-bar) = 17, s = 4.9 Sample 5: (x-bar) = 14.1, s = 8.4 Which sampling strategy does she conclude provides the best sample?

Sample 2 (The statistics from Sample 2 are the best estimates of both population parameters, indicating that this is probably the best sampling strategy.)

Which of these are categorical data?

The different types of anteaters. (other options were weight, length, etc) (The different types of anteaters cannot be expressed as a number, so this is an example of qualitative data.)

What can you do with a calculator- or computer-generated histogram that you can't do with a hand-drawn stem-and-leaf plot?

You can change the class interval.

You're measuring the weights of a group of dogs. You get a mean weight of x-bar open parentheses top enclose straight x close parentheses equals 48.8 pounds and a standard deviation of s = 17.5 pounds. The dogs are:

a sample. (The symbols used for the mean and standard deviation are symbols for statistics, not parameters. This is a clue that the group of dogs is a sample from some population of interest.)

True or False: A randomly selected sample is made up of any group of population members that's easy to find.

false

The kind of sampling strategy least likely to produce statistics that are good estimates of population parameters is a:

haphazard sample

What is the maximum length of a whisker in a modified box plot where the median = 120, Q1 = 100, Q3 = 150, the minimum = 20, and maximum = 270?

75

If we wanted to gather a sample representing all residents in a town, what is the problem with drawing a simple random sample from the local phone book?

A phone book isn't a complete listing of all population members.

Which measure of central tendency and which measure of variation should be used with a heavily skewed distribution?

The median and inter-quartile range (The median and inter-quartile range are used because they're less likely to be influenced by outliers in a skewed distribution.)

A hockey team has completed 35 games. The team's median goals per game is 2. Which of the following must be true about the team's goal total so far?

The median doesn't allow us to infer the exact goal total, AND it is at least 36.

A histogram class is a collection of all the observations that fall between two:

class limits

True or False: Degrees of freedom are used to calculate both the population standard deviation and sample standard deviation formulas.

false

True or False: The sample size you need to estimate the population distribution should always be at least 10% of the population size.

false (The sample size you need isn't dependent on population size. A sample of 1,000 to 1,500 observation is usually enough to give a reliable estimate of the distribution of a variable in the population, no matter how big the population is. With as few as 50 observations you can start to get a general idea of the shape, the mean, and the standard deviation.)

True or False: When you decrease the Xscl value, you decrease the class interval. By doing this, you increase the number of classes or "buildings."

true

True or False: If the total area of all the bars in a histogram is 1, the area of each bar is proportional to the total number of data values.

true (Think about each bar as representing a proportion of the total area. A histogram can be thought of as an "area-picture" of a frequency table; the areas of the bars represent the frequencies, and a large area indicates a large frequency.)

The following hypothetical data set shows the purchase prices (in thousands) for a sample of 3-bedroom, 2-bathroom homes in Essex County, MA, over the past year. Compute the five-number summary and create a modified box-and-whisker plot. How many outliers are present in this distribution? 250, 254, 320, 342, 221, 235, 210, 426, 210, 298, 231, 254, 278, 234, 236, 235, 300, 401, 129, 234, 235, 235, 245

In this distribution Q1 = 234, Q3 = 298, IQR = 64, and IQR AP Statistics 1.5 = 96. Therefore, the threshold values for outliers are 138 and 394. You can see that three houses (129, 401, and 426) fall outside the threshold values.

From left to right (smallest to largest), what is the order of the different measures of central tendency in a negatively (left) skewed distribution?

Mean, median, mode (The mode is the peak of the curve, left of that is the median, which is the middle value of the distribution, and furthest over on the left is the mean.)

Every ten years the United States takes a census, which is a survey of every person in the country. If you took the census data that told you the number of people in the United States, and if from all of those numbers you calculated the mean age, what symbols would you use to represent these numeric facts?

N and µ. (A census counts every member of a given population (though in practice it isn't always successful at reaching everyone). The symbol for the parameter population size is N, and the symbol for the parameter population mean is µ (mu). n and x-bar are the symbols for the sample statistics.)

You want to know something about your neighbors, so you give them a survey. The survey collects the following data about each family on your block: family size, the kind of pets they have, the grade of the youngest child in the family, the family's annual income in dollars, what the dad does for a living, whether the mom works, and their phone number. Each kind of data you collect about a family is a variable. Which of the variables you collect are continuous data?

Only annual income

Which of the following combinations of data types is not possible? - discrete and categorical - continuous and categorical - discrete and numeric - continuous and numeric - all combinations of data types are possible

continuous and categorical are not possible

Let's say that a researcher administers a new type of vitamin supplement to a sample of 30 rats. Thirty other rats didn't receive the supplement. Later, he compares the weights of the supplement group with the non-supplement control group. In this case the rats' weights are an example of which type of data?

continuous data (Weights are measured, so the data are continuous.)

Inferential statistics is used in each of the following except:

creating a pictograph of the number of people struck by lightning each year. (In creating a pictograph, you haven't attempted to predict or compare anything. You've used descriptive statistics.)

True or False: In a histogram, a single building or class contains all the values of the data set.

false

True or False: We usually don't have to sample because we can always gather data from every population member.

false

True or False: You perform a study to see how long the grass in your yard will live if you don't water it all summer. You conclude that no one's yard can live for more than three weeks without water. This is an example of descriptive statistics.

false

True or False: If the population of interest is all day care centers in the United States, a sample of day care centers could be either all day care centers in New York City or a randomly selected group of day care centers throughout the United States. Either sample is equally good.

false (A simple random sample of day care centers in the U.S., rather than a sample that comes from only one city, is more likely to produce statistics that accurately estimate the population parameters you're interested in. This is because in an SRS each population member is equally likely to be chosen, regardless of its characteristics.)

True or False: If your sample is made up of power tools randomly selected from one hardware store, your population of interest is all power tools sold in hardware stores

false (If you randomly select the tools, but only from one hardware store, the relevant population is all power tools in that hardware store alone. You can't assume that all hardware stores carry the same kinds of power tools.)

True or False: A population contains 60% women and 40% men. To reflect the population group, we make sure that our sample also contains 60% women and 40% men. This is an example of a simple random sample.

false (In a simple random sample, any possible combination of people must be equally likely. This statement describes a sample in which there are restrictions.)

True or False: A large randomly selected sample always gives a better estimate of the population than a small randomly selected sample.

false (Remember though, smaller samples can still work very well if the sample is representative of the population. It's even possible for a very large simple random sample to give a less accurate estimate than a smaller simple random sample, since the sample members are drawn randomly and you never know exactly what you'll get.)

True or False: For large populations, 1,000 is the best sample size.

false (The best size for your sample depends on many factors, including the shape of the distribution and acceptable margin of error in your study. Sometimes you may need a sample size of fewer than 1,000, and sometimes you may need a size greater than 1,000.)

True or False: The mean and standard deviation are usually not used together because of outliers.

false (The mean and standard deviation should be used together since the standard deviation measures deviation from the mean. Keep in mind, however, that the mean is sensitive to the effect of outliers)

True or False: You want to know how often residents of cold climates vacation in warm destinations. You randomly sample 50 residents and find out how many annual trips to warm destinations they've taken during their adult lives. True or False: This is an example of discrete data.

false (The number of trips is counted and therefore discrete.)

Based on the data in the table below, what is the smallest number of births (in thousands) a month could possibly have and still be an upper outlier?

for a data set like births, apparently the upper + lower outliers need to be whole numbers. ie. 357.7 is the correct calculation, but 358 is the correct answer

You claim that you're healthier than your friends. To support your claim, you randomly select some of your friends and track their meals for a month. You also track your meals during the same month. What you are doing is:

inferential statistics (This is an example of inferential statistics. The data you collected is a sample used to infer whether you're healthier than your friends.)

Histograms are most useful in displaying:

large numeric data sets (Histograms are used to display frequencies across intervals of numeric data.)

In inferential statistics, variation is an essential measurement for:

making predictions.

n (for the size of the group), x̄ (for the mean), and s (for the standard deviation) are all measures calculated from what group?

sample

A synonym for variation is:

spread. (While distance is a component of calculating variation, it's not a synonym.)

You've drawn a simple random sample from a population. The standard deviation of this sample is a(n):

statistic

Six radio listeners are surveyed. Their favorite FM stations are: 89.1, 89.1, 89.1, 94.7, 94.7, and 104.3. Based on these data, you want to name the favorite station of a typical listener. You should name:

the mode, which is 89.1

You have a distribution summarizing the number of days in the past two months (60 days) that an individual watched TV. The median number of days = 25, the lower quartile = 21, and upper quartile = 42. Given this information, which of the following statements are true?

there are no outliers in this distribution

True or False: In a survey of your neighbors (asking for family size, the kind of pets they have, the grade of the youngest child in the family, the family's annual income in dollars, what the dad does for a living, whether the mom works, and their phone number), the only discrete, numerical data you're collecting about your neighbors is family size.

true

True or False: A simple random sample is not just a sample where every population member has an equal chance of being drawn.

true (A simple random sample also has the requirement that all possible samples are equally likely, meaning that every possible combination of population members has the same chance of occurring.)

True or False: For a symmetric, mound-shaped distribution, the mean, median, and mode are all the same.

true (A symmetric, mound-shaped distribution will have its mean as the most common value. Half of the values will be above the mean and half will be below it.)

True or False: Your population of interest is whatever you decide it is. A population can be anything as long as it's defined as a population.

true (If you're interested in trees in general, but elm trees in particular, especially those close to where you live, you could say that your population is not trees in general, but elm trees in the park next to your house. Then an appropriate sample would be a sample selected from the elm trees in the park.)

True or False: It's possible to determine the frequencies (counts) within each interval from a cumulative frequency plot.

true (In a cumulative frequency table, the difference between successive entries in a column is equal to the frequency of the lower entry. All frequencies can be "recaptured" by computing all such differences.)

True or False: The shape and standard deviation of a population distribution of a variable (such as income) can be estimated with a distribution of a sample of sufficient size

true (Just as sample statistics are used to estimate population parameters, distributions of samples can be used to estimate the shapes and standard deviations of population distributions.)

You're using a ruler to measure lengths of stick-bugs. The ruler is marked for every centimeter. You end up taking all lengths to the nearest centimeter. (In other words, you can have 5 centimeters and 6 centimeters, but not 5.5 centimeters.) True or False: These measured data are can be called discrete or continuous, depending on how you think of the data.

true (Sometimes the definitions get fuzzy. Measured data are usually thought of as continuous, but in this case the measured data are also discrete because they can only have certain values that are whole numbers. But you can also think of the data as rounded estimates of the true lengths, so in a sense the data are specific points along a continuous number line. Most people would call these lengths continuous data, since they're more measurements than counts.)

True or False: One reason to use a sample to estimate the shape of a population distribution is to determine which statistics are appropriate to use for that variable, since different statistics have different characteristics.

true (There are several different statistics you could use to measure central tendency, but different ones work better with different shapes of distributions.)


Related study sets

Chapter 7: The Empires of Persia

View Set

Lección 10 Estructura: 10.2 Grammar tutorial: The preterite and the imperfect

View Set

Article 110 - Requirements for Electrical Installations (QUARTER 1)

View Set

Personal Finance: Auto Insurance--Who Am I?

View Set

RAD 114 CHEST (Problem Solving for Technical and Positioning Errors)

View Set

AP1 Prac3 lab 9(14&15) mastering questions

View Set

UNIT: CAUSES AND CONSEQUENCES OF WORLD WAR II - VICTORY & DEFEAT

View Set

Operating Systems and You: Process Management Quiz

View Set

Chap 2 Individual Behaviour , Personality, and Values

View Set

GEB1101:M2-C16: Mastering Financial Management

View Set