E370 Exam 1
primary data
data that you have collected for your own use
correlation coefficient numbers
- 1:1
mean > median
right skewed
Suppose the dealer incentive per vehicle for Honda's Acura brand is thought to be bell-shaped and symmetrical with a mean of $2,500 and a standard deviation of $300. Based on this information, what interval of dealer incentives would we expect approximately 99.7% of vehicles to fall within? $1,600 to $3,400 $1,900 to $3,100 $2,200 to $2,800 $1,300 to $3,700
$1,600 to $3,400
bar charts
- can be arranged in a vertical or horizontal orientation - on one axis (usually horizontal), we specify the labels that are used for each of the classes -A frequency or relative frequency scale can be used for the other axis (usually, vertical) - Using a bar of fixed width drawn above each class label, we extend the height appropriately
disadvantages of primary data
- can be expensive and time consuming to gather
advantages of primary data
- collected by the person or organization who uses the data
What type of relationship is indicated in the scatterplot? It is hard to tell because the variables are not identified A negative relationship No relationship A positive relationship
A negative relationship
Which of the following is a quantitative variable? House age All of the suggested choices House price House size
All of the suggested choices
descriptive statistics
Collecting, summarizing, and displaying data (reported based on observations)
Advantages of range
Easy to calculate
Conditional Probability Formula
P(A | B) = P(A and B) / P(B) or P (A | B) = P( A ∩ B) / P(B)
correlation coefficient
The sample correlation coefficient, rxy , indicates both the strength is and direction of the linear relationship between the independent and dependent variables:
positive z score
above the mean
estimated class width formula
max data value - min data value / k
z scores from -3 to 3
outliars
wide classes result in few class intervals
- can obscure important patterns - gives a blocky distribution graph - tells little about the distribution shape
primary data collection methods
- direct observations - experiments - survey's
disadvantages of secondary data
- no control over how the data was collected - less reliable unless collected and recorded accurately
advantages of secondary data
- readily available - less expensive to collect
too many narrow classes in a histogram also have consequences
- results in a jagged histogram - some classes may be empty
the need for sampling
- too expensive to gather info on the entire pop - too time consuming to gather info on the entire pop - often impossible to gather info on the entire pop
Cross-section data
- values collected from a number of subjects during a single time period - subjects might include individuals, households, firms, industries, regions, countries
time series data
- values that correspond to specific measurements taken over a range of time periods - data can include hourly, weekly, monthly, quarterly, or annual observations
class width
-The width is the range of numbers to put into each class - Round this estimate to a useful whole number that makes the frequency distribution more readable -There is no one correct answer for the class width - The goal is to create a histogram to clearly and usefully show the pattern in the data - Often there is more than one acceptable way to accomplish this
In a standard normal distribution, the probability that z is greater than zero is 1 1.96 0.5 at least 0.5
.5
probability of discrete probability distributions must be between
0 and 1
ideal number of classes in a frequency distribution
4-20 - Some data sets, particularly those with continuous data, require several values to be grouped together in a single class - This grouping prevents having too many classes in the frequency distribution, which can make it difficult to detect patterns
a binomial random variable
A binomial random variable is defined as the number of successes denoted by x achieved in the n trials and is a result of a Binomial Experiment Characteristics of a Binomial Experiment: 1. The experiment consists of a fixed number of (Bernoulli) trials, denoted by n 2. Each trial has only two possible outcomes, a success and a failure 3. The probability of a success p and the probability of a failure q are constant throughout the experiment 4. Each trial is independent of the other trials in the experiment
Scatterplot
A scatter plot is a graphical tool used to determine if two variables are related. Each point represents a pair of known values of the two variables for one observation In the relationship, we usually distinguish between dependent and independent variables
complement of an event
All possible outcomes that are not in the event P(A) + P(A' ) = 1 or P(A) = 1 - P(A' )
empirical method of probability
Assigning probabilities based on experimentation or historical data 𝑃 𝐴 = Frequency in which Event A occurs / Total number of observations
subjective method of probability
Assigning probabilities based on judgment or experience
classical method of probability
Assigning probabilities based on the assumption of equally likely outcomes 𝑃 𝐴 = Number of possible outcomes that constitute Event A / Total number of possible outcomes in the sample space
addition rule for mutually exclusive events formula
If A and B are mutually exclusive, P(A or B) = P(A) + P(B) or P (A∩B) = P(A)+P(B)
multiplication rule for dependent events
P(A I B) = P(A∩B)/P(B) or 𝑃 𝐴∩𝐵 =𝑃 𝐵 𝑃(𝐴|𝐵)
Addition Rule for Probability formula
P(A or B) = P(A) + P(B) -P(A and B) or P(A ∩ B) = P(A) + P(B) -P(A ∩B)
pie charts
Pie charts are a tool for comparing proportions for qualitative data Each segment of the pie represents the relative frequency of one category - All categories in the data set must be included in the pie - Use a pie chart to compare the relative sizes of all possible categories - Bar charts are more useful when you want to highlight the actual data values
the intersection of events
The intersection of events A and B is the set of all outcomes that are in both A and B The intersection of events A and B is denoted by A ∩ B
Mean of a discrete probability distribution
The mean, 𝝁, of a discrete probability distribution is the weighted average of ALL values of the random variable • The weights are the probabilitie s• The mean does not have to be a value the random variable can assume Mean is also known as the expected value E(x)
the multiplication rule
The multiplication rule is used to determine the probability of the intersection (joint probability) of two events occurring, or P(A and B)
independent events
The outcome of one event does not affect the outcome of the second event P (A | B) = P(A)
Conditional Probability
The probability of an event given or knowing that another event has occurred is called a conditional probability The conditional probability of A given B is denoted by P(A|B) and computed as:
Percentile
The pth percentile divides a data set into two parts: • Approximately p percent of the observations have values less than the pth percentile • Approximately (100 − p) percent of the observations have values greater than the pth percentile
Sample Mean Formula
The sample mean formula is: x̄ = ( Σ xi ) / n x̄ just stands for the "sample mean" Σ means "add up" xi "all of the x-values" n means "the number of items in the sample" Finding the mean To find the mean, sum all the numbers and then divide by the number of items in the set. For example, to find the mean of the following set of numbers: 21, 23, 24, 26, 28, 29, 30, 31, 33 First add them all together: 21 + 23 + 24 + 26 + 28 + 29 + 30 + 31 + 33 = 245 Then divide your answer by the number of items in your set. There are 9 numbers, so: 245 / 9 = 27.222
the union of events
The union of events A and B is the event containing all outcomes that are in A or B or both The union of events A and B is denoted by A U B
mutually exclusive events
Two events are said to be mutually exclusive if the events have no outcomes in common Two events are mutually exclusive if, when one event occurs, the other cannot occur
Parameter
a described characteristic about a population
Statistic
a described characteristic about a sample
random variable
a numerical description of the outcome of an experiment
data set
all the data collected in a particular study
Elements
are the entities on which data are collected
negative z-score
below the mean
if data has exactly 2 modes
bimodal
The standard deviation of a sample of 100 elements taken from a very large population is determined to be 60. The variance of the population must be at least 100 can be any value greater than zero cannot be larger than 3600 cannot be larger than 60
can be any value greater than zero
qualitative data
classified by descriptive terms (labels or names used to identify an attribute of elements) examples: - hair color - eye color - political party - marital status
Measures of Relative Position
compare the position of one value in relation to other values in the data set
secondary data
data collected by someone else
information
data that are transformed into useful facts that can be used for a specific purpose, such as making a decision it is more than just data
quantitative data
described by numerical values (how many or how much) 1. counted: - number of children - defects per hour 2. measured: - weight - voltage
The Department of Transportation of a city has noted that on the average there are 17 accidents per day. The average number of accidents is an example of descriptive or inferential statistic?
descriptive
probability distributions can be
discrete or continous
relative frequency distribution
displays the proportion of observations of each class relative to the total number of observations - shows the fraction of observations in each class - found by dividing each frequency by the total number of observations - fractions in a relative frequency distribution add up to 1
tabular data
frequency distribution, relative frequency distribution, and cumulative relative frequency distribution
graph data
histogram
mutually exclusive events does not equal
independent events • If one mutually exclusive event is known to occur, the other cannot occur. thus, the probability of the other event occurring is reduced to zero (and they are therefore dependent)
Based on a survey, it was concluded that households with children under the age of 18 are more likely to have access to the Internet than family households with no children. Is it an example of descriptive or inferential statistic?
inferential
2 types of quantitative data
interval and ratio
a variable
is a characteristic of interest for the elements
Histogram
is a graph showing the number or % of observations in each class - it is a graphical representation of a frequency distribution or the relative frequency distribution - classes of a variable of interest are placed on the horizontal axis - a rectangle is drawn above each class interval with its height corresponding to the frequency or relative frequency
classes letter
k
mean < median
left skewed
inferential statistics
making claims or conclusions about the population based on a sample of data sample statistic is calculated form the sample data and is used to make inferences about the unknown pop parameter
a continuous random variable
may assume any numerical value in an interval or collection of intervals • Often measured, fractional values are possible • time required to complete a task; height; distance
a discrete random variable
may assume either a finite number of values or an infinite sequence of values • Values are whole numbers (integers), usually counted • number of complaints per day; number of TVs in a household
Which of the following is the most influenced by outliers? Median 75th percentile 1st quartile Mean
mean
central tendency
mean, median, mode
coefficient of variance
measures the SD in terms of its percentage of the mean and indicates how large the SD is in relation to the mean • A high CV indicates high variability relative to the mean • A low CV indicates low variability relative to the mean
When outliers are present in the data set, which measure is the best to describe central tendency in the data? weighted mean median mean standard deviation
median
The median is defined as the most frequent value of a data set middle point in a data set average of a data set closest value to the mean
middle point in a data set
if the data has more than 2 modes
multimodal
The normal distribution can well approximate the binomial distribution as long as n q ≥ 5 n p ≥ 5 , n q ≥ 5 n p ≥ 5 We do not need any additional requirements because it is always the case
n p ≥ 5 , n q ≥ 5
negative sample covariance
negative linear relationship
2 types of qualitative data
nominal and ordinal
2 measures of relative position
percentiles and quartiles
positive sample covariance
positive linear relationship
Addition Rule of Probability
provides a way to compute the probability of event A, or B, or both A and B occurring
nominal data
qualitative arbitrary labels for data. no ranking allowed examples: - eye color - zip code
ordinal data
qualitative ranking allowed. no measurable meaning to the number of differences example: - education level
interval data
quantitative meaningful differences. no true point zero (zero does not mean absence) example: - calendar year
ratio data
quantitative meaningful differences. true zero point (zero means absence) example: - income (48,000 or 0)
measures of variability
range, variance, standard deviation
sample
refers to a portion of the population that is representative of the population form which it was selected
class boundaries
represent min and max values for each class
Population
represents all possible subjects that are of interest in a particular study
Sample Variance Formula
s^2 = ∑(x- x̄)^2 / n-1 s^2 is the sample variance x is the value of each observation in the sample x̄ is the mean of the sample n is the number of observations in the sample
zero z score
same as mean
measures of association between 2 variables
sample covariance and sample correlation coefficient
frequency distribution
shows the number of data observations that fall into specific intervals
Quartiles
split the ranked data into 4 equal groups:• The first quartile (Q1) is the value that constitutes the 25th percentile • The second quartile (Q2) is the value that constitutes the 50th percentile • Second quartile (the 50th percentile) = Median • The third quartile (Q3) is the value that constitutes the 75th percentile
Sample Covariance
sxy , measures the direction of the linear relationship between two variables
mean = median
symmetrical distribution
sample variance
s²
Z Scores are ...
the # of standard deviations above or below the mean
the sample variance is
the average of the squared differences between each data value and the mean
Statistics
the mathematical science that deals with the collection, analysis, and presentation of data, which can then be used as a basis for inference and induction
If no data value or category repeats more than once, then we say that
the mode does not exist
observation
the set of measurements obtained for a particular element "a data set with n elements contains n observations"
if the sample covariance is 0
there is no linear relationship
cumulative relative frequency distribution
totals the proportion of observations that fall below the upper limit of each class - shows the accumulated proportion as a class values vary from low to high - cumulative relative frequency for the highest class is equal to 1
Data
values assigned to observations or measurements
Weighted Mean Formula
x bar = ∑ (w • x) / ∑w
Population Mean Formula
μ = ( Σ X ) / N
Which of the following symbols represents the standard deviation of the population? μ σ σ 2 x ¯
σ
disadvantages of range
• Based on two numbers in the data set and ignores the way in which the data are distributed • Sensitive to outliers Change in one value causes a dramatic change in the range The range does not accurately reflect the overall variability of the data
advantages of weighted mean
• Simple to calculate • Summarizes the data with a single value
disadvantages of weighted mean
• With only a summary value we lose information about the original data • The value of the mean is sensitive to outliers (values that are much higher or lower than most of the data)
a discrete probability distribution is
• a listing of all the possible outcomes of an experiment for a discrete random variable • along with the relative frequency of each outcome
discrete probability distribution
• describes how probabilities are distributed over the values of the random variable • can be represented by a table, a graph, or a formula • a formula used to describe discrete probability distribution is called a probability function, denoted by P(x) (sometimes f(x)), which provides the probability for each value of the random variable x
multiplication rule for independent events
𝑃 𝐴∩𝐵 =𝑃 𝐵 𝑃(𝐴)
Which one of the following is NOT a valid number for a probability? 0 1.5 0.63 17%
1.5
x is a random variable with the probability function P(x) = x/6 for x = 1, 2 or 3. The expected value of x is 0.5 2 2.333 0.333
2.333
Consider the following frequency distribution: Class Frequency 12 up to 15 3 15 up to 18 6 18 up to 21 3 21 up to 24 4 24 up to 27 4 The total number of observations to construct this frequency distribution is 20 24 6 4
20
Twenty percent of the students in a class of 100 are planning to go to graduate school. The standard deviation of this binomial distribution is 20 4 2 16
4
The statistics professor has kept attendance records and recorded the number of absent students per class. The recorded data is displayed in the following bar chart with the frequency of each number of absent students shown above the bars. How many statistics classes had three or more students absent? 43 22 8 13
43
Assume that you have a binomial experiment with p = 0.5 and a sample size of 100. The expected value for this distribution is 50 100 5 0.5
50
Suppose the wait through immigration at JFK Airport in New York is thought to be bell-shaped and symmetrical distributed with a mean of 22 minutes. It is known that 68 percent of travelers will spend between 16 and 28 minutes waiting to pass through immigration. The standard deviation for the wait time through immigration is 10 minutes 12 minutes 8 minutes 6 minutes
6 minutes
An analyst constructed the following frequency distribution on the monthly returns for 50 selected stocks: Class (in percent) Frequency -10 up to 0 8 0 up to 10 25 10 up to 20 15 20 up to 30 2 The cumulative relative frequency for the class "0 up to 10" is 0.66. This means that 66% of stocks have returns above 10% 66% of stocks have returns below 10% 66% of stocks have returns below 0% 66% of stocks have returns between 0% and 10%
66% of stocks have returns below 10%
The following data represent the recent sales price (in $1,000s) of 24 homes in a Midwestern city:187 125 165 170 230 139 195 229239 135 188 210 228 172 127 139122 181 196 237 115 199 170 239Suppose the data are grouped into five classes, and one of them will be "165 up to 190." The frequency of this class is 7 6 6/24 7/24
7
The following data represent the recent sales price (in $1,000s) of 24 homes in a Midwestern city:187 125 165 170 230 139 195 229239 135 188 210 228 172 127 139122 181 196 237 115 199 170 239Suppose the data are grouped into five classes, and one of them will be "115 up to 140." The relative frequency of this class is 6/24 7/24 6 7
7/24
What is probability? A numerical value assigned to an event that measures the number of its occurrences A value between 0 and 1 assigned to an event that measures the likelihood of its occurrence A value between 0 and 1 assigned to an event that measures the unlikelihood of its occurrence Any value between 0 and 1 randomly assigned to an event
A value between 0 and 1 assigned to an event that measures the likelihood of its occurrence
The accompanying chart shows the numbers of books written by each author in a collection of cookbooks. What type of chart is this? Histogram for qualitative data Bar chart for quantitative data Bar chart for qualitative data Histogram for quantitative data
Bar chart for qualitative data
How do we find the median if the number of observations in a data set is odd? By taking the middle value in the sorted data set after eliminating outliers By averaging the first and the third quartiles By taking the middle value in the sorted data set By averaging the minimum and maximum values
By taking the middle value in the sorted data set
Which of the following variables is qualitative? Weight Gender Temperature Height
Gender
Which one of the following statements about probability in NOT true? If Events A and B are mutually exclusive, then Event A and Event B must occur at the same time during the experiment If P(A) = 1, then with certainty, Event A must occur The probabiity of any event must range between 0 and 1 If P(A) = 0, then with certainty, Event A will not occur
If Events A and B are mutually exclusive, then Event A and Event B must occur at the same time during the experiment
Suppose that you conduct a study in which you observe parents and their children interacting at home. You find that the more supportive parents are, the less aggressive their children are. What conclusion can you make? Level of support and aggression are negatively correlated. Children being aggressive causes parents to be less supportive. Parents being more supportive causes children to be less aggressive. Level of support and aggression are positively correlated.
Level of support and aggression are negatively correlated.
The Boom company has recently decided to raise the salaries of all employees by 10%. Which of the following is(are) expected to be affected by this raise? Mode and median only Mean, median, and mode Mean and mode only Mean and median only
Mean, median, and mode
Frequency distributions may be used to describe which of the following types of data? Nominal, ordinal, interval, and ratio data Nominal, ordinal, and interval data only Nominal and ordinal data only Nominal and interval data only
Nominal, ordinal, interval, and ratio data
If two groups of numbers have the same mean, then their standard deviations must also be equal None of the suggested alternatives is correct their modes must also be equal their medians must also be equal
None of the suggested alternatives is correct
The mean of the sample can never be negative is always smaller than the mean of the population from which the sample was taken can never be zero None of the suggested alternatives is correct
None of the suggested alternatives is correct
It is known that the length of a certain product x is normally distributed with µ = 20 inches. How is the probability P ( x > 16 ) related to P ( x < 16 )? P ( x > 16 )is smaller than P ( x < 16 ) No comparison can be made with the given information P ( x > 16 )is greater than P ( x < 16 ) P ( x > 16 )is the same as P ( x < 16 )
P ( x > 16 )is greater than P ( x < 16 )
If x has a normal distribution with µ = 100 and σ = 5, then the probability P ( 90 ≤ x ≤ 95 ) can be expressed in terms of a standard normal variable z as P ( − 2 ≤ z ≤ − 1 ) P ( − 2 ≤ z ≤ − 2 ) P ( − 2 ≤ z ≤ 1 ) P ( 2 ≤ z ≤ 1 )
P ( − 2 ≤ z ≤ − 1 )
The accompanying chart shows the numbers of books written by each author in a collection of cookbooks. What type of data is being represented? Qualitative, ordinal Qualitative, nominal Quantitative, interval Quantitative, ratio
Qualitative, nominal
Which of the following is an example of time series data? Quarterly housing starts collected over the last 60 years Starting salaries of recent business graduates at IU Results of market research conducted in 2012 testing consumer preferences for soda The sale prices of townhouses sold last year
Quarterly housing starts collected over the last 60 years
Which of the following represents a population and a sample from that population? Freshmen at IU and basketball players at IU Teachers of a high school and members of the parent-teacher group Fans at a concert who purchase T-shirts, and fans at a concert who purchase soda Residents of Albany, New York, and registered voters in Albany, New York
Residents of Albany, New York, and registered voters in Albany, New York
Which of the following is an example of cross-sectional data? Quarterly housing starts collected over the last 60 years Daily price of DuPont stock during the first quarter Results of market research conducted in 2012 testing consumer preferences for soda GDP of the United States from 1990-2010
Results of market research conducted in 2012 testing consumer preferences for soda
Which of the following can be represented by a discrete random variable? The average outside temperature taken every day for two weeks The height of college students The number of obtained spots when rolling a six-sided die The finishing time of participants in a cross-country meet
The number of obtained spots when rolling a six-sided die
Which of the following is NOT a characteristic of a binomial experiment? The probability of a success must exceed the probability of a failure. Each trial is independent of the other trials in the experiment. Each trial has only two possible outcomes—a success or a failure. The experiment consists of a fixed number of trials.
The probability of a success must exceed the probability of a failure.
Which of the following is NOT a characteristic of the normal probability distribution? The mean of the distribution can be negative, zero, or positive The distribution is symmetrical The standard deviation must be 1 The mean, median, and the mode are equal
The standard deviation must be 1
Which of the following can be represented by a continuous random variable? The score of a randomly selected student on a five-question multiple-choice quiz The number of defective light bulbs in a sample of five The number of arrivals to a drive-through bank window in a four-hour period The time of a flight between Chicago and New York
The time of a flight between Chicago and New York
Is it possible for a data set to have more than one mode? No, there must always be a single mode, or else there is no mode. Yes, if there are at least two different values in a data set, there is always more than one mode. Yes, if two or more values in a data set occur the same number of times. Yes, if two or more values in a data set occur with the most frequency.
Yes, if two or more values in a data set occur with the most frequency.
Positive values of the correlation coefficient indicate a positive variance of the x values a positive relation between the independent and the dependent variables a positive variance of the y values that the standard deviations of both x and y are positive
a positive relation between the independent and the dependent variables
A population consists of a subject of interest in a sample all items of interest in a sample a subject of interest in a study all items of interest in a study
all items of interest in a study
A survey asked randomly selected adults 18-29 years old to indicate their working status. The results are shown in the following table: Working Status Frequency Working full-time 103 Working part-time 60 Student, not working 31 Unemployed 56 The probability that a randomly selected adult 18-29 years old is unemployed calculated using this data would be an empirical probability a classical probability a subjective probability a simple probability
an empirical probability
A continuous random variable may assume only the positive integer values in an interval only integer values in an interval or collection of intervals any value in an interval or collection of intervals only fractional values in an interval or collection of intervals
any value in an interval or collection of intervals
Population parameters are difficult to calculate due to the infeasibility of collecting data on the entire population the fact that samples are difficult to draw due to the nature of the data cost prohibitions on data collection both cost prohibitions on data collection and the infeasibility of collecting data on the entire population
both cost prohibitions on data collection and the infeasibility of collecting data on the entire population
Comparing the consistency (variability) between two data sets when their means are very different is best done with the coefficient of variation z-score standard deviation median
coefficient of variation
Since the population is always larger than the sample, the value of the sample mean is always larger than the true value of the population mean could be larger, equal to, or smaller than the true value of the population mean is always smaller than the true value of the population mean is always equal to the true value of the population mean
could be larger, equal to, or smaller than the true value of the population mean
Which of the following measures the direction of the linear relationship between two variables but does not measure the strength of the relationship? variance coefficient of variation covariance correlation coefficient
covariance
Suppose that you obtained the following scatter plot for the relationship between variables x and y. What would you conclude about sample covariance for this data? covariance is negative covariance is negative and close to -1 covariance can take any value covariance is negative and close to 0
covariance is negative
Your business statistics class had a test last week. The average score for the class is an example of secondary data descriptive statistics qualitative data inferential statistics
descriptive statistics
The Law of Large Numbers states that when an experiment is conducted a large number of times, the [ Select ] ["empirical", "subjective", "classical"] probabilities of the process will converge to the [ Select ] ["subjective", "empirical", "classical"] probabilities.
empirical classical
In inferential statistics, we calculate sample statistics to Neither of suggested choices is correct summarize the information about the sample make conclusions about the sample estimate unknown population parameters
estimate unknown population parameters
A qualitative variable assumes meaningful numerical values.
false
Cross-sectional data contain values of a characteristic of one subject collected over time.
false
Nominal data has all the features of interval data with the added benefit of having a true zero point.
false
Population parameters are used to estimate corresponding sample statistics.
false
The branch of statistical studies called inferential statistics refers to drawing conclusions about sample data by analyzing the corresponding population.
false
The mathematical operation of addition can be performed on nominal data.
false
Typically, it is possible to examine every member of the population.
false
Which scales of data measurement are associated with quantitative data? Interval and ratio Ratio and nominal Nominal and ordinal Ordinal and interval
interval and ratio
The Fahrenheit scale for measuring temperature would be classified as a(n) nominal scale ratio scale ordinal scale interval scale
interval scale
If a data set has an even number of observations, the median is the average value of the two middle items is the average value of the two middle items when all items are arranged in ascending order cannot be determined must be equal to the mean
is the average value of the two middle items when all items are arranged in ascending order
The center of a normal curve is always equal to zero cannot be negative is the mean of the distribution is the standard deviation
is the mean of the distribution
The covariance ranges between minus infinity and plus infinity 0 and 1 0 and 100 -1 and +1
minus infinity and plus infinity
Which measure would you use to describe central tendency for qualitative data? mean median mode weighted mean
mode
During a cold winter, the temperature stayed below zero for ten days (ranging from -20 to -5). The variance of the temperatures of the ten-day period is negative since all the numbers are negative cannot be computed since all the numbers are negative must be positive can be either negative or positive
must be positive
A recent survey of 200 small firms (annual revenue less than $10 million) asked whether an increase in the minimum wage would cause the firm to decrease capital spending. Possible responses to the survey question were: "Yes," "No," or "Don't Know." The scale of measurement for the data generated by this survey question is best classified as nterval ordinal nominal ratio
nominal
The covariance between the returns on two assets is negative. This occurs when on average, the return on one asset increases while the return on the other asset decreases the variance of one asset has a negative linear relationship with the variance of the other asset on average, the return on one asset decreases while the return on the other asset also decreases the standard deviation of one asset has a positive linear relationship with the standard deviation of the other asset
on average, the return on one asset increases while the return on the other asset decreases
The sum of the relative frequencies for all classes will always equal to the number of classes can be any value one the sample size
one
What kind of data assumes that the differences between categories are meaningless? ratio interval ordinal measured
ordinal
μ is an example of a population parameter mode sample statistic population variance
population parameter
Which of the following indicates the strongest relationship between two variables? r = 0.09 r = -0.6 r = 0.5 r = 2
r = -0.6
What is the scale of measurement of the distance between any two locations? Ordinal Interval Nominal Ratio
ratio
The diagram below is an example of a scatter plot illustrating a lack of correlation between tobacco and alcohol scatter plot illustrating a positive correlation between tobacco and alcohol scatter plot illustrating a perfect correlation between tobacco and alcohol histogram illustrating a positive correlation between tobacco and alcohol
scatter plot illustrating a positive correlation between tobacco and alcohol
The following graph shows the curb weight of seven cars, in pounds, along with their corresponding highway miles per gallon: This graph is an example of a line chart, and miles per gallon is the dependent variable in the graph scatter plot, and curb weight is the dependent variable in the graph line chart, and curb weight is the independent variable in the graph scatter plot, and miles per gallon is the dependent variable in the graph
scatter plot, and miles per gallon is the dependent variable in the graph
A smaller standard deviation for the normal probability distribution results in a fatter curve that is more spread out around the mean and not as tall skinnier curve that is more spread out around the mean and not as tall skinnier curve that is tighter and taller around the mean fatter curve that is tighter and taller around the mean
skinnier curve that is tighter and taller around the mean
Data that describe a characteristic about a sample is known as a population survey parameter statistic
statistic
If the variance of a data set is correctly computed with the formula using n - 1 in the denominator, which of the following is true? the data set could be either a sample or a population the data set includes all subjects of interest in the study the data set is a population the data set is a sample
the data set is a sample
For a continuous random variable x, the probability density function f(x) represents the probability at a given value of x the area under the curve at x the area under the curve to the right of x the height of the function at x
the height of the function at x
Which one of the following statements is true for a right-skewed distribution? The mean is roughly equal to the median. The mean is less than the median. The mean is greater than the mode. the mean is greater than the median.
the mean is greater than the median.
An analyst constructed the following frequency distribution on the monthly returns for 50 selected stocks: Class (in percent) Frequency -10 up to 0 8 0 up to 10 25 10 up to 20 15 20 up to 30 2 The frequency for the class "0 up to 10" is 25. This means that the number of stocks with returns of less than 10% is 25. the number of stocks with returns of 10% is 25. the percentage of stocks with returns between 0% and 10% is 25%. the number of stocks with returns between 0% and 10% is 25.
the number of stocks with returns between 0% and 10% is 25.
Which of the following characteristics does the interval scale NOT have? There is a true zero point. Numerical values are used to measure various characteristics. Values can be ranked. The differences between values are valid.
there is a true zero point
The data represents the stock price for Google at the end of the past four quarters. Which of the following types of data best describes these values? Time series Nominal Cross-sectional Ordinal
time series
A population is a larger data set than its corresponding sample.
true
Mathematical operations can be performed on ratio-scaled data.
true
The branch of statistical studies called descriptive statistics summarizes important aspects of a data set.
true
Which of the following symbols represents the mean of the sample? σ μ n x ¯
x ¯
Which of the following statements is NOT true concerning the attributes of z-scores? z-scores can be positive or negative for data values above the mean of the distribution z-scores are equal to zero for data values equal to the mean of the distribution z-scores are negative for data values below the mean of the distribution z-scores are positive for data values above the mean of the distribution
z-scores can be positive or negative for data values above the mean of the distribution
For an experiment in which a single die is rolled, the sample space is {1, 1, 3, 4, 5, 6} All suggested alternatives can be viewed as the sample space {1, 2, 3, 4, 4, 5} {2, 1, 3, 6, 5, 4}
{2, 1, 3, 6, 5, 4}
Consider the sample space of an experimental outcome denoting days of the week: S = {Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday}. Let's define an event: A = {Days when I have my stats classes} = {Tuesday, Thursday}. What is the complement of the event A? {Monday, Wednesday, Friday} {Tuesday, Thursday} {Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday} {Monday, Wednesday, Friday, Saturday, Sunday}
{Monday, Wednesday, Friday, Saturday, Sunday}
Consider two events: A = {apple pie, peach pie, pumpkin pie}; B = {cherry pie, blueberry pie, pumpkin pie}. The union of these events is {pumpkin pie} {apple pie, peach pie, pumpkin pie, cherry pie, blueberry pie} {apple pie, peach pie, cherry pie, blueberry pie} {apple pie, peach pie, pumpkin pie, cherry pie, blueberry pie, pumpkin pie}
{apple pie, peach pie, pumpkin pie, cherry pie, blueberry pie}
Consider two events: A = {apple pie, peach pie, pumpkin pie}; B = {cherry pie, blueberry pie, pumpkin pie}. The intersection of these events is {apple pie, peach pie, pumpkin pie, cherry pie, blueberry pie, pumpkin pie} {apple pie, peach pie, cherry pie, blueberry pie} {pumpkin pie} {apple pie, peach pie, pumpkin pie, cherry pie, blueberry pie}
{pumpkin pie}
The following table shows the probability distribution for the number of boats sold daily at Boats Unlimited. What is the probability that exactly 3 boats are sold? x f(x) 0 0.20 1 0.30 2 0.32 3 ? 4 0.05 5 0.02 Can be any value between 0 and 1 0.62 0.11 Need to know the distribution function to calculate the probability
.11
If P(A) = 0.79, find P(A'). 0.105 0.21 0.395 0.79
.21
Consider the following frequency distribution: Class Frequency 12 up to 15 3 15 up to 18 6 18 up to 21 3 21 up to 24 4 24 up to 27 4 What is the relative frequency of the class "15 up to 18"? 0.3 0.2 0.25 0.35
.3
For the standard normal probability distribution, the area under the probability density function to the left of the mean is 0.5 any value between 0 to 1 1 Cannot say exactly without knowing the value of the mean
.5
Consider the following frequency distribution: Class Frequency 12 up to 15 3 15 up to 18 6 18 up to 21 3 21 up to 24 4 24 up to 27 4 What is the cumulative relative frequency of the class "18 up to 21"? 0.9 0.6 0.3 1.00
.6
For any continuous random variable, the probability that the random variable takes on exactly a specific value is 0.5 1 any value between 0 to 1 0
0
Consider the following probability distribution for the random variable x. xP(x)-20.2-10.100.310.4 The expected value of this random variable is: 1 0.1 -0.1 -1
-.1
The covariance between the returns of stocks A and B is -0.112. The standard deviation of the rates of return is 0.26 for stock A and 0.81 for stock B. The correlation of the rates of return between A and B is the closest to -2.52 -0.53 -0.24 0.53
-.53
A survey of adults who typically work full-time from home recorded their current education levels. The results are shown in the table below: Education LevelFrequencyBachelor's degree or higher37Some college13High school diploma only7Less than high school diploma3 The probability that a randomly selected adult who works from home has less than a high school diploma is 0.2 0.05 0.02 0.17
.05