ISDS
The measure of central location that can BEST be labeled as the midpoint of the data set is the _______.
Median
This is the symbol for the population size.
N
This scale represents the least sophisticated level of measurement
Nominal
Chebyshev's theorem results in _____ bounds for the percentage of observations falling in a particular interval
conservative
A _____ variable is characterized by uncountable values within an interval.
continuous
Data that are collected about many subjects at the same point in time or without regard to differences in time are known as ______ data.
cross sectional
Every day, consumers and businesses use data from various sources to help make ____
decisions
The branch of statistics that summarizes important aspects of a data set is often referred to as ______ statistics.
descriptive
Consider the following variable: the number of people in a household. This variable is best categorized as a ______ variable.
discrete
Chebyshev's theorem provides the proportion of observations that lie within k standard deviations of the mean. The value k must be ______
greater than 1
The ______ strategy recommends that the missing values be replaced with some reasonable imputed values
imputation
Samples are primarily used to make _________ about population parameters
inferences
The branch of statistics that draws conclusions about a large set of data based on a smaller set of data is often referred to as ______ statistics.
inferential
The average of the absolute differences between the values of the data set and the mean is the Multiple choice question.
mean absolute deviation
The _____ and the _____ are the most extensively used measures of central location and dispersion, respectively
mean standard deviation
_____ data allows us to review the range of values for each variable.
sorting
The square root of the average of the sum of squared deviations from the mean is known as the
standard deviation
______ is the process of extracting a portion of a data set that is relevant for subsequent statistical analysis.
subsetting
A distribution is ____ if one side of the histogram is a mirror image of the other side.
symmetric
The empirical rule is appropriate when the distribution of a variable is ___ and ____ shaped.
symmetric bell
The function to find the mean of a subset in R is ____
tapply
A company wants to estimate the mean price of oil over the past 10 years. What type of data does the company need?
time series
Data that are collected by recording a characteristic of a subject over several time periods are referred to as ______ data.
time series
________ data can include hourly, daily, weekly, monthly, quarterly, or annual observations.
time series
This type of data does not conform to a predefined, row-column format.
unstructured
A characteristic of interest that differs among various observations is referred to as a ______.
variable
Which of the following is not a measure of central location?
variance
Two widely used measures of dispersion are
variance standard deviation
A _____ mean is used to calculate the mean of a frequency distribution.
weighted
The term ________ relates to the way data tend to cluster around some middle or central value
central location
The weakness with this scale of data is that we cannot interpret the difference between the ranked values because the actual numbers used are arbitrary.
ordinal
With this data, we are only able to both categorize and rank the data with respect to some characteristic or trait.
ordinal
The ______ measures the difference between the largest and smallest values in a data set.
range
A ______ includes all items of interest in a statistical problem.
population
The empirical rule states that approximately ____ % of observations will fall within three standard deviations of the mean.
100
The mode's usefulness as a measure of central location tends to diminish with variables that have more than ___ modes
3
This is the symbol for the sample mean.
-x
Many experts believe that _____% of the data in the world today were created in the last two years alone.
90
The empirical rule states that approximately ___% of observations will fall within two standard deviations of the mean
95
If Fund A has a coefficient of variation of 1.1, and Fund B has a coefficient of variation of 0.9, Fund ______ has the greater relative dispersion.
A
This is a catch-phrase, meaning a massive amount of both structured and unstructured data.
Big Data
What is the most widely-used measure of central location?
Mean
_________ is the science that deals with the collection, preparation, analysis, interpretation, and presentation of data
Statistics
The formula for the weighted mean is xx=Σwixi. Using this formula, what is the restrictions on the weights.
They must sum to one
Which characteristic of big data does the following describe? Organizations must develop a methodical plan for formulating business questions, curating the right data, and unlocking the hidden potential in big data.
Value
Chebyshev's theorem is applicable for ____ shape
any
The main drawback of interval-scaled data is that the value of zero is _______ chosen
arbitrarily
For numerical variables, it is common to replace the missing values with the ____ values across relevant observations.
average
The function to find the mean of a subset in Excel is
averageif
Generally, the _____ is the best measure of central location when outliers are present.
median
An owner of a grocery store wants to determine the brands of soda that customers purchase at the store. When summarizing the data about soda brand purchases, the meaningful measure of central location is the ______.
mode
This is the symbol for the sample size
n