ISDS 2000
An analyst assigns a sample of bond issues to one of the following credit ratings, given in descending order of credit quality (increasing probability of default): AAA, AA, BBB, BB, CC, D.
ordinal
Rating products from one to five stars generates ___________ data.
ordinal
the restaurant surveys its customers about the quality of its waiting staff on a scale of 1 to 4, where 1 is poor and 4 is excellent.
ordinal
When creating a bar chart or a histogram, each bar/rectangle should be of the _____________ width.
same
Sampling, rather than surveying an entire population, can offer some substantial benefits. Some of those benefits include
saving money and time
Data that are collected by recording a characteristic of a subject over several time periods are referred to as ______ data.
time series
An ogive is a graph that plots the cumulative frequency, or cumulative relative frequency, against the
upper limit of the corresponding class.
Scale of Measurement: Nominal
uses tags or labels to associate value with the rank; differentiates items based on the categories they belong to
Which characteristic of big data does the following describe? Organizations must develop a methodical plan for formulating business questions, curating the right data, and unlocking the hidden potential in big data.
value
Data that are collected about many subjects at the same point in time or without regard to differences in time are known as ______ data.
cross-sectional
A _______________ _________________ distribution shows the number of observations that fall below the upper limit of a particular interval.
cumulative frequency
When a researcher examines quantitative data and wants to know the number of observations that fall below the upper limit of a particular class, the researcher is BEST served by creating a ______.
cumulative frequency distribution
Scale of Measurement: Ordinal
depicts the order/rank of variables and not the difference between them; measures quality
The branch of statistics that summarizes important aspects of a data set is often referred to as ______ statistics.
descriptive
Two branches of the study of statistics:
descriptive and inferential
Consider the following variable: the number of people in a household. This variable is best categorized as a ______ variable.
discrete
In general, data are compilations of
facts, figures, or other contents
A ______ is a way to organize qualitative data into categories and record the number of observations in each category.
frequency distribution
In order to summarize qualitative data, a useful tool is a(n) ______.
frequency distribution
In descriptive statistics, a polygon is best described as a
graph that connects the midpoints of each class and its associated frequency or relative frequency.
A stem-and-leaf diagram has two parts: The stem and the leaf. The stem consists of the ______ and the leaf consists of the ______.
leftmost digits; last digit
Scale of Measurement: Interval
measured on specific numerical values and there are equal distances between attributes; intervals are always equal
The dean of the business school at a local university categorizes students by major (i.e., accounting, finance, marketing, etc.) to help in determining class offerings in the future.
nominal
kindergarten teacher marks if a student is a girl or a boy
nominal
Key word for cross-sectional data
now or current
population parameter
number that describes something about an entire group or population
A cumulative frequency distribution identifies the number of ___________________ that falls below the upper limit of a particular interval.
observations
In general, we use sample data because
obtaining data from the population is often an expensive process
In order to approximate the class width for a frequency distribution of quantitative data, we calculate:
(Largest value−Smallest value)/Number of classes
_____________ data can include hourly, daily, weekly, monthly, quarterly, or annual observations.
Time series
Many experts believe that _____ of the data in the world today were created in the last two years alone.
90%
How does an ogive differ from a polygon?
An ogive is a graph of a cumulative (relative) frequency distribution, while a polygon is a graph of a (relative) frequency distribution.
Numerical Variable
Have values that describe a measurable quantity as a number
Continuous v. Discrete
Discrete data is counted while continuous data can take any value in a range
Difference between descriptive and inferential statistics
Inferential applies to ALL of a population while descriptive uses a sample (key word for inferential is ALL)
With data that is measured on this scale, we are able to categorize and rank the data as well as find meaningful differences between observations.
Interval scale
Rank order these measurement scales from weakest to strongest.
Nominal -> Ordinal -> Interval -> Ratio
This represents the least sophisticated level of measurement.
Nominal scale
Interval
Observations can be categorized and ranked, and differences between observations are meaningful.
Ordinal
Observations can be categorized and ranked; however, differences between the ranked observations are meaningless.
Nominal
Observations differ merely by name or label
Ratio
Observations have all the characteristics of interval-scaled data as well as a true zero point.
Cross-sectional data has to do with
One point in time
With this data, we are only able to both categorize and rank the data with respect to some characteristic or trait.
Ordinal scale
When constructing a histogram, what values/labels go on the horizontal (x) axis and the vertical (y) axes?
Quantitative class limits on the horizontal axis; frequency or relative frequency on the vertical axis.
A _____ variable is characterized by uncountable values within an interval.
continuous
_____ data often consist of numerical information that is objective and is not open to interpretation.
Structured
Categorical Variable
Take on values that are names or labels
Relative Frequency Distribution Formula
Total number of items, and then divide the frequency count by the total number
This type of data does not conform to a predefined, row-column format.
Unstructured
Which characteristic of big data does the following describe? Data from a variety of sources get generated at a rapid speed.
Velocity
The interval scale of measurement
allows for the use of negative values
The main drawback of interval-scaled data is that the value of zero is __________ chosen.
arbitrarily
A qualitative variable is also known as a ________ variable.
categorical
Relative frequency distributions are generally more useful than frequency distributions when
comparing data sets of different sizes.
The branch of statistics that draws conclusions about a large set of data based on a smaller set of data is often referred to as ______ statistics.
inferential
It is important to note that numerical results are not very useful unless they are accompanied with clearly stated actionable business
insights
a ski resort records the daily temperature during the month of January
interval
a sociologist notes the birth year of 50 individuals
interval
One method of graphical presentation for qualitative data is a(n) ______.
pie chart
A ______________ includes all items of interest in a statistical problem
population
Inferential statistics refers to drawing conclusions about a large set of data, called a ___________, based on a smaller set of data called a __________.
population, sample
Scale of Measurement: Ratio
produces the order of variables and makes the difference between variables known; can have zeros which means the variable can be not present
A meteorologist records the amount of monthly rainfall over the past year.
ratio
A research analyst collects data on the weekly closing price of gold throughout the year. The scale for these data is ______.
ratio
An investor collects data on the weekly closing price of gold throughout a year.
ratio
An investor monitors the daily stock price of BP following the 2010 oil disaster in the Gulf of Mexico.
ratio
This scale has all the characteristics of the interval scale as well as a true zero point, which allows us to interpret the ratios between observations.
ratio scale
In a given cumulative frequency distribution, the "cumulative frequency" column value for the third class represents
the sum of observations in the first, second and third classes.