CBAD - Statistics
Chapter 4: In which of the following data sets would the arithmetic mean NOT be a good measure of central location
0, 8, 8, 9, 10.... (0 is considered an outlier for this set of data)
Chapter 2: A __________________ is a subset of a population
sample
Chapter 2: Which type of error is unavoidable when sampling from a population?
sampling error
Chapter 3: A __________ ____________ is a type of data that allows researchers to investigate the relationship between two variables.
scatter plot
Chapter 3: Which of the following graphical depictions allows you to examine the relationship between two variables?
scatter plot
Chapter 4: Place the steps for using the method of medians in finding quartiles in the proper order
sort the observations, find the median of the entire data set Q2, find the median of the data values that lie below Q2
Chapter 4: Variability
spread of data values or dispersion
Chapter 4: The population standard deviation of the data set 3, 4, 5, 6, and 7 is ________________. (round your final answer to 1 decimal place)
1.4
Chapter 4: The mean absolute deviation for the sample data set: 3, 4, 5, and 8 is
1.5
Chapter 4: The sample standard deviation of the data set 3, 4, 5, 6, and 7 is ____________.
1.6
Chapter 4: A certain value has a standardized score = 1.75. How many standard deviations from the mean does this value fall? Is this value greater than or less than the mean?
1.75 greater than the mean
Chapter 4: For a given distribution, the range is 60. Assuming the distribution is bell-shaped, the estimated standard deviation =
10 (since the range is 60, xmax - xmin, you divide that number, which is 60, by 6)
Chapter 4: A data set has 60 data points sorted from lowest to highest value. The 20th percentile value will be the ___________th data point, starting from the lowest value.
12
Chapter 4: Descriptive measures derived from a sample (n items) are
statistics
Chapter 2: Identify which of the sampling techniques listed are random
stratified clustered (simple and systematic)
Chapter 4: Shape
symmetrical, skewed, sharply peaked, flat, bimodal
Chapter 2: The __________ population contains all the individuals in which one is interested.
target
Chapter 3: Tables are frequently used to display data because
they are the simplest form of data display, a well-designed table can communicate the meaning of data at a glance
Chapter 2: ______________ _______________ data are quantities that represent or track the values taken by a variable over equally spaced periods
time series
Chapter 3: Column or bar charts can be used to display ________ __________ data using the time periods as the _________ __________.
time series, category labels
Chapter 4: Center
typical or middle values, where the data values are concentrated
Chapter 2: Data
usually are entered into a spreadsheet or database as an n × m matrix. Specifically, each column is a variable (m columns) and each row is an observation (n rows).
Chapter 3: The length of a bar of height of a column in a bar or column chart represents the ______________ of a category.
value
Chapter 3: Column
vertical display of data
Chapter 4: Multiplying data values by a fraction (where the fractions add to 1) and summing results in a ______________ mean
weighted
Chapter 4: Coefficient of variation
which is a unit-free measure of dispersion:
Chapter 4: Median < Midhinge ⇒ Skewed right (longer right tail) Median ≅ Midhinge ⇒ Symmetric (tails roughly equal) Median > Midhinge ⇒ Skewed left (longer left tail)
Median < Midhinge⇒ Skewed right (longer right tail) Median ≅ Midhinge⇒ Symmetric (tails roughly equal) Median > Midhinge⇒ Skewed left (longer left tail)
Chapter 4: The numerical measure, 0xy, is used frequently by financial frequency portfolio managers. This measure is called the
covariance
Chapter 3: Because the intent of the analysis is to study the S&P 500 companies at a point in time, these are ________________ data.
cross-sectional
Chapter 4: The sample correlation coefficient
describes the degree of linearity between paired observations on two quantitative variables X and Y
Chapter 1: Collecting, organizing and summarizing a particular data set are known as __________ statistics.
descriptive
Chapter 3: Stem-and-leaf displays can be used to
determine central tendency and dispersion, analyze the small samples of integer data
Chapter 3: A line chart can be used to...
display time series data, spot trends
Chapter 1: Surveys of corporate recruiters show that ______ and _________________ rank high on their list of hiring criteria.
ethics, personal integrity
Chapter 4: The correlation coefficient values
fall between -1 and +1, inclusive
Chapter 4: True or False: The geometric mean does not mitigate the effect of outliers
false
Chapter 1: Risk assessment of an investment
finance
Chapter 3: A log scale is useful for
financial data that is expected to grow rapidly
Chapter 4: Standard deviations can be compared
for data sets with the same measurement units and similar magnitude, for data sets with the same measurement units
Chapter 4: Quartiles divide the data into __________ equal parts.
four
Chapter 3: Cumulative relative
frequencies accumulate relative frequency values as the bin limits increase.
Chapter 2: Categorical data (also called qualitative data)
have values that are described by words rather than numbers. For example... Diners at a restaurant are asked to rate the food based on the following scale, excellent, good, average, below average, poor.
Chapter 4: Trimmed mean
is calculated like any other mean, except that the highest and lowest k percent of the observations in the sorted data array are removed. The trimmed mean mitigates the effects of extreme high values on either end. For a 5 percent trimmed mean, the Excel function is =TRIMMEAN(Data, 0.10) because .05 + .05 = .10. For the J.D.
Chapter 4: Midhinge
is the average of the first and third quartiles. The midhinge is always exactly halfway between Q1 and Q3, while the median Q2 can be anywhere within the "box," which suggests a new way to describe skewness:
Chapter 4: Range
is the difference between the largest and smallest observations
Chapter 4: Midrange
is the point halfway between the lowest and highest values of X. It is easy to calculate but is not a robust measure of central tendency because it is sensitive to extreme data values.
Chapter 4: An additional measure of dispersion is the ____________ ___________ deviation (MAD). This statistic reveals the average distance from the center. Absolute values must be used; otherwise the deviations around the mean would sum to zero
mean absolute
Chapter 4: The average of absolute differences between the values of the data set and the mean is the
mean absolute deviation
Chapter 4: When monitoring a process distribution, both the __________ and the __________ must be tracked.
mean, variation
Chapter 4: covariance
measures the degree to which the values of X and Y change together.
Chapter 4: Generally, the ______________ is the best measure of center when outliers are present.
median
Chapter 4: In a neighborhood there are five houses listed for sale for the following amounts: $250,000, $275,000, $280,000, $295,000, and $515,000. What is the best measure of center for the price of a house in that neighborhood?
median
Chapter 4: The measure of center where half the values of the data set lie above this is measure and half the values of the data set lie below this measure is known as the
median
Chapter 4: The second quartile is also the
median, 50th percentile
Chapter 4: The __________ is the measure of center that identifies the most frequently occurring value in the data set
mode
Chapter 4: The best measure of central location when using qualitative data set is the
mode
Chapter 4: The owner of a grocery store wanted to determine the brands of soda that customers purchase at the store. When summarizing the data about soda brand purchase the meaningful measure of center is the
mode
Chapter 3: Identify the characteristics below that does NOT describe a Pareto chart.
most common categories appear to the far right of the graph
Chapter 3: Histograms can be used to...
observe the spread or variability of the data, determine the shape of the data,
Chapter 4: Accuracy of grouped estimates depends on
on the number of bins distribution of data within bins bin frequencies.
Chapter 4: The summary measures for grouped data are
only approximate values
Chapter 4: Standardized data
(called a z-score) by transforming each value of the observed data
Chapter 4: Geometric mean
(denoted G) is a multiplicative average, obtained by multiplying the data values and then taking the nth root of the product. This is a measure of central tendency used when all the data values are positive (greater than zero).
Chapter 4: Median
(denoted M) is the 50th percentile or midpoint of the sorted sample data set x1, x2, ..., xn. It separates the upper and lower halves of the sorted observations
Chapter 4: Population variance
(denoted σ 2, where σ is the lowercase Greek letter "sigma") is defined as the sum of squared deviations from the mean divided by the population size:
Chapter 4: Which of the following correlation coefficients indicate the strongest inverse relationship between two variables?
-0.87
Chapter 4: Calculate the standardized score for the following data value. Assume the mean = 100 and the standard deviation = 25: x = 60, z = ?
-1.6 (100 divided by 60)
Chapter 4: The maximum value of a data set is 200 and the minimum value is 80. The midrange is equal to...
140 (80-200 / 2 = 60, 80+60 = 140 and 200-60 = 140)
Chapter 4: Pat's time in the 1600 meter run placed Pat in the 85th percent in the school. what percentage of students are faster than Pat?
15
Chapter 4: A company sold 1000 units in its first year of operation, 1400 units in its second year of operation, and 1680 units in the third year of operation. The average growth rate of the company's sales from years 1 to 3 is _______ %
29.61 (2 above square root... in square root, 1680 / 1000, out of square root, - 1, which equals = .2961)
Chapter 4: The mode(s) for the data set: 4, 4, 5, 6, 9, 9 is
4 and 9
Chapter 4: If a company sold 1000 units in its first year if operation, and 1400 units in its second year of operation, then the growth rate of the company's sales is _______ %
40
Chapter 4: For the data set 4, 5 , 6, and 9 the arithmetic mean is
6...... add values together and divide by 4
Chapter 4: The range for the data set: 2, 5, 5, 7, and 10 is:
8
Chapter 4: Suppose a data set has 80 data points. A 5% trimmed mean would be calculated by removing the ______________ highest values and the ______________ lowest values.
80 x 5% = 4....... answer is 4
Chapter 4: Using Chebyshev's theorem at least ________________ % of observations should fall within 2.5 standard deviations of the mean.
84
Chapter 4: The empirical rule states that approximately _______% of observations will fall within 3 standard deviations of the mean
95% (68% with 1)
Chapter 4: If Fund A has a coefficient of variation of 1.1, and Fund B has a coefficient of variation of 0.9, Fund ______ has a greater
A
Chapter 4: Place the following steps in order, from beginning to end, to create a box plot
Calculate the five number summary values plot the five-number summary values in numerical order on a horizontal or vertical axis, draw a box from Q1 to Q3, then add lines from Q1 to the minimum value and Q3 to the maximum value
Chapter 2: Census or Sample A) Budget constraints can make this necessary. B) Legal requirements sometimes mandate this. C) An examination of all items in a population D) Looking at only selected items in a population
Census: B and C Sample: A and D
Chapter 1: True or False: Business managers want to see detailed numerical explanations in technical reports.
False
Chapter 4: True or False: the standard deviation can be a negative value
False
Chapter 3: Excel's pivot table features
It allows interactive analysis, summarize categorical data, categorizes discrete numerical data
Chapter 4: mean
It is the sum of the data values divided by the number of data items. For a population we denote it μ, while for a sample we call it x⎯⎯x¯.
Chapter 4: Skewed left (negative skewness)
Long tail of histogram points left (a few low values but most data on right)
Chapter 4: Skewed right (positive skewness)
Long tail of histogram points right (most data on left but a few high values)
Chapter 4: or small data sets, you can find the quartiles using the method of medians:
Step 1: Sort the observations. Step 2: Find the median Q2. Step 3: Find the median of the data values that lie below Q2. Step 4: Find the median of the data values that lie above Q2.
Chapter 4: Symmetric
Tails of histogram are balanced (low/high values offset)
Chapter 4: Which of the items below describes the usefulness of a standard deviation?
To gauge the relative position of data values within the data set
Chapter 4: True or False: The empirical rule should be applied to data sets that are normally distributed or nearly normally distributed.
True
Chapter 4: To calculate the arithmetic mean
all the data points must be added together, then divided by the number of data points.
Chapter 4: Quartiles
are scale points that divide the sorted data into four groups of approximately equal size, that is, the 25th, 50th, and 75th percentiles, respectively.
Chapter 4: The owner of BevaMart wants to study the relationship between the temperature and hot chocolate sales. The owner computed the covariance between temperature and hot chocolate sale to be -81.46. Based on the covariance, which option best describes the linear relationship between temperature and hot chocolate?
as the temperature increases, hot chocolate sales decrease
Chapter 4: When calculating a percentile, the first step is to arrange the data set in
ascending order (from least to greatest)
Chapter 1: A sample of errors from invoice statements
auditing
Chapter 4: The arithmetic mean is the "__________" with which most of us are familiar.
average
Chapter 4: A useful tool of exploratory data analysis (EDA) is the ______ ________ (also called a box-and-whisker plot) based on the five-number summary:
box plot
Chapter 4: Place in order, from beginning to end, the steps to calculate the mean absolute deviation
calculate the arithmetic mean for the data set, find the absolute difference between each value and the mean, sum the absolute differences, divide by the sample (or the population) size
Chapter 4: The first quartile and third quartile is also the
center, variability
Chapter 4: For a sample of numerical data, we are interested in three key characteristics:
center, variability, and shape.
Chapter 4: Which of the following characteristics can be seen on a boxplot?
center, variability, shape
Chapter 4: When comparing two data sets with different units of measurement, what is the relative measure of dispersion?
coefficient of variation (CV)
Chapter 4: The skewness coefficient can be used to
compare two samples with different measurement units, compare one sample to a known reference distribution
Chapter 4: Which of the following situations are valid reasons for removing an outlier from a data set?
if the data point was typed incorrectly into the spreadsheet, if the observed value was taken from a population different from the one under study
Chapter 4: If covariance is positive, then as one variable increases the other variable will generally
increase
Chapter 4: Standard deviation
is a single number that helps us understand how individual values in a data set vary from the mean. Because the square root has been taken, its units of measurement are the same as X (e.g., dollars, kilograms, miles).
Chapter 4: Weighted means
is a sum that assigns each data value a weight wj that represents a fraction of the total (i.e., the k weights must sum to 1)
Chapter 4: For a population (N items or infinite) they are
parameters
Chapter 4: The first step to determine the median is to
place the data in numerical order
Chapter 4: The ___________ measures the difference between the smallest and largest values in a data set.
range
Chapter 4: The interquartile range of a data set
represents the middle 50% of the data is calculated by subtracting the first quartile from the third quartile
Chapter 4: Which of the following can be used to determine the proportion of data points that fall within a specified number of standard deviation from the mean?
the empirical rule-assuming a normal distribution, Chebyshev's Theorem
Chapter 3: The pictured graph from the New York Times op-ed piece has some misleading components to it: which of the following elements are misleading?
the higher by position the category icon the lower the rating, all the ratings arrows end at the same point regardless of the actual rating
Chapter 4: When the data are skewed right (or positively skewed)...
the mean exceeds the median.
Chapter 4: When the data are skewed left (or negatively skewed)...
the mean is below the median.
Chapter 4: When calculating a mean for grouped data
the midpoint of each bin is used to approximate the individual values in that bin
Chapter 4: Mode
the most frequently occurring data value.
Chapter 4: A box plot is constructed using several different values. Which of the following values are included in a box plot?
the second quartile, the third quartile, the smallest value
Chapter 3: Identify the problem with the pictured graph
the vertical axis limit is too high, there is no 0 value on the vertical axis, the time ranged needs to be specified
Chapter 2: Which of the following characteristics of interest is a variable?
The number of pizzas ordered from Pizza Hut per day.
Chapter 1: True or False: There are generally accepted statistical methods for dealing with missing data and unusual data.
True
Chapter 3: Relative frequencies
are calculated as the absolute frequency for a bin divided by the total number of data values.
Chapter 3: Pie chart
because of their visual appeal, pie charts appear daily in company annual reports and the popular press
Chapter 3: Stacked Dot Plot
can be used to compare two or more groups.
Chapter 2: The nominal scale of measurement is used to...
categorize unranked data
Chapter 1: Some experts prefer to call statistics...
data science
Chapter 2: Continuous (example)
A numerical variable that can have any value within an interval is (the finishing times for running the 100 meter dash)
Chapter 2: Discrete (example)
A variable with a countable number of distinct values is (the number of dots face up on a roll of a pair of die)
Chapter 2: A __________________ includes all members of the group of interest.
population
Chapter 1: Descriptive statistics
refers to the collection, organization, presentation, and summary of data (either using charts and graphs or using a numerical summary).
Chapter 3: The vertical (y axis) for an ogive can be labeled as...
relative cumulative frequency, cumulative frequency
Chapter 2: Coding (Example)
On occasion the values of the categorical variable might be represented using numbers (a database might code payment methods using numbers: 1 = cash 2 = check 3 = credit/debit card 4 = gift card).
Chapter 2: Stratified Sample
Select randomly within defined strata (by age, occupation, gender).
Chapter 2: Data Set
consists of all the values of all of the variables for all of the observations we have chosen to observe.
Chapter 2: Data set
consists of all the values of all of the variables for all of the observations we have chosen to observe.
Chapter 3: Ogive
is a line graph of the cumulative frequencies. It is useful for finding percentiles or in comparing the shape of the sample with a known benchmark such as the normal distribution (that you will be seeing in the next chapter).
Chapter 1: Which of the following are rules for a data analyst?
maintain data integrity, know and follow accepted procedures, protect confidential information
Chapter 2: Which of the following are examples of the interval scale of measurement?
many Likert scales, a golfer's score relative to par
Chapter 2: Parameter V.S. Statistic examples A) The average age of all students currently enrolled at the Leads School of Business (population of students is only current students). B) The average starting salary for 25 students from this year's Leeds' MBA graduating class of 110 students. C) The average GPA for a sample of 40 students from this year's graduating class at The Leeds School of Business.
parameter: A statistic: B and C
Chapter 2: Which of the following are examples of the nominal scale?
social security numbers, specialty sandwich names at a fast food restaurant, designating males as 1 and females as 2 to compare gender performance on an aptitude test.
Chapter 2: Which of the following are examples of time series data?
the monthly Consumer Confidence Index for the past three years. average annual credit card debt over the past decade.
Chapter 3: A relative frequency distribution for quantitative data identifies...
the proportion of observations that occur in each bin
Chapter 2: The sample size is determined by the _____________ in the population of the interest and desired ______________ parameter being estimated.
variability, precision
Chapter 3: Sorting data is helpful because
we can see the range of values we can see the frequency of each data value
Chapter 3: Place the following steps in order to explain how to construct a polygon.
1. Construct a frequency distribution 2. Find the midpoint for each class of the frequency distribution 3. The midpoints are plotted based on the frequency for the respective class 4. Neighboring midpoints are connected together by a straight line
Chapter 2: Match the survey type to an issue for consideration. A) Mail B) Telephone C) Interviews D) Web 1) Expect low response rates due to a poorly targeted list of people. 2) Growing in popularity but need to be well-targeted. 3) Can be expensive but result are often high-quality. 4) Requires a well targeted, current list of people.
A) 4 B) 1 C) 3 D) 4
Chapter 3: Which variables could be displayed on a log scale?
GDP, real estate values in a fast moving market
Chapter 3: If the data were collected from a random sample we must allow for __________ error.
Sampling
Chapter 2: Systematic Sample
Select every kth item from a list or sequence (restaurant customers).
Chapter 3: Which characteristic below is NOT a rule of thumb for displaying categorical data on a column chart?
The height of each column should be the same
Chapter 3: A relative frequency distribution for quantitative data identifies...
The proportion of observations that occur in each bin
Chapter 1: True or False: Statistics support critical thinking by helping one identify illogical conclusions or to see holes in another's argument.
True
Chapter 1: True or False: Successful businesses expect their employees to have some knowledge of statistics.
True
Chapter 2: Simple Random Sample (and example)
Use random numbers to select items from a list (Visa cardholders).
Chapter 2: XYZ Corporation made a profit of #3 million last year. ABC Corporation made a profit of #6 million last year. Based on the ratio scale, which of the following is an accurate statement about the relationship between ABC's profit and XYZ's profit?
XYZ was half as profitable as ABC.
Chapter 1: Data science
a trilogy of tasks involving data modeling, analysis, and decision making.
Chapter 3: Cumulative frequency distributions show...
accumulated counts up to and including the current bin as the bin limits increase
Chapter 1: Which of the following are responsibilities of a data analyst?
accurately reporting information, identifying degrees of uncertainty
Chapter 3: An outlier is defined as...
an extreme value that is located at the tail of the histogram (Mean < Median), an extreme value that might have arisen from a different cause (Mean ≈ Median), an extreme value is might have arisen from measurement error (Mean > Median).
Chapter 2: Numerical data (also called quantitative data)... example
arise from counting, measuring something, or some kind of mathematical operation. For example, we could count the number of auto insurance claims filed in March (e.g., 114 claims) or sales for last quarter (e.g., $4,920), or we could measure the amount of snowfall over the last 24 hours (e.g., 3.4 inches).
Chapter 3: There are several guidelines one should follow when creating graphs. Which of the following describe these guidelines?
axes should be clearly labeled, axes that are numerical should be to the appropriate scale, novelty graphs such as pyramids chart introduce ambiguity
Chapter 2: In sampling, ____________ refers to over or underestimate a population parameter of interest.
bias
Chapter 3: One of the primary goals of constructing a frequency distribution of quantitative data is to summarize the data...
by showing frequency of values that lie within a class or bin
Chapter 3: Dot plots can show which features of a data set?
center, variability, shape
Chapter 2: Identify which of the sampling techniques listed are non-random.
convenience focus group (and judgement)
Chapter 3: Characteristics of a bar chart include...
display horizontal bars when the axis labels are long or there are many categories, length or height of bar reflects frequency of a category
Chapter 3: A pie chart is never used to
display time series data.
Chapter 3: Bar
horizontal display of data
Chapter 2: Variable
is a characteristic of the subject or individual, such as an employee's income or an invoice amount.
Chapter 3: histogram
is a graphical representation of a frequency distribution... appearance in identical
Chapter 1: Statistics
is a set of tools which helps organizing, presenting information, and extracting meaning from raw data.
Chapter 1: Statistic
is a single measure, reported as a number, used to summarize a sample data set
Chapter 2: Observation
is a single member of a collection of items that we want to study, such as a person, firm, or region. An example of an observation is an employee or an invoice mailed last month.
Chapter 3: Frequency distribution
is a table formed by classifying n data values into k classes called bins (we adopt this terminology from Excel). The bin limits define the values to be included in each bin.
Chapter 3: Dot Plot
is another simple graphical display of n individual values of numerical data. The basic steps in making a dot plot are to (1) make a scale that covers the data range, (2) mark axis demarcations and label them, and (3) plot each data value as a dot above the scale at its approximate location.
Chapter 3: Line chart
is used to display a time series, to spot trends, or to compare time periods. Line charts can be used to display several variables at once. If two variables are displayed, the right and left scales can differ, using the right scale for one variable and the left scale for the other.
Chapter 1: The specialized vocabulary of statistics crosses __________ barriers to improve problem solving for multinational businesses.
language
Chapter 3: In general, the ___________ limit is included in the bin, while the __________ limit is excluded.
lower, upper
Chapter 1: Identifying repeat customers
marketing
Chapter 2: A significant weakness of the ordinal scale is...
no clear meaning to differences between the ranked values
Chapter 1: Distribution of inventory items in a big box store
operations management
Chapter 1: Inferential statistics
refers to generalizing from a sample to a population, estimating unknown population parameters, drawing conclusions, and making decisions.
Chapter 1: A company code of ethics addresses things such as (check all that apply)
sources of data inaccuracy, conflicts of interest, policies on confidentiality
Chapter 3: Pareto chart
special type of column chart used in business. displays categorical data, with categories displayed in descending order of frequency, so that the most common categories appear first.
Chapter 2: A ______________ is a numerical summary of a sample whereas ______________ is a numerical summary that describes a population.
statistic parameter
Chapter 3: Stacked column chart
the bar height is the sum of several subtotals. Areas may be compared by color to show patterns in the subgroups, as well as showing the total. Stacked column charts can be effective for any number of groups but work best when you have only a few. Use numerical labels if exact data values are of importance.
Chapter 3: When constructing bins for a frequency distribution of quantitative data, which if the following principles should generally be followed?
the bins should be exhaustive, bins should be mutually exclusive, bins should have the same width
Chapter 3: Which of the following is NOT a common error one should beware
unembellished charts that do not contain sound or animationSt
Chapter 2: Binary Variable (example)
Some categorical variables have only two values (employment status, employed or unemployed, mutual fund type, load or no-load), and marital status, currently married or not currently married...Binary variables are often coded using a 1 or 0. a variable such as gender could be coded as: 1 = female 0 = male).
Chapter 1: Which statistical pitfall does the following statement match? People who belong to health clubs tend to have college degrees therefore exercise increases your IQ
Assuming A Causal Link
Chapter 3: Set bin limits
Just as choosing the number of bins requires judgment, setting the bin limits also requires judgment. For guidance, find the approximate width of each bin by dividing the data range by the number of bins: Bin width≈ xmax-xmin / k
Chapter 2: If each observation represents a different individual unit (like a person, firm, geographic area) at the same point in time, we have
Cross sectional data
Chapter 1: Which of the following are examples of inferential statistics?
Prof. Stats randomly selects 50 female students at State University to estimate the average height of all female students at State. A manufacturer of light bulbs randomly selects 100 light bulbs to test the longevity of all light bulbs that the company produces.
Chapter 2: Cluster Sample
Select random geographical regions (e.g., zip codes) that represent the population.
Chapter 3: Stem-and-Leaf Plot
The stem-and-leaf plot is a tool of exploratory data analysis (EDA) that seeks to reveal essential data features in an intuitive way. A stem-and-leaf plot is basically a frequency tally, except that we use digits instead of tally marks
Chapter 3: Frequency polygon
is a line graph that connects the midpoints of the histogram bin intervals, plus extra intervals at the beginning and end so that the line will touch the X-axis.
Chapter 3: The rectangles of a histogram...
represent grouped data, represent the class width and frequency/relative frequency of the respective class, are drawn with no space gaps between them except when there is no data in a particular bin