Analytics Test 2
The number of minutes that Samantha waits to catch the bus is uniformly distributed between 0 and 15 minutes. What is the probability that Samantha has to wait less than 4.5 minutes to catch the bus?
30%
stacked-column chart
allows the reader to compare the relative values of quantitative variables for the same category in a bar chart - mult variables
clustered-column (or bar) chart
an alternative chart to stacked column chart for comparing quantitative variables - mult variables
key performance indicators (KPIs)
automobile dashboard: current speed, fuel level, and oil pressure business dashboard: financial position, inventory on hand, customer service metrics
The charts that are helpful in making comparisons between categorical variables are
bar charts and column charts
In order to visualize three variables in a two-dimensional graph, we use a
bubble chart
parallel-coordinates plot
chart for examining data with more than two variables - includes a different vertical axis for each variable - each observation is represented by drawing a line on the parallel-coordinates plot connecting each vertical axis - the height of the line on each vertical axis represents the value taken by that observation for the variable corresponding to the vertical axis
A PivotChart, in few instances, is the same as a
clustered-column chart
pie chart
common form of chart used to compare categorical data
An experiment consists of determining the speed of automobiles on a highway by the use of radar equipment. The random variable in this experiment is a
continuous random variable
PivotTable
crosstabulation in Microsoft Excel
A two-dimensional graph representing the data using different shades of color to indicate magnitude is called a
heat map
An effective display of trend and magnitude is achieved by using a combination of a
heat map and sparklines
Data-ink is the ink used in a table or chart that
is necessary to convey the meaning of the data to the audience
The best way to differentiate chart elements is using
labels
A time series plot is also known as a
line chart
line chart
line that connects the points in the chart - useful for time series data collected over a period of time (mins, hours, days, years, etc)
variance
measure of variability
data-ink ratio
measures the proportion of what Tufte terms "data-ink" to the total amount of ink used in a table or chart
Probability is the
numerical measure of the likelihood that an event will occur
verification
process of determining that the computer procedure that performs the simulation calculations is logically correct - not complete until the user develops a high degree of confidence that the computer procedure is error free
validation
process of ensuring that the simulation model provides an accurate representation of a real system
random experiment
process that generates well-defined outcomes
risk analysis
quantifying the likelihood and magnitude of an undesirable outcome
In many cases, white space in a chart can improve
readability
A simulation model extends spreadsheet modeling by
replacing the use of single values for parameters with a range of possible values
The process of evaluating a decision in the face of uncertainty by quantifying the likelihood and magnitude of an undesirable outcome is known as
risk analysis
The event containing the outcomes belonging to A or B or both is the __________ of A and B.
union
bar charts
use horizontal bars to display the magnitude of the quantitative variable
column charts
use vertical bars to display the magnitude of the quantitative variable
exponential probability distribution
used for random variables like: - time between patient arrivals at an ER - distance between defects in a highway
crosstabulation
useful type of table for describing data of two variables
conditional probability
when the probability of one event is dependent on whether some related event has already occurred
To summarize and analyze data with both a crosstabulation and charting, Excel typically pairs
PivotCharts with PivotTables
Which one of the following statements is not true concerning PivotTables in Excel?
PivotTables can be built using data arrayed in rows
The __________ probability distribution can be used to estimate the number of vehicles that go through an intersection during the lunch hour.
Poisson
bubble chart
graphical means of visualizing 3 variables in a 2D graph that sometimes is a preferred alternative to a 3D graph
scatter chart
graphical presentation of the relationship between two quantitative variables
D and M are independent events
if the probability of event D is not changed by the existence of event M
Deleting the grid lines in a table and the horizontal lines in a chart
increases the data-ink ratio
data-ink
ink used in a table or chart that is necessary to convey the meaning of the data to the audience
non-data-ink
ink used in a table or chart that serves no useful purpose in conveying the data to the audience
The data dashboard for a marketing manager may have KPIs related to
current sales measures and sales by region
A disadvantage of stacked-column charts and stacked-bar charts is that
it can be difficult to perceive small differences in areas
In a business, the values indicating the business's current operating characteristics, such as its financial position, the inventory on hand, and customer service metrics, are typically known as
key performance indicators (KPI's)
expected value
mean of a random variable
A _____________ is a graphical presentation of the relationship between two quantitative variables.
scatter chart
A line chart that has no axes but is used to provide information on overall trends for time series data is called a
sparkline
To avoid problems in interpreting the differences in color in a heat map, ____________ can be added.
sparklines
sparkline
special type of line chart - minimalist type of line chart that can be placed directly into a cell in Excel - contains no axes; they display only the line for the data - takes up very little space and they can be effectively used to provide information on overall trends for time series data
Sample space is
the collection of all possible outcomes
All the events in the sample space that are not part of the specified event are called
the complement of the event
The center of a normal curve is
the mean of the distribution
Tables should be used instead of charts when
the values being displayed have different units or very different magnitudes
PivotChart
to summarize and analyze data with both a cross tabulation and charting, Excel pairs PivotCharts with PivotTables
A __________ is useful for visualizing hierarchical data along multiple dimensions.
treemap
A _____________ is a line that provides an approximation of the relationship between the variables.
trendline
_________ are visual methods of displaying data.
Charts
DJ needs to display data over time. Which of the following charts should he use?
Line chart
A ___________ uses repeated random sampling to represent uncertainty in a model representing a real system and that computes the values of model outputs.
Monte Carlo Simulation
In a normal distribution, which is greater, the mean or the median?
Neither the mean or the median (they are equal)
Which statement is true about mutually exclusive events?
If events A and B cannot occur at the same time, they are called mutually exclusive
Two events are independent if
P(A | B) = P(A) or P(B | A) = P(B)
geographical information system (GIS)
- a system that merges maps and statistics to present data collected over different geographical areas - helps in interpreting data and observing patterns
table design principles
- avoid using vertical lines in a table unless they are necessary for clarity - horizontal lines are generally necessary only for separating column titles from data values or when indicating that a calculation has taken place
data visualization involves
- creating a summary table for the data - generating charts to help interpret, analyze, and learn from the data
uses of data visualization
- helpful for identifying data errors - reduces the size of your data set by highlighting important relationships and trends in the data
tables should be used when
- the reader needs to refer to specific numerical values - the reader needs to make precise comparisons between different values and not just relative comparisons - the values being displayed have different units or very different magnitudes
What is the total area under the normal distribution curve?
1
A survey of 100 random high school students finds that 85 students watched the Super Bowl, 25 students watched the Stanley Cup Finals, and 20 students watched both games. How many students did not watch either game?
10
A health conscious student faithfully wears a device that tracks his steps. Suppose that the distribution of the number of steps he takes in a day is normally distributed with a mean of 10,000 and a standard deviation of 1,500 steps. What percent of the days does he exceed 13,000 steps?
2.28%
heat map
2D graphical representation of data that uses different shades of color to indicate magnitude
The random variable X is known to be uniformly distributed between 2 and 12. Compute E(X), the expected value of the distribution.
7
Which of the following is a disadvantage of using simulation?
Each simulation run provides only a sample of how the real system will operate
The software package most commonly used for creating simple charts is
Excel
Which of the following graphs cannot be used to display categorical data?
Scatter chart
__________ merges maps and statistics to present data collected over different geographies.
The geographic information system
Which of the following is not a characteristic of the normal probability distribution?
The standard deviation must be 1
__________ is the process of determining that a simulation model provides an accurate representation of a real system.
Validation
normal probability distribution
a continuous random variable with applications used for: - test scores - height and weights of people
Poisson Probability Distribution
a discrete random variable that is often useful in estimating the number of occurrences of an event over a specified interval of time or space
trendline
a line that provides an approximation of the relationship between two variables
probability
a numerical measure of the likelihood that an event will occur
discrete uniform probability distribution
a probability distribution for which each possible value of the random variable has the same probability
A chart that is recommended as an alternative to a pie chart is a
bar chart
A data visualization tool that updates in real time and gives multiple outputs is called
data dashboard
data-dashboard
data visualization tool that illustrates multiple metrics and automatically updates these metrics as new data become available
A variable that can only take on specific numeric values is called a
discrete random variable
uniform probability distribution
every interval of a given length being equally likely
Making visual comparisons between categorical variables may be difficult in a
pie chart
A __________ describes the range and relative likelihood of all possible values for a random variable.
probability distribution for a random variable
The outcome of a simulation experiment is a(n)
probability distribution for one or more output measures
realization
probability that is generated from observations
scatter-chart matrix
useful chart for displaying multiple variables
treemap
useful for visualizing hierarchical data along multiple dimensions
triangular probability distribution
useful only when subjective probability estimates are available
charts / graphs
visual methods of displaying data