Quantitative Analysis-WTAMU-1st-Midterm
When we look at a time series plot, we usually look for which two things?
"Is there an observable trend?" and "Is there a seasonal pattern?"
We are usually on the lookout for large correlations near
+1 or -1
If events A and B are mutually exclusive, then the probability of both events occurring simultaneously is equal to
0.0
If two events are mutually exclusive, what is the probability that both occur at the same time?
0.00
The probability of an event and the probability of its complement always sum to
1
The median can also be described as:
1. the middle observation when the data is arranged in ascending order 2. the second quartile 3. the 50th percentile
We can infer that there is a strong relationship between two numerical variables when
1. the points on a scatterplot cluster tightly around an upward sloping straight line 2. the points on a scatterplot cluster tightly around a downward sloping straight line
A scatterplot allows one to see
1. what type of relationship there is between two variables 2. whether there is any relationship between two variables
Expressed in percentiles, the interquartile range is the difference between the
25th and 75th percentiles
For a boxplot, the box itself represents percent of the observations.
50%
Suppose that a histogram of a data set is approximately symmetric and "bell shaped". Approximately percent of the observations are within one standard deviation of the mean.
68%
Suppose that a histogram of a data set is approximately symmetric and "bell shaped". Approximately percent of the observations are within two standard deviation of the mean.
95%
If a value represents the 95th percentile, this means:
95% of all values are below this point
Suppose that a histogram of a data set is approximately symmetric and "bell shaped". Approximately percent of the observations are within three standard deviation of the mean.
99.7%
Which of the following best describes a data warehouse?
A database specifically structured to enable data mining
The Excel function that allows you to count using more than one criterion is
COUNTIFS
Gender and State are examples of which type of data?
Categorical
Gender and State are examples of which type of data?
Categorical data
If we are interested only in determining whether a relationship exists, we employ_______.
Correlation Analysis
Which of the following are considered measures of association?
Covariance and correlation
Numerical variables can be subdivided into which two types?
Discrete and continuous
Conditional probability is the probability that an event will occur, with no other events taken into consideration.
F
Mean absolute deviation (MAD) is the average of the squared deviations
F
Time series graphs chart the values of one or more time series, using time on the vertical axis.
F
Two events A and B are said to be independent if P(A and B) = P(A) + P(B)
F
All nominal data may be treated as ordinal data.
False
Because they represent such extreme values, outliers should be eliminated from statistical analyses.
False
Categorical variables can be classified as either discrete or continuous.
False
Correlation and covariance can be used to examine relationships between numerical variables as well as for categorical variables that have been coded numerically.
False
Correlation can be affected by the measurement scales applied to X and Y variables.
False
In an extremely right-skewed distribution, the mean is much smaller than the median.
False
Phone numbers, Social Security numbers, and zip codes are examples of numerical variables.
False
The number of cars produced by GM during a given quarter is a continuous random variable.
False
If data is stored in a database package, which of the following terms are typically used?
Fields and records
Mutually Excusive Event
If one event occurs, the other cannot.
An opinion variable expressed numerically on a 1-5 scale is a(n):
Likert scale
Which of the following are the three most common measures of central location?
Mean, median and mode
Which of the following are the three most common measures of central tendency?
Mean, median, and mode
Standard Deviation
Measures how far the observations are from the mean. It's influenced by outliers.
Mean < Median
Skewed Left (Tail Left)
Mean > Median
Skewed Right (Tail Right)
Football teams toss a coin to see who will get their choice of kicking or receiving to begin a game. The probability that given team will win the toss three games in a row is 0.125.
T
In the term "frequency table," frequency refers to the counts of observations in specified categories.
T
Probability is a number between 0 and 1, inclusive, which measures the likelihood that some event will occur.
T
Suppose that after graduation you will either buy a new car (event A) or take a trip to Europe (event B). Events A and B are mutually exclusive.
T
The advantage that the coefficient of correlation has over the covariance is that the former has a set lower and upper limit.
T
The correlation between two variables is a unitless and is always between -1 and +1.
T
The count of categories is the only meaningful way to summarize categorical data.
T
The number of car insurance policy holders is an example of a discrete random variable.
T
Two events A and B are said to mutually be exclusive if P(A and B) = 0.
T
Using dummy variables is an efficient way of determining counts of categorical variables.
T
A frequency table indicates how many observations fall within each category, and a histogram is its graphical analog.
TRUE
Exhaustive Events
They exhaust all possibilities- one of these events must occur.
A population includes all elements or objects of interest in a study, whereas a sample is a subset of the population used to gain insights into the characteristics of the population.
True
A trend line on a scatterplot is a line or a curve that "fits" the scatter as well as possible.
True
A variable (or field or attribute) is a characteristic of members of a population, whereas an observation (or case or record) is a list of all variable values for a single member of a population.
True
Age, height, and weight are examples of numerical data.
True
Both ordinal and nominal variables are categorical.
True
Cross-sectional data are data on a population at a distinct point in time, whereas time series data are data collected over time.
True
The core purpose of time series graphs is to detect historical patterns in the data.
True
The number of car insurance policy holders is an example of a discrete numerical variable.
True
The number of people entering a shopping mall on a given day is an example of a discrete random variable.
True
The probability that event A will not occur is denoted as . P(A^)
True
The scatterplot is a graphical technique used to make apparent the relationship between two numerical variables.
True
The temperature of the room in which you are writing this test is a continuous random variable.
True
The time students spend in a computer lab during one day is an example of a continuous random variable.
True
Mean = Median
Unimodal
Which of the following are the two most commonly used measures of variability?
Variance and standard deviation
Scatterplots are also referred to as
X-Y charts
If the correlation of variables is close to 0, then we expect to see
a cluster of points with no apparent relationship on the scatterplot
The Literary Digest fiasco of 1936 is an example of:
a sample that is not representative of its population
We study relationships among numerical variables using
a. correlation b. covariance c. scatterplot charts
A population includes:
all objects of interest in a particular study
The correlation coefficient is always:
between -1 and +1
A histogram that has exactly two peaks is called a distribution.
bimodal
Boxplots are probably most useful for .
comparing two populations graphically
Which of the following are considered numerical summary measures?
correlation and covariance
To examine relationships between two categorical variables, we can use
counts and corresponding charts of the counts
A sample of a population taken at one particular point in time is categorized as:
cross-sectional
The tables that result from pivot tables are called:
crosstabs
Data that arise from counts are called:
discrete
There are two types of random variables, they are
discrete and continuous
Coding males as 1 and females as 0 in a data set illustrates the use of
dummy variables
A _________________ indicates how many observations fall into various categories.
frequency table
Continuous Randon Variables
has a continuum of possible values.
Discrete Random Variable
has a finite number of possible values
A _________________ is the graphical analog of a frequency table.
histogram
What is the most common type of chart for showing the distribution of a numerical variable?
histogram
The difference between the first and third quartile is called the .
interquartile range
Probability
is a number between 1 and 1 that measures the likelihood that some event will occur.
The limitation of covariance as a descriptive measure of association is that it
is very sensitive to the units of the variables
What measure of distribution relates to extreme events, such as a stock market crash?
kurtosis
A discrete probability distribution:
lists all of the possible values of the random variable and their corresponding probabilities
For a boxplot, the point inside the box indicates the location of the __________.
mean
Correlation is useful only for
measuring the strength of a linear relationship
For a boxplot, the vertical line inside the box indicates the location of the______.
median
An observation is a:
member of a population or sample
The interquartile range (IQR) represents what percent of the observations?
middle 50%
The median can also be described as the
middle observation when the data values are arranged in ascending order
The mode is best described as the
most frequently occurring value
Let A and B be the events of the FDA approving and rejecting a new drug to treat hypertension, respectively. The events A and B are
mutually exclusive
A function that associates a numerical value with each possible outcome of an uncertain event is called a
random variable
As a measure of variability, what is defined as the maximum value minus the minimum value?
range
In order for the characteristics of a sample to be generalized to the entire population, it should be ________________ of the population.
representative
P(A^) = 1-(A)
rule of complements
Researchers try to gain insight into the characteristics of a population by examining a __________ of the population.
sample
Researchers may gain insight into the characteristics of a population by examining a
sample of the population
A time series plot is essentially a:
scatterplot
A histogram that is positively skewed is also called
skewed to the right
A histogram that is positively skewed is:
skewed to the right
A histogram that has a single peak and looks approximately the same to the left and right of the peak is:
symmetric
The commonly observed shapes of histograms are:
symmetric, bimodal, positively skewed, negatively skewed
How is the median defined if the number of observations is even?
the average of the two middle observations
A bimodal histogram is often an indication that:
the data come from two or more distinct populations
The mode is best described as:
the most frequently occurring value
The tool that provides useful information about a data set by breaking it down into subpopulations is:
the pivot table
Correlation and covariance measure
the strength and direction of a linear relationship between two numerical variables
A variable is classified as ordinal if
there is a natural ordering of categories
A variable is classified as ordinal if:
there is a natural ordering of categories
The daily closing values of the Dow Jones Industrial Average are examples of
time-series data
A line or curve superimposed on a scatterplot to quantify an apparent relationship is known as a(n)
trend line
A scatterplot allows one to see:
whether there is any relationship between two variables & what type of relationship there is between two variables