Analysis Quiz 1
What is the principal connection between a sample and a population?
A random sample drawn from a population seeks to describe the characteristics of that population.
Some hotels ask their guests to rate the hotel's services as excellent, very good, good, and poor. This is an example of the:
Ordinal Scale.
The largest experimental statistical study ever conducted is believed to be for:
Polio.
The difference between the lower class limits of adjacent classes provides the:
class width.
The major applications of data mining have been made by companies with a strong _____ focus.
consumer
What is the principal difference between time series data and cross-sectional data?
Cross-sectional data are limited to an approximate window of time, while time series data are collected over several time periods.
The Department of Homeland Security has noted that on average 1120 suspicious vehicles are stopped and searched each day in the United States. This number is used to estimate the number of cars stopped in an average yearly period. The average number of cars stopped is not an example of:
descriptive statistics.
The Department of Transportation of a city has noted that on the average there are 17 accidents per day. The average number of accidents is an example of:
descriptive statistics.
The owner of a factory regularly requests a graphical summary of all employees' salaries. The graphical summary of salaries is an example of:
descriptive statistics.
The summaries of data, which may be tabular, graphical, or numerical, are referred to as:
descriptive statistics.
Statistical Inference:
is the process of drawing conclusions about the population based on the evidence taken from the sample.
Which of the following statements is correct? Stem-and-leaf
It would also be appropriate to have displayed this data in a histogram.
A student asked their classmates how tall they are, in which month they were born, and whether they are in a relationship. How many of these variables are quantitative and how many of these variables are categorical?
One is quantitative. Two are categorical.
The reversal of conclusions based on aggregate and unaggregated data is called:
Simpson's paradox.
Which of the following would likely display a negative relationship when creating a scatter diagram?
The number of classes a student misses during a semester and the grade obtained in the course
Numerical values that indicate how much or how many are known as
quantitative data.
Data mining deals with methods for developing useful decision-making information from large databases. It performs all actions except the:
reselling of the packaged data.
A graphical presentation of the relationship between two quantitative variables is called a:
scatter diagram.
A graphical display for depicting multiple bar charts on the same display is called a:
side-by-side bar chart.
A display used to compare the frequency, relative frequency, or percent frequency of two categorical variables is a:
stacked bar chart.
The relative frequency of a class is computed by:
dividing the frequency of the class by n .
The relative frequency of a class is computed by:
dividing the frequency of the class by the sample size.
A company wants to enhance the benefits for its yearly healthcare package offering. It observes the set of employees who smoke on their break times at 10 a.m., 12 p.m., and 2 p.m. It records the number of cigarettes smoked by each individual. This is an example of:
an observational study.
The proper way to construct a stem-and-leaf display for the data set {62, 67, 68, 73, 73, 79, 91, 94, 95, and 97} is to:
include a stem labeled "8" and enter no leaves on the stem.
Before drawing any conclusions about the relationship between two variables shown in a crosstabulation, you should:
investigate whether any hidden variables could affect the conclusions.
The sample size:
is always smaller than the population size.
Data collected through a survey attached to this month's pay stub:
is considered an observational study because no control is imposed.
In a sample of 200 students in a university, 40, or 20%, are Communications majors. Based on the above information, the school's paper reported that "an estimated 20% of all the students at the university are Communications majors." This report is an example of:
statistical inference.
Which of the following is least useful in making comparisons or showing the relationships of two variables?
stem-and-leaf display
All of the following are examples of observational studies except:
the behavior of Walmart shoppers after they are given a $20 gift card from the store.
Examples of cross-sectional data include all except:
the comparison of sales output of all 10 salespeople in the Western Sales Region for the 3rd quarter.
The sum of frequencies for all classes will always equal to:
the number of observations in a data set.
The collection of all elements of interest in a particular study is:
the population of interest.
A cumulative relative frequency distribution shows:
the proportion of data items with values less than or equal to the upper limit of each class.
Data mining is the analysis step of the "knowledge discovery in databases" process. It has become an emerging field in today's business world because:
the sheer amount of useful data has grown exponentially.
In a cumulative frequency distribution, the last class will always have a cumulative frequency equal to:
the total number of observations in a data set.
A time series is a sequence of data points, typically consisting of successive measurements made over a time interval. Examples of the time series include all except:
the volume of shares traded today in the stock market.
In a random sample of 200 items, 5 items were defective. An estimate of the percentage of defective items in the population is:
2.5%.
The number of observations in a complete data set having 20 elements and 3 variables is:
20
Examples of categorical data include all of the following except:
228 lbs.
An experiment is conducted to study the effects of a new blood pressure medicine on subjects. A random sample of 60 people are placed into three groups of 20 according to their age (young, middle-aged, and senior). Each group of 20 people are randomly assigned to two treatment groups (new medicine and placebo). How many total treatment groups are there?
6
Which of the following is not a type of data acquisition errors?
A particularly extreme value is verified, and is still included in the data set.
Blank
Blank
Which of the following graphical displays should not be used for quantitative data?
Stacked Bar Chart
In a crosstabulation:
both variables must be categorical.
Data obtained from a nominal scale:
can be either numeric or nonnumeric.
In a sample of 1600 registered voters, 912, or 57%, approve of the way the President is doing his job. The 57% approval rating is an example of:
descriptive statistics.
The entities on which data are collected are:
elements
Examples of quantitative data include all of the following except:
your zip code.
The American Statistical Association describes eight general topic areas and specifies important ethical considerations under each topic. One area is "Responsibilities to Research Subjects". It describes the requirements for protecting the interests of human and animal subjects of research not only during data collection but also in the analysis, interpretation, and publication of results of the findings. An example of this would include:
Avoiding or minimizing the use of deception. Where it is necessary and provides significant knowledge-as in some psychological, sociological, and other research-ensure prior independent ethical review of the protocol and continued monitoring of the research. Avoiding excessive risk to research subjects and excessive imposition on their time and privacy. Know about and adhere to appropriate animal welfare guidelines in research involving animals. Ensure that a competent understanding of the subject matter is combined with credible statistical validity.
A frequency distribution is a tabular summary of data showing the:
number of items in several classes.
Statistical studies in which researchers do not control variables of interest are:
observational studies.
Which of these is a continuous and quantitative variable?
The time it takes to grill a steak
The American Statistical Association describes eight general topic areas and specifies important ethical considerations under each topic. One area is "Professionalism". Professionalism points out the need for competence, judgment, diligence, self-respect, and worthiness of the respect of other people. Which of the following adheres to upholding the ethical guidelines for statistical practice?
Use only statistical methodologies suitable to the data and to obtaining valid results. For example, address the multiple potentially confounding factors in observational studies and use due caution in drawing causal inferences. Guard against the possibility that a predisposition (bias) by investigators or data providers might predetermine the analytic result. Account for all data considered in a study and explain the sample(s) actually used.
Which of the following is not a recommended guideline for creating an effective graphical display?
Use three dimensions whenever possible to give the display depth.
A histogram is:
a graphical presentation of a frequency or relative frequency distribution.
A set of visual displays that organizes and presents information that is used to monitor the performance of a company or organization in a manner that is easy to read, understand, and interpret is called a:
data dashboard.
In a cumulative relative frequency distribution, the last class will have a cumulative relative frequency equal to:
one.
In data mining, statistical models play an important role in developing _____.
predictive models
Anyone who wants to use the data and statistical analysis as aids to decision making must be aware of the time and cost issues. If important data are not readily available, it would be best to:
use a cross-sectional data set.