Ch 1 Test Prep

Ace your homework & exams now with Quizwiz!

1.5 The collection of all elements of interest in a particular study is:

The collection of all elements of interest in a particular study is:

1.5 The sample size:

is always smaller than the population size. WHY: A sample is a subset of the population. In order to use a sample for inference it must be randomly drawn from the population and, in practice, cannot be greater than 10% of the population of interest. See Section 1.5, Statistical Inference.

1.8 Big data is often defined according to the four v's of data: volume, variety, veracity, and ___________.

velocity

1.5 In a random sample of 200 items, 5 items were defective. An estimate of the percentage of defective items in the population is:

2.5% WHY: The proportion of defective items in the sample is a good estimator for the proportion of defective items in the population. 5/200 = .025

1.2 The number of observations in a complete data set having 20 elements and 3 variables is:

20. WHY: The number of elements is equal to the number of observations. There are 20 elements in the data set; therefore, there are 20 observations.

Examples of categorical data include all of the following except:

228 lbs. WHY: Weight measured in pounds is a continuous, and numerical piece of data; therefore it is quantitative

1.4 What percentage of countries have tax rates above the mean tax rate for the dataset? GDP - per capita compares GDP on a purchasing power parity (PPP) basis divided by population as of 1 July for the year 2014. HISTOGRAM SHOWING THE FREQUENCY OF TAX RATES

50% WHY: The mean of the data set is 30.4%. Half of the countries (50%) have tax rates above the mean.

1.3 Which of the following is not a type of data acquisition errors?

A particularly extreme value is verified, and is still included in the data set. WHY: Although the data is an outlier, because it has been properly verified, it should be included in the data set

1.5 What is the principal connection between a sample and a population?

A random sample drawn from a population seeks to describe the characteristics of that population.

1.3 An experiment is conducted to study the effects of a new blood pressure medicine on subjects. A random sample of 60 people are placed into three groups of 20 according to their age (young, middle-aged, and senior). Each group of 20 people are randomly assigned to two treatment groups (new medicine and placebo). How many total treatment groups are there?

6 WHY: (Three age groups)*(two treatments) = 6 treatment groups.

1.2 Data obtained from a nominal scale:

can be either numeric or nonnumeric

1.8 The major applications of data mining have been made by companies with a strong _____ focus.

consumer WHY: such as retail businesses, financial organizations, and communications companies

1.2 What is the principal difference between time series data and cross-sectional data?

Cross-sectional data are limited to an approximate window of time, while time series data are collected over several time periods.

1.4 In a sample of 1600 registered voters, 912, or 57%, approve of the way the President is doing his job. The 57% approval rating is an example of:

descriptive statistics.

1.4 The Department of Transportation of a city has noted that on average there are 17 accidents per day. The average number of accidents is an example of:

descriptive statistics.

1.4 The owner of a factory regularly requests a graphical summary of all employees' salaries. The graphical summary of salaries is an example of

descriptive statistics.

1.5 Statistical Inference:

is the process of drawing conclusions about the population based on the evidence taken from the sample.

1.8 Data mining deals with methods for developing useful decision-making information from large databases. It performs all actions except the:

reselling of the packaged data. WHY: Data mining includes collection, extraction, warehousing, and analysis of predictive information.

1.5 The Department of Homeland Security has noted that on average 1120 suspicious vehicles are stopped and searched each day in the United States. This number is used to estimate the number of cars stopped in an average yearly period. The average number of cars stopped is not an example of:

descriptive statistics. WHY: The statistic mentioned is gathered daily and used to describe the population of the United States.

1.4 The summaries of data, which may be tabular, graphical, or numerical, are referred to as:

descriptive statistics. WHY: this set of analytical techniques is used to define and describe statistical data.

1.2 The entities on which data are collected are:

elements

Data collected through a survey attached to this month's pay stub:

is considered an observational study because no control is imposed WHY: Management just wants a response from the employee and communicates in the most convenient way

1.5 In a sample of 200 students in a university, 40, or 20%, are communications majors. Based on the above information, the school's paper reported that "an estimated 20% of all the students at the university are communications majors." This report is an example of:

statistical inference. WHY: The report uses a statistic to describe a characteristic of a population.

1.3 All of the following are examples of observational studies except:

the behavior of Walmart shoppers after they are given a $20 gift card from the store. WHY: Behaviors will vary because a treatment is introduced, which will elicit a response. Since a control (the gift card) is present, it cannot be called an observational study

1.2 Examples of cross-sectional data include all of the following except:

the comparison of sales output of all 10 salespeople in the Western Sales Region for the 3rd quarter. WHY: The sales data is examined over a 3-month time interval.

1.5 The population of countries (n = 60) with the top GDPs in 2014 is shown above. Assume for the purposes of this problem that the mean and median GDP of this population are unknown. In reality, they are ($51,195, $44,600). A random sample of 10 countries (GDPs) was selected from this unknown population. What range would the sample mean most likely fall into?

$50K — $59.9K WHY: The average is sensitive to outliers. The outlier to the far right pulls the mean toward it. Income distributions are commonly skewed right because there are far fewer super-rich people than impoverished in the world's population.

1.9 The American Statistical Association describes eight general topic areas and specifies important ethical considerations under each topic. One area is "Responsibilities to Research Subjects." It describes the requirements for protecting the interests of human and animal subjects of research not only during data collection but also in the analysis, interpretation, and publication of results of the findings. Which of the following is not an example of this?

Assuming legal privacy and confidentiality protections where they may not apply

1.9 The American Statistical Association describes eight general topic areas and specifies important ethical considerations under each topic. One area is "Professionalism." Professionalism points out the need for competence, judgment, diligence, self-respect, and worthiness of the respect of other people. Which of the following does not adhere to upholding the ethical guidelines for statistical practice?

Engage in discrimination based on personal characteristics.

1.3 Statistical studies in which researchers do not control variables of interest are:

observational studies. WHY: An observational study simply gathers information on phenomena without imposing any control over the elements being observed

1.2 Which of these is a continuous and quantitative variable?

The time it takes to grill a steak

1.2 A time series is a sequence of data points, typically consisting of successive measurements made over a time interval. Examples of the time series include all of the following except:

The volume of shares traded today in the stock market WHY: it is a data point captured for today only, not over a period of time.

1.3 A company wants to enhance the benefits for its yearly healthcare package offering. It observes the set of employees who smoke on their break times at 10 a.m., 12 p.m., and 2 p.m. It records the number of cigarettes smoked by each individual. This is an example of:

an observational study. WHY: Studies of smokers and nonsmokers are observational studies because researchers do not determine or control who will smoke and who will not smoke

1.7 Which of the following is used for data-driven decision making?

analytics WHY: Analytics is the scientific process of transforming data into insights for making better decisions and thus is used for data-driven decision making

1.2 Some hotels ask their guests to rate the hotel's services as excellent, very good, good, and poor. This is an example of the:

ordinal scale. WHY: Ordinal data is data in which the order or rank is meaningful.

1.3 The largest experimental statistical study ever conducted is believed to be for:​

polio WHY: 1954 Public Health Service experiment for the Salk polio vaccine. Nearly 2 million children in grades 1, 2, and 3 were selected from throughout the United States

1.7 A group of employees at a major grocery chain is tasked with analyzing point of sale data to predict the success of a marketing campaign they are considering launching. This is an example of:

predictive analysis. WHY: Predictive analysis uses models constructed from past data to predict the future.

1.8 In data mining, statistical models play an important role in developing _____.

predictive models

1.3 Anyone who wants to use the data and statistical analysis as aids to decision making must be aware of the time and cost issues. If important data are not readily available, it would be best to:

use a cross-sectional data set. WHY: This option allows for data to be taken in a short window of time. This would also be best in terms of cost, and advantageous because existing data could be used.

1.7 The main difference between prescriptive analysis and descriptive/predictive analysis is that prescriptive analysis:

yields a course of action. WHY: Prescriptive analysis is the only category of analytics of the three that yields a course of action.

1.2 Examples of quantitative data include all of the following except:

your zip code. WHY: Examples of quantitative data include all of the following except your zip code. Although zip code is a number per se, it describes a location and can be viewed as categorical.


Related study sets

Data Structures Final Exam Study Guide

View Set

Regulations: Securities Exchange Act '34 Review Questions

View Set

MGMT 490: Exam 1 - Chapter PowerPoints

View Set

Domain 4: Fire Prevention and Protection

View Set