BUAL Exam 1

Ace your homework & exams now with Quizwiz!

Secondary Sources

data collected by existing sources

Time Series Data

data collected over different time periods

Experimental and observational studies

data we collect ourselves for a specific purpose

Existing sources

data already gathered by public or private sources

Cross-Sectional Data

data collected at the same or approximately the same point in time

Descriptive Statistics

"describe", the science of describing the important aspects of a set of measurements

Calculating Class Length

(largest number - smallest number)/number of classes

z score equation

(x-mean)/standard deviation

Find the z-score for an IQ test score of 92 when the mean is 100 and the standard deviation is 15. .53 .77 −.77 −.53 −8.00

-.53

What are the differences between a histogram and a bar graph?

-bars on the histogram touch to represent continuous data while the bars do not touch on a bar graph -one axis of a bar graph is categories (qualitative) where the axis on the histogram is quantitative data grouped into classes -frequencies on a bar graph represent the counts from categories (qualitative data) while frequencies on histograms present counts of quantitative data values grouped into classes

Measures of Variation

-knowing the measures of central tendency is not enough -range, variance, standard deviation

The following is a partial relative frequency distribution of grades in an introductory statistics course. grade and relative frequency: A (.22) B (?) C (.18) D (.17) F (.06) Find the relative frequency for the B grade.

.37

Example of interval variable not being "0"

0 degrees means cold, not no heat

Find the z-score for an IQ test score of 118 when the mean is 100 and the standard deviation is 15. 1.2 1.0 18.0 −1.03 −1.2

1.2

In a statistics class, 10 scores were randomly selected with the following results (mean = 71.5): 74, 73, 77, 77, 71, 68, 65, 77, 67, 66.What is the range? 22.72 12.00 4.77 516.20 144.00

12.00

The following is a relative frequency distribution of grades in an introductory statistics course. Grade and Relative Frequency: A (.22) B (.37) C (.18) D (.17) F (.06) If we wish to depict these data using a pie chart, find how many degrees (out of 360 degrees) should be assigned to grade B 133.2 degrees 37 degrees 79.2 degrees 140 degrees

133.2 degrees

Which percentile describes the first quartile, Q1? 25th 50th 75th 100th

25th

A normal population has 99.73 percent of the population measurements within ________ standard deviation(s) of the mean. 1 2 3 4

3

The following is a relative frequency distribution of grades in an introductory statistics course. Grade and Relative Frequency: A (.22) B (.37) C (.18) D (.17) F (.06) If this was the distribution of 200 students, give the frequency distribution for grade A. 44 22 200 22

44

The number of weekly sales calls by a sample of 25 pharmaceutical salespersons is below. 24, 56, 43, 35, 37, 27, 29, 44, 34, 28, 33, 28, 46, 31, 38, 41, 48, 38, 27, 29, 37, 33, 31, 40, 50How many classes should be used in the construction of a histogram? 4 6 10 5 2

5

In a statistics class, the following 10 scores were randomly selected: 74, 73, 77, 77, 71, 68, 65, 77, 67, 66. What is the mean? 71.5 72.0 77.0 71.0

71.5

In a statistics class, the following 10 scores were randomly selected: 74, 73, 77, 77, 71, 68, 65, 77, 67, 66. What is the median? 71.5 72.0 77.0 71.0

72

Which percentile describes the third quartile, Q3? 25th 50th 75th 100th

75th

In a statistics class, the following 10 scores were randomly selected: 74, 73, 77, 77, 71, 68, 65, 77, 67, 66. What is the mode? 71.5 72.0 77.0 71.0

77

If there are 130 values in a data set, how many classes should be created for a frequency histogram? 4 5 6 7 8

8

Which of the following is a type of question used in survey research?dichotomous open-ended multiple-choice All of the other answers are correct

All of the other answers are correct

Which of the following is a type of question used in survey research? open-ended All of the other answers are correct. multiple-choice dichotomous

All of the other answers are correct.

In ________ we select elements because they are easy to sample? random sampling probability sampling convenience sampling judgment sampling

Convenience sampling

A(n) ________ is a graphical presentation of the current status and historical trends of a business's key performance indicators. frequency distribution histogram Pareto chart Dashboard

Dashboard

Which of the following are quantitative variables? Nominative Ordinal Interval Ratio

Interval and ratio

Describing central tendency

a measure of central tendency represents the center or middle of the data

_______ uses traditional or newer graphics to present visual summaries of business information. Nonparametric predictive analytic Parametric predictive analytics Prescriptive analytics Graphical descriptive analytics

Graphical descriptive analytics

Which variables are qualitative? Nominative Ordinal Interval Ratio

Nominative and Ordinal

Primary sources

data collected by an individual or business directly through planned experimentation

referring to the pop and sample

Pop: N Sample: n

_______ sampling is where we know the chance that each element will be included in the sample, which allows us to make statistical inferences about the sample population. Convenience Voluntary Probability Judgment

Probability

________ sampling is where we know the chance that each element will be included in the sample, which allows us to make statistical inferences about the sample population. Voluntary Convenience Probability Judgment

Probability

Equation for variance

SD^2= sigmaE(x-m)^2/N

Examples of descriptive statistics

Salaries: high, low, mean, median, graph

________ is the difference between a numerical descriptor of the population and the corresponding descriptor of the sample. Nonresponse Sampling error Observation error Non observation error

Sampling error

You want to select a simple random sample of 100 employees of Company X. You assign a number to every employee in the company database from 1 to 1000 and use a random number generator to select 100 numbers. Simple random sample Cluster random sample Stratified random sample Analytics random sample

Simple random sample

The company has 800 female employees and 200 male employees. You want to ensure that the sample reflects the gender balance of the company, so you sort the population into two strata based on gender. Then you use random sampling on each group, selecting 80 women and 20 men, which gives you a representative sample of 100 people. What time of sampling is this?

Stratified sampling

All employees of the company are listed in alphabetical order. From the first 10 numbers, you randomly select a starting point: number 6. From number 6 onwards, every 10th person on the list is selected (6, 16, 26, 36, and so on), and you end up with a sample of 100 people Cluster random sample Stratified random sample Analytics random sample Systematic sampling

Systematic sampling

Skewed to the right

The right tail of the histogram is longer than the left tail

Ogive

a graph of cumulative distribution -plot a point above each upper class boundary at a height of the cumulative frequency -connect the points with line segments can be drawn using cumulative relative frequencies and cumulative percent frequencies

Examples of factors

Will you get accepted into Auburn? the things that are taken into consideration are GPA, HS, rank, test scores, extracurricular activities

According to a survey of the top 10 employers in a major city in the Midwest, a worker spends an average of 413 minutes a day on the job. Suppose the standard deviation is 26.8 minutes, and the time spent is approximately a normal distribution. What are the times within which approximately 68.26 percent of all workers will fall? [394.8, 431.2] [386.2, 439.8] [372.8, 453.2] [359.4, 466.6] [332.6, 493.4]

[386.2, 439.8]

Example of judgment sampling

a class and only looking at student doctors

frequency distribution

a list of data classes with the count of values that belong to each class "classify and count" table

population parameter

a number calculated from all the population measurements that describes some aspect of the population -standard deviation -variance -population mean

Sample statistic

a number calculated using the sample measurements that describes some aspect of the sample

Finite population

a population of limited size

Data Warehousing

a process of centralized data management and retrieval (its objective is the creation and maintenance of a central repository for all of an organization's data)

Ordinal Variables

a qualitative variable for which there is meaningful ordering, or ranking, of the categories

Nominative Variables

a qualitative variable for which there is no meaningful ordering, or ranking, of the categories

Process

a sequence of operations that takes inputs and turns them into outputs

Population

a set of all elements about which we wish to draw conclusions, it is everything of a group

Sample

a subset of the elements of a population

Contingency table

a table consisting of rows and columns that is used to classify data on two dimensions

measurement

a way to assign a value to an element

How to calculate mean

add all the variables/the number of variables that you added

Example of population

all netflix customers, all amazon customers, all mastercard customers

Interval Variable

all of the characteristics of ordinal plus -measurements are on a numerical scale with an arbitrary zero point -the zero is assigned: it is nonphysical and not meaningful -zero does not mean the absence of the quantity that we are trying to measure

Ratio Variable

all the characteristics of interval plus -measurements are on a numerical scale with a meaningful zero point (zero means none or nothing) -values can be compared by their intervals and ratios -in business and finance, most quantitative variables are ratio variables, such as anything related to money

Multiple choice questions

allow more than two responses, usually analyzed with averages

Census

an examination all of the population measurements, 100%

A measurement located outside the upper limits of a box-and-whiskers display is ________. always in the first quartile an outlier always the largest value in the dataset within the lower limits

an outlier

Example of sample

analytics students in the business school

variable

any characteristic of an element

Variance

average of squared number deviations of individual measurements from mean

Example of primary sources

banks, credit cards, etc

The general term for a graphical display of categorical data made up of vertical or horizontal bars is called a(n) ________. pie chart Pareto chart bar chart ogive plot

bar chart

Which of the following graphs is for qualitative data? histogram bar chart ogive plot stem-and-leaf

bar chart

Pareto Chart

bar chart having different kinds of defects listed on the horizontal scale -bar height represents frequency -arranged in decreasing height from left to right

A ________ displays the frequency of each class with qualitative data and a ________ displays the frequency of each class with quantitative data. histogram, stem-and-leaf display bar chart, histogram scatter plot, bar chart stem-and-leaf, pie chart

bar chart, histogram

Qualitative data is typically in the form of a

bar graph, pie chart, or pareto chart

What kind of visualizations as associated with qualitative data?

bar graphs, pie charts

Improper sampling

biased, convenience, voluntary, and judgment

Transactional data are now used by businesses as part of experimental studies survey analysis big data descriptive statistics

big data

Two types of modes

bimodal, multimodal

As a business owner, I have requested my staff to develop a set of dashboards that can be used by the public to show wait time at each of my four local coffee shops at peak times during the day and whether the time is short, medium, or long. Which of the following graphical displays would be the best choice? bullet graph sparkline treemap gauges

bullet graph

Web surveys

cheaper still, same problems as mail surveys

Pie chart

circle divided into slices where the size of each slice represents its relative frequency and or percent frequency

Dichotomous questions

clearly stated, easy to answer, easy to analyze, limited information

A quantity that measures the variation of a population or a sample relative to its mean is called the ________. range standard deviation coefficient of variation variance

coefficient of variation

A quality control worker at a factory selects the first 10 items she sees as her sample for the day. What is this an example of?

convenience sampling

A restaurant leaves comment cards on all of its tables and encourages customers to participate in a brief survey to learn about their overall experience. What is this an example of?

convenience sampling

In ________ we select elements because they are easy to sample. random sampling convenience sampling judgment sampling probability sampling

convenience sampling

Which of the following is a measure of the strength of the linear relationship between x and y that is dependent on the units in which x and y are measured? covariance correlation coefficient slope least squares line

covariance

A Yes or No question is ________. systematic evaluative dichotomous open-ended

dichotomous

A stem-and-leaf display is best used to ________. provide a point estimate of the variability of the data set provide a point estimate of the central tendency of the data set display the shape of the distribution display a two-variable treemap.

display the shape of the distribution

Multistage cluster sampling

divide population into clusters and then randomly select clusters to sample

Stratified random sample

divide population into non-overlapping groups (strata) then select a random sample from each strata

Examples of quantitative measurements

dollar amount, miles, gallons, feet, percentages, etc, selling price of a home, payment of bill, how many apples did you buy

Random Sample

equal chance of getting selected

Example of big data

facebook: pictures, videos, text, like and dislike

Which of the following is not a supervised learning technique in predictive analytics? linear regression factor analysis decision trees neural networks

factor analysis

Definition of data

facts and figures from which conclusions can be drawn

How to calculate mode

find the number that appears the most

A population that consists of all the customers who will use the drive-thru of the local fast food restaurant is called a(n) ________. infinite population random sample population statistical population

finite population, because you can count how many customers can count how many customers came to the drive thru)

Examples of nominative variables

gender, car color

Stem and leaf display

graphical portrayal of a data set that shows the data set distributions by using stems consisting of leading digits and leaves consisting of trailing digits

Histogram

graphically displays frequency distribution, relative frequency distribution, or percent frequency distribution. It divides the measurements into class and graphs frequency, relative frequency, or percent for each class

Which of the following divides quantitative measurements into classes and graphs the frequency, relative frequency, or percentage frequency for each class? histogram dot plot stem-and-leaf display scatter plot

histogram

What kind of visualizations as associated with quantitative data?

histograms, dot plots, line graph, scatterplots

The empirical rule for normal populations

if a population as a mean and standard deviation and is described by a normal curve, then -68.36% of the population measurement lie within one standard deviation of the mean -95.44% lies within two standard deviations of the mean -99.73% lie within three standard deviations of the mean

Multimodal

if there are more than two mode

Bimodal

if there are two modes

As the coefficient of variation ________, risk ________. increases, decreases decreases, increases increases, increases remains constant, increases

increases, increases

Phone surveys

inexpensive, low response rate

Mail surveys

inexpensive, low response rates (20-30 percent) Requires multiple mailings

Population mean

is the average of the population measurements

Range

largest measurement minus the smallest measurement

Sample frame

list from which the sample was selected

Systematic sampling

list population, select at a random starting point, sample each "nth" element

Prescriptive analytics

looks at variables and constraints, along with predictions from predictive analytics, to recommend courses of action

Big Data

massive amount of data, often collected in real time in different forms, sometimes needing quick analysis

examples of measures of central tendency

mean, median, mode

If a population distribution is skewed to the right, then, given a random sample from that population, one would expect that the ________. median would be greater than the mean mode would be equal to the mean median would be less than the mean median would be equal to the mean

median would be less than the mean

Sampling designs

methods for obtaining a sample

Predictive analytics

methods used to find anomalies, patterns, and associations in data sets to predict future outcomes

Which of the following is a quantitative variable? a person's gender the manufacturer of a cell phone mileage of a car whether a person is a college graduate

mileage of a car

Which of the following is a quantitative variable? the manufacturer of a cell phone a person's gender mileage of a car whether a person is a college graduate whether a person has a charge account

mileage of a car

Personal interviews

more expensive, more control, and higher response rate

Open-ended questions

most honest and complete information cannot be readily summarized

Number of classes equation

n=population 2^k>n

When developing a frequency distribution, the class (group) intervals must be ________. large small integer nonoverlapping

nonoverlapping

Examples of infinite population

numbers of stars in the sky, number of red blood cells in human body

Observational study

observes individuals and measures variables of interest but does not attempt to influence the responses

A(n) ________ is a graph of a cumulative distribution. histogram scatter plot ogive pie chart

ogive

An identification of police officers by rank would represent a(n) ________ level of measurement. nominative ordinal interval ratio

ordinal

Factors

other variables related to the response variable

A(n) ________ can be used to differentiate the "vital few" causes of quality problems from the "trivial many" causes of quality problems. histogram scatter plot pareto chart ogive plot stem-and-leaf display

pareto chart

Examples of qualitative measurements

phone number, zip code, social security number, address

Types of surveys

phone, mail, web, personal interviews

All of the following are used to describe quantitative data except the ________. histogram stem-and-leaf chart dot plot pie chart

pie chart

Data that are collected by an individual through personally planned experimentation or observation are ________. secondary data quantitative data primary data variables

primary data

Data that are collected by an individual through personally planned experimentation or observation are ________. variables secondary data primary data quantitative data

primary data

One method of being sure a sample being studied can be used to make statistical inferences about the population is to select a convenience sample. voluntary response sample. judgment sample probability sample.

probability sample

A sequence of operations that takes inputs and turns them into outputs is a ________. statistical inference random sampling process runs plot

process

Cross tabulation

process that classifies data into two dimensions

Dashboard

provides a graphical presentation of the current status and historical trends of key performance indicators

How to calculate median

put all the numbers in numerical order, find the middle number

what are two types of measurements?

qualitative and quantitative

All of the following are measures of central tendency except the ________. range mode mean median

range

Cumulative Distribution

rather than a count, we record the number of measurements that are less than the upper boundary of that class "running total"

The ________ is the positive square root of the sample variance. sample mean sample standard deviation range median

sample standard deviation

Judgment sampling

samples in which a person who is extremely knowledgeable about the population selections population elements he or she feels are most representative

Voluntary response sampling

samples in which participants self select -frequently used by radio and television -over represent people with strong opinions

Probability Sampling

sampling where we know the chance that each element in the population will be included in the sample -required for statistical inference -random sample

Convenience sampling

sampling where we select elements because they are convenient to sample -easy and convenient -not a probability sample

A ________ shows the relationship between two variables. stem-and-leaf bar chart histogram scatter plot pie chart

scatter plot

Which of the following graphical tools is not used to study the shapes of distributions? stem-and-leaf display scatter plot histogram Bar graph

scatter plot

If the mean is greater than the median, then the relative frequency curve is most likely to be ________. skewed right skewed left symmetrical bimodal

skewed right

mode<median<mean

skewed to left

mode>median>mean

skewed to right

Mean=30.25 Median=31 Mode=32 How is this skewed?

skewed to the left mean<median<mode

A relative frequency histogram having a longer tail to the right than to the left is said to be ________. skewed to the left normal a scatter plot skewed to the right

skewed to the right

The number of weekly sales calls by a sample of 25 pharmaceutical salespersons is below. 24, 56, 43, 35, 37, 27, 29, 44, 34, 28, 33, 28, 46, 31, 38, 41, 48, 38, 27, 29, 37, 33, 31, 40, 50 What is the shape of the distribution of the data? skewed to the right skewed to the left normal bimodal

skewed to the right

In the least squares line, ________ is defined as rise/run. correlation coefficient predicted value of y y-intercept slope

slope

Nonresponse

some of the individuals who were supposed to be included in the sample are not

standard deviation equation

square root of the variance

Linear scatterplots

straight line relationship between two variables

Nonoverlapping groups of similar elements in a population are called clusters. frames strata. stages.

strata

Alternatives to random sampling

stratified random sample, multistage cluster sampling, systematic sampling

Example of a finite population

students in a class, number of cars in parking lot, number of births per year

Relative frequency

summarizes proportion of items in each class

Frequency distribution table

summarizes the number of items in each of the several non overlapping classes

Mean=median=mode

symmetrical distribution

Examples of ordinal variables

teaching effectiveness

Example of response variable

the "y" in the example y=5x+6x

Data set

the data that are collected for a particular study

Sampling error

the difference between a numerical descriptor of the population and the corresponding descriptor of the sample

Target population

the entire population of interest

Examples of existing sources

the internet, library, us government, data collection agency

Skewed to the left

the left tail of the histogram is longer than the right tail

Qualitative Measurement

the possible measurements fall into several categories and are things that cannot be counted, they are descriptive things, and cannot be mathematical methods, they can be numerical or non numerical

quantitative measurement

the possible measurements of values of a variable are numbers that represent quantities

Experimental Study

the researcher manipulates one of the variables and tries to determine how the manipulation influences other variables

Symmetrical

the right and left tails of the histogram appears to be mirror images of each other

sample survery

the sample we take

statistical inference

the science of using a sample of measurements to make generalizations about the important aspects of a population of measurements, "drawing conclusions"

Composite score

the total number of scores added up to get another given number

Data mining

the use of predictive analytics, algorithms, and IS techniques to extract useful knowledge from huge amounts of data

Descriptive analytics

the use of traditional and newer graphics to represent easy to understand visual summaries to up to minute data

Business Analytics

the use of traditional and newly developed statistical methods, advances in IS, and techniques from management science to explore and investigate past performance

No linear relationship

there is no coordinated linear movement between the two variables

The purpose of stem and leaf display

to see the overall pattern of the data by grouping into classes best for small to moderately sized data distributions

Scatterplots

used to study the relationships between two variables place on variable on x axis place a second variable on y axis place a dot on the coordinates

Response variable

variable of interest, it is the dependent variable

Bar chart

vertical or horizontal rectangle represents frequency of each category

Supervised learning

we observe values of a respond variable and corresponding predictor variables 1. linear regression 2. logistic regression 3. neural networks 4. decision trees

Unsupervised learning

we observe values of variables but not a response variable 1. cluster analysis 2. factor analysis 3. association rules

Errors of observation

when data values are recorded incorrectly

Recording error

when either the respondent or interviewer incorrectly marks an answer

Negative linear scatterplots

when one variable goes up the other variable goes down

Positive linear scatterplots

when one variable goes up, the other variable goes up

Response bias

when respondents do not tell the truth (also occurs when biased questions are used)

Undercoverage

when some population elements are excluded from the process of selecting the sample

Selection bias

when the opinions of those who complete a survey vary dramatically from those who do not


Related study sets

Chapter 7: Legal Dimensions of Nursing Practice

View Set

Lesson 1 homework 1.2 practical computer applications

View Set

Chapter 5 - Repetition Structures: Looping

View Set

Chapter 1: Thinking like and Economist

View Set