DSC 210 Final Exam

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

In the textile industry, a manufacturer is interested in the number of blemishes or flaws occurring in each 100 feet of material. The probability distribution that has the greatest chance of applying to this situation is the _____.

Poisson distribution

If one wanted to find the probability of 10 customer arrivals in an hour at a service station, one would generally use the _____.

Poisson probability distribution

It is desired to take a random sample of individuals who are frequent fliers on Delta Airlines. Frequent fliers are classified based upon their frequent flier status (50% silver level, 30% blue level, 20% red level) and a simple random sample reflecting these percentages is taken from each of these groups. Which type of sampling method has been employed?

Stratified Random Sampling

T/F: The level of significance α (alpha) is the probability that the confidence interval does not contain the value of the parameter being estimated.

True

T/F: The median is used in creating a boxplot.

True

T/F: The number of defective items in a shipment is a discrete random variable.

True

T/F: The sample mean provides a point estimate for the population mean.

True

T/F: The symbol for the sample variance is s^2

True

Find z-score/value of x corresponding to a percentile: excel

=NORM.INV(percentile, mean, stdev) = mean = 0 = stdev. = 1

Percentiles: excel

=PERCENTILE.EXC(data set, percentile as decimal)

Standard Deviation: excel

=STDEV.S(number 1, [number 2], ...)

Sample variance: excel

=VAR.S()

Events A and B are mutually exclusive. Which of the following statements is also true?

P(A ∪ B) = P(A) + P(B)

Which of the following is a point estimator? A. σ B. t C. s D. p

C. s

T/F: A histogram is appropriate to be used for categorical data.

False

T/F: A representative sample means that you select every member of the population to be used in your sample.

False

T/F: Every confidence interval constructed will contain the value of the parameter being estimated.

False

T/F: Highway patrol officers measure the speed of automobiles on a highway using radar equipment. The random variable in this experiment is speed, measured in miles per hour. This random variable is a discrete random variable.

False

T/F: In combinations, the order of selection is important.

False

T/F: Numerical calculations are always appropriate for nominal data.

False

T/F: The Central Limit Theorem can only be used when the population being sampled is normally distributed.

False

T/F: The RAND() function in Excel helps in calculating the margin of error for a confidence interval.

False

T/F: The binomial probability distribution is appropriate to use for "without replacement" type problems.

False

T/F: The correlation coefficient can never be negative.

False

T/F: The exponential distribution is a discrete probability distribution.

False

T/F: The exponential probability distribution is a symmetric distribution.

False

T/F: The mean is a better measure of center than the median if there is an outlier in the data set.

False

T/F: The mean of a standard normal random variable is one.

False

T/F: The standard deviation is used in creating a boxplot.

False

The weight of items produced by a machine is normally distributed with a mean of 8 ounces and a standard deviation of 2 ounces. What is the probability that a randomly selected item weighs exactly 8 ounces?

0

The assembly time for a product is uniformly distributed between 6 and 10 minutes. The probability of assembling the product in less than 8 minutes is ___________.

0.50

When the population has a normal distribution, the sampling distribution of is normally distributed _____.

for any sample size

The expected value of a random variable is the _____.

mean value

After the data have been arranged from smallest value to largest value, the value in the middle is called the _____.

median

Which of the following is not a graph that is used for quantitative data?

pie chart

Excel's __________ can be used to construct a frequency distribution for categorical data.

pivot table tool

A single numerical value used as an estimate of a population parameter is known as a _____.

point estimate

The general form of an interval estimate of a population mean is the _____ plus or minus the _____.

point estimate, margin of error

The key difference between binomial and hypergeometric distributions is that with the hypergeometric distribution the _____.

probability of success changes from trial to trial

The interquartile range is used as a measure of variability to overcome what difficulty of the range?

the range is influence too much by extreme values

Quarterly sales data is an example of what type of data?

time-series

Which of the following descriptive statistics is NOT measured in the same units as the data?

variance

Which of the following is NOT a probability-based sampling method?

volunteer sampling

If we change a 95% confidence interval estimate to a 99% confidence interval estimate, we can expect the _____.

width of the confidence interval to increase

Sample mean

x-bar

The _____ denotes the number of standard deviations a data value is away from the mean of the data set.

z-score

In a post office, the mailboxes are numbered from 1 to 5,000. These numbers represent ______.

categorical data

Poisson Probability Distribution: excel

= POISSON.DIST(x, mu, T/F)

Standard Error of x-bar: excel

= stdev/SQRT(random sample n)

Binomial Distribution: excel

=BINOM.DIST(x, n, p, TRUE or FALSE)

Sampling distribution of p-bar: excel

=NORM.DIST()

T/F: The Poisson random variable is an example of a discrete random variable.

True

T/F: The area under the curve for a normal distribution must always equal 1.

True

positively skewed distribution

- skewed to the left

negatively skewed distribution

- skewed to the right

elements

- the entities on which data are collected - ex: persons, mutual funds, companies, products

observational study

- the researcher has no control over the variables and merely records the data - ex: surveys

population

- the set of all elements of interest in a particular study

Sampling Distribution of the sample proportion (p-bar)

- (0<= p-bar <=1) - mean: the mean of all possible p-bar values equals the population proportion p) - stdev: sigma of p-bar = square root of p(1-p)/n (a.k.a standard error)

Process for computation of Binomial Probability Distributions

- 1) define x - 2) specify the probability distribution of x --> ex: x~Binomial(n = _____, p = ______) - 3) write the probability statement in terms of values --> ex: P(x<= #)

Conditions for using the normal distribution of p-bar

- 1) n*p >= 5 - 2) n(1-p) >= 5

Normal Probability Distribution: probability that x is greater than a number

- = 1 - = NORM.DIST(x-value, mu, sigma, TRUE)

Binomial distribution - "at least"

- =1-BINOM.DIST(x,n,p,T/F) - probability statement should show one less than the "at least" number

What value of z puts # probability? (use excel)

- =NORM.S.INV(probability to the left of the z-value) - returns the z-value with given probability to its left - interpret: P(z < #)

Excel: TRUE vs FALSE?

- FALSE when P(x=x) - TRUE when P(x<=x)

situations in which ethical issues in statistics may arise

- use of inappropriate statistics to summarize data - biased interpretation of the statistical results - use of misleading graphs - poor sampling methods (not a representative sample or fishing for supportive data)

variables

- a characteristic or property of an individual element

scatter diagram

- a graph that shows the degree and direction of relationship between two variables

class frequency

- a raw count of the how many fell into a category

representative sample

- a sample whos members display characteristics of the target population - analyzing this data helps draw conclusions (make inferences) about the population

sample

- a subset selected from the population

crosstabulation

- a tabular summary of data for two variables. - the classes for one variable are represented by the rows; the classes for the other variable are represented by the columns

observations

- a value of something of interest you're measuring or counting during a study or experiment

Expected value of a Discrete random variable

- average of the random variable - E(x) = sum of x*f(x)

bar chart

- categorical data graph

pie chart

- categorical data graph

experiment

- certain variables are controlled by the researcher so that data can be obtained about how they influence the variable of interest - ex: a study to test the effectiveness of a new medication vs a placebo

class percentage

- compute: class relative frequency x 100

interval measurement

- data can be ordered and there is a fixed distance between values - ex: temperature, IQ, grade marking

ratio measurement

- data can be ordered with a fixed distance between values and there is a meaningful zero point which allows ratios to be useful - ex: cost of purchasing a share of stock, height

cross-sectional data

- data collected at the same or approximately the same point in time - like a snapshot in time

time-series data

- data collected over several time periods (daily, weekly, monthly, yearly) - seen in many businesses when trying to predict future values based on trends observed over time

symmetric distribution

- data that is evenly distributed between the left and the right side

nominal measurement

- data values are labels or categories with no logical ordering of values - ex: social security number, eye color, gender

ordinal measurement

- data values can be arranged, but difference between values cannot be computed mathematically - ex: survey responses on a scale of very poor, poor, average, good, very good

Discrete Uniform Probability Distribution: probability density function

- f(x) = 1/n, where n = total number of sample points

When is a sampling distribution of x-bar considered a normal distribution?

- if n>30, then the Central Limit Theory (CLT) applies to tell us that x-BAR ~ N

qualitative (categorical) variables

- information that can be classified into different categories based on a nonnumeric characteristic - scales of measurement: nominal and ordinal

quantitative variables

- information with numerical values that indicate how much or how many - scales of measurement: interval and ratio

Sampling Distribution of x-bar

- mean: the mean of the x-bar values equals the mean of the population x values - stdev: sigma of x-bar = sigma/n (a.k.a. standard error, or SE, of the mean)

inferential statistics

- methods used to draw conclusions about the population based upon the sample info - conclusions include estimates or predictions about the population

If x ~ Binomial (n = ___, p = ____), then...

- mu = E(x) = n*p - sigma^2 = var(x) = n*p(1-p)

descriptive statistics

- numerical and graphical summaries of the data which help show any patterns in a set of data - describe the data to show any patterns in the data set - ex: a histogram

class relative frequency

- proportion of the total for a class - compute: class frequency/total

dot plot

- quantitative data graph

histogram

- quantitative data graph

cumulative percentage

- records the percentage of cases at or below any given value of the variable

how can graphs be misleading?

- scale not starting at zero - scale made very small to make graph look bigger - scale values/labels missing from graph - incorrect scale placed on graph - pieces of a Pie Chart are not the correct sizes - oversized volumes of objects that are too big for the vertical scale differences they represent - size of images used in Pictographs being different for the different categories being graphed - graph being a non-standard size or shape

stem-and-leaf display

- shows quantitative data values in a way that sketches the distribution of the data

Ex: A research organization surveyed 250 women between the ages of 35 and 50 who work in the state of Ohio and asked them about the amount of time they spend commuting to their jobs each week. The average amount of time these women spent commuting to work each week was 190 minutes. Identify the sample that was taken.

250 women in the state of Ohio who are between ages 35 and 50

Normal Probability Distribution: when x is given (excel)

= NORM.DIST(x-value, mu, sigma, TRUE)

Normal Probability Distribution: when the z-value is known (excel)

= NORM.DIST(z-value, TRUE)

The use of the normal probability distribution as an approximation of the sampling distribution of is based on the condition that both np and n (1 - p) equal or exceed _____.

5

Hypergeometric Probability Distribution: excel

= HYPGEOM.DIST(x, n, r, N, T/F), where - x = number of successes - n = number of trials - N = population size - r = total number of successes in the population

Normal Probability Distribution: probability that x is between two values

= NORM.DIST(larger x-value, mu, sigma, TRUE) - = NORM.DIST(smaller x-value, mu, sigma, TRUE)

Sampling Distributions of x-bar: excel

= NORM.DIST(x, mean, stdev/SQRT(n), T/F)

The margin of error in an interval estimate of the population mean is a function of all of the following EXCEPT _____. A. sample mean B. level of significance C. variability of the population D. sample size

A. sample mean

All of the following are true about the standard error of the mean EXCEPT _____. A. it measures the variability in sample means B. it is larger than the standard deviation of the population C. it decreases as the sample size increases D. its value is influenced by the standard deviation of the population

B. it is larger than the standard deviation of the population

Posterior probabilities are computed using _____

Bayes' Theorem

Which of the following is NOT a characteristic of an experiment where the binomial probability distribution is applicable? A. The experiment has a sequence of n identical trials. B. Exactly two outcomes are possible on each trial. C. The trials are dependent. D. The probabilities of the outcomes do not change from one trial to another.

C. The trials are dependent

All of the following are examples of observational studies except: A. an online survey to record your satisfaction with a company's service. B. the number of cars running a stop sign in a residential area during rush hour. C. the behavior of Walmart shoppers after they are given a $20 gift card from the store. D. a Gallup poll measuring the approval rating of the president.

C. the behavior of Walmart shoppers after they are given a $20 gift card from the store.

The fact that the sampling distribution of the sample mean can be approximated by a normal probability distribution whenever the sample size is large is based on the _____.

Central Limit Theorem

T/F: Taking repeated samples until you obtain the desired result is not an ethically acceptable statistical practice.

True

T/F: The Empirical Rule can only be applied if the data distribution is bell-shaped.

True

T/F: A pie chart is appropriate for categorical data

True

T/F: Cluster sampling is a probability-based sampling method.

True

T/F: If the random variable X = number of occurrences in a certain interval of time has a Poisson distribution, then the random variable Y = time between occurrences has an exponential distribution.

True

T/F: If two events are mutually exclusive, this means they cannot happen at the same time.

True

T/F: In a crosstabulation, the two variables can be either categorical or quantitative.

True

T/F: Statistical studies in which researchers do not control variables of interest are not of any value.

True

Ex: A research organization surveyed 250 women between the ages of 35 and 50 who work in the state of Ohio and asked them about the amount of time they spend commuting to their jobs each week. The average amount of time these women spent commuting to work each week was 190 minutes. Identify the variable of interest in this problem and indicate if it is categorical or quantitative.

Variable = how many minutes each week a woman commutes to work. The variable is quantitative.

Ex: A research organization surveyed 250 women between the ages of 35 and 50 who work in the state of Ohio and asked them about the amount of time they spend commuting to their jobs each week. The average amount of time these women spent commuting to work each week was 190 minutes. What is the population in this problem?

all women who work in the state of Ohio and are between ages 35 and 50

A graphical summary of data that is based on a five-number summary is a _____.

boxplot

A graphical device for depicting categorical data that have been summarized in a frequency distribution, relative frequency distribution, or percent frequency distribution is a(n) _____.

bar chart

The t distribution is a family of similar probability distributions, with each individual distribution depending on a parameter known as the _____.

degrees of freedom

The summaries of data, which may be tabular, graphical, or numerical, are referred to as:

descriptive statistics

The height and weight are recorded by the school nurse for every student in a school. What type of graph would best display the relationship between height and weight?

scatter diagram

Which of the following symbols represents the standard deviation of a population?

sigma

An important numerical measure related to the shape of a distribution is the _____.

skewness

The standard deviation of all possible values is called the _____.

standard error of the mean

The process of analyzing sample data to draw conclusions about the characteristics of a population is called _____.

statistical inference

Whenever the population standard deviation is unknown, which distribution is used in developing an interval estimate for a population mean?

t distribution

A dot plot can be used to display _____

the distribution of one quantitative variable

Which of the following would likely display a negative relationship when creating a scatter diagram?

the number of classes a student misses during a semester and the grade obtained in the course

The probability that Pete is late to work on a given day is .2. Pete has 5 days of work next week . The random variable in this problem is __________________________________________.

the number of days out of 5 that Pete is late to work


Set pelajaran terkait

CBA 469: Business Policy Exam 1 (Study Guide)

View Set

PREP U Chapter 32: Disorders of Endocrine Control of Growth and Metabolism

View Set

Career Planning and Skill Development Unit:3 Lesson:16 "Science, Technology, Engineering, and Mathematics"

View Set

Ch. 17 Activity-Base Costing (ABC)

View Set

Skeletal system: bone structure & functions LearnSmart

View Set

Antoni Gaudi and the La Sagrada Familia

View Set

Chapter 11: Sizing and Fit Specifications

View Set

Chapter 1 - The Regulation of Employment

View Set