Intro to Stats - Test 1

¡Supera tus tareas y exámenes ahora con Quizwiz!

What are the two requirements for a discrete probability distribution?

*both answers = 1

What is the probability of an event that is impossible?

0

What is a closed question? What is an open question?

A closed question has fixed choices for answers, whereas an open question is a free-response question.

What is a confounding variable?

A confounding variable is an explanatory variable that was considered in a study whose effect cannot be distinguished from a second explanatory variable in the study.

A(n) _______ is a numerical summary of a sample.

statistic

A(n) _______ is obtained by dividing the population into homogeneous groups and randomly selecting individuals from each group.

stratified sample

Which sampling method does not require a frame?

systematic

What is a designed experiment?

A designed experiment is when a researcher assigns individuals to a certain group, intentionally changing the value of an explanatory variable, and then recording the value of the response variable for each group.

What is a frame?

A frame is a list of the individuals in the population being studied.

What does it mean when sampling is done without replacement?

Once an individual is selected, the individual cannot be selected again.

If E and F are disjoint events, then P(E or F) = ________

P(E) + P(F)

_______ divide data sets in fourths.

Quartiles

Define statistics.

Statistics is the science of​ collecting, organizing,​ summarizing, and analyzing information to draw a conclusion and answer questions. In​ addition, statistics is about providing a measure of confidence in any conclusions.

Define confounding.

The effect of two factors (explanatory variables on the response variable) cannot be distinguished.

Describe the difference between classical and empirical probability.

The empirical method obtains an approximate empirical probability of an event by conducting a probability experiment. The classical method of computing probabilities does not require that a probability experiment actually be performed. Rather, it relies on counting techniques, and requires equally likely outcomes.

Explain the circumstances for which the interquartile range is the preferred measure of dispersion. What is an advantage that the standard deviation has over the interquartile range?

The interquartile range is preferred when the data are skewed or have outliers. An advantage of the standard deviation is that it uses all the observations in its computation.

Why is the median resistant, but the mean is not?

The mean is not resistant because when data are skewed, there are extreme values in the tail, which tend to pull the mean in the direction of the tail. The median is resistant because the median of a variable is the value that lies in the middle of the data when arranged in ascending order and does not depend on the extreme values of the data. *long

A histogram of a set of data indicates that the distribution of the data is skewed right. Which measure of central tendency will likely be larger, the mean or the median? Why?

The mean will likely be larger because the extreme values in the right tail tend to pull the mean in the direction of the tail.

Define response variable.

The quantitative or qualitative variable for which the experimenter wishes to determine how its value is affected by the explanatory variable

What are the advantages of having a presurvey with open questions to assist in constructing a questionnaire that has closed questions?

The researcher can learn common answers.

The ______ class limit is the smallest value within the class and the ______ class limit is the largest value within the class.

lower; upper

The standard deviation is used in conjunction with the ______ to numerically describe distributions that are bell shaped. The ______ measures the center of the distribution, while the standard deviation measures the ______ of the distribution.

mean; mean; spread

The factorial symbol, n!, is defined as n! = _______ and 0! = _______.

n(n-1) x ... x 3 x 2 x 1; and 0! = 1

Suppose that a probability is approximated to be zero based on empirical results. Does this mean that the event is impossible?

no

A frequency distribution lists _______ of occurrences of each category of data, while a relative frequency distribution lists the _______ of occurrences of each category of data.

number; proportion

A(n) _______ is a numerical summary of a population.

parameter

A(n) _______ is an ordered arrangement of r objects chosen from n distinct objects without repetition.

permutation

If r = ______, then a perfect negative linear relation exists between the two quantitative variables.

r = -1

In the binomial probability distribution function, (n)C(x) represents the number of ways of obtaining x successes in n trials.

true

The _______ represents the number of standard deviations an observation is from the mean.

z-score

What is a lurking variable?

A lurking variable is an explanatory variable that was not considered in a study, but that affects the value of the response variable in the study. In addition, lurking variables are typically related to explanatory variables in the study.

In a relative frequency distribution, what should the relative frequencies add up to?

The relative frequencies add up to 1.

What does it mean when a part of the population is under-represented?

A part of the population is under-represented when it is proportionally smaller in a sample than in its population.

Define experimental unit.

A person, object, or some other well-defined item upon which a treatment is applied

What is a random variable?

A random variable is a numerical measure of the outcome of a probability experiment. *short

What is a residual? What does it mean when a residual is positive?

A residual is the difference between an observed value of the response variable y and the predicted value of y. If it is positive, then the observed value is greater than the predicted value.

Define simple random sampling.

A sample of size n from a population of size N is obtained through simple random sampling if every possible sample of size n has an equally likely chance of occurring. The sample is then called a simple random sample.

Define factor.

A variable whose effect on the response variable is to be assessed by the experimenter

Describe what an unusual event is. Should the same cutoff always be used to identify unusual events? Why or why not?

An event is unusual if it has a low probability of occurring. The same cutoff should not always be used to identify unusual events. Selecting a cutoff is subjective and should take into account the consequences of incorrectly identifying an event as unusual.

What is an observational study?

An observational study measures the value of the response variable without attempting to influence the value of either the response or explanatory variables. *long

Define treatment.

Any combination of the values of the factors (explanatory variables)

Describe how the value of n affects the shape of the binomial probability histogram.

As n increases, the binomial distribution becomes more bell shaped.

What is a case-control study?

Case-control studies are observational studies that are retrospective, meaning that they require individuals to look back in time or require the researcher to look at existing records.

Discuss the advantages and disadvantages of each type of question.

Closed questions are easier to analyze, but limit the responses. Open questions allow respondents to state exactly how they feel, but are harder to analyze due to the variety of answers and possible misinterpretation of answers.

What is meant by confounding?

Confounding in a study occurs when the effects of two or more explanatory variables are not separated. Therefore, any relation that may exist between an explanatory variable and the response variable may be due to some other variable or variables not accounted for in the study.

What is a cross-sectional study?

Cross-sectional studies are observational studies that collect information about individuals at a specific point in time or over a very short period of time.

_______ statistics consists of organizing and summarizing information collected, while _______statistics uses methods that generalize results obtained from a sample to the population and measure the reliability of the results.

Descriptive; inferential

Which allows the researcher to claim causation between an explanatory variable and a response variable?

Designed experiment

What is the formula for the expected number of successes in a binomial experiment with n trials and probability of success p?

E(X) = np

Explain what each point on the least-squares regression line represents.

Each point on the least-squares regression line represents the predicted y-value at the corresponding value of x

A binomial experiment is performed a fixed number of times. What is each repetition of the experiment called?

Each repetition of the experiment is called a trial.

What does it mean if a statistic is resistant?

Extreme values (very large or small) relative to the data do not affect its value substantially.

The notation P(F| E) means the probability of event

F given event E

True or false: Correlation implies causation.

False

True or False: When obtaining a stratified sample, the number of individuals included within each stratum must be equal.

False. Within stratified samples, the number of individuals sampled from each stratum should be proportional to the size of the strata in the population.

The U.S. Department of Housing and Urban Development (HUD) uses the median to report the average price of a home in the United States. Why do you think HUD uses the median?

HUD uses the median because the data are skewed right.

Explain the difference between a single-blind and a double-blind experiment.

In a single-blind experiment, the subject does not know which treatment is received. In a double-blind experiment, neither the subject nor the researcher in contact with the subject knows which treatment is received.

Which is the superior observational study?

Neither study is always the superior to the other. Both have advantages and disadvantages that depend on the situation.

What does it mean if r = 0?

No linear relationship exists between the variables.

Suppose that two variables, X and Y, are negatively associated. Does this mean that above-average values of X will always be associated with below-average values of Y? Explain.

No, because association does not mean that every point fits the trend. The negative association only means that above-average values of X are generally associated with below-average values of Y.

What does it mean to say that two variables are negatively associated?

There is a linear relationship between the variables, and whenever the value of one variable increases, the value of the other variable decreases.

What does it mean to say that two variables are positively associated?

There is a linear relationship between the variables, and whenever the value of one variable increases, the value of the other variable increases.

True or False: Generally, the goal of an experiment is to determine the effect that the treatment will have on the response variable.

True

True or False: In a probability model, the sum of the probabilities of all outcomes must equal 1.

True

True or False: Probability is a measure of the likelihood of a random phenomenon or chance behavior.

True

True or False: Inferences based on voluntary response samples are generally not reliable.

True, because it is often the case that the individuals who volunteer do not accurately represent the population.

True or False: When comparing two populations, the larger the standard deviation, the more dispersion the distribution has, provided that the variable of interest from the two populations has the same unit of measure.

True, because the standard deviation describes how far, on average, each observation is from the typical value. A larger standard deviation means that observations are more distant from the typical value, and therefore, more dispersed.

When are events mutually exclusive?

When the events have no outcomes in common

What does it mean to say that the linear correlation coefficient between two variables equals 1? What would the scatter diagram look like?

When the linear correlation coefficient is 1, there is a perfect positive linear relation between the two variables. The scatter diagram would contain points that all lie on a line with a positive slope.

The _________________ is the difference between consecutive lower class limits.

class width

_______ are the categories by which data are grouped.

classes

A(n) _______ is obtained by dividing the population into groups and selecting all individuals from within a random sample of the groups.

cluster sample

The _______, R^2, measures the proportion of total variation in the response variable that is explained by the least squares regression line.

coefficient of determination

A ________ is an arrangement of r objects chosen from n distinct objects without repetition and without regard to order.

combination

In probability, a(n) ________ is any process that can be repeated in which the results are uncertain.

experiment

True or False: A data set will always have exactly one mode.

false

Two events E and F are ________ if the occurrence of event E in a probability experiment does not affect the probability of event F.

independent

A(n) _________ is a person or object that is a member of the population being studied.

individual


Conjuntos de estudio relacionados

History 1301: Chapter 4. The Empire in Transition

View Set

Texas Promulgated Contract Forms Quiz

View Set

Study set for vocab words English

View Set

Chapter 1: Introduction to Economics

View Set