Statistics Exam 1 Definitions (Ch 1, 2, 3)

¡Supera tus tareas y exámenes ahora con Quizwiz!

The cumulative relative frequency for the last class must always be 1.​ Why?

All the observations are less than or equal to the last class.

Which allows the researcher to claim causation between an explanatory variable and a response​ variable?

A designed experiment allows the researcher to claim causation between an explanatory variable and a response variable

Define placebo. Choose the correct answer below.

An innocuous​ medication, such as a sugar​ tablet, that​ looks, tastes, and smells like the experimental medication

Define treatment. Choose the correct answer below.

Any combination of the values of the factors​ (explanatory variables)

Why​ shouldn't classes overlap when summarizing continuous data in a frequency or relative frequency​ distribution?

Classes​ shouldn't overlap so there is no confusion as to which class an observation belongs.

What is a​ cross-sectional study? Choose the correct answer below.

Cross-sectional studies are observational studies that collect information about individuals at a specific point in time or over a very short period of time.

What does it mean if a statistic is​ resistant?

Extreme values​ (very large or​ small) relative to the data do not affect its value substantially.

Identify the given statement as either true or false. The standard deviation can be negative.

False

Explain the difference between a​ single-blind and a​ double-blind experiment.

In a​ single-blind experiment, the subject does not know which treatment is received. In a​ double-blind experiment, neither the subject nor the researcher in contact with the subject knows which treatment is received.

Why is it rare for frames to be completely​ accurate?

It is rare for frames to be accurate because frames are obtained​ periodically, whereas populations are constantly changing.

Which is the superior observational​ study? Why? Choose the correct answer below.

Neither study is always the superior to the other. Both have advantages and disadvantages that depend on the situation.

Distinguish between nonsampling error and sampling error.

Nonsampling error is the error that results from​ undercoverage, nonresponse​ bias, response​ bias, or​ data-entry errors. Sampling error is the error that results because a sample is being used to estimate information about a population.

What does it mean when sampling is done without​ replacement?

Once an individual is​ selected, the individual cannot be selected again.

What is replication in an​ experiment?

Replication is applying each treatment to more than one experimental unit.

Define statistics.

Statistics is the science of​ collecting, organizing,​ summarizing, and analyzing information to draw a conclusion and answer questions. In​ addition, statistics is about providing a measure of confidence in any conclusions.

Which sampling method does not require a​ frame?

Systematic

Define confounding. Choose the correct answer below.

The effect of two factors​ (explanatory variables on the response​ variable) cannot be distinguished.

Explain the circumstances for which the interquartile range is the preferred measure of dispersion. What is an advantage that the standard deviation has over the interquartile​ range?

The interquartile range is preferred when the data are skewed or have outliers. An advantage of the standard deviation is that it uses all the observations in its computation.

A histogram of a set of data indicates that the distribution of the data is skewed right. Which measure of central tendency will likely be​ larger, the mean or the​ median? Why?

The mean will likely be larger because the extreme values in the right tail tend to pull the mean in the direction of the tail.

Define response variable. Choose the correct answer below.

The quantitative or qualitative variable for which the experimenter wishes to determine how its value is affected by the explanatory variable

What makes the range less desirable than the standard deviation as a measure of​ dispersion?

The range does not use all the observations.

In a relative frequency​ distribution, what should the relative frequencies add up​ to?

The relative frequencies add up to 1.

What are the advantages of having a presurvey with open questions to assist in constructing a questionnaire that has closed​ questions?

The researcher can learn common answers.

Is the following statement true or​ false? When plotting an​ ogive, the plotted points have​ x-coordinates that are equal to the upper limits of each class.

True

Determine whether the following statement is true or false. Explain. Inferences based on voluntary response samples are generally not reliable.

True, because it is often the case that the individuals who volunteer do not accurately represent the population.

_________ are the characteristics of the individuals of the population being studied

Variables

The _____________ of a variable is computed by adding all the values of the variable in the data set and dividing by the number of observations.

arithmetic mean

The​ _________________ is the difference between consecutive lower class limits.

class width

__________are the categories by which data are grouped.

classes

A(n)________is obtained by dividing the population into groups and selecting all individuals from within a random sample of the groups.

cluster sample

A ________________ is one in which each experimental unit is randomly assigned to a treatment.

completely randomized design

A _______________ distribution displays the aggregate frequency of the category. In other words, it displays the total number of observations less than or equal to the upper class limit of the class.

cumulative frequency

A _______________ distribution displays the proportion (or percentage) of observations less than or equal to the upper class limit of the class.

cumulative relative frequency

For a distribution that is skewed​ left, the left whisker is ___________ the right whisker.

longer than

The​ _______________ is the smallest value within the class and the​ _______________ is the largest value within the class.

lower class limit upper class limit

A ________________ is an experimental design in which the experimental units are paired up. The pairs are selected so that they are related in some way (that is, the same person before and after a treatment, twins, husband and wife, same geographical location, and so on). There are only two levels of treatment in a matched-pairs design.

matched-pairs design

For a distribution that is skewed​ left, which of the following is​ true?

mean<median

A(n)_________is a numerical summary of a sample.

statistic

The sum of the deviations about the mean always equals

zero

Define experimental unit. Choose the correct answer below.

A​ person, object, or some other​ well-defined item upon which a treatment is applied

What is a​ case-control study? Choose the correct answer below.

Case-control studies are observational studies that are​ retrospective, meaning that they require individuals to look back in time or require the researcher to look at existing records.

Discuss the advantages and disadvantages of each type of question.

Closed questions are easier to​ analyze, but limit the responses. Open questions allow respondents to state exactly how they​ feel, but are harder to analyze due to the variety of answers and possible misinterpretation of answers.

What is meant by​ confounding? Choose the correct answer below.

Confounding in a study occurs when the effects of two or more explanatory variables are not separated.​ Therefore, any relation that may exist between an explanatory variable and the response variable may be due to some other variable or variables not accounted for in the study.

Determine whether the following statements are true or false. ​(a) When a factor is controlled by setting it to three​ levels, the particular factor is of no interest to the researcher. ​(b) Randomization is used so that those factors not controlled in the experiment​ "average out" their effect on the response variable.

The statement is false because a factor that is controlled and set at various levels is a factor of interest to the researcher. The statement is true.

The _____________, IQR, is the range of the middle 50% of the observations in a data set. That is, the IQR is the difference between the first and third quartiles

interquartile range

For a distribution that is skewed​ right, the median is ____________ of the box.

left of center

The standard deviation is used in conjunction with the​ ______ to numerically describe distributions that are bell shaped. The​ ______ measures the center of the​ distribution, while the standard deviation measures the​ ______ of the distribution.

mean mean spread

For a distribution that is​ symmetric, which of the following is​ true?

mean=median

For a distribution that is skewed​ right, which of the following is​ true?

mean>median

The ____________ of a variable is the value that lies in the middle of the data when arranged in ascending order. We use M to represent the median.

median

The ____________ of a variable is the observation of the variable that occurs most frequently in the data set.

mode

A frequency distribution lists the_________of occurrences of each category of​ data, while a relative frequency distribution lists the________of occurrences of each category of data.

number proportion

What are some solutions to​ nonresponse?

offer rewards and incentives attempt callbacks

​A(n)_________is a numerical summary of a population.

parameter

A _______________ is a circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the category.

pie chart

The _____________, μ (pronounced "mew"), is a parameter that is computed using data from all the individuals in a population.

population arithmetic mean

The ______________ of a variable is the square root of the sum of squared deviations about the population mean divided by the number of observations in the population, N.

population standard deviation

The __________, R, of a variable is the difference between the largest and smallest data value.

range

A numerical summary of data is said to be _________ if values that are extreme (very large or small) relative to the data do not affect its value substantially.

resistant

In a​ boxplot, if the median is to the left of the center of the box and the right whisker is substantially longer than the left​ whisker, the distribution is skewed

right

For a distribution that is​ symmetric, the left whisker is the ____________ as the right whisker.

same length

The ____________, x-bar(pronounced "x-bar"), is a statistic that is computed using data from individuals in a sample.

sample arithmetic mean

The _______________, s, of a variable is the square root of the sum of squared deviations about the sample mean divided by n−1, where n is the sample size.

sample standard deviation

​A(n)________is obtained by dividing the population into homogeneous groups and randomly selecting individuals from each group.

stratified sample

A ___________ is obtained by selecting every kth individual from the population. The first individual selected corresponds to a number between 1 and k

systematic sample

A _______________ is obtained by plotting the time in which a variable is measured on the horizontal axis and the corresponding value of the variable on the vertical axis. Line segments are then drawn connecting the points.

time-series plot

The _____________ of a variable is the square of the standard deviation. The ____________ is σ^2, and the ______________ is s^2.

variance population variance sample variance

The _____________ represents the distance that a data value is from the mean in terms of the number of standard deviations. We find it by subtracting the mean from the data value and dividing this result by the standard deviation.

z-score

What is a Pareto​ chart?

A Pareto chart is a bar graph whose bars are drawn in decreasing order of frequency or relative frequency.

What is a bar​ graph?

A bar graph is a horizontal or vertical representation of the frequency or relative frequency of the categories. The height of each rectangle represents the​ category's frequency or relative frequency.

What is a closed​ question? What is an open​ question?

A closed question has fixed choices for​ answers, whereas an open question is a​ free-response question.

What is a designed​ experiment?

A designed experiment is when a researcher assigns individuals to a certain​ group, intentionally changing the value of an explanatory​ variable, and then recording the value of the response variable for each group.

What is a​ frame?

A frame is a list of the individuals in the population being studied.

What is an​ ogive?

A graph that represents the cumulative frequency or cumulative relative frequency for the class

What is a lurking​ variable? Choose the correct answer below.

A lurking variable is an explanatory variable that was not considered in a​ study, but that affects the value of the response variable in the study. In​ addition, lurking variables are typically related to explanatory variables in the study.

What does it mean when a part of the population is​ under-represented?

A part of the population is​ under-represented when it is proportionally smaller in a sample than in its population.

What does it mean when an observational study is​ prospective?

A prospective study collects the data over time.

What does it mean when an observational study is​ retrospective?

A retrospective study requires that individuals look back in time or require the researcher to look at existing records.

Define factor. Choose the correct answer below.

A variable whose effect on the response variable is to be assessed by the experimenter

What can be said about a set of data with a standard deviation of​ 0?

All the observations are the same value.

What is an observational​ study?

An observational study measures the value of the response variable without attempting to influence the value of either the response or explanatory variables.

Determine if the statement is true or false. When an observation that is much larger than the rest of the data is added to a data​ set, the value of the median will increase substantially.

False

Identify the given statement as either true or false. The standard deviation is a resistant measure of spread.

False

True or​ False: A data set will always have exactly one mode.

False

Determine whether the following statement is true or false. Explain. When taking a systematic random sample of size​ n, every group of size n from the population has the same chance of being selected.

False, because certain groups would never be selected.

Determine whether the following statement is true or false. Explain. A simple random sample is always preferred because it obtains the same information as other sampling plans but requires a smaller sample size.

False, because other sampling techniques may provide more information for less cost than a simple random sample.

Determine whether the following statement is true or false. Explain. When obtaining a stratified​ sample, the number of individuals included within each stratum must be equal.

False. Within stratified​ samples, the number of individuals sampled from each stratum should be proportional to the size of the strata in the population.

Determine whether the following statement is true or false. ​Generally, the goal of an experiment is to determine the effect that the treatment will have on the response variable.

True

Determine whether the following statement is true or false. Explain. When conducting a cluster​ sample, it is better to have fewer clusters with more individuals when the clusters are heterogeneous.

True, because when the clusters are​ heterogeneous, they are scaled down versions of the population.

A _______________ is a graph that uses points, connected by line segments, to represent the frequencies for the classes. It is constructed by plotting a point above each _______________ (the sum of consecutive lower class limits divided by 2) on a horizontal axis at a height equal to the frequency of the class.

frequency polygon class midpoint

A _______________ is constructed by drawing rectangles for each class of data. The height of each rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same, and the rectangles touch each other.

histogram

When an observation that is much larger than the rest of the data is added to a data​ set, the value of the mean will​ __________________.

increase

A(n) _________ is a person or object that is a member of the population being studied.

individual


Conjuntos de estudio relacionados

Black Swans and Unpredictability

View Set

Automotive Brakes 2023-2024 55-2

View Set

CSS -Defines How HTML is Displayed

View Set

Ultimate (Amazing) Climatic History

View Set

AP Art History Greece (IMAGE SET)

View Set