STATS Final

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Variables

​_________ are the characteristics of the individuals of the population being studied.

Interpretation of a Confidence Interval

A (1-alpha)* 100% confidence interval indicates that (1-alpha)*100% of all simple random samples of size n from the population whose parameter is unknown will result in an interval that contains the parameter

designed experiment

A ___________________ allows the researcher to claim causation between an explanatory variable and a response variable

Trial

A binomial experiment is performed a fixed number of times. What is each repetition of the experiment​ called?

Normally distributed; normal probability distribution

A continuous random variable is ___ or has a ___, if its relative frequency histogram has the shape of a normal curve.

a finite number of

A discrete random variable has _____values.

Binomial Probability Distribution

A discrete probability distribution that describes probabilities for experiments in which there are two mutually exclusive (disjoint) outcomies

Random Variable

A numerical measure of the outcome of a probability experiment; so its value is determined by chance

Population Arithmetic Mean

A parameter that is computed using data from all the individuals in a population

Subjective Probability

A probability that is determined based on personal judgement

1. The probability of two or more successes in any sufficiently small subinterval is 0. For example, the fixed interval might be any time between 0 and 5 minutes. A subinterval could be any time between 1 and 2 seconds. 2. The probability of success is the same for any two intervals of equal length. 3. The number of successes in any interval is independent of the number of successes in any other interval provided the intervals are not overlapping

A random variable X, the number of successes in a fixed interval, follows a Poisson process provided the following conditions are met.

Sample Arithmetic Mean

A statistic that is computed using data from individuals in a sample

Simulation

A technique used to recreate a random event

Kth percentile

A value such that k percent of the observations are less than or equal to the value

Individual

A(n) _________ is a person or object that is a member of the population being studied.

Relative Frequency Distribution

lists each category of data together with the relative frequency.

Unusual Event

An event that has a low probability of occurring

Empirical Method

According to a sports​ analyst, the probability that a football team will win the next game is 0.31.

68%; 95%; 99.7%

According to the Empirical​ Rule, if a distribution is​ bell-shaped, then approximately _____ of the data will lie within 1 standard deviation of the​ mean; approximately _____ of the data will lie within 2 standard deviations of the​ mean; approximately ____ of the data will lie within 3 standard deviations of the mean

Event

Any collection of outcomes from a probability experiment

Law of Large Numbers

As the number of repetitions of a probability experiment increases, the proportion with which a certain outcome is observed gets closer to the probability of the outcome.

0.95

As the number of samples​ increases, the proportion of​ 95% confidence intervals that include the population proportion approaches​ ______.

The standard error of the mean decreases.

As the sample size n​ increases, what happens to the standard error of the​ mean?

​No; since the variance is based on the squared deviations from the mean and​ N, it cannot be negative.

Can the variance of a data set ever be​ negative? Explain.

The results might differ because there is always a chance that the sample surveyed is unlike the population.

Contact a local hospital and ask them the percentage of the population that is blood type O. Why might the results differ?

Statistics is the science of collecting, organizing, summarizing, and analyzing information to draw a conclusion and answer questions. In addition, statistics is about providing a measure of confidence in any conclusions.

Define Statistics

Ordinal

Determine the level of measurement of the variable. Position of runners in a race

Ratio

Determine the level of measurement of the variable. Weight of a child

Interval

Determine the level of measurement of the variable. Years of election

The variable is continuous because it is not countable.

Determine whether the quantitative variable is discrete or continuous. Volume of a sound

The value is a statistic because the respondents who were full-time college students aged 18 to 22 are a sample.

Determine whether the underlined value is a parameter or a statistic. In a national survey on substance abuse, 66.4% of respondents who were full-time college students aged 18 to 22 reported using alcohol within the past month.

The value is a statistic because the 1,502 adults 18 years of age or older are a sample

Determine whether the underlined value is a parameter or a statistic. Telephone interview of 1,502 adults 18 years of age or older found that only 69% could identify the current vice- president

Quantitative; it is a numerical value

Determine whether the variable is qualitative or quantitative. Grams of carbohydrates in a donut

Class width= (largest value-smallest value)/ number of classes

Determining Class Width Formula

A population is the entire group that is being studied while a sample is a subset of the population that is being studied.

Explain the difference between a population and a sample.

The word​ "average" is ambiguous and can refer to any measure of center. It is better to use the specific measure of center you intend​ (mean, median, or​ mode).

Explain why it is misleading to use the term​ "average" to describe your typical bowling score.

mean= median

Fo a distribution that is symmetric...

Mean < Median

For a distribution that is skewed left...

Mean > median

For a distribution that is skewed right...

longer than

For a distribution that is skewed​ left, the left whisker is _____ the right whisker.

Left of Center

For a distribution that is skewed​ right, the median is _____ of the box.

the same length

For a distribution that is​ symmetric, the left whisker is ______ as the right whisker.

The General Addition Rule

For any 2 events, E and F, P (E or F) = P(E) + P(F) - P(E and F)

10

For the shape of the distribution of the sample proportion to be approximately​ normal, it is required that ​np(1−​p)≥​______.

A random sample of 100 adults aged 18 years or older were given a list of ice cream flavors and were asked to list which flavors they liked. The responses are given below. Chocolate 45 Strawberry 42 Vanilla 23 Mocha 19 Which of the following graphs would be most appropriate for visually displaying the​ results?

Frequency Bar Graph

P(E) + P(F)

If E and F are disjoint events then, P(E or F)

Inferential; makes a prediction

If a polling organization claimed that the results of the survey indicate that 9​% of adults in the country believe that the action is acceptable in certain​ situations, would you say this statement is descriptive or​ inferential? Why? The statement is _____ because it _____.,

Normal

If a random variable X is normally​ distributed, what will be the shape of the distribution of the sample​ mean?

1.96

If a​ 95% confidence interval results in a sample proportion that does not include the population​ proportion, then the sample proportion is more than​ ______ standard errors from the population proportion.

Independent

If n<0.005N, treat the event as ___.

Less than

If the normality requirement is not satisfied​ (that is, ​np(1−​p) is not at least​ 10), then a​ 95% confidence interval about the population proportion will include the population proportion in​ ________ 95% of the intervals.

1

In a relative frequency​ distribution, what should the relative frequencies add up​ to?

Interquartile Range (IQR)

In a typical box plot, the length of the box indicates which measure of spread?

Experiment

In​ probability, a(n)​ ________ is any process that can be repeated in which the results are uncertain.

False. In statistics, results are not reported with 100% certainty. Because statistical studies draw on samples, and because there is variation within groups, results cannot be reported with 100% certainty.

In​ statistics, results are always reported with​ 100% certainty. Choose the correct answer below.

Empirical Method

On the basis of a survey of 1000 families with eight ​children, the probability of a family having eight girls is 0.0064.

Empirical Method

On the basis of clinical​ trials, the probability of efficacy of a new drug is 0.82.

Multiplication Rule for Independent Events

P(E&F)= P(E) * P(F)

Complement Rule

P(Ec)= 1- P(E)

Conditional Probability

P(F/E) is read the probability of event F given event E. Event F occurs, given that event E has occurred.

Conditional Probability Rule

P(F/E)= P(E&F)/P(E)

Probability Distribution

Provides the possible values of the random variable and their corresponding probabilities.

B, C, A

Put the following in order for the most area in the tails of the distribution. ​(a) Standard Normal Distribution ​(b) Student's​ t-Distribution with 25 degrees of freedom. ​(c) Student's​ t-Distribution with 45 degrees of freedom.

Practical Significance

Refers to the idea that although small differences between the statistic and parameter stated in the null hypothesis are statistically significant, the difference may not be large enough to cause concern or be considered important.

Random Process

Represents scenarios where the outcome of any particular trial of an experiment is unknown, but the proportion (or relative frequency) a particular outcome is observed approaches a specific value.

False. Statistical studies are concerned with both describing the variability in the data and understanding the sources of variability in data. Understanding the sources allows researchers to control it and reach better conclusions.

Statistical studies are not concerned with understanding the sources of variability in​ data, only with describing the variability in the data. Choose the correct answer below.

Since the sample size is large enough , the population distribution does not need to be normal

Suppose a simple random sample of size n=47 is obtained from a population with μ=62 and σ=15. ​(a) What must be true regarding the distribution of the population in order to use the normal model to compute probabilities regarding the sample​ mean? Assuming the normal model can be​ used, describe the sampling distribution x.

0.95

Suppose the proportion of a population that has a certain characteristic is 0.95. The mean of the sampling distribution of p from this population is μp=​______.

True. Statistical studies typically look at samples rather than entire populations. Since each study is likely to draw different samples, it is quite possible that each study ends up with different results, due to variability in the data.

Suppose three different individuals conduct the same statistical​ study, such as estimating the average commute time of students at a college. It is possible that all three studies end up with different results. Choose the correct answer below.

Null

The ___ ​hypothesis, denoted H0​, is a statement to be​ tested, and is a statement of no​ change, no​ effect, or no difference.

True

True or False: @ events E and F are independent if P(E/F)= P(E)

1. The area corresponds to the proportion of the population with the characteristic. Your answer is correct. 2. The area corresponds to the probability that a randomly selected individual from the population has the characteristic.

The area under a normal curve corresponding to a certain characteristic of the normal random variable may be interpreted in any of the following ways.

1/2

The area under the normal curve to the right of μ equals​ _______.

Sample Space

The collection of all possible outcomes for that experiment

1. Identify the research objective 2. Collect the Data needed to Answer the Research Question(s) 3. Describe the Data 4. Perform Inference

The methods of statistics follows a process. Place the processes in the correct order.

Mode

The observation that occurs most frequently in the data set

Classical Method

The probability of having eight girls in an eight​-child family is 0.00390625.

Level of Significance

The probability of making a Type I error

Variance

The square of the standard deviation

Zero

The sum of the deviations about the mean always equals...

Point Estimate

The value of a statistic that estimates the value of a parameter

Median

The value that lies in the middle of the data when arranged in ascending order.

Random

The word _____ suggests an unpredictable result or outcome.

symmetric about 0.

The​ Student's t-distribution is ___

Sample Proportion; x/n

The​ _____ _____, denoted p​, is given by the formula p=​_____, where x is the number of individuals with a specified characteristic in a sample of n individuals.

Lower; upper

The​ ______ class limit is the smallest value within the class and the​ ______ class limit is the largest value within the class.

Level of Confidence; (1-alpha)*100

The​ _______ represents the expected proportion of intervals that will contain the parameter if a large number of different samples of size n is obtained. It is denoted​ _______.

False

True or False: A​ 95% confidence interval may be interpreted by saying there is a​ 95% probability that the interval includes the unknown parameter.

False

True or False: The standard deviation can be negative.

False

True or False: The standard deviation is a resistant measure of spread.

True. A relative frequency histogram will have a different scale on the​ y-axis but the same shape as a regular histogram.

True or​ false? A histogram and a relative frequency​ histogram, constructed from the same​ data, always have the same basic shape.

Independent

Two events E and F are​ ________ if the occurrence of event E in a probability experiment does not affect the probability of event F.

All the observation are the same value

What can be said about a set of data with a standard deviation of​ 0?

An event is unusual if it has a low probability of occurring. The choice of a cutoff should consider the context of the problem.

What does it mean for an event to be​ unusual? Why should the cutoff for identifying unusual events not always be​ 0.05?

A prospective study collects the data over time.

What does it mean when an observational study is​ prospective?

A retrospective study requires that individuals look back in time or require the researcher to look at existing records.

What does it mean when an observational study is​ retrospective?

The graph of the normal curve slides right.

What happens to the graph of the normal curve as the mean​ increases?

The graph of the normal curve compresses and becomes steeper.

What happens to the graph of the normal curve as the standard deviation​ decreases?

Case-control studies are observational studies that are​ retrospective, meaning that they require individuals to look back in time or require the researcher to look at existing records.

What is a case-control study?

A confounding variable is an explanatory variable that was considered in a study whose effect cannot be distinguished from a second explanatory variable in the study.

What is a confounding​ variable?

A designed experiment is when a researcher assigns individuals to a certain group, intentionally changing the value of an explanatory variable, and then recording the value of the response variable for each group.

What is a designed​ experiment?

A lurking variable is an explanatory variable that was not considered in a study, but that affects the value of the response variable in the study. In addition, lurking variables are typically related to explanatory variables in the study.

What is a lurking​ variable?

An observational study measures the value of the response variable without attempting to influence the value of either the response or explanatory variables.

What is an observational​ study?

Make an assumption about​ reality, and collect sample evidence to determine whether it contradicts the assumption.

What is at the​ "heart" of hypothesis testing in​ statistics?

Cross-sectional studies are observational studies that collect information about individuals at a specific point in time or over a very short period of time.

What is a​ cross-sectional study?

Confounding in a study occurs when the effects of two or more explanatory variables are not separated. Therefore, any relation that may exist between an explanatory variable and the response variable may be due to some other variable or variables not accounted for in the study.

What is meant by​ confounding?

0.95

When constructing​ 95% confidence intervals for the mean when the parent population is right skewed and the sample size is​ small, the proportion of intervals that include the population mean approaches​ _____ as the sample​ size, n, increases.

Below

When constructing​ 95% confidence intervals for the mean when the parent population is right skewed and the sample size is​ small, the proportion of intervals that include the population mean is​ (above, below, equal​ to) 0.95.

2- tailed Hypothesis Testing Using Confidence Intervals

When testing H0: p=p0 versus H1: p≠p0, if a (1−α)⋅100% confidence interval contains p0, we do not reject the null hypothesis. However, if the confidence interval does not contain p0,we conclude that p≠p0 at the level of significance, α.

Neither study is always the superior to the other. Both have advantages and disadvantages that depend on the situation.

Which is the superior observational​ study?

Min. Value, Q1, Median, Q3, Max. Value

Which measures are used in the 5- number summary?

1. The area under the normal curve to the right of the mean is 0.5. 2. The graph of a normal curve is symmetric. 3. The high point is located at the value of the mean.

Which of the following are properties of the normal​ curve?

If the differences were not​ squared, then the sum of all deviations from the mean would always be zero since the positive deviations are balanced by the negative deviations.

Why does the formula for calculating the sample​ variance, s2=∑x−x2n−1​, involve squaring the difference between each value and the​ mean?

If the formula involved division by​ n, the sample variance would be biased and consistently underestimate the population variance

Why does the formula for calculating the sample​ variance, s2=∑x−x2n−1​, involve division by n−1 instead of​ n?

Use the results of the sample to conjecture the percentage of the population that has type O blood. Is this an example of descriptive or inferential​ statistics? Select the correct choice below and fill in the answer box to complete your choice.

___% inferential

Descriptive; Inferential

____________ statistics consists of organizing and summarizing information​collected, while _________________ statistics uses methods that generalize results obtained from a sample to the population and measure the reliability of the results.

Pareto Chart

a bar graph whose bars are drawn in decreasing order of frequency or relative frequency.

Pie Chart

a circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the category.

Parameter

​A(n) ___________ is a numerical summary of a population.

Classes

are the categories by which data are grouped

Histogram

constructed by drawing rectangles for each class of data. The height of each rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same, and the rectangles touch each other.

Resistant

if the observations that are extreme relative to the data do not affect its value substantially

Arithmetic Mean

is computed by adding all the values of the variable in the data set and dividing by the number of observations.

Bar Graph

is constructed by labeling each category of data on either the horizontal or vertical axis and the frequency or relative frequency of the category on the other axis. Rectangles of equal width are drawn for each category. The height of each rectangle represents the category's frequency or relative frequency.

Frequency Distribution

lists each category of data and the number of occurrences for each category of data.

Z- Score

represents the distance that a data value is from the mean in terms of the number of standard deviations. z= (value-mean)/ std. dev.

Mean of a Discrete Random Variable

sum of x*p(x)

Class Width

the difference between consecutive lower class limits.

Range

the difference between the largest and smallest data value.

Relative Frequency

the proportion (or percent) of observations within a category and is found using the formula

Interquartile Range (IQR)

the range of the middle 50% of the observations in a data set. IQR= Q3-Q1

Population Standard Deviation

the square root of the sum of squared deviations about the population mean divided by the number of observations in the population.

Sample Standard Deviation

the square root of the sum of squared deviations about the sample mean divided by n-1

Statistic

​A(n) _________ is a numerical summary of a sample.


Kaugnay na mga set ng pag-aaral

Chap 5 Underwriting and Policy Issue

View Set

NUR108 #1 - Chapter 2: Caring for Women & Children (Family-Centered Community-Based Care)

View Set

Ch 4 Validating and Documenting Data

View Set

econ final- study guide mod 10-11

View Set