AP Stats Cumulative Review

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

What does it mean for results to be statistically significant?

The results are so out-of-the-ordinary that they have to mean something... they didn't just happen by chance!

How do we check for independence when sampling?

The sample must be less than 10% of the population.

What does it mean to call a result a biased estimator?

The statistic doesn't match the parameter (i.e. the population is not well represented).

To find P(X>4) for a binomial random variable, what should you use for X in the calculator function?

4

If you want to cut the standard deviation in 1/2 from one sample to another, how much larger should the new sample be?

4 times as large (dividing by the square root of n)

What is t* for a sample of 6 with 99% confidence level?

4.032 - whoa?!? That's a lot of STDs!

If you are in the top 4% of the class, what percentile are you?

96th %ile

What happens to the shape of the t distributions as we increase sample size?

As sample size increases, variability decreases - the distribution get narrower and more Normal

How do you find the probability of AT LEAST ONE?

At least one is the complement of none... so find the probability of no successes and then subtract from 1.

How do you find expected counts for a two-way table?

(column total)(row total)/(total sample)

What are the conditions for a Binomial setting?

B: Binary, I: Independent, N: Number of trials is fixed, S: Success probability is fixed

What are the conditions for a Geometric setting?

B: Binary, I: Independent, T: Trials until first success, S: Success probability is fixed

What is the center of the hypothesized distribution for comparing two means or two proportions?

0

What alpha level should you use when you aren't given one?

0.05

What alpha level corresponds to a 90% confidence interval?

0.10

What is the "worst case scenario" p-hat value you can use for finding the sample size for a desired margin of error?

0.5

If I want to find the top 3% from a distribution, what area will I use for invNorm?

0.97

When asked to "describe the sampling distribution" what information should you give?

1 - the center (mean), 2 - the spread (ideally STD... but if you can't calculate then the range will suffice), 3 - the shape - either you know it's Normal because the population is Normal or you use one of the checks for Normality to show it's Normal!

How is a population distribution different from a sampling distribution?

1 - the population distribution shows every actual value in the population, 2- the sampling distribution is the result of taking many many samples and has a smaller standard deviation

How is an experiment different from an observational study?

1. An experiment applies treatments and an observational study (should) have no influence on those being observed, 2. An experiment can extend to cause-effect where an observational study can only extend to the population sampled from

What is a matched pairs design?

1. It's a type of BLOCKING in experiments, 2. You apply a different treatment to each pair to compair (haha) the results. The BEST matched pair is something against itself!

What is a cluster sample?

1. It's not a SRS!, 2. All different types of people are represented in the cluster... then you randomly select a cluster

What is a stratified sample?

1. It's not a SRS!, 2. Group persons by similar characteristics and randomly select person(s) from each group (strata)... like mentor groups

What is a systematic sample?

1. It's not a SRS!, 2. Randomly select a starting point then select the following using a preestablished pattern (like every 5)

What is the 5 number summary and how do you get it?

1. Minimum - lowest value, 2. Q1 - quartile one (25th percentile), 3. Median - middle 50%, 4. Q3 - quartile three (75th percentile), 5. Maximum - highest value

What is the difference between P(A U B) and P(A and B)?

1. P(A U B) is the probability of A or B or both, 2. P(A U B) = P(A) + P(B) - P(A and B), 3. P(A and B) is the probability of A and B at the same time; on a Venn diagram this is the intersection, 4. P(A and B) = P(A) * P(B) iff A and B are independent!

What are the conditions for a chi-square test?

1. Random selection, 2. Individual observations are independent AND, if sampling, 10% condition, 3. All expected counts are at least 5

What is the difference between stratifying and blocking?

1. Stratifying is grouping by similar characteristics when SAMPLING, 2. Blocking is grouping by similar characteristic when EXPERIMENTING; NEVER say "stratified" when discussing experiments and NEVER say "blocked" when discussing sampling!!!

What is the difference between marginal and conditional distribution?

1. The marginal distribution is the row or column total out of the GRAND total 2. A conditional distribution is the proportion of a group under a given category

What is the complement probability?

1. The probability of something NOT happening. 2. P(A') = 1 - P(A)

What are the differences between experimental UNITS, FACTORS, and LEVELS?

1. Treatments are imposed on the experimental units (or subjects if human), 2. Each treatment is a combination of the factors or explanatory variables, 3. Levels are the specific value of the factors

What is the 1.5 IQR rule and how do you use it?

1. find the IQR: Q3 - Q1, 2. multiple the IQR by 1.5, 3. add the number in (2) to the Q3... numbers above are outliers, 4. subtract the number in (2) from Q1... numbers below are outliers

How do you increase power?

1. increase the alpha level (because the rejection region will be higher), 2. increase the sample size (because the STD will get smaller and it will take less to get to the rejection region)

What is the alpha level?

1. it's the threshold we have for rejecting the null hypothesis, 2. it's the probability of making a Type I error

What is a residual?

1. observed (actual) - expected (predicted from LSRL), 2. positive = above the LSRL and an underprediction, 3. negative = below the LSRL and an overprediction

How do we linearize a power model?

1. take log of both x and y, 2. use a root

When can we reject the null hypothesis?

1. the P value is lower than our alpha level or 2. the null hypothesis is outside the confidence interval (for two-sided tests)

What is the test statistic?

1. z score for proportions, 2. t score for means or slope, 3. sum of chi-square

What is z* for 90% confidence level?

1.645

How do you find the mean for a Geometric random variable?

1/p

What does the notation μ + 2σ signify?

2 standard deviations above the mean

What is the difference between a distribution of sample data and a sampling distribution?

A distribution of sample data is for 1 sample. A sampling distribution is every possible result from a population with a given n.

What needs to be included in a probability model?

A sample space of all events where you can identify the probability of each outcome from the sample space.

If you don't select participants _______________________, then you can't make an inference for the population!

RANDOMLY!!!

If we take samples of size n = 50 from a population 100,000 times and plot the resulting mean of each sample, what do we call the distribution of this data?

An approximate sampling distribution - remember: a true sampling distribution has every possible result from a population where we are sampling 50 at a time.

What are the sampling distribution standard deviation formulas divided by n?

Because as we sample larger groups the standard deviation will decrease.

Why doesn't P(A U B) = P(A) + P(B)?

Because events that AREN'T mutually exclusive have an overlap... so if you add P(A) and P(B) you are including the events that are common to both twice! You MUST subtract P(A and B).

Why does P(A|B) = P(A) prove independence?

Because if A and B are independent, then the probability of A under the condition of B has NO BEARING on the outcome of B.

What does it mean to finish all tests with a CRAP?

C: conclusion in context (there is/isn't significant evidence of...), R: reject or fail to reject, AP: compare the alpha level and your P-value

If I multiply or divide all the data by a constant, what stays the same and what changes?

Changes: mean and STD, Same: shape

If I add or subtract all the data by a constant, what stays the same and what changes?

Changes: mean, Same: STD and shape

What must you include in the "Plan" step of the four-step process?

Checking conditions - Random, Independent, and Normal.

Is a segmented bar graph more like a conditional distribution or a marginal distribution?

Conditional distribution

If we are measuring blood pressure for the population of the US, is the variable discrete or continuous?

Continuous; we can take all values between two extremes.

Does the margin of error help account for sampling issues like nonresponse, voluntary response, response bias, etc.?

Not - it just accounts for the difference in sampling results due to variability

What does it mean to find P(A|B)?

Find the probability of A given B... how likely is the event A under the condition of B... like, how likely is it that I draw a heart from a deck of cards if I already drew a heart?

How do you interpret slope in context?

For each unit increase in the (explanatory variable), the (response variable) changes by (insert slope)

How do you determine the degrees of freedom for a chi-square test?

For goodness of fit: columns - 1; For chi square: (rows - 1)(columns - 1)

How is finding the STD for one variable taken many times different from finding the STD for two different variables combined?

For two different variables you combine the variance for each and then square root. For one variable many times, you combine the variance many times and then square root.

If we are counting number of siblings in different families, is our variable discrete or continuous?

Discrete; we can only take countable values.

What is a synonym for "mutually exclusive?"

Disjoint

How do we check for Normality with means?

Either the sample is at least 30 or a graph of the sample data doesn't show any extreme skew or outliers

Does the confidence LEVEL tell us the chance that an interval has to capture the population parameter?

No - not a probability

What is another name for the mean of a probability distribution?

Expected Value

How do you find expected counts for a chi-square goodness of fit test?

Expected proportion or % of the total sample!

What features describe a scatterplot?

F:form (is it linear), O:outliers (any possible points away from the trend), D:direction (positive or negative slope), S:strength (how closely does it follow a line)

Suppose you have a probability distribution for number of apples and get a mean of 3.2. Should we round down to 3?

No. It's an average and can be a decimal.

Are mutually exclusive events independent?

Heck no teckno... when mutually exclusive you CAN'T be in the other event... so they are dependent. Example: If you are a man, you can't get pregnant. Man and pregnant are mutually exclusive. They aren't independent because if you are man, that dictates that you can't be pregnant!

What is the difference between a chi-square homogeneity test and an association/independence test?

Homogeneity is for separate samples (like with an experiment). Association/Independence is comparing groups within one sample.

What is the 68-95-99.7 rule (a.k.a. Empirical rule)?

In a NORMAL distribution 68% of the population is captured within +/- 1 std, 95% within +/- 2 stds, and 99.7 within +/- 3 stds.

What is the difference between a paired t-test and a two sample t-test?

In a paired t test you are finding ONE sample mean and STD from a number of pairs that are dependent! In a two sample test you are comparing two different independent groups!

If we sample from a population that we know is Normal, do we still have to meet the conditions for Normality?

Nope!

What is a contribution?

Individual chi-square values... the largest contribution is the one furthest from the expected!

What is the confidence LEVEL?

It tells us the percent of samples of a given size that will result in an interval that will capture the population.

What do we know about the chi-square distribution?

It's always positive, it's skewed right, and the skew decreases as the degrees of freedom increase.

What is "scope of inference"?

It's how we can extend the results... can we apply the results to just those in the experiment, to the population, or for cause & effect?

What is a P-value?

It's the probability of the sample result or more extreme if the null hypothesis is true.

On the computer output for the LSRL, what is "s"?

It's the standard deviation of the residuals... in other words, the average prediction error!

What are the conditions for a Lin-Reg t-test or t-interval?

L:liner (residuals show no pattern), I:independent observations AND 10% if sampling, N:normal... boxplot of residuals is roughly normal, E:equal variance above and below 0 on the graph of the residuals, R:random (duh!)

What is the confounded problem with lurking variables? (stats jokes)

Lurking variables make it hard for us to determine if the treatment was actually what resulted in a response - these can usually be accounted for by better experimental design. Confounding variables have the same effect but are harder to separate out.

What is a Standard Normal Distribution?

Mean is 0 and std is 1

If I add AND multiply the data by two constants, how do the measures of center and spread change?

Mean: adds and multiplies, STD: only multiplies!!!!!!

When should you write "normalcdf" on a test?

NEVER NEVER NEVER NEVER NEVER NEVER NEVER

Does an association show cause and effect?

NO!!!!

When you combine two different, independent, variables can you just combine the STDs?

NO!!!!!!!!!!!! Only variances can combine... silly!

In the concluding step of the four-step process, can you say your simulation results prove a claim is true?

NO!!!!!!!!!!!! Simulations give us an idea about the likelihood of an outcome but we can't say it PROVES something is true.

What operation(s) effect the shape of a distribution?

NONE!!!

What is the difference between P(x ≥ 290) and P(x > 290) on a Normal distribution?

NOTHING!

How are conditions different for an experiment?

Need random assignment!

When you combine two different, independent, variables by adding or subtracting, how do you find the combined mean and combined STD?

New mean: just add or subtract the means; New STD: find the variances, then add, then square root!

What is the difference between population and sample?

Population is who you are trying to represent... sample is who actually responded. Suppose you have 1200 people in your population and you send surveys to 100 of them... then 42 respond... your sample is NOT the 100... it's the 42... don't get suckered!

What is power?

Power is the strength of a test to reject the null for some center that is different than the null.

How does the Law of Large Numbers relate to probability?

Probability is what happens in the LONG RUN... so you need a lot of numbers to get the true probability; then, large numbers.

What function on the calculator will find the STD for a the probability distribution of a discrete random variable?

One-Var Stats

What is the formula for finding P(A|B)?

P(A and B)/P(B)

If you are using Table B, and the degrees of freedom isn't on the table, should you round down or up?

Round down!!!

If you are solving for the sample size for a desired margin of error and you get n = 94.12, should you round up or down?

Round up!!!

What is an SRS?

Simple Random Sample... every person/thing in the population has an equal chance of being selected (names in a hat, random digits, etc.)

What is sampling variability?

Simply that we know results will vary from sample to sample!

If two events are DEPENDENT, how do you find P(A and B)?

Since P(A|B) = P(A and B)/P(B), then you can rearrange this so P(A and B) = P(A|B)*P(B)!

What does the 10% condition pertain to?

The 10% condition lets us sample without replacement so long as we don't sample more than 10% of the population; when we stay within the 10% condition we can calculate the standard deviation of the sampling distribution.

What does it mean if a test is one-sided?

The alternative hypothesis is either less than or greater than.

What does it mean for two or more events to be mutually exclusive?

The have no outcomes in common.

What is a z-score?

The number of stds from the mean!

What is the relationship between power and Type II error?

The probability of a Type II error is 1 - power.

What is a Type II error?

We failed to reject the null but it was false!

What are the three components to good experimental design?

The three Rs: 1. Random Assignment, 2. Replication - use enough units in the experiment that the results can't be just by chance or, if possible, repeat the experiment more than once, 3. ContRol

What is a completely randomized design?

The treatment groups are assigned completely at random to the experimental units... NO organizing or separating of units beforehand!

What does it mean if P(A and B) = 0?

There are no events in common so A and B are mutually exclusive or disjoint.

What are some forms of NONsampling bias?

These errors happen regardless of how good your sampling method is... 1. Nonresponse: some people just don't respond, 2. Response bias: some people respond the way they think you want them to respond, 3. Question wording: no matter how careful you are, questions are interpreted differently by different groups

What is the Central Limit Theorem?

We know that the sampling distribution of means will approach Normality as we increase sample size - the ideal being a sample size of at least 30!

What is a Type I error?

We rejected the null but it was true... oops.

What features "describe a distribution?"

Think: GSOCS - G: Gaps, S: Shape (symmetric, skew, etc.), O: Outliers, C: Center (median or mean)

What is the purpose of a control group?

To provide a baseline for comparison... it's difficult to argue that a treatment has an effect if you don't have something to compare to!

What does it mean to standardize a score?

Turn it into a z-score!

When you are given a P-value on a computer output, what alternative is it based on?

Two sided

If you're not sure of what symbol to use in a free response question, what should you do?

USE WORDS!

What does it mean to increase power?

We are making it more probable that we reject the null hypothesis!

What does it mean to say results are "statistically significant?"

We were able to reject the null hypothesis.

Under what circumstance does P(A and B) = P(A)*P(B)?

When A and B are INDEPENDENT... you can't just multiply probabilities if one depends on the other!

Under what condition can you find the combined variance of two different variables?

When the outcomes are independent.

When can we generalize to the population?

When the sample comes from the population of interest AND the sample is obtained randomly (doesn't have to be a straight SRS)

When can we establish a cause and effect relationship?

When we conduct an experiment... preferably there is a control group for a baseline comparison!

When can we NOT calculate the standard deviation of the sampling distribution?

When we sample more than 10% of the population!

When do you use a chi-square goodness of fit test?

When you have one categorical variable!

When do I use InvNorm?

When you know the area/proportion/percent/probability and need to work backwards to find a specific value that corresponds to that area/proportion/percent/probability.

How is the random condition different when doing a matched pairs EXPERIMENT?

You don't have to randomly sample but you DO have to randomly assign treatments.

What is a false positive?

You get a positive test result but you shouldn't have. For instance, a woman could do an at-home pregnancy test that shows she is pregnant but then later go to the doctor and find out she isn't pregnant. A false negative would be the opposite: a woman takes a pregnancy test at home that comes back negative but she's actually pregnant.

What does it mean to have paired data?

You have matched pairs, applied treatments to each pair, then are measuring the differences.

How do you prove two variables are independent?

You must show that the likelihood of an event is the same with or without a condition (i.e. the condition doesn't matter)... show P(A|B) = P(A)

What will happen if you give a naked probability answer?

You will get NO credit. Always show where calculations come from.

What are some different forms of SAMPLING bias?

Your method of sampling is bad because... 1. Undercoverage: you don't get a certain group of the population of interest, 2. Voluntary response: people choose to respond, 3. Convenience: not random

What does it mean (hahaha) to be "resistant to outliers?"

a value is not influenced by extremes/outliers

What does the standard deviation describe?

average distance from the mean

Can you use a histogram or a bar graph for categorical data?

bar graph

What calculator function will determine cumulative binomial probabilities?

binomcdf

How do you determine the value of p-hat?

count of successes out of size of sample

How do you find degrees of freedom with a Linear Regression t-test?

df = number of pairs - 2

What is the difference between the explanatory and response variables?

eXplanatory: the independent variable that we are altering (graphed on x axis); response: the dependent variable that we are measuring as a result of altering the explanatory variable (graphed on y axis)

What is "standard error"?

estimating the standard deviation of the sampling distribution from the data

How can we assess Normality?

make a graph

If a distribution is skewed right, how do the mean and median compare?

mean > median

What is the difference between x bar and mu?

mu is for population mean and x-bar is for sample mean

What operation(s) effect the mean AND std?

multiplying or dividing b/c the distribution is stretched or compressed and moves

How do you find the mean for a Binomial random variable?

n times p

How do we check for Normality with proportions?

n times p-hat and n times 1 minus p-hat is at least 10

How do we test for Normality with the sampling distribution of p-hat?

np and n(1-p) have to be at least 10 where n is the sample size and p is the population proportion!

What is the null hypothesis for matched pairs?

the mean difference is equal to zero! (i.e. there is no difference between the two values)

Do you use the combined sample proportion with a confidence interval or a significance test (or both)?

only in a significance test

What does normalcdf find?

proportion, area, percent, or probability on a standard Normal curve based on a z-score

When P is low...

reject the Ho

How do you determine degrees of freedom for a one-sample t-test?

sample size - 1

When asked to describe the sampling distribution what features do you report?

shape, center, spread

How can you find the equation of the LSRL without data?

slope = r(sy/sx)... then use x bar and y bar as a point on the line

How do you find the conservative degrees of freedom for two samples?

smallest sample - 1

A student survey from a random sample of EC students last year found that 42% had committed an act of academic dishonesty on a test. Is 0.42 the parameter or the statistic?

statistic

What is the name of the test for means?

t test

How do we linearize an exponential model?

take the log of y

Where is the mean on a density curve?

the balance point - where would you put a fulcrum?

What does it mean to interpret r squared?

the coefficient of determination tells us the percent variation that can be explained by the altering of the explanatory variable... this can be fairly small because we know there are MANY outside factors that influence the response variable!

What does it mean to interpret r?

the correlation coefficient tells us the strength and direction of a linear relationship

Where is the median on a density curve?

the equal areas point (50th %ile).

What is a critical value?

the number of STDs above and below the point estimate for a confidence interval

What is a confidence INTERVAL?

the point estimate plus or minus the margin of error

What is a point estimate?

the sample result used to estimate the population parameter

What is variance?

the square of the standard deviation

What is a percentile?

the value with p percent of the observations less than or equal to the observation

When will you need Table B?

to find z* and t* with confidence intervals

When do you have to double the P-value that you calculate?

when your alternative is two-sided (not equal)

What is the name of the test for proportions?

z test

What is the difference between z* and t*?

z* is only used for proportions or if we know the population STD for means t* is used for means


Ensembles d'études connexes

Mastering Biology CH 17 homework

View Set

Chapter 33 Environmental Emergencies

View Set

BUS 201 - Chapter 3 -Trials and Resolving Disputes

View Set

EMT Chapter 35 Geriatric Emergencies

View Set

Chemistry Chapter 10 States of Matter

View Set