BBSS-E

¡Supera tus tareas y exámenes ahora con Quizwiz!

T/F A large P-value in a test will favor a rejection of the null hypothesis.

False. A small P-value in a test will favor a rejection of the null hypothesis. The smaller the P-value of the test, the more evidence there is to reject the null hypothesis.

T/F In a hypothesis test, you assume the alternative hypothesis is true.

False. In a hypothesis test, you assume the null hypothesis is true.

Hypothesis Testing

method for testing a claim or hypothesis about a parameter in a population using data measured in a sample (likelihood it would be true)

Null hypothesis (Hsub0)

contains a statement of equality (greather than or equal to, less than or equal to, equal to)

alpha

denotes the level of significance of a hypothesis test, and it is the probability of committing a type I error

Power

probability of rejecting a false null

null distribution

the sampling distribution of outcomes for a test statistic under the assumption that the null hypothesis is true

Use the confidence interval to find the margin of error and the sample mean. (1.66,2.04)

(2.04-1.66)/2 = .19 Margin of error = .19 2.04-.19 = 1.85 Sample mean = 1.85

T-stat

(X-mean)/(SD/root(n))

null hypothesis (H0)

(stated as the null) a statement about a population parameter, such as the population mean, that is assumed to be true

-P-values are probabilities, so they are always a number between .... -The order of ... of the P-value matters more than its exact numerical value.

0 and 1 magnitude

100!/98!

100x99 (everything else cancels out)

t distribution

The sampling distribution of the test statistic

probability rule: the probability that an event A does not occur equals 1 ... the probability that it does occur

minus

alternative, Ha

more general statement that complements yet is mutually exclusive with null hypothesis

use the values population mean = 6.39 sample mean = 5.1 to find the sampling error

sample mean (x-bar) - population mean (mu) = sampling error -1.29

2 biases that can occur

sample selection bias - survivorship bias, excluding situations that haven't survived time-period bias- sensitivity to the starting/ending dates of sample

Playing the game of roulette, where the wheel consists of slots numbered 00, 0, 1, 2, ..., 41 To play the game, a metal ball is spun around the wheel and is allowed to fall into one of the numbered slots.

sample space = {00, 0, 1, 2, ..., 41}. outcomes= 43

Identify the sample space of the probability experiment and determine the number of outcomes in the sample space. -Guessing the last digit in the price of a TV

sample space= 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 outcomes= 10

A lower confidence level C produces a ... margin of error m (more ... less accuracy).

smaller precision

p-value

smallest level of significance at which the null can be rejected if the P is LOW reject the null!!!

significance tests

someone makes claim about unknown value of population parameter tests a specific hypothesis using sample data to decide on validity of hypothesis

The null hypothesis, H0, is a very ... statement about a parameter of the population(s).

specific

List 4 steps to hypothesis testing

state hypothesis, set criteria for decision, compute test stat, make decision

r measures the ... and ... of a linear association

strength direction

Wording effects can influence _____________________________.

survey results

P(A or B) = P(A) + P(B)

true

one sided vs two sided

two sided: symmetric, not equal sign one sided: asymmetric and specific, greater or less signs

For ... tests, 2P(Z>/= absolute value(z)) because of the symmetry of the normal curve

two-sided

summarize data about two categorical variables or factors collected on the same set of individuals

two-way tables

null hypothesis

designated H0, is the hypothesis that the researcher wants to Reject

r^2, the coefficient of ..., is the square of the correlation coefficient

determination

for continuous probabilities events are defined over the ... of values

intervals

... normal calculations- when you are seeking the rang of values that correspond to a given proportion/area under the curve -find the desired area/proportion in the middle of the table -then look at the corresponding z-value from the left column and top row

inverse

Because ... have small chance variation, very small population effects can be highly significant if the sample is large.

large random samples

Higher confidence C implies a ... margin of error m (less precision more ...).

larger accuracy

to calculate the area between two z-values, first get the area under N(0, 1) to the left for each z-value from the table and then subtract the smaller area from the ...

larger area

significance level: alpha

largest P-value tolerated for rejecting H0, decided arbitrarily before conducting test -when p<=a, we reject null -when p>a, we fail to reject null

as the number of randomly drawn observations (n) in a sample increases: -the mean of the sample gets closer and closer to the population mean -the sample proportion gets closer and closer to the population proportion p

law of large numbers

the ... describes what would happen if we took samples of increasing size n

law of large numbers

the area between z1 and z2 = area ... of z1 - area ... of z2

left left

How do you Carry out a hypothesis test

-You assume the null hypothesis is true -Then consider how likely the observed value of the test statistic was to occur -if the likelihood is < a given threshold , then you reject the null hypothesis tests

steps in hypothesis testing

1. Stating the hypotheses. 2. Identifying the appropriate test statistic and its probability distribution. 3. Specifying the significance level. 4. Stating the decision rule. 5. Collecting the data and calculating the test statistic. 6. Making the statistical decision. 7. Making the economic or investment decision.

List the four survey challenges.

1. Undercoverage or selection bias 2. Nonresponse 3. Wording effects 4. Response bias

Name the three sampling processes.

1. Voluntary response sampling 2. Convenience sampling 3. Probability sampling

Level of confidence 90% = critical value ___

1.645

Level of confidence 99% = critical value ___

2.575

A probability experiment consists of rolling a 6-sided die. Find the probability of the event below. rolling a number less than 3

2/6= 0.333

almost all 99.7% of observations are within ... of the mean

3 standard deviations

The access code for a car's security system consists of four digits. The first digit cannot be 1 and the last digit must be odd. How many different codes are available?

4,500 using 0= 9x10x10x5

a sample size of ... or more will typically be good enough to overcome an extremely skewed population and mild outliers in the sample

the poisson distribution is skewed when u < ...

how many different groups of 3 can be selected from 5 ppl

5!/3!(5-3)!=10

A certain lottery has 35 numbers. In how many different ways can 4 of the numbers be selected? (Assume that order of selection is not important.)

52369 nCr= n!/((n-r)!r!) =35!/((35-4)!4!) =1256640/4!

There are 50 members on the board of directors for a certain non-profit institution. If they must elect a chairperson, first vice chairperson, second vice chairperson, and secretary, how many different slates of candidates are possible?

5527200 50x49x48x47

Critical region

A region of the probability distribution which, if the test statistic falls within it, would cause you to reject the null hypothesis

Critical Region

A region of the probability distribution which, if the test statistic falls within it, would cause you to reject the null hypothesis.

Hypothesis =

A statement made about the value of a population parameter

One-tailed hypothesis

Alternate Hypothesis; H(1): p<... and H(1): p>...

Two-tailed hypothesis

Alternate Hypothesis; H(1): p≠...

What does the symbol H1 stand for?

Alternative Hypothesis

Interval Estimate

An interval, or range of values, used to estimate a population parameter

Sixth Step

Compare the OV to the CV

Fifth Step

Compute the CV using the appropriate table

Fourth Step

Compute the test statistic value to get the OV

Statistical significance only says whether the effect observed is likely to be due to chance alone because of random sampling

Doesn't tell about the magnitude of the effec May not be practically important With large sample size, small effect can be signficant

T/F You toss a fair coin nine times and it lands tails up each time. The probability it will land heads up on the tenth flip is greater than 0.5.

False- You toss a fair coin nine times and it lands tails up each time. The probability it will land heads up on the tenth flip is exactly 0.5.

The probability that event A or event B will occur is P(A or B)=P(A)+P(B)−P(A or B).

False. -P(A and B)

One- tailed test

Hypothesis tests with alternative hypotheses in the form H1 : p < and H1: p >

Two-tailed tests

Hypothesis tests with an alternative hypothesis in the form H1 : p /=

Null Hypothesis (Ho)

Hypothesis that you asume as correct.

Seventh Step

If the OV is greater than the CV, the null is incorrect. If the OV is less than the CV, the null is correct.

Error rates

Occur when we've made a mistake in drawing our statistical conclusion, There are Type I and Type II errors.

general addition rule: P(A or B)

P(A) + P(B) - P(A and B)

multiplication rule for independent events: if A and B are independent then: P(A and B) =

P(A)P(B)

general multiplication rule: the probability that any two events, A and B, both occur is: P(A and B) =

P(A)P(BlA)

Baye's theorem: if we know the conditional probability P(BlA) and the individual probability P(A) we can use Baye's theorem to fine the continual probability of ...

P(AlB)

when two events A and B are independent, P(AlB) = ... because no information is gained from the knowledge of event A

P(B)

for example, for P(x</= 2) =

P(X = 0) + P(X = 1) + P(X = 2)

A ... quantifies how strong the evidence is against the H0. But if you reject H0, it doesn't provide any information about the true population mean µ.

P-value

The probability, if H0 was true, of obtaining a sample statistic at least as extreme (in the direction of Ha) as the one obtained.

P-value

Decide if the situation involves permutations, combinations, or neither. Explain your reasoning. The number of ways 19 people can line up in a row for concert tickets. Does the situation involve permutations, combinations, or neither?

Permutations. The order of the 19 people in line matters.

general decision rule for a two-tailed test is

Reject H0 if: test statistic > upper critical value or test statistic < lower critical value

parametric test

T-test, Z-test, chi square, f-test 1. concerned w parameters (mean/variance) 2. the validity depends on a definite set of assumptions concerned with the parameters of distribution

Construct the indicated confidence interval for the population mean μ using the t-distribution. c=0.95, x bar=13.1, s=3.0, n=5

Tc= 2.776 margin of error= 3.7 xbar +- 3.7 (9.4, 16.8)

Alternative Hypothesis H1

Tells you about the parameter if your assumption is shown to be wrong

What is the alternative hypothesis, H₁?

Tells you about the parameter if your assumption is shown to be wrong

Significance level =

The PB of rejecting the null hypothesis when its true. Eg a sig level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference

Critical value

The first value to fall inside of the critical region

Null Hypothesis H0

The hypothesis that is assumed to be correct

What is the null hypothesis, H₀?

The hypothesis that you assume to be correct

Write a statement that represents the complement of the given probability. The probability of randomly choosing a tea drinker who has a college degree (Assume that you are choosing from the population of all tea drinkers.)

The probability of choosing a tea drinker who does not have a college degree

What is the actual significance level of a hypothesis test?

The probability of incorrectly rejecting the null hypothesis

Actual significance level

The probability of incorrectly rejecting the null hypothesis.

In the general population, one woman in ten will develop breast cancer. Research has shown that 1 woman in 650 carries a mutation of the BRCA gene. Seven out of 10 women with this mutation develop breast cancer.

The probability that a randomly selected woman will develop breast cancer given that she has a mutation of the BRCA gene= 0.7 The probability that a randomly selected woman will carry the gene mutation and develop breast cancer= (0.7x(1/650)= 0.0011 dependent

Test statistic =

The result of the experiment Or The statistic that's is calculated from the sample

What is the test statistic?

The result of the experiment or the statistic that is calculated from the sample (eg the number of heads in 8 tosses)

A combination is an ordered arrangement of objects.

The statement is false. A true statement would be "A permutation is an ordered arrangement of objects." A permutation is an ordered arrangement of objects. The number of different permutations of n distinct objects is n!. On the other hand, a combination is a selection of r objects from a group of n objects without regard to order and is denoted by nCr.

Why do we standardize to a t-scale?

We standardize to a t-scale, allowing us to use 1 test for every situation

Point estimate

a single value estimate for a population parameter

hypothesis

a statement or proposed explanation for an observation, a phenomenon, or a scientific problem that can be tested using the research method (a hypothesis is often a statement about the value for a parameter in a population)

alternate hypothesis (H1)

a statement that directly contradicts a null hypothesis by stating that the actual value of a population is less than, greater than, or not equal to the value stated in the null hypothesis

effect size

a statistical measure of the size of an effect in a population, which allows researchers to describe how far scores shifted in the population, or the percent of variance that can be explained by a given variable

one-sample z test

a statistical procedure used to test hypothesis concerning the mean in a single population with a known variance

Type III error

a type of error possible with one-tailed tests in which a decision would have been to reject the null hypothesis, but the researcher decides to retain the null hypothesis because the rejection region was located in the wrong tail (the "wrong tail" refers to the opposite tail from where a difference was observed and would have otherwise been significant)

rejection point/ critical value for the test statistic

a value with which the computed test statistic is compared to decide whether to reject or not reject the null hypothesis

A standard deck of cards contains 52 cards. One card is selected from the deck. (a) Compute the probability of randomly selecting a heart or diamond (b) Compute the probability of randomly selecting a heart or diamond or spade. (c) Compute the probability of randomly selecting an eight or spade

a.) 0.5 (26/52) b.) 0.75 c.) 0.308 (4+13-1)

The odds of an event occurring are 2:6. Find (a) the probability that the event will occur and (b) the probability that the event will not occur.

a.) 2+6=8 so 2/8=0.25 b.)6/8= 0.75

z statistic

an inferential statistic used to determine the number of standard deviations in a standard normal distribution that a sample mean deviates from the population mean stated in the null hypothesis

Experiments compare the response to a given treatment versus _____________________, _______________________, and or _____________________.

another treatment; the absence of treatment, a control; a placebo, a fake treatment

central limit theorem components: -the larger the sample size n, the better the ... of normality -many statistical tests assume normality for the sampling distribution and the central limit theorem tells us that, if the sample size is large enough, we can safely make this assumption even if the raw data appear ...

approximation non-normal

probabilities are computed as ... under the corresponding portion of the density curve for the chosen interval

areas

A one-tail or one-sided alternative is ... and ...: -Ha: µ < [a specific value or another parameter] OR -Ha: µ > [a specific value or another parameter]

asymmetric specific

chi-square distributions

asymmetrical family of distributions- a dif distribution exsists for each possible value of degrees of freedom bounded below by 0. only positive values bc X^2 use for tests concerning the variance of a single normally distributed population sensitive to violations of its assumptions. if sample not random or if it does not come from a normally distributed population, inferences might be faulty

The P-value is the area under N(µ0, σ√n) for values of x̅ ... in the direction of Ha as that of our random sample.

at least as extreme

What determines the choice of a one-sided versus two-sided test is the question we are asking and what we know about the problem ... performing the test. If the question or problem is asymmetric, then Ha should be one-sided. If not, Ha should be two-sided.

before

Whats a type 2 error also called?

beta

the ... counts the number of ways in which k successes can be arranged among n observations

binomial coefficient

the number of ways of arranging k successes in a series of n observations (with constant probability p of success) is the number of possible combinations (unordered sequences)

binomial coefficient

the center and spread of the ... for a count X are defined by the mean (u) and standard deviation (o)

binomial distribution

... are models for some categorical variables, typically representing the number of successes in a series of n independent trials

binomial distributions

P(X = k)

binomial probability

ex. we can conclude that there is a correlation between bear lengths and weights but we cannot conclude that greater lengths cause more weight

causality

association, however strong, does not imply ...

causation

confidence interval equation

center +/- margin of error (m) xbar +/- z*sigma / sqrt(n) -confidence level C represents an area of corresponding size C under sampling distribution

Use the given statement to represent a claim. Write its complement and state which is H0 and which is Ha. mu greater than or equals 568

complement of the claim: mu < 568 H0: mu greater than or equal to 568 Ha: mu less than 568

alternative hypothesis (Hsub-a)

complement of the null hypothesis statement that must be true if H0 is false and it contains a statement of strict inequality (greater than, less than, not equal to)

the distribution of one factor for each level of the other factor

conditional distribution

... reflect how the probability of an event can be different if we know that some other event has occurred or is true

conditional probabilities

A ... gives a black and white answer: Reject or don't reject H0. But it also estimates a range of likely values for the true population mean µ.

confidence interval

The ... C determines the value of z* (in Table C). The ... also depends on z*.

confidence level margin of error

we say that two variables are ... when their effects on a response variable cannot be distinguished from each other

confounded

Match the level of confidence, c=0.98, with its representation on the number line, given x bar =56.7, σ=8.9, and n=55.

constructing a confidence interval (mu) find Zc of c=0.98 OR STAT-->TESTS-->7 (z Interval) enter data and c-level as .98= 2.8 (interval estimate) xbar - interval

think of the poisson distribution as describing the number of items in ...

containers

.... contain an infinite number of events

continuous sample spaces

What range is a small Cohens d?

d < .2

Whats a large Cohens d?

d> .8

caution

data must be probability sample or come from randomized experiment sampling distribution must be approximately normal to use z procedure, we must know sigma

test statistic

degrees of freedom: n-1

we use ... to model continuous probability distributions because they assign probabilities over the range of values making up the sample space

density curves

If you picked different samples from a population, you would probably get ... sample means ( x̅ ) and virtually none of them would actually equal the true population mean, u.

different

Case-control studies start with two random samples of individual with ____________________________, and look for exposure factors in the subjects' past.

different outcomes

have a sample space that is made up of a list of individual outcomes

discrete probability models

discrete variables that can take on only certain values (a whole number or a descriptor)

discrete sample space

positive predictive value: P( ....l....)

disease positive test

two events are ... or ... if they can never happen together or have no common outcome

disjoint mutually exclusive

A _____________________ experiment is one in which neither the subjects nor the experimenter know which individuals received which treatment until the experiment is completed.

double-blind

What should you do in step 4?

draw a normal dist

The individuals in an experiment are the _________________________. If they are human, we call them _________________.

experimental units; subjects

Cross-sectional studies measure the ________________ and the ________________ at the same time.

exposure; outcome

predication outside the range is ... which you should avoid

extrapolation

the binomial coefficient "n choose k" uses the ... notation "!"

factorial

The explanatory variables in an experiment are often called _____________.

factors

Step 3

find test statistic( z score) and alpha

directional tests, one-tailed tests

hypothesis tests in which the alternative hypothesis is states as greater than (>) or less than (<) a value stated in the null hypothesis (hence the researcher is interested in a specific alternative to the null hypothesis)

uncertainty and confidence

if you picked different samples from a population, you would get different sample means (xbar) and virtually none of them would actually equal the true population mean, u

One way to increase the precision of a confidence interval without decreasing the level of confidence is to ___

increase the sample size

Whats the relationship btw effect size, sample, and power?

increases

two events are ... knowing that one event is true to has happened does not change the probability of the other event

independent

test statstic for a test of differences between 2 populations. (normal distribution/ variances unkown but assumed equal) or variances are not assumed equal!

independent random sample pooling!! when variances are assumed equal

an observation that markedly changes the regression if removed; this is often an isolated point

influential individual

an association may exist between x and y even when there is no significant linear correlation; could be a nonlinear association

linearity

... is a variable that is not among the explanatory or response variables in a study, and yet may influence the relationship between the variables studied

lurking variable

Observational studies often fail to yield clear causal conclusions, because the explanatory variable is confounded with ___________________.

lurking variables

confidence level and margin of error

m=z*sigma / sqrt(n) higher C=larger margin of error, less precision and more accuracy lower C=smaller margin of error, more presion and less accuracy .90...z*: 1.645 .95...z*: 1.96 .99...z*: 2.575

Statistical significance doesn't tell about the ... of the effect.

magnitude

A confidence interval ("CI") can be expressed as: -a center ± a ... m: μ within x̅ ± m -an ...: μ within (x̅ − m) to (x̅ + m)

margin of error interval

we can examine each factor in a two-way table separately by studying the row totals and column totals because they represent the ... expressed in percents

marginal distributions

Test Statisitc

math formula that identifies how far and how many standard deviations a sample outcome is from the value stated in a null hypothesis

Define Cohen's d

measures number of SDs from effect, z value up, pop effect up

-You may need a certain margin of error (e.g., drug trial, manufacturing specs). In many cases, the population variability (σ) is fixed, but we can choose the number of measurements (...). -Using simple algebra, you can find what sample size is needed to obtain a desired margin of error.

Binomial distribution conditions: -the total number of observations ... is fixed in advance -each observation falls into just one of two categories: ... and ... -the outcomes of all n observations are statistically ... -all n observations have the same probability p of ...

n success failure independent "success"

find what sample size is needed to obtain desired margin of error

n = (z*sigma/m)^2

n over k =

n!/k!(n k)!

For the same confidence level, ... confidence intervals can be achieved by using ... sample sizes.

narrower large

when x is smaller than the mean, the z is ...

negative

differences in the means

no paired observations form the 2 samples not independent samples

sometimes we are just told that a variable has an approximately ... distribution

normal

the sample size depends on the population distribution and more observations are required if the population distribution is far from ...

normal

... are used to model many biological variables and they can describe a population distribution or a probability distribution

normal curves

To test H0: µ = µ0 using a random sample of size n from a Normal population with known standard deviation σ, we use the ... N(µ0, σ√n).

null sampling distribution

the standard deviation of the sampling distribution of means is ...

o/sqrt n

binomial distributions describe the possible number of times that a particular event will occur in a sequence of ...

observations

The Hawthorne Effect also known as the ___________________________ is a term used to describe a type of bias that may occur due to behavior modification because of study enrollment.

observer effect

type II error

occurs if the NULL hypothesis is not rejected when it's false

Type I error

occurs if the NULL hypothesis is rejected when it's true

odds v. probability

odds of 2:3 (2/3) means probability of success is 2/5

If you obtain a different t value between two examples, what might the difference be from?

one tailed versus two tailed, assuming alpha is the same.

For ... tests, P(Z >/= z) or P(Z </= z)

one-sided

A matched pairs design is a repeated measures design if the experiment involves only _____________________________________________.

only one individual undergoing two treatments

You randomly select one card from a standard deck. Event A is selecting a king. Determine the number of outcomes in event A. Then decide whether the event is a simple event or not.

outcomes= 4 simple event= no

an observation that lies outside the overall pattern

outlier

... have unusually large residuals (in absolute value)

outliers

A statistic is unbiased if it does not ___

overestimate or underestimate the population parameter

c-confidence interval for a population proportion p

p hat - E is less than p is less than p hat + E where E = z score (square root of [p hat times q hat] / n)

DEPENDENT test concerning mean differences

paired observation comparison tests- stat test for differences in dependent items use single t-test just w mean and sd of differences

mean of differences

paired observations from the 2 samples 2 independent samples

simple linear regression: data comes in ... (xi, yi) where xi, is the ith observation for variable x and yi is the ith observation for variable y

pairs

Undercoverage or selection bias occurs when ________________________________________.

parts of the population are systematically left out

Nonresponse occurs when ___________________________________________.

people choose not to participate

Response bias occurs when _________________________.

people lie

-When we take a random sample, we can compute the sample mean and an interval of size ... around the mean. -Based on the ~68-95-99.7% rule, we can expect that: ... of all intervals computed with this method capture the parameter μ.

plus-or-minus 2σ/√n ~95%

q hat

point estimate for population proportion of *failures*

p hat

population proportion

Point estimate for p

population proportion of successes

... is a sample statistic representing the population correlation coefficient p

the value of ... is always in between -1 and 1 or -1 </= r </= 1

each factor can have any number of levels and if the row factor has "r" levels and the column factor has "c" levels, we say that the two-way table is an "..." table

r by c

the value of ... is the proportion of variation in y that is explained by x

r^2

in a ... event, outcomes are uncertain, but there is nonetheless a regular distribution of outcomes in a large number of distributions

random

Requirements for making inferences about p, using r: 1. Paired data (x,y) must be a ... 2. A scatterplot must confirm that the points approximate a ... pattern 3. Outliers should be removed if they are known to be ...

random sample straight-line errors

in addition to x, there may be a variety of other factors affecting y, such as ... or other factors not included in the study

random variation

confidence interval

range of values with an associated probability -quantifies chance that interval contains unknown population parameter

When P-value ≤ α, we ... H0.

reject

When p < .05 (and equal .05 ) what do you do?

reject

What are the two decisions that you can make from performing a hypothesis test?

reject the null hypothesis fail to reject the null hypothesis

When the z score falls within the ... region (shaded area on the tail-side), the p-value is smaller than α and you have shown ....

rejection statistical significance

Experiments use _____________________: several or many individuals are studied.

replication

the vertical distances from each point to the least-squares regression line are called ... and the sum of all the residuals is by definition 0

residuals

When p > .05 what do you do?

retain (fail to reach significance)

Step 4

retain or reject null hypothesis

Case-control studies -> ________________________

retrospective

If n is not a whole number, then ___

round n up to the next whole number

Determine the number of outcomes in the event. Decide whether the event is a simple event or not. - A computer is used to select randomly a number between 1 and 9, inclusive. Event C is selecting selecting a number greater than 4.

sample space= 9 9-4=5 Event C= 5 outcomes simple event? no bc C has more than 1 outcome

A ________________________ is an observational study that relies on a random sample drawn from the entire population.

sample survey

different random ... taken from the same population will give different ... but there is a predictable pattern in the long run

samples statistics

a ... describes what would happen if we took all possible random samples of a fixed size n

sampling distribution

you should begin any investigation into the association between 2 variables by constructing a ...; that can have a positive, negative or no correlation

scatterplot

The ..., is the largest P-value tolerated for rejecting H0 (how much evidence against H0 we require). This value is decided arbitrarily before conducting the test.

significance level, α

Someone makes a claim about the unknown value of a population parameter. We check whether or not this claim makes sense in light of the "evidence" gathered (sample data).

significance tests

Define hypothesis

statement/explanation for obs from pop that can be testing w/research method

if the data have approximately a normal distribution, the normal quantile plot will have a roughly ... pattern

straight line

A ________________________________ has percentages of individuals of certain types.

stratified random sample

Cross-sectional studies -> ______________________

surveys

A two-tail or two-sided alternative is ...: Ha: µ ... [a specific value or another parameter]

symmetric not =

Use the given confidence interval to find the margin of error and the sample mean. (13.7,23.1)

take the avg of the endpoint (add both values and divide by 2) subtract the mean from upper endpoint to find margin of error mean- 18.4 margin of error-4.7

Use the given confidence interval to find the margin of error and the sample proportion. (0.772,0.798)

take the avg to find mean subtract mean to find an endpoint mean=0.785 margin of error= 0.013 p hat= margin of error + left endpoint= 0.785 margin of error= subtract endpoints and divide answer by 2

Sampling Error

the difference between the point estimate and the actual parameter value

permutations

the differences 5P3= 5!/2!

alpha level

the level of significance or criterion for a hypothesis test; is the largest probability of committing a Type I error that we will allow and still decide to reject the null hypothesis

p value

the probability of obtaining a sample outcome, given that the value stated in the null hypothesis is true (the p value for obtaining a sample outcome is compared to the level of significance)

Type I error

the probability of rejecting a null hypothesis that is actually true (researchers directly control for the probability of committing this type of error)

Type II error (beta error)

the probability of retaining a null hypothesis that is actually false

Population Proportion

the probability of success in a single trial of a binomial experiment

Level of Confidence

the probability that the interval estimate contains the population parameter, assuming that the estimation process is repeated a large number of times

The point estimate for p is given by ___

the proportion of successes in a sample and is denoted by p = x/n where x is the number of successes in the sample and n is the sample size.

obtained value

the value of a test statistic (often compared to the critical value(s) of a hypothesis test to make a decision, when the obtained value exceeds a critical value, we decide to reject the null hypothesis, otherwise we retain the null hypothesis)

use a z test when

the variance is KNOWN

binomial parameters: -the parameter n is the ... number of observations -the parameter p is the probability of ... on each observation -the count of successes X can be any whole number between ... and ...

total successes 0 n

In a completely randomized experimental design, individuals are randomly assigned to groups, and then the groups are randomly assigned to ___________________.

treatments

... are used to represent probabilities graphically and facilitate computations

tree diagrams

What should you do in step 1?

use both words and symbols

f-test

use for tests concerning the inequality of 2 variances family of asymmetrical distributions bounded from below by 0 (like chi squared) defined by 2 degrees of freedom right skewed and rejection is always on the right side

the standard deviation of the sampling distribution measures how much the statistic x-bar ... from sample to sample

varies

null hypothesis, H0

very specific statement about parameter of populations

permutation

ways in which things are ordered -fit 7 ppl into 3 chairs -how many ways can we fit 3 balls into 2 cups? 7x6x5 OR 3x2

use of sampling distribution

we take one random sample of size n and rely on known properties of sampling distribution -remember 68, 95, 99.75% rule

lower and higher confidence levels present a ... situation

win/lose

variable ... is the independent, predictor, or explanatory variable

p hat =

x/n

Find the margin of error for the given values of c, s, and n. c=0.90, s=5, n=24

1.714 times (5/sqrt24)= 1.7

Level of confidence 95% = critical value ___

1.96

the ... is the count multiplied by the probability of any specific arrangement of the k successes

binomial probability

F-stat

sample deviation 1 ^2 / sample deviation 2 ^2

The most unbiased point estimate of the population mean is the ___

sample mean

Find the critical value Tc for the confidence level c=0.99 and sample size n=14.

3.012 T-distribution table: look up 0.99 and n-1 (13) and find corresponding number

Let p be the population proportion for the following condition. Find the point estimates for p and q. In a survey of 1487 adults from country A, 738 said that they were not confident that the food they eat in country A is safe. The point estimate for p hat is... q hat...

738/1487= 0.496 q hat= 1- Ans= 0.504

Hypothesis

A statement made about the value of a population parameter

P(AlB) not equal to P(BlA)

Baye's theorem

(T/F) To estimate the value of p, the population proportion of successes, use the point estimate x.

False, to estimate the value of p, use the point estimate p hat = x/n

3 things into 2 spaces

P (3,2)

the conditional probability of event b, given event A is: P(BlA) =

P(A and B)/P(A)

addition rule for disjoint events: P(A or B) =

P(A) + P(B)

a list or description of all possible outcomes of a random process

S or sample space

pooling

SP2 use when u you assume population variances are equal estimate drawn from the combination of two different samples

Third Step

Select the appropriate test statistic

Second Step

Set the level of risk (alpha level)

Find the P-value for the indicated hypothesis test with the given standardized test statistic, z. Decide whether to reject H0 for the given level of significance α. Two-tailed test with test statistic z=−2.18 and α=0.02

Since this is a two-tailed test and the test statistic is left of center, the P-value is twice the area to the left of the test statistic z-score-= normalcdf(-10000,-2.18,0,1) multiplied by 2 to find p-value= 0.0292

... P-values are strong evidence AGAINST H0 and we reject H0. The findings are "statistically significant."

Small

First Step

State the null and research hypotheses

Alternative Hypothesis (Ha)

The claim about the population that we are trying to find evidence for.

Polling Organisations

Use small samples to make inferences about a population

Why don't we use the sample mean during hypothesis testing?

We do not use the sample mean because we are using inferential statistics, not descriptive statistics

When do we reject the null hypothesis?

We reject the null hypothesis when the sample mean falls within the critical region

factorials (!)

used without replacement

When taking a random sample from a Normal population with known standard deviation σ, a level C confidence interval for µ is: x-bar +/- .../sqrt(n) or x-bar +/- m -m is the margin of error for this level C confidence interval. It is calculated using a ... (z*) and the standard deviation -σ/√n is the standard deviation of the ... distribution -C is the area under the ... between −z* and z*

z* z critical value sampling N(0,1)

a ... measures the number of standard deviations that a data value x is from the mean u

z-score

we can standardize data by competing a ...

z-score

for a normal quantile plot: the data points are ranked and the percentile ranks are converted to ...; the z-scores are then used for the ... axis and the actual data values are used for the ... axis; use technology to obtain normal quantile plots

z-scores horizontal vertical

if r is close to ..., we conclude that there is no significant linear correlation between x and y

zero

the area under N(0,1) for a single value of z is ...

zero

hypothesis testing

Comparing sample mean to the null hypothesis - the hypothesized/population value. If your data is unlikely, the null is rejected and if your data is likely, you fail to reject the null. As well, the NULL CAN NEVER BE ACCEPTED.

Find the critical value(s) and rejection region(s) for the type of z-test with level of significance α. Include a graph with your answer. Right-tailed test, α=0.10

The critical values are z=1.28 (1-alpha= 0.9; invNorm(0.9,0,1)= 1.28 The rejection region is z >1.28 Pick the right-tailed graph

degrees of freedom

The number of individual scores that can vary without changing the sample mean. Statistically written as 'N-1' where N represents the number of subjects.

p-value

The probability of observing a value as extreme as your data or greater under the assumption that the null hypothesis is true

Critical Value of a Test Statistic (tcrit)

The value of a test statistic that corresponds to a specified level of chance probability. Can determine with qt() function or from a table. You need to say how much that area is and the degrees of freedom.

What is the purpose of the null hypothesis?

To state that there is no difference within the experiment regarding intervention/manipulation

What is the purpose of the alternative hypothesis?

To state that there will be a significant difference somewhere based on intervention/manipulation

Hypothesis

a statement or proposed explanation for an observation, phenomenon, or scientific problem that can be tested using the research method. Often a statement about the value for a parameter in a population

Null Hypothesis

assuming something is true

We have ... that μ falls within the interval computed.

confidence C

Alpha

is always towards the tail, it is the cutoff point, it reveals a surprising value that would reject the null

a standardized sampling distribution is a

t distribution or standard normal distribution

P-value

the probability of obtaining a sample outcome , given that the value stated in the null hypothesis is true

Type 1 Error

the probability of rejecting a null hypothesis that is actually true

Z-stat

(X-mean)/(SD/root(n))

Find the minimum sample size n needed to estimate μ for the given values of c, σ, and E. c=0.98, σ=8.2, and E=2 E= margin of error c- confidence level

(Zc(sigma)/E ) squared Zc= 1-.98=0.02 1-0.02/2=0.99 invNorm(0.99)= 2.33 (2.33x6.4/1) squared=

power

(in hypothesis testing) the probability of rejecting a false null hypothesis (specifically, the probability that a randomly selected sample will show that the null hypothesis is false when the hypothesis is indeed false)

Determine whether the events E and F are independent or dependent. Justify your answer. - E. A person living at least 70 years. F: The same person regularly handling venomous snakes - E: A randomly selected person finding cheese revolting F: Another randomly selected person finding cheese delicious - E: The unusually foggy weather in London on May 8 F: The number of car accidents in London on May 8

-E and F are dependent because regularly handling venomous snakes can affect the probability of a person living at least 70 years - E cannot affect F and vice versa because the people were randomly selected, so the events are independent. -The unusually foggy weather in London on May 8 could affect the number of car accidents in London on May 8, so E and F are dependent.

Based on Cohen's d what is a small effect?

.2 - .5

What's a medium Cohens d?

.2 < d < .8

Based on Cohen's d what is a medium effect?

.5 - .8

binomial distributions ar skewed when p is close to ... or close to ... especially if the sample is small

0 1

probability rule: probabilities range from...

0 to 1

if x has the N(u, o) distribution than z has the N(...) distribution

0, 1

You toss a coin and randomly select a number from 0-9. What is the probability of getting tails and selecting a 9?

0.05 (1/20)

A probability experiment consists of rolling a 20-sided die. Find the probability of the event below. rolling a prime number

0.4

Nine of the 50 digital video recorders (DVRs) in an inventory are known to be defective. What is the probability you randomly select an item that is not defective?

0.82 (50-9=41) (41/50)

probability rule: the probability of the complete sample space S must equal ...

the closer r^2 gets to ..., the better the model explains the data

the total are under a density curve represents the whole population (sample space) and equals ... (100%)

when x is ... standard deviation larger than the mean then x = 1

68% of all observations are within ... of the mean

1 standard deviation

Steps of Hypothesis testing

1) State the hypothesis 2)Set the criteria for a decision 3)Compute the test statistic 4)Make decision

Assuming that no questions are left unanswered, in how many ways can a ten-question true/false quiz be answered?

1,024 2x2x2x2x2x2x2x2x2x2=1024

Find the critical value zc necessary to form a confidence interval at the level of confidence shown below. c = 0.81

1-0.81 = .19 .19/2 = .095 use technology or insert into a t-table to find the answer 1.31

Find the critical value Zc necessary to form a confidence interval at the level of confidence shown below. c=0.89

1-0.89 divided by 2 invNorm(Ans = 1.6

q hat

1-p hat

t-tests examples

1. Chi-Squared 2.Single sample 3. Paired 4. Two-sampled

Constructing a Confidence Interval for a Population Proportion

1. Identify the sample statistics n and x 2. Find the point estimate p hat 3. Verify that the sampling distribution of p hat can be approximated by a normal distribution 4. Find the critical value, that corresponds to the given level of confidence c 5. Find the margin of error E 6. Find the left and right endpoints and form the confidence interval

The Belmont Report was created partly in response to the Tuskegee Syphilis Study. Name the three main aims of the report.

1. Respect for persons 2. Beneficence 3. Justice

Hypothesis testing procedure

1. State hypothesis 2. Select appropriate test statistic 3. Specify level of significance 4. State the decision rule regarding the hypothesis 5. Collect the sample and calculate the sample statistics 6. Make a decision regarding the hypothesis 7. Make a decision based on the results of the test

Constructing a Confidence Interval for a Population Mean

1. Verify that standard deviation is known, and either the population is normally distributed or n is greater than or equal to 30 2. Find the sample statistics n and x bar 3. Find the critical value that corresponds to the given level of confidence 4. Find the margin of error E 5. Find the left and right endpoints and form the confidence interval

nonparametric test

1. not concerned with parameters 2. makes minimal assumptions about the populations from which the sample comes from use when: 1. when the data we use does not meet distribution assumptions 2. the data are given in ranks 3. hypothesis we are addressing does not concern a parameter

2 types of Hypotheses

1. null- what we want to reject. what we are testing for (Ho) 2. alternative- what we accept when the null hypothese is rejected (Ha)

4 possible outcomes to testing a null hypothesis

1. we reject a false null hypothesis- this is correct 2. we do not reject a true null hypothesis- this is correct 3. we reject a true null hypothesis- type 1 error 4. we do not reject a false null hypothesis- type 2 error

level of confidence 90% 95% 98% 99%

1.645 1.96 2.33 2.575

You have 13 different video games. How many different ways can you arrange the games side by side on a shelf?

13!

In a random sample of 50 refrigerators, the mean repair cost was $139.00 and the population standard deviation is $15.70. A 90% confidence interval for the population mean repair cost is (135.35,142.65). Change the sample size to n=100. Construct a 90% confidence interval for the population mean repair cost. Which confidence interval is wider? Explain.

15.7 divided by sqrt of 100 times 1.645 plus and minus the mean to get (136.42, 141.58) The n=50 confidence interval is wider because a smaller sample is taken, giving less information about the population.

when x is ... standard deviations larger than the mean then x = 2

Researchers found that people with depression are five times more likely to have a breathing-related sleep disorder than people who are not depressed. Identify the two events described in the study. Do the results indicate that the events are independent or dependent?

2 events= depressions and breathing-related sleep disorder dependent

about 95% of all observations are within ... of the mean

2 standard deviations

a sample size of ... or more is generally enough to obtain a normal sampling distribution from a skewed population, even with mild outliers in the sample

Outside a home, there is a 6-key keypad with letters A, B, C, D, E and F that can be used to open the garage if the correct six-letter code is entered. Each key may be used only once. How many codes are possible?

6!= 720

A restaurant offers a $12 dinner special that has 7 choices for an appetizer, 11 choices for an entrée, and 4 choices for a dessert. How many different meals are available when you select an appetizer, an entrée, and a dessert?

7x11x4=308

Space shuttle astronauts each consume an average of 3000 calories per day. One meal normally consists of a main dish, a vegetable dish, and two different desserts. The astronauts can choose from 10 main dishes, 9 vegetable dishes, and 14 desserts. How many different meals are possible?

8190 10x9x (desserts) 2 desserts (14x13/2)

Margin of error for the given values of c, σ, and n. c = .9 σ = 5.1 n = 121

90% = 1.645 E = zc (σ/√n) E = 1.645 (5.1/ √121) = .763

You are given the sample mean and the population standard deviation. Use this information to construct the 90% and 95% confidence intervals for the population mean. Interpret the results and compare the widths of the confidence intervals. A random sample of 40 home theater systems has a mean price of $145.00. Assume the population standard deviation is $15.50. n= 40 mu= 145 sigma= 15.50

90%- find the margin of error Zc 90%= 1.645 1.645(15.5/sqrt40)= 4.03 find the left endpoint (145-4.03)=140.97 find the right endpoint (145+4.03)=149.03 With 90% confidence, it can be said that the population mean price lies in the first interval. With 95% confidence, it can be said that the population mean price lies in the second interval. The 95% confidence interval is wider than the 90%.

For the same sample statistics, which level of confidence would produce the widest confidence interval?

99%, because as the level of confidence increases, zc increases.

Based on Cohen's d what is a large effect?

> .8

use the normal approximation for binomial when both np and nq are ...

>/= 10

Significance level

A critical probability associated with a statistical hypothesis test that indicates how likely an inference supporting a difference between an observed value and some statistical expectation is true.

When you calculate the number of permutations of n distinct objects taken r at a time, what are you counting?

A permutation is an ordered arrangement of objects. The number of different permutations of n distinct objects is n!. The number of ordered arrangements of n objects taken r at a time.

How many different 10-letter words (real or imaginary) can be formed from the following letters? B, B Z, Z N, N J, A, K, C

A permutation of nondistinct items without replacement is the number of ways n objects can be arranged (order matters) in which there are n1 of one kind, n2 of a second kind, and n Subscript k of a kth kind, where n=n1+n2 +..+nk. The number of such permutations is given by the following formula. 10!/2!x2!x2!=453600

paired samples t-test

A special type of single-sample t-test. The sampling unit gets both treatments and within it, we take the difference between each pair of measurements and compare whether the difference between pairs of data is different from a mean of 0. Ho: There is no difference between the two groups (mean difference=0) Ha: There is a difference between the two groups (doesn't = 0)

For the same sample statistics, which level of confidence would produce the widest confidence interval?

As the level of confidence increases, Zc increases causing wider intervals. 99%, because as the level of confidence increases, Zc increases.

we express a binomial distribution for the count X of successes among n observations as a function of the parameters n and p:

B(n, p)

Hypothesis Testing

Can help polling organisations to assess the accuracy of their predictions

Classify the statement as an example of classical probability, empirical probability, or subjective probability. Explain your reasoning. The probability of choosing five numbers from 1 to 36 to match five numbers drawn by the lottery is 1/376,992 almost equals 0.0000027 .

Classical because each outcome in the sample space is equally likely

Give at least one difference and one similarity between "hypothesis testing" and "estimation" for a population mean.

Differences: 1) Hypothesis testing leads to a yes/no (reject/fail to reject) decision while estimation produces a numeric value (e.g. estimate of the population mean), or a pair of values (.e.g confidence interval for the population mean), but does not result in an immediate yes/no decision. 2) Hypothesis testing answers the question "is a pre-determined population mean likely, based on what we saw in the sample?" Estimation of the mean provides an estimate of the population mean, working from the information in the sample. Similarities: 1) Both hypothesis testing and estimation are done using one sample from the population to make an inference about the population. 2) Both hypothesis testing and estimation rely on the sample being randomly selected from the population. 3) Both hypothesis testing and estimation make use of the sample mean and the standard error from the sample as part of their calculations. They just finish up with different calculations afterwards. 4) Both hypothesis testing and estimation involve the concept of the sampling distribution, and relating the probability of a 'rare event' in the tails of that sampling distribution to what was observed in the sample.

Explain how the complement can be used to find the probability of getting at least one item of a particular type.

Getting "none of the items" is the set of all outcomes in the sample space that are not included in "at least one item." Using the definition of the complement of an event and the fact that the sum of the probabilities of all outcomes is 1, the following formula is obtained. P(at least one item)equals= 1−P(none of the items)

Finding a minimum sample size to estimate mu

Given a c-confidence level and a margin of error E, the minimum sample size n needed to estimate the population men is n =

Whats the symbol for null hypothesis?

What hypotheses are there?

H0 - the null hypothesis - the hypothesis that u assume to be correct H1- the alternative hypothesis - tells you about the parameter if the H0 is shown to be wrong

Whats the symbol for alternative hypothesis

Why might you want to have a narrower confidence interval when doing statistical inference?

If you have a narrower confidence interval for a variable, then if your null hypothesis is not true, you are more likely to collect a sample that leads to the null hypothesis mean being outside the confidence interval, meaning that you would correctly reject the null hypothesis. With a wider confidence interval, the null hypothesis mean is in principle more likely to fall in the interval, and so you might fail to reject the null, even though it isn't true.

If your sample mean is towards the center of the null distribution, what does that tell you, and what would the hypothesis test result be? E.g.

If you see a sample mean that is close to the center of the standardized null distribution, as shown, this indicates that the data and the null distribution/null hypothesis are not conflict: there is a high probability that your sample could have been observed if the null hypothesis were true. In probabilistic terms, we say that the probability of your observed value or something more extreme is high under the null distribution. This is indicated by the large area in the tail of the distribution up to the tobs value. In hypothesis testing terms, you would fail to reject the null hypothesis, because there is no substantial evidence that your sample isn't from the null distribution.

If your sample mean is towards the tails of the null distribution, what does that tell you, and what would the hypothesis test result be? E.g.

If you see a sample mean that is far from the center of the standardized null distribution, as shown, this indicates that the data and the null distribution/null hypothesis are in disagreement at some level, there is only a low probability that your sample could have been observed if the null hypothesis were true. In probabilistic terms, we say that the probability of your observed value or something more extreme is low under the null distribution. This is indicated by the small area in the tail of the distribution up to the tobs value. In hypothesis testing terms, you would reject the null hypothesis, because either (a) your sample is just a very strange/unexpected sample from the null (but that has low probability, as indicated bythe very small tail), or (b) the null hypothesis actually isn't true, and your sample came from a different population. Since the probability of (a) is very small, it makes sense to say that there is strong evidence for (b).

Explain the difference between the z-test for μ using rejection region(s) and the z-test for μ using a P-value.

In the z-test using rejection region(s), the test statistic is compared with critical values. The z-test using a P-value compares the P-value with the level of significance α. A rejection region (or critical region) of the sampling distribution is the range of values for which the null hypothesis is not probable. A critical value z0 separates the rejection region from the nonrejection region. To use a rejection region to conduct a z-test, calculate the standardized test statistic z. If the standardized test statistic is in the rejection region, then reject H0. If the standardized test statistic is not in the rejection region, then fail to reject H0. To use a P-value to make a conclusion in a hypothesis test, compare the P-value with α. If P ≤ α, then reject H0. If P > α, then fail to reject H0.

When it comes to the process of hypothesis testing, what is the specific type of statistics we use?

Inferential Statistics

independent two-sample t-test

Involves more than 1 group and is used to evaluate whether two populations have different means. H0: mean 1= mean 2 HA: mean 1 doesn't =mean2

correlations are calculated using means and standard deviations and so they are ... resistant outliers

NOT

Decide if the events are mutually exclusive. Event A: Electing a president of the United StatesElecting a president of the United States Event B: Electing a female candidate

No, cuz someone who is elected to be President can be female.

What does the symbol Ho stand for?

Null Hypothesis

Paired and independent two-sampled tests are similar but how do they differ in comparing means between 2 groups?

Paired sample designs reduce variation among sampling units from other factors while independent sample designs have greater statistical degrees of freedom for the same effort and can discern differences

describes the count X occurrences of an event in fixed, finite intervals of time or space when -occurrences are all ... -and the probability of an occurrence is the ... over all possible intervals

Poisson distribution independent same

What is the rejection region?

Region that is the representation of whether or not we should reject the null hypothesis - aka critical region

Steps in Hypothesis Testing

Steps: 1.State null (Ho) and alternative (Ha) hypothesis 2. Establish your null distribution and test statistic i)Change x-axis to mean values. Insert sampling and null distributions; can compare how far away your data is. ii)Standardize to t-score (refer to equation) iii)Set alpha (significance level). Usually 5%. If your data is further from this, it's significant iv) calculate tobs v) calculate p-value. 3. Conduct statistical test Compare data to null hypothesis via statistical test. P>α, fail to reject the null hypothesis. P≤α, reject null hypothesis. 4. Draw conclusions 1)P≤α reveals that the mean is significantly less than null hypothesis and that the data provide strong evidence that the sample is not from the null hypothesis 2)P>α, reveals that the mean is NOT significantly less than null hypothesis and that the data do not provide strong evidence that the sample is not from the null hypothesis

chi squared test of independence

Tests for independence among categorical variables. Ho: Categorical variables are independent and HA: Categorical variables are not independent For our example of colour blindness, a Chi-squared test hypothesis would be: Ho: There is no difference in the degree of colour blindness between males and females HA: There is a difference in the degree of colour blindness between males and females

What assumption is made when thinking about the test statistic, tobs, and the null distribution?

The assumption for the null distribution is that the null hypothesis is true. Since the null hypothesis is essentially "everything is as expected here, nothing interesting is happening", the null distribution gives you the distribution of sample means you would expect to see, just due to sampling variation

Use the values on the number line to find the sampling error. x bar= 3.8 mu= 4.25

The difference between the point estimate and the actual parameter value is called the sampling error. x bar - mu 3.8-4.25= -0.45

game 1: 1/10 game 2: 1:10 which is better to play?

The probability of winning the first game is 1/10. The probability of winning the second game is number of wins/ number of outcomes= 1/11 Since the second probability is smaller, it would be wiser to play the first game.

Test Statistic

The result of the experiment or the statistic that's calculated from the sample

(T/F) The point estimate for the population proportion of failures is 1 - p hat

The statement is true

single-sample T-tests

This is used to compare a single obtained sample mean to a known or hypothesized population mean. Can be 1-tailed or 2-tailed. For example (2-tailed) Ho: There is no difference between the mean number of eggs per fish in the sample and the threshold of 1100 Ha: There is a difference between the mean number of eggs per fish in the sample and the minimum threshold of 110. For example (1-tailed) Ho: The mean number of eggs per fish in the sample is not less than the threshold of 1100. Ha: The mean number of eggs per fish in the sample is less than the threshold of 1100.

If two events are mutually exclusive, they have no outcomes in common.

True

T/F If two events are independent, P(A|B)equals=P(B).

True Two events A and B are independent if P(B|A)=P(B) or if P(A|B)=P(A).

Find the margin of error for the given values of c, σ, and n. c=0.95, σ=2.9, n=64

Zc x (sigma/sqrt of n) c= .95 (Zc of 1.96) 0.711

null hypothesis

a claim about a population parameter (i.e. mean) that takes a skeptical viewpoint (Ho). For example, Ho: The flu vaccine has no effect

alternative hypothesis

a claim about a population parameter that represents eveything not included in the null hypothesis (Ha) Ha The flu vaccine has an effect

level of significance, significance level

a criterion of judgement upon which a decision is made regarding the value stated in a null hypothesis; the criterion is based on the probability of obtaining a statistic measured in a sample if the value stated in the null hypothesis were true (usually at 5%; less than 5% = reject the null)

Significance Level

a criterion of judgement upon which a decision is made regarding the value stated that the actual value of pop. parameter is less than, greater than, or not equal to the value stated in the null hypothesis

Critical Value

a cutoff value that defines the boundaries beyond which less than 5% of sample means can be obtained if null is true (sample means beyond this are rejected)

critical value

a cutoff value that defines the boundaries beyond which less than 5% of sample means can be obtained if the null hypothesis is true (sample means obtained beyond a critical value will result in a decision to reject the null hypothesis)

one-tailed test

a directional test, reflecting a directional hypothesis. For example, we are expecting "the mean to be less than [mean value] this mean value".

two-tailed test

a hypothesis test in which the research hypothesis does not indicate a direction of the mean difference or change in the dependent variable, but merely indicates that there will be a mean difference

test statistic

a mathematical formula that identifies how far or how many standard deviations a sample outcome is from the value stated in a null hypothesis; allows researchers to determine the likelihood of obtaining sample outcomes if the null hypothesis were true (value is used to make a decision regarding a null hypothesis)

Cohen's d

a measure of effect size in terms of the number of standard deviations that mean scores shifted above or below the population mean stated by the null hypothesis (larger the value of the d, larger the effect in the population)

hypothesis testing, significance testing

a method for testing a claim or hypothesis about a parameter in a population, using data measured in a sample; in this method, we test a hypothesis by determining the likelihood that a sample statistic would be selected if the hypothesis regarding the population parameter were true

Population Parameter

a numerically valued attribute of a model for a population

Alternative Hypothesis

a statement that directly contradicts the null and offers all other possible solutions

test statistic (t-test)

a statistic whose value helps determine whether a null hypothesis should be rejected and is used to compare the means of two groups.

One Sample Z-Test

a statistical procedure used to test hypotheses concerning the mean in a single population with a known variance

t-test

a statistical test used to evaluate the size and significance of the difference between two means

State whether the standardized test statistic z indicates that you should reject the null hypothesis. (left-tailed) (a) z=1.208 (b) z=−1.364 (c) z=−1.467 (d) z=- 1.189 a) For z=1.208, should you reject or fail to reject the null hypothesis?

a) Fail to reject H0 because z > −1.285. b)Reject H0 because z< −1.285. c) Reject H0 because z<−1.285. d)Fail to reject H0 becuase z > -1.285

A light bulb manufacturer guarantees that the mean life of a certain type of light bulb is at least 750 hours. A random sample of 24 light bulbs has a mean life of 728 hours. Assume the population is normally distributed and the population standard deviation is 65 hours. At α=0.02, do you have enough evidence to reject the manufacturer's claim? This is left-tailed. a)) Identify the null hypothesis and alternative hypothesis. b) Identify the critical value(s). C) Identify the standardized test statistic d) Decide whether to reject or fail to reject the null hypothesis (e) interpret the decision in the context of the original claim.

a) H0: mu equal than > 750 (claim) Ha: mu < 750 b) null z: invNorm(alpha,0,1)= -2.05 c)z= -1.66 (used STAT tests menu, p. 368) d) Fail to reject H0. There is not sufficient evidence to reject the claim that mean bulb life is at least 750 hours. e)

Determine whether to reject or fail to reject H0 at the level of significance of a)α=0.07 and b) α=0.02. H0: μ=123, Ha: μ≠123, and P=0.0396

a) Reject H0 because P<0.07 b) Fail to reject H0 because P>0.02

During a 52-week period, a company paid overtime wages for 16 weeks and hired temporary help for 7 weeks. During 4 weeks, the company paid overtime and hired temporary help. Complete parts (a) and (b) below. (a) Are the events "selecting a week that contained overtime wages" and "selecting a week that contained temporary help wages" mutually exclusive? (b) If an auditor randomly examined the payroll records for only one week, what is the probability that the payroll for that week contained overtime wages or temporary help wages?

a.) No b.)0.365 (30/52 +7/52 -4/52)

A company that makes cartons finds that the probability of producing a carton with a puncture is 0.03, the probability that a carton has a smashed corner is 0.08, and the probability that a carton has a puncture and has a smashed corner is 0.002 a.) mutually exclusive? b.) If a quality inspector randomly selects a carton, find the probability that the carton has a puncture or has a smashed corner.

a.) no b.)0.108 (.03+.08-.002)

(a) List an example of two events that are independent. (b) List an example of two events that are dependent.

a.) rolling a die twice b.) Drawing one card from a standard deck, not replacing it, and then selecting another card

estimation

allows us to describe the distribution of the population parameters (How large is the effect)

Whats another name for a type 1 error?

alpha

Because a two-sided test is symmetric, you can easily use a confidence interval to test a two-sided hypothesis. C = 1-... You just have to do 1- C and divide by ...

alpha 2

The probability that the test statistic will fall inside rejection region due to chance alone is equal to:

alpha one minus confidence interval the significance level

A null and alternative hypothesis are given. Determine whether the hypothesis test is left-tailed, right-tailed, or two-tailed. H0:σ ≥ 66 Ha: σ < 6

always determined by Ha -less than (L tail), greater than (R tail), not equal to (split tail) left-tailed test

The confidence level C (in %) represents an ... of corresponding size C under the sampling distribution.

area

Find the P-value for a left-tailed hypothesis test with a test statistic of Z=−1.15. Decide whether to reject H0 if the level of significance is α=0.05

area to the left of z= normalcdf(-10,000, -1.15,0,1)= 0.1251 p-value= 0.1251 To use a P-value to make a conclusion in a hypothesis test, compare the P-value with alphaα. If P≤ α, then reject H0. If P>α, then fail to reject H0. Since P>α, fail to reject H0.

error of interpreting r: data based on ...

averages

can be analyzed to determine if there is an association between the two variables

bivariate (paired) data

Determine which numbers could not be used to represent the probability of an event.

can't be less than 0 or greater than 1 -can be % -can be fraction -can be any decimal places (not just two)

If two events are mutually exclusive, why is P(A and B)=0?

cannot occur at the same time Two events are said to be mutually exclusive if they cannot occur simultaneously.

error of interpreting r: concluding that correlation implies ...

causality

example: even though the population of a is strongly skewed, the sampling distribution of x-bar when n=25 is approximately normal, as expected from the ...

central limit theorem

when randomly sampling from any population with mean u and standard deviation o, when n is large enough, the sampling distribution of x-bar is approximately normal

central limit theorem

Statistical significance only says whether the effect observed is likely to be due to ... because of random sampling.

chance alone

The number of ways a five- member committee can be chosen from 10 people.

combo- order doesnt matter- all equal positions

Cohort studies enlist individuals of _______________________________, and keep track of them over a long period of time.

common demographic

With significance tests, you should plot your results, ... them with a baseline or similar studies.

compare

Use the given statement to represent a claim. Write its complement and state which is H0 and which is Ha. sigma = 3

complement: sigma does not equal 3 H0: sigma =3 Ha: sigma does not euqal 3

continuous variables that can take on any one of an infinite number of possible values over an interval

continuous sample space

exists between two variables when one of them is linearly related to the other in some way

correlation

averages may suppress individual variation and may inflate the ...

correlation coefficient

significance, statistical significance

describes a decision made concerning a value stated in the null hypothesis (when the null hypothesis is rejected, we reach significance; when the null hypothesis is retained, we fail to reach significance)

alternative hypothesis

designated Ha, is what is concluded if there is sufficient evidence to reject the null hypothesis -The alternative hypothesis can be one-sided or two-sided. A one-sided test is referred to as a one-tailed test, and a two-sided test is referred to as a two-tailed test.

Step 1

determine null and alternative hypothesis

What's an effect size?

diff btw hypothesized parameter values

Effect

difference between a sample mean and the population mean stated in the null (significant to reject the null)

combinations

division 5C3= (5P3)/3!

Whats the diff between effect size and statistical sign?

effect size is practical

if we conclude that there is a linear correlation between x and y, we can find a linear equation that expresses y in terms of x and that ... can be used to predict the values of y for given values of x (simple linear regression)

equation

Use the confidence interval to find the estimated margin of error. Then find the sample mean. A biologist reports a confidence interval of(2.3,3.5) when estimating the mean height (in centimeters) of a sample of seedlings.

estimated margin of error= 3.5-2.3 (all divided by 2)= 0.6 sample mean=2.3+0.6 (left margin plus) OR 3.5-0.6 (right margin/upper limit MINUS) =2.9

an ... is a subset of the sample space

event

When P-value > α, we ... H0.

fail to reject

t?f If you want to support a claim, write it as your null hypothesis.

false. If you want to support a claim, write it as your alternative hypothesis. A hypothesis test can only reject or fail to reject the null hypothesis. Failing to reject the null hypothesis does not mean that the null hypothesis is true. So to support a claim, the desired result of the test would be to reject the opposite of that claim. Thus, the opposite of the claim should be stated as the null hypothesis, and the claim should be the alternative hypothesis.

Step 2

find critical value, how many tails?, p-value

Construct the confidence interval for the population mean μ. c=0.98, x bar=4.3, σ=0.6, and n=50

find margin of error then subtract it from xbar to find left value, add to xbar to find right value (upper limit) invNorm(0.98=2.05 (=Zc) 2.05(0.6/sqrt50)=0.174 plus and minus xbar= 4.13, 4.47

central limit theorem

for any given distribution with a mean and variance, the sampling distribution of the mean approaches a normal distribution as sample size increase (aka sd/n)

r^2 represents the ... of the variation in y that is explained by the regression model

fraction

The alternative hypothesis, Ha, is a more ... statement that complements yet is mutually exclusive with the null hypothesis.

general

a conditional percent is computed using the counts within a single row or a single column and the denominator is the corresponding row or column total rather than the table ... total

grand

most of the time we just don't know if the population is normal and all we have is sample data so: -we can summarize the data with a ... and describe its shape -if the sample is ..., the shape of the histogram should be similar to the shape of the population distribution -the ... can help guess whether the sampling distribution should look roughly normal or not

histogram random central limit theorem

List 2 ways to calc effect size

how far scores shifted in pop, percent of variance that can be explained by given variable

A test of statistical significance tests a specific ... using sample data to decide on the validity of the hypothesis.

hypothesis

nondirectional tests, two-tailed tests

hypothesis tests in which the alternative hypothesis is stated as not equal to a value stated in the null hypothesis; hence, the researcher is interested in any alternative to the null hypothesis

the ... is a necessary mathematical descriptor of the regression line and it does not describe a specific property of the data

intercept

you can use the equation of the ... to predict y for any value of x within the range studied

least-squares regression

the ... is the unique line such that the same of the ... between the data points and the line is zero, and the sum of the squared vertical distance is the smallest possible

least-squares regression line vertical distances

averages are ... variable than individual observations

less

When the standard error of a sample mean is decreased by increasing n, it becomes ___

less variable

If an experiment has several factors, a treatment is a combination of specific _____________ of each factor.

levels

if r is close to -1 or 1, we conclude that there is significant ... correlation

linear

least squares regression is only for ... so always plot the raw data to confirm

linear associations

... measures the strength of the linear association between paired x and y qualitative values in a sample

linear correlation coefficient r

t-test v normal test

mean= o t test sd > 1 but normal = 1 t-test has fatter tails

Cohen's D

measure of effect size in terms of # of SD's that the mean scores shifted above or below the pop. mean stated by the null

For the statement below, write the claim as a mathematical statement. State the null and alternative hypotheses and identify which represents the claim. A laptop manufacturer claims that the mean life of the battery for a certain model of laptop is less than 9 hours.

mu < 9 H0: mu greater than or equal to 9 Ha: mu < 9 The alternative hypothesis Ha: μ < 9 is the claim.

mu <3 hours H0: mu > or equal to 3 hours (equality is always null) Ha: mu < 3hours The alternative hypothesis Ha: μ<3 is the claim.

In a random sample of 24 people, the mean commute time to work was 32.8 minutes and the standard deviation was 7.3 minutes. Assume the population is normally distributed and use a t-distribution to construct a 90% confidence interval for the population mean μ. What is the margin of error of μ? Interpret the results.

n=24 mean=32.8 sigma= 7.3 Tc=1.714 margin of error= 2.6 confidence interval= 30.2, 35.4 With 90% confidence, it can be said that the population mean commute time is between the bounds of the confidence interval.

A ______________________ is a control in which the outcome is expected to stay the same or no response is expected.

negative control

when the sampling distribution is ..., we can standardize the value of a sample mean x-bar to obtain a z-score and this z-score can then be used to find areas under the sampling distribution from the normal probability table

normal

one way to assess if a data set has an approximately normal distribution is to plot the data on a ....

normal quantile plot

a family of symmetrical, bell-shaped curves defined by a mean (u) and a standard deviation (o); N(u, o)

normal/ Gaussian distribution

Find the P-value for the indicated hypothesis test with the given standardized test statistic, z. Decide whether to reject H0 for the given level of significance α. Right-tailed test with test statistic z=1.29 and α=0.04

normalcdf(1.29,10,000,0,1) p-value= 0.0985 Fail to reject H0 cuz p-value is higher than alpha

when a variable in a population is normally distributed, the sampling distribution of the sample mean x-bar is also ...

normally distributed

P-values that are ... don't give enough evidence against H0 and we fail to reject H0. Beware: We can never "prove H0."

not small

What are the two types of hypotheses used in a hypothesis test? How are they related?

null and alternative The null hypothesis H0 is a statistical hypothesis that contains a statement of equality, such as ≤, =, or ≥. The alternative hypothesis Ha is the complement of the null hypothesis. It is a statement that must be true if H0 is false and it contains a statement of strict inequality, such as >, ≠, or <. They are complements

A ________________ is a number summarizing a characteristic of the population while a _________________ is a number summarizing a characteristic of a sample.

parameter; statistic

when x is larger then the mean the z is ...

positive

A _________________________ is a control in which the outcome is expected to change.

positive control

Type 3 Error

possible with directional tests in which a decision would have been to reject the null, but the researcher decides to retain the null because the rejection region was located in the wrong tail

Statistical significance may not be ... important.

practically

What's the p value?

prob of getting extreme results given that null is true

Whats a type 2 error?

prob of keeping false null (false negative)

What's a type 1 error?

prob of rejecting h0 when its true (false positive)

Define power

prob of rejecting null assuming alternative is true

a ... is assigned for each possible simple event in the sample space S

probability

we define the ... of any outcome of a random phenomenon as the proportion of times the outcome would occur in a very long series of repetitions

probability

mathematically describe the outcome of random processes

probability models

Type II error

probability of failing to reject null hypothesis when it is false/false negatives. It is something that we have no control over. For example, Display Ad A is not effective in driving conversations but is accepted as true.

Type I error

probability of rejecting null hypothesis when it is true/false positive. It is something that we decide. For example, a person is judged as guilty when the person actually did not commit the crime

A study found that 34% of the assisted reproductive technology (ART) cycles resulted in pregnancies. Twenty-five percent of the ART pregnancies resulted in multiple births.

probability that a randomly selected ART cycle resulted in a pregnancy and produced a multiple birth= (.34x0.25)= 0.085 The probability that a randomly selected ART cycle that resulted in a pregnancy did not produce a multiple birth= 0.750 unusual? No, this is not unusual because the probability is not less than or equal to 0.05

P-value

probability, if H0 was true, of obtaining a sample statistic at least as extreme as one obtained -small: statistically significant, reject null -large: don't give enough evidence against H0, fail to reject...we can NEVER prove H0

A probability experiment consists of rolling a eight-sided die and spinning the spinner shown at the right (4 colors). The spinner is equally likely to land on each color. Use a tree diagram to find the probability of the given event. Then tell whether the event can be considered unusual. -Event: rolling a number less than 4 and the spinner landing on red

probability= 0.094 (3/32) unusual? No, bc it's not close enough to 0. (An event that occurs with a probability of 0.05 or less is typically considered unusual.)

Cohort studies -> _____________________

prospective

The point estimate for the population proportion of failures is

q = 1- p

The margin of error does not cover all errors: The margin of error in a confidence interval covers only ... Undercoverage, nonresponse or other forms of bias are often more ... than random sampling error (e.g., our elections polls). The margin of error does not take these into account at all.

random sampling error serious

Experiments ____________________ the assignment of subjects to treatments.

randomize

Caution About Z Procedures for a Mean: -The data must be a probability sample or come from a .... Statistical inference cannot remedy basic design flaws, such as voluntary response samples or uncontrolled experiments. -The sampling distribution must be approximately .... This is not true in all instances (if the population is skewed, you will need a large enough sample size to apply the central limit theorem). -To use a z procedure for a population mean, we must know ..., the population standard deviation. This is often an ... requisite.

randomized experiment Normal σ unrealistic

A _____________________________ design gives two or more treatments to each subject over time, in random order.

repeated measures

Use of Sampling Distributions: -If the population is N(μ,σ), the ... is N(μ,σ/√n). -If not, the sampling distribution is ~N(μ,σ/√n) if n is .... -We take one random sample of size n, and rely on the ... of the sampling distribution.

sampling distribution large enough known properties

the ... is the probability distribution of that statistic for samples of a given size n taken from a given population

sampling distribution of a statistic

the value of r does not change if all values of either variable are converted to a different ...

scale

this regression equation expresses an association between x and y

simple linear regression

a linear regression model with one predictor variable is a .... model

simple linear regression (SLR)

A ________________________________ is made of randomly selected individuals.

simple random sample, SRS

the ... of the regression line describes how much we expect y to change, on average for every unit change in x

slope

With a large sample size, even a ... effect could be significant.

small

Because ... have a lot of chance variation, even large population effects can fail to be significant if the sample is small.

small random samples

don't subtract the z-value because normal curves are not ...

square

establishing causation from an observed association can be done if: 1. the association is ... 2. the association is ... 3. higher doses are associated with ... responses 4. the alleged cause precedes the ... 5. the alleged cause is ...

strong consistent stronger effect plausible

the probability that a binomial random variable takes any range of values is the ... of each probability for getting exactly that many successes in n observations

sum

effect

the difference between a sample mean and the population mean stated in the null hypothesis (an effect is not significant when we retain the null hypothesis; an effect is significant when we reject the null hypothesis)

the probability that the confidence interval contains p is c, assuming that ___

the estimation process is repeated a large number of times

Type II error

the failure to reject the null hypothesis when it is actually false.

Margin of Error

the greatest possible distance between the point estimate and the value of the parameter it is estimating

confidence interval

the range of values within which a population parameter is estimated to lie

rejection region

the region beyond a critical value in a hypothesis test (when the value of a test statistic is in the rejection region, we decide to reject the null hypothesis, otherwise we retain the null hypothesis)

Type I error

the rejection of the null hypothesis when it is actually true

A ____________________ is any specific experimental condition applied to the subjects.

treatment

A matched pairs design chooses pairs of subjects that are closely matched, like twins, and each pair is randomly assigned ____________________.

treatments

Baye's theorem can be extended to events with more than ... outcomes

two

the mean of the sampling distribution x-bar is ...

there is no tendency for a sample average to fall systematically above or below ..., even if the population distribution is ...

u skewed

Voluntary response sampling and convenience sampling are biased while probability sampling is __________________.

unbiased

x-bar is an ... estimate of the population mean u

unbiased

A confidence interval is a range of values with an associated probability, or confidence level, C. This probability quantifies the chance that the interval contains the ....

unknown population parameter

t test

use when test concerns the value of an underlying or population mean hypothesis testing using a statistic (t-stat) & follows t-distiribution. t distribution is a probability distribution defined by a single parameter, degrees of freedom. each degree of freedom = one distribution in the family of distributions mean = 0 sd > 1 more prob for outcome distant from mean (fatter tails) as # of degrees of freedom increase w sample size, the t-distribution approaches the standard normal distribution use for tests w population mean of a normally distributed population w unknown variances population variance = sd!!!

chi-square test

used for hypothesis tests concerning the variance of a normally distributed population (n-1)samp(sd^2)/SD^2

Obtained Value

value of a test statistic

Critical Values

values that separate sample statistics that are probable from sample statistics that are improbable, or unusual

a statistic computed from a random sample is a random ...

variable

spearman rank correlation coefficient Rs

when data to do a t test with 2 variables based on correlation coefficient meaningfully departs from distribution assumptions. essentially the same thing as correlation coefficient, but calculated on the ranks of two variable. gives number -1 to 1. -1 = perfectly inverse relationship 1 = perfectly linear 0= no correlation

test for population mean

with known standard deviation sigma: z=(xbar - u0)/(sigma/sqrt(n)) then use the chart to see where p-value falls near alpha, but doesn't provide any information about true population mean u

the value of r is not affected by the choice of ... and ...

x y

In a survey of 608 males ages 18-64, 396 say they have gone to the dentist in the past year. Construct 90% and 95% confidence intervals for the population proportion. Interpret the results and compare the widths of the confidence intervals. If convenient, use technology to construct the confidence intervals.

x= 396 n=608 p hat= 0.651 q hat= 0.349 Zc 90%= 1.645 margin of error=sqrt (p hat times q hat/n) times 1.645= 0.015 endpoint L= p hat - E= 0.619 endpoint R= p hat + E= 0.683 With the given confidence, it can be said that the population proportion of males ages 18-64 who say they have gone to the dentist in the past year is between the endpoints of the given confidence interval. The 95% confidence interval is wider.

p hat= x/n

x= number of successes in survey n= sample size

variable ... is the dependent or response variable

... = intercept + slope x

y-hat

... is the predicted value of y for a given value of x

y-hat

the probability of an event being equal to a single numerical value is ... when the sample space is continuous

zero

In hypothesis testing, does choosing between the critical value method or the P-value method affect your conclusion?

No, because both involve comparing the test statistic's probability with the level of significance The P-value method converts the standardized test statistic to a probability (P-value) and compares this with the level of significance, whereas the critical value method converts the level of significance to a z-score and compares this with the standardized test statistic. Thus, both methods will result in the same conclusion.

Ver todos los conjuntos de estudio

Conjuntos de estudio relacionados

CH 7 ETHICS, HW #1/#2, Ethics in Accounting Chapter 4 - Part 2, Chapter 4, Accounting ethics 4 - professional judgment in accounting, ACCT Ethics CH 5-8, ACG445 Chapter 1-4, Accounting Ethics Midterm 2, Ethics for Accountants - Test 3, Chapter 3 and...

BBSS-E

Conjuntos de estudio relacionados

CH 7 ETHICS, HW #1/#2, Ethics in Accounting Chapter 4 - Part 2, Chapter 4, Accounting ethics 4 - professional judgment in accounting, ACCT Ethics CH 5-8, ACG445 Chapter 1-4, Accounting Ethics Midterm 2, Ethics for Accountants - Test 3, Chapter 3 and...

MAN4720 CH1

Chapter 30

HR Exam 2 Study Guide

Macro Mid Term 3

Public Speaking Finale Exam

exam 2 psych

Spanish unit 5A

Accounting 208: Chapter 1-4: Quiz

Infectious Disease

N430 HESI Exam 2

Bio final exams study guide

TEST 3 STUDYING

Macroeconomics

Chapter 18 Regulation of Gene Expression

STI/HIV Questions

Studies in poetry 1

A&P 3

Környezetvédelem vizsga 9.rész

Exam 3 study guide