Module 3 Stats

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

p-value

P-value: the probability of obtaining the sample statistic (e.g. the mean) observed or more extreme from the original population By convention, if p is less than or equal to (p<=.05) .05, the result is considered "statistically significant"

What term refers to the likelihood of a certain outcome occurring?

Probability

If a sample mean falls in the critical region, it is sufficiently unlikely to be the same as the untreated population what action?

Reject Ho

Wha is sampling error ?

Sampling error is the difference between the population parameter and the sample statistic. It is impossible to eliminate sampling error but there are ways to reduce it such as a larger sample size.

What is another name for the standard deviation of a distribution of sample means?

Standard error

Identify the 4 steps in hypothesis testing.

State the null (H0) and alternative (Ha) hypotheses. Establish the important cutoff (critical) values by which to measure the observed result to determine statistical significance. Calculate the test statistic. Compare the test statistic to the critical values, and decide if the null hypothesis (H0) should be rejecting or not.

Combating Problems with Hypothesis Testing

Supplement hypothesis testing with additional measures such as effect size and confidence intervals

biased sample

a sample that is not representative of the population

Which of the following terms describes a claim that is made about a particular population parameter?

hypothesis

relative frequency method

involves performing numerous observations of a given situation and recording the number of times the event occurs - for finding probability of events where you are not 100% sure of the outcome

cluster sampling

is when you divide the population into clusters such as the U.S. voters into voters by states, and then you randomly select clusters (e.g. CA, NY, KY). From there, everyone in that cluster is studied. -great for very large populations

stratified sampling

is when you divide the population into strata (typically based on age, gender, race, socioeconomic status) and then you randomly select people to sample from each strata (not everyone in a strata is studied) -great for populations with subsections/ different characterstics

response bias

when participants respond in a way that is inaccurate or untruthful

selection bias

when the sample is not representative of the population ( e.g. Literary Digest Alf Landon predicted to vote but the sample was upper class people who tend to vote republican)

Statistical significance

-A result is not likely to be from the population -For instance, the sample mean is so far in to the extreme that the probability is low that it could come from the original population By convention, if p is less than or equal to (p<=.05) .05, the result is considered "statistically significant" It means that the probability that we got this sample mean or something more extreme from this population is 5% or less It's still possible but it is unlikely

standard error

-Is the average difference between each sample mean and the population mean -indicates how much variability exists between samples -provides a measure of the SAMPLING ERROR (the larger the standard error the larger the sampling error) -We want the standard error to be small!

simple random sampling

-every member of the population has an equal chance of being selected for the sample -the probability that each member is selected is independent of one another -computers are often used to generate random numbers -best for small sample sizes

What is the general value that is used to determine statistical significance?

0.05

What are the four steps of hypothesis testing?

1 State the hypotheses (Ho and Ha) Ho= There is no relationship between test scores and tutoring. (M= 75) Ha= There is a relationship between test scores and tutoring. (M=X 75, M>75) 2. Establish the decision criteria (set the alpha level or significance level as 0.05) 3. Collect data and compute sample statistics as well as convert sample statistic into test statistic (e.g. z-score) 4. Make a decision (if p-value <= alpha level, reject the null hypothesis, the result is statistically significant.) (if p value>= alpha level, fail to reject the null hypothesis. there is no significant treatment effect).

Problems with Hypothesis Testing

1. Conventional levels of alpha (.05, .01, or .001) are arbitrary We just somehow decided, but probabilities exist on a continuum 2. A larger sample size is more likely to achieve statistical significance than a smaller sample size Statistical significance does not always have meaningful significance Even small effects can be statistically significant with a large enough sample size 3 There is too much reliance on the p-value as the sole measure upon which conclusions are made Conclusions should also be drawn on past evidence, validity of assumptions made and so forth 4 People (including researchers!) misinterpret the p-value and the results of hypothesis testing 5 There is a bias towards publishing statistically significant results over non-significant results This leads to an incomplete and biased picture of the findings

Central Limit Theorem

1. The distribution of sample means is approximately normal as long as the sample size is large (n=30 or more) This is true REGARDLESS of the shape of the original population distrbiution 2. The mean of all the sample means is equal to the population mean! 3. The standard deviation of the distribution of sample means is known as the STANDARD ERROR and is calculated as the population standard deviation/ square root n

What is the probability that in a standard deck of 52 cards you get a heart?

13/52= 0.25

Calculate the standard error of a distribution of sample means for a sample size of 50 and a population standard deviation (σ) of 2.5.

2,5/ sqrt(5)= 0.35

Sample size is considered large if it contains at least ____ members.

What is the difference between a parameter and a statistic?

A parameter is a numerical value that describes a population, such as the population mean. A statistic is a numerical value that describes a sample, such as a sample mean A statistic is a descriptive statistical result that is generated from a sample, whereas a parameter is a statistical result from a population..

What theory identifies several key characteristics of a distribution of sample means?

Central Limit Theorem

If a sample mean does not fall in the critical region, it is likely to be the same as the untreated population

Fail to reject Ho

How to reduce sampling error?

Increase the sample size or use STRATIFIED sampling (increase the likelihood that the sample is more representative of the population) - the larger the sampling size the smaller the sampling error

How to reduce the standard error?

Increase the sample size! (with a larger n, the standard error is smaller)

Why is it important to have a representative sample?

It is important to have a representative sample, because we're trying to make inferences about a certain population. In order to make correct inferences about this population, the sample must reflect the population. For instance, not having a representative sample is what called Literary Digest to predict Alf Landon to win, when FDR actually won in a land slide.

Which of the following is a disadvantage of simple random sampling?

It is most practical for small populations.

Which type of sampling involves a rule being established for how sample members will be selected?

Systematic sampling

T/F Hypothesis testing is Based on the logic of falsification T/F Hypothesis testing can prove claims.

T. Based on the logic of falsification (trying to reject the null hypothesis) False Never ever say "prove" during hypothesis testing

probability

The likelihood of a certain outcome occurring -must be between 0 and 1 (0= no chance, 1 = will always happen) -reported as a DECIMAL -may be described as "PROPORTION) -In research, probabilities are used to determine the likelihood that the result of a sample came from the original population.

What is the difference between the null and alternative hypotheses?

The null hypothesis (H0) is the beginning assumption of any hypothesis test that makes a particular claim about the value of a population parameter and, in essence, claims that an observed effect or difference does not exist. The alternative hypothesis (Ha) states that the value of the population parameter differs from the value claimed in the null hypothesis and, in essence, claims that an observed result or difference does exist.

Which of the following statements is NOT true regarding a representative sample?

The results from a representative sample favor particular results.

What is the purpose of a significance level?

The significance level (α) is the value that the P-value of a test statistic can be compared to in order to determine statistical significance. This value is typically 0.05.

Describes some of the major problems with hypothesis testing.

The significance level is an arbitrary value. - Hypothesis testing is sensitive to sample size. - P-values do not reflect the size of an effect - Hypothesis testing is often used as the only means to determine significance. - The results of hypothesis testing are frequently misinterpreted and misunderstood. - There is a publishing bias towards statistically significant results.

What is the difference between Type I and Type II errors?

Type I errors are false positives, whereas Type II errors are false negatives. A type I error is a false positive. It is when you reject the null hypothesis when the null hypothesis is true. In other words, you are saying a treatment has an effect, when it is useless. It is the error that is more costly. A type II error is a false negative. It is when you fail to reject the null hypothesis when the null hypothesis is false. In other words, you are saying a treatment has no effect, when it is useful. This error is less costly.

What is a sampling distribution?

a frequency distribution of a statistic from every possible sample of a given size n from the population.

What is a distribution of sample means?

a frequency distribution of all possible sample means.

Representative sample

consists of members that possess the same characteristics as those of the population (e.g. age distribution)

The area in the distribution of sample means where a low probability exists

critical region

Critical values

cutoff values, serve as the boundaries for the critical regions. If the test statistic lies in the critical region (region of low probability), then there is reason to reject the null hypothesis.

What are non-sampling errors? aka as? Examples?

errors that are not the result of random sampling -sampling bias e.g. measurement bias, response bias, selection bias

measurement bias

may results from a mistake during the measurement process or poorly worded questions e.g. scale on carpet overestimates weight

systematic sampling

random sampling with a system in which the starting point is random and each subsequent member selected is based on a fixed interval (e.g. every 3rd person) - the probability that each member is selected is not independent of one another -random samples can be achieved in a way that is more efficient than simple random samlping

convenience sampling

selecting a sample that is convenient and easy to access - asking everyone in a lecture hall to determine L handedness at your school -does NOT result in a representative sample -generally is more prone to bias than other sampling methods

What term is used to delineate a sample statistic that deviates from a population parameter to a degree that appears to be beyond chance?

statistically significant

Hypothesis testing

the formal process by which sample data is used to evaluate a statistical hypothesis or a claim about a population (make inferences about a population) -based on probability (it's not black and white), making a best guess based on evidence from our sample

Alpha level (a)/ significance level

the maximum probability that one would make a Type I error if the null hypothesis is true

Module 3 Stats

संबंधित स्टडी सेट्स

mech 2653 test 4

310 Ch. 17 Pre-op Nursing Management

Maternity ATI practice questions Exam 4

411 Final

Chapter 11: Anxiety, Anxiety Disorders, and Obsessive - Compulsive Disorders

Module 2

Operations Management Chapter 1

ECON Module 4

Pivotal Politics

Chapter 8: Metabolism

Exam 1- Biodiversity

Acct 2302 Exam 3

REL 205 Final

Test 4

CJUS 3340 Ch. 7 Quiz

Chapter 5 - Basic Stats Concepts, and Descriptive Stats

ATI Client Education

Economics Part 4

VARCAROLIS Chapter 28: Child, Older Adult, and Intimate Partner Violence

415 - study guide 2