psych stats final

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

What are the degrees of freedom for a 3 x 3 chi-square test of independence?

(r-1)(c-1) 4

validity

- can you trust the results of the study? -refers to the extent that the instrument measures what it was designed to measure

nominal level of measurement

-categorical -always *discrete*

interval/ratio level of measurement

-clear ordering of values -consistent (equal) spacing between consecutive values -ex: pages read, response times, exam scores - can be *discrete* or *continuous*

ordinal level of measurement

-clear ordering of values -does not contain useful information about distance between variables -always *discrete* -ex: award won (gold/silver/bronze), military rank, economic class, likert scales

internal validity

-degree to which you can draw correct conclusions about the causal relationships between variables - refers to the relationships between things "inside" the study -threatened by *cofounding* *variables*

external validity

-how generalizable and applicable are your findings -threatened by *artificial* *Variables*

"significant" meaning

-if data allows us to reject the null hypothesis, the result is "statistically significant" -doesnt mean the result is important in the real world -related to "indicated" rather than modern definition as "important"

Reliability

-is the degree of consistency of a measure. - can the measure produce the same results under the same conditions

outcome variable

-measured not manipulated -proposed effect -proposed effect

Predictor variable

-the manipulated variable -independent variable -proposed cause -has 2+ levels --the levels are the specific groups, categories, or well levels of the predictor variables that the researcher used in an experiment

1 vs 2 tailed tests

-two tailed: -alternative hypothesis covers the area of both "sides" of the distribution -critical region of test covers both tails of sampling distribution one tailed: -critical region only covers one tail of sampling distribution -might encourage cheating-fishing for results to match expectation

appropriate files for jasp

.jasp .csv

If an event is impossible, then the probability of that event will be

0

According to the law of probability, the probabilities of the elementary events must add up to __________ .

1

What is the formula for "not B"?

1 - P(B)

misconceptions around p values

1. A significant result means that the effect is important -significance depends on sample size -statistical significance doesn't mean practical significance 2. A significant result means that the null hypothesis is false -based on probabilities-> unlikely (not false) 3. a non-significant result means the null hypothesis is true -non-significant results only indicate effect is not big enough to be found (given sample size), not that the effect size is 0

If difference between sample means is larger than we would expect (based on standard error), there are 2 possibilities:

1. No effect; by chance we got two samples that are atypical of the population they came from 2.The samples are typical of their population; they come from different populations

Steps for ANOVA

1. pick a test 2. list and check assumptions 3.state hypothesis -ANOVA ALWAYS has nondirectional hypothesis -nondirectional, between subjects, one way anova H0: μ1 = μ2 = μ3 H1:At least one population mean is different from the others 4. decision rule -If F ≥ Fcv , reject H0 -If F < Fcv , fail to reject H0 -ANOVAs have one critical value. You find the critical value by looking at the F table in the back of your book.

assumptions of independent samples t-test

1. random samples 2. independence of observation 3. the dependent variable is normally distributed in each population 4. Homogeneity of variance

If alpha is set at 5%, the critical value of z should be _____ for a one-tailed test and ______ for a two-tailed test.

1.65; 1.96

You roll a die with 20 sides (numbered 1 through 20). On a single roll, what is the probability of rolling a 13? Report as a decimal rounded to 2 decimal places.

1/20= .05

What is the critical value for a 3 x 3 chi-square test of independence if alpha = 0.01? Report your answer to 3 decimal places.

13.277 check table

F(6,30) = 2.81, p < .05. For the results in APA format above, what is Fcv?

2.421

To determine which z-score marks off the middle 95% of scores under a normal curve, we find the z-score with ____ % of scores above it.

2.5%

If the population mean is 50, the population standard deviation is 20, and sample size is 100, what is the standard error?

20/ SQR 100= 20/10= 2

If SSWithin = 160 and SSTotal = 250, what is SSBetween?

250-160=90

In a research article, you come across the following result for a chi-square goodness-of-fit test: Which of the following statements are correct? chi-square (2, N=100) = 6.21, p< 0.05

3 categorical variables the null hypothesis was a poor fit to the observed data

Interqaurtile range (IQR)

3 values that split into sorted data into 4 equal parts-this is encompassed in IQR where middle half of the data

F(6,30) = 2.81, p < .05. For the results in APA format above, what is N?

37

One hundred students took a K300 exam, and half of them studied beforehand. Out of the 50 students who studied, 40 got an A on their K300 exam. If there is no relationship between studying and exam grade, how many of 50 students who did not study should we expect to get an A on their K300 exam?

40

You are interested in the average salary an IU graduate makes 10 years after graduating from IU. You manage to get data from 2500 alumni, and the mean salary of your sample is $56,000, with a standard deviation of $1000. What is the best estimate of the population mean in this example?

56,000 for large samples, the sample mean is the best guess for the population mean

In a normal distribution, what percentage of cases fall within -1 and +1 standard deviations of the mean?

68

F(6,30) = 2.81, p < .05. For the results in APA format above, how many groups were tested?

7

If MSWithin = 764.55 and MSBetween = 898.00, what is F?

898/764.55= 1.17

In the Evans, Barston, & Pollard (1983) study, what percentage of people thought that an invalid argument was valid when the conclusion felt true?

92%

APA style CI

95% CI [xxx, xxx]

Cohen's d

= Mean1-mean2/ Standard deviation

The f ratio

=variation between/variation within =small/large f=small =Variation between/variation within =large/small f=large

addition rule

A rule of probability stating that the probability of any one of two or more mutually exclusive events occurring can be determined by adding their individual probabilities. use with MUTUALLY EXCLUSIVE outcomes use when finding probabilities of event A OR event B

What is the difference between a normal distribution and other symmetrical distributions?

A specific percentage of scores must fall within a given area of the curve

type 2 error

Accepting null hypothesis when you should have rejected it

advantages of effect sizes

Although there are researcher df (not related to sample size) that researchers could use to maximize (or minimize) effect sizes, there is less incentive to do so because effect sizes are not tied to a decision rule in which effects either side of a certain threshold have qualitatively opposite interpretations. interpretation of effect sizes is not cofounded by sample size

elementary event

An event which contains only a single outcome in the sample space

ANOVA

Analysis of variance a single test that does a hypothesis test that compares the mean of three or more samples. If you find a significant effect in your ANOVA, you can then run a post-hoc test.

benefits of bayesian approaches

Bayesian methods focus on estimating parameter values (which quantify effects) or evaluating relative evidence for the alternative hypothesis (Bayes factors) -Behavior such as p-hacking is circumvented because there's no all-or-nothing thinking, only estimation and interpretation. unlike NHST, you can draw conclusions about the likelihood that the null hypothesis is true

why not run multiple t tests?

Because the risk of committing type 1 error (the rejection of a true null hypothesis) aka "problem of multiple comparisons

reasons to learn statistics

Conduct or understand research To understand research design and analysis Stats analysis is expensive

Pearson's r cross product

Cross Product=(Xi−X¯).(Yi−Y¯) in order to define the correlation coefficient between two variables. Sum of Cross Product (SP)==∑Ni=1Cross Product of X & Y∑Ni=1(Xi−X¯).(Yi−Y¯)

Test-statistics: signal-to-noise ratios

Difference between means/ standard error of difference effect/error or signal/ratio

The new statistics EMBER model

Effect sizes Meta-analysis Bayesian Estimation Registration

independent events

Events for which the outcome of one event does not affect the probability of the other

How ANOVA uses variability

F ratio: to decide whether the amount of between-group variability is large or small, ANOVA compares it to within-group variability =between group variability/within group variability =treatmentr effect + individual differences/ individual differences As treatment effect grows, numerator becomes larger than denominator. As F ratio increases, results are more likely to be statistically significant.

f ratio apa format

F(between, within)= f-ratio, p value

Central Limit Theorem

For large samples: - sampling distribution will be normally distributed -mean of sampling distribution= mean of population -can calculate standard error -

How would you write the null hypothesis for an independent samples t test using symbolic notation? Choose all that apply.

H0: U1=U2 H0: U1-U2=0

A researcher thinks that the mean starting salary for San Jose State University graduates is under $100,000 per year.

H0: μ ≥ 100,000; H1: μ < 100,000

A researcher thinks that the mean starting salary for San Jose State University graduates is under $100,000 per year. Choose the correct null and alternate hypotheses below for this one-tailed example.

H0: μ ≥ 100,000; H1: μ < 100,000

How would you write the null hypothesis for an independent samples t test using symbolic notation?

H0: μ1= μ2 H0: μ1-μ2=o

A desperate statistician believes that using corny stats jokes as pick-up lines will increase the number of dates he goes on. Statisticians, as a group, typically go on 1.3 dates per year, σ = 0.4. If μJ represents the average number of dates for statisticians who tell corny stats jokes and μNJ represents the average number of dates for statisticians who don't tell corny stats jokes, which of the following is the null hypothesis this statistician should use?

H0: μJ ≤ μNJ

hypothesis of independent sample t-tests (stated same manner as single sample t-test)

H1 (alternative): population represented by sample 1 =/ population represented by sample 2 (μS≠μNS) H0: sample 1= sample 2 μS=μNS

SSM

How well did our model explain variance? how much better was it than using just the overall mean

Which of the following best describes the relationship between sample size and significance testing?

In large samples even small effects can be deemed 'significant'.

variation and significant results

In order to have a significant result, the differences due to your experiment must be large compared to the variation in the population Differences due to experiment/ variation in population "large compared to the individual differences of your population" - this is why we don't just look at the difference between the two mean values. When we are dividing by the standard deviation/standard error, we are "taking into account" the individual differences of the population.

experiment

In probability theory, refers to any procedure that can be repeated and that has a well-defined outcome or set of outcomes

Why is it useful to know the standard error of the mean? (

It is provides a measure of sampling error. It tells you how much variability you should expect in the sample means.

standardized normal curve

It is the distribution that occurs when a normal random variable has a mean of zero and a standard deviation of one.

red flags when viewing graphs

Lack of clear axis labeling truncation of an acid so it ends or starts at misleading place numbers not adding up correctly, especially pie graphs poor use of graph type visual deceptions, like inappropriate use white space or comparative elements that don't seem to correspond to the Data they represent cherrypicking data, like including limited years or responses chosen to support a specific argument bad graphic choices, like low-contrast or a distorting picture not title or no key for information in graph

Why is it important for psychologists to understand the normal curve?

Many psychological variables are assumed to be normally distributed Many statistical analyses assume that data are normally distributed. Understanding the frequencies allow researchers to determine the likelihood of observing individuals with a specific characteristic.

__________ evaluate(s) the probability of a test statistic given the null hypothesis, while __________ evaluate(s) the probability of a hypothesis given the data.

NHST; bayesian approaches

p value

Neyman: -defined to be the smallest type 1 error rate (alpha) that you have to be willing to tolerate of you want to reject the null hypothesis -summary of all possible hypothesis tests that you could have run, taken across all possible alpha values fisher: -the probability that we would have observed a test statistic that is at least as extreme as the one we actually did get -if the data are extremely implausible according to the null hypothesis, then the null hypothesis is probably wrong

median and percentiles

Nominal: No Ordinal: yes Interval/ratio: yes

Mean; add/subtract

Nominal: no Ordinal: no Interval/ratio: yes

frequency distribution

Nominal: yes Ordinal: yes Interval/ratio: yes

null vs alternative hypothesis

Null: -H0 -goal to show that null hypothesis is (probably) false -Think of criminal trial, null is the defendant -goal of researcher (you) is to prove beyond a reasonable doubt that null is probably false alternative: -H1 -goal to show alternative hypothesis is (probably) true -show no direction, show just an effect is present

equations for null and alternative hypothesis for 2 tailed tests

Null: H0: mean group A= mean group B alternative: mean group A =/(does not equal) mean group b

Post -hoc tests

Only use if ANOVA result is significant Compare each mean against all others Use a stricter criterion to accept an effect as significant, controlling the familywise error rate. example: Bonferroni method: -Pcrit= Alpha/ k(#of groups)

preregistration

Open science: A movement to make the process, data and outcomes of research freely available to everyone. Pre-registration of research: -making all aspects of your research process (rationale, hypotheses, design, data processing strategy, data analysis strategy) publicly available before data collection begins. -registered reports in an academic journal --If protocol considered rigorous enough & research question novel enough, protocol is accepted by journal typically with a guarantee to publish findings no matter what they are -public websites (e.g., the Open Science Framework).

Which of the following are criticisms or limitations of NHST?

Opposite conclusions are often made for p-values of 0.0501 versus 0.0499, despite those values being very similar A non-significant effect doesn't tell us that there is no effect, instead it merely indicates that there is no effect large enough to be detected by the sample that was taken. The p-value does not indicate whether the effect is important or meaningful in the real world.

critical value and decision rule for independent t-test

Reject null hypothesis if t ≤ - tcv or t≥ tcv (if observed t statistic falls in a rare zone, reject null) Fail to reject null hypothesis if -tcv>t>tcv (if observed t statistic falls in the common zone,fail to reject null)

type 1 error

Rejecting null hypothesis when it is true depends on the alpha (significance level) a hypothesis test is said to have a significance level (alpha) if there type 1 error rate is no larger than alpha

What are the similarities and differences between sum of squares (SS), variance (s2) and standard deviation?

Represent the same thing: -the "fit" of the mean to the data -the variability in the data -how well mean presents observed data -error differences: -SS: total dispersion / error -Variance: average error in unit^2 -Average error in original units

equations

SS = ΣX2 - ((ΣX)2 / N)

Pearson's correlation coefficient

Statistical test that measures the degree of linear relationship between two interval/ratio-level variables

symbols

Symbol What is it? Do we know what it is? ̄ X Sample mean Yes, calculated from the raw data μ True population mean Almost never known for sure ˆ μ Estimate of the population mean Yes, identical to the sample mean in sim- ple random samples

t-distribution

The t distribution is a continuous distribution that looks very similar to a normal distribution, Note that the "tails" of the t distribution are "heavier" (i.e., extend further outwards) than the tails of the normal distribution). That's the important difference between the two. This distribution tends to arise in situations where you think that the data actually follow a normal distribution, but you don't know the mean or standard deviation Different than normal distribution BC: -normal requires mean and standard deviation

why use follow-up tests

The F-statistic tells us only that the experiment was successful i.e. group means were different It does not tell us specifically which group means differ from which.

The mean of the sampling distribution of the range is:

The mean of the sampling distribution of the range is less than the range in the population. The range in the population is the largest possible range for a sample. The mean range in a sample will be less than this.

experimenter sample

The sample where the independent variable is introduced or manipulated by the experimenter. (given a drug or some kind of treatment/intervention)

control sample

The sample where the independent variable was not introduced or manipulated by the experimenter. (placebo/no treatment)

What is the standard error of the mean?

The standard deviation of the sampling distribution of the mean

Imagine you managed to measure every case in the population that you were interested in. Which statements below would be true?

The standard error would be zero. All confidence intervals would contain only the true mean and no other values.

regression equations

The symbol X represents the independent variable. The symbol a represents the Y intercept, that is, the value that Y takes when X is zero. The symbol b describes the slope of a line. Y′=a+bX

The samples are typical of their population; they come from different populations

The tail of a t distribution is taller when sample size is smaller. As a result, the critical value of t is farther away from zero when the sample size is small, making it more difficult to reject H0

multiplication rule

To determine the probability, we multiply the probability of one event by the probability of another. use to find joint probability of independent events

what is "the model"

To test whether means are different, need a way to test whether predicting recall from group means results in better prediction than predicting from mean of all scores (grand mean). model: using group mean to predict individual scores in the group (alternative hypothesis)

p hacking

Trying multiple analyses/measuring multiple outcomes but only reporting significant results -Stopping data collection at a point other than when the pre-determined sample size is reached -Including (or not) data based on the effect they have on the p-value. -Including (or excluding) variables in an analysis based on how those variables affect the p-value -Merging groups of variables or scores to yield significant results -Transforming, or otherwise manipulating scores to yield significant p-values

mutually exclusive events

Two events that cannot occur at the same time flipping a coin, cannot be heads and tails, only either

why are counterbalancing and randomization useful

Use counterbalancing to deal with practice effects and fatigue Randomization minimizes possibility of systematic differences (ex: motivation, intelligence, SES, sex, handedness) between groups

z score

Value of an observation expressed in standard deviation units transformation: applying mathematical function to all observations in a data set Standardization: converting variable into a standard unit of measurement (usually standard deviation units); makes data comparisons possible Sign indicates whether above/below mean indicates how many standard deviation away from mean

belief bias effect

When deciding whether an argument is valid, were influenced by believability

APA format

X^2( degrees of free, n=# of samples)= test statistic, p>.05 fail to reject/ p<.05 reject null

We can convert a raw score to a z-score in order to compare scores that originally had different units of measurement. Can we convert z-scores back to raw scores?

Yup, just rearrange the equation to solve for the raw score.

t vs z statistic

Z: -sample mean - population mean/ SQR sample size T -if sample deviation is too small use t -

Normal distribution

a bell-shaped curve continuous distribution Described using 2 parameters: -The mean of distribution -standard deviation of the distribution

Null hypothesis significance testing (NHST)

a framework for establishing whether a hypothesis is true by working out the probability of observing a statistic at least as large as the one observed if the null hypothesis were true

duck

a graphic is taken over by decorative forms or computer debris, when the data measures and structures become Design Elements, when the overall design purveys Graphic style rather than quantitative information

parameters

a numerical characteristic of a population, as distinct from a statistic of a sample.

statistic

a numerical measurement describing some characteristic of a sample

sample

a set of observations draw from a population of interest used since we can't measure every member of a population

effect sizes

a standardized measure of there size of an effect -standardized= comparable across studies -Not (as) reliant on the sample size There are several effect size measures that can be used -Cohen's d -Pearson's r -odds ratio/risk rates --example: (odds of heart attack) high cholesterol/ low cholesterol

experimental research design

a variable is manipulated to measure its effect on another variable Independent (between subject): -different entities in experimental conditions Repeated measures (within-subject): -same entities take party in all experimental conditions -use counterbalancing to deal with practice effects and fatigue

A negative z score indicates that:

an individual's raw score fell below the mean.

cofounding variable

any variable that varies along with the predictor variable in such a way that you cannot tell which variable is really having an effect

sources of variation and the test statistic

any variation that can be measures in a study. In the one way between subjects ANOVA, there are 2 sources of variation -between groups variation: variance of group means -within-group (error) variance: variation attributed to error

construct validity

are you measuring what you set out to measure

Why don't we make the alpha level as small as possible to avoid Type I errors?

as researchers we need to compromise between 2 things: 1. we need to avoid type 1 error 2. we are aiming to reject the null hypothesis and avoid type 2 error

A percentile rank tells the percentage of cases whose scores are _______ a given level.

at or below

Variance

average error in unit^2 average dispersion

Why is the sampling distribution of the mean normally-distributed for large enough Ns?

because as sample gets larger, more scores will fall near the mean which will give us a normally distribution look

If you transform all the scores in a distribution to z-scores, what happens to the relative position of the scores (the order the scores are in)?

because you do the same thing to each score, there's no change in relative position of scores

As sample size increases, the t-distribution

becomes more normally-distributed

demand effects

behavior changes due to being knowingly observed/studied -Placebo effect: simply being treated leads to improvement

posterior probability

belief in hypothesis/model *after* considering the data

prior probability

belief in hypothesis/model *before* considering the data

The _____ line is the line that best represents the overall pattern of association between the values graphed on a scatterplot.

best fit

In class, we discussed the ________________, which is a correction that is applied to the alpha-level to control the overall Type I error rate when multiple significance tests are carried out. It is simple and effective, but can be too strict when lots of tests are performed.

bonferroni

Which type of graph shows the median and interquartile range?

box plot

theory of ANOVA

calculate how much variability there is between scores -total sum of squares then calculate how much variability can be explained by the model we fit to the data -how much variability is due to experimental manipulation: Model Sum of squares (SSmodel or SSbetween) and how much cannot be explained -how much variability is due to individual differences in performance: residual sum of squares (SSresidual or SSwithin) we compare the amount of variability explained by the model (experiment), to the error in the model (individual differences) -this is the F-ratio. Its a signal to noise ratio

running the hypothesis test in practice

called binomial test

Bayesian view of probability

called the subjectivist view defines probability of an event as the degree of belief that an intelligent/rational agent assigns to that truth of that event states probabilities don't exist in the world but rather in thoughts and assumptions of people Pros: -assign probabilities to any event you want to Cons: -cant be purely objective -doesnt allow observers to attribute different probabilities

why is it useful to know the area under the curve

can tell how likely a given value is. we can expect to see what values we should reasonable expect to see by chance

For matched samples, we're interested in the mean of differences, which is symbolically represented as

capital D with line over it

The _____ describes the relationship between the sampling distribution of means and the population those samples came from.

central limit theorem

Calculate central tendency and variability statistics

centran tendancy: -open descriptive statistics, click statistics tab, clean mean, median, mode

An instructor wants to know whether there is a relationship between whether students read the text before/after a lecture and whether they pass/fail an exam. Which test should she use?

chi-square test of independence

paired sample

choosing one sample influence the measurements of the other sample example: measuring subjects in a psychological study before and after an experiment

_______ probability involves the theoretical probability of an event, while _______ probability is based on relative frequencies from actual observations.

classical; empirical

For t distributions, when the sample size gets large, the rare zone gets _________ and it is ________ to reject the null hypothesis.

closer to zero; easier

single-sample t test

compares a single sample means with a known or hypothesized population mean

Independent samples t-test

comparing 2 independent samples that represent different populations for example: population 1: people who eat sugar before watching commercials population 2: people who do not eat sugar before watching commercials

A range of values thought to contain the true mean of a population is called a

confidence interval

Population

consists of all possible people, animals, observations, etc that wed like to know something about

A _______ sample is easily gathered but not likely to representative.

convenience

correlation and causation

correlation does not equal causation often involve influence from a third variable

critical region

corresponds to the values of X that would lead us to reject null hypothesis consists of the most extreme values, known the tails of distribution

As N approaches infinity, standard error _________ .

decreases

As the sample size_____ , the t distribution becomes flatter, with fewer cases in the middle and more cases in the tails.

decreases

critical values

define the edges of the critical region how extreme the test statistic needs to be to consider its unlikely enough to reject the null cuts of distribution "tail(s)"

Calculating the mean to summarize your sample is an example of

descriptive statistics.

chi square Gof

determines whether a set of categorical data comes from a claimed distribution 1 way table used to analyze the relative frequencies of different categories within a single variable Null: the characteristic is distributed in the sample in exactly the same way as in the population alternative: the distribution of the characteristic in the sample is different from what is specified

With paired-samples t-tests, the sampling distribution consists of:

difference scores

unsystematic variation

differences created by unknown factors -age, gender, IQ, time of day, measurement error

systematic variation

differences in performance created by a specific experimental manipulation Systematic: done or acting according to a fixed plan; methodical

If a researcher is doing a one-tailed test, he/she should predict the ________ of the results before collecting any data.

direction

The sign of a correlation coefficient indicates the _____ of the association while the magnitude indicates _____ .

direction; strength

Discrete vs. Continuous

discrete: -variables that can only take a set of fixed values -Often, but not always, integral values continuous: -are variables that can take on any value within some specified range -The only limitation to measuring continuous variables is the accuracy of the measurements device

homogeneous attrition

dropout rate in same for all study groups -can lead to unrepresentative sample and external validity issues

heterogenous (differential) attrition

dropout rate is different for different groups -kind of selections bias caused by itself -can threaten internal validity

The change in our outcome variable that is due to our experimental manipulation is called the _____ .

effect

If we use a non-directional hypothesis, we are open to extreme differences found in ______ tail of the distribution.

either

I'm interested in whether there is a relationship between letter grades in K300 and handedness. It turns out I have 1 individual in my sample who is ambidextrous... what should I do? (Hint: what are assumptions of the chi-square test?)

either exclude them from the study or collect a larger sample that results in an expected frequencies of 5 or greater

standard error

estimated as the sample standard deviation divided by the square root of the sample size. n is the size (number of observations) of the sample. Gets *smaller* as the sample size *increases*

confidence intervals

estimates the population parameter with a specific level of confidence and precision.

The _____ frequencies represent what we should see if the pattern does not depart from chance.

expected

A statistician uses a contingency table to do the following calculation: the row total multiplied by the column total, divided by total number of observations. What are they calculating?

expected frequencies for chi-square test of independence

experimenter bias

experimenter influences results by subtly communicating "right answer" or "desired behavior" to participants -double blind studies are ideal, But difficult to achieve

The _____ distribution is utilized in an analysis of variance (ANOVA) test for the multiple comparisons of population means.

f

The chi-square test statistic can be negative.

false

True/False: You can entirely eliminate sampling error if you use random sampling and get a sample that is representative of the population.

false

The alpha-level we choose represents the probability of ______ that we are willing to tolerate.

false positives (Type I errors)

The ability of a test to reject a _____ null hypothesis is known as ______ .

false; power

Can you interpret a p value as the probability the null is true? (Why/why not?)

fisher: means observing a value of the test statistic that is as or more extreme than what was observed in the sample, assuming the null hypothesis is true. Newman: -the p value didn't directly measure the probability of the data (or data more extreme) under the null, it was more of an abstract description about which "possible tests" were telling you to accept the null, and which "possible tests" were telling you to accept the alternative

ANOVA summary table

for between and within group: -take SS/DF=ms (mean of squares)

Who is more likely do describe an 80% probability of the robot team Arduino Arsenal winning as the following?: "They're robot teams so I can make them play over and over again, and if I did that Arduino Arsenal would win 8 out of every 10 games on average. "

frequentist

It is important that a sample is representative, because then we can _____ the results from the sample to the _____ .

generalize; population

Degrees of freedom is k-1 for the chi-square ___________ and (r-1)(c-1) for the chi-square ___________ .

goodness-of-fit test; test of independence

If a z-score is positive then it is...

greater than the mean

Why should we care about levels of measurement?

helps you decide how to interpret the data from that variable.

SSR

how much error there is in the model, the diff between predictions and actual observations

A(n) _________ can indicate whether or not the effect you observed is real (and not just due to chance), while _________ indicates how large (and possibly meaningful) the effect is.

hypothesis test; effect size

When is a post-hoc test used? (Select all that apply.)

if the F ratio is greater than the critical value for F if the ANOVA is statistically significant

A sample size _____ , the sampling distribution for repeated random samples will approach a normal distribution.

increases

All other factors being equal, as an effect grows larger, the probability of rejecting a null hypothesis _____ .

increases

As n ___ , the sampling distribution approaches a normal distribution.

increases

Assuming sample size stays the same, as the width of a confidence interval increases (including more values in the interval), our confidence that the interval contains the true population value _____ .

increases

As the difference between observed and predicted values grows, the chi-square test statistic ________ and its associated p-value _________ .

increases; decreases

In a(n) _____ samples t test, the test focuses on the difference between the mean of one sample and the mean of the other sample.

independent

one-way between subjects ANOVA

independent (not repeated measures) design way=factor refers to the independent variable, meaning we can use it to test 1 indepedent variable (but it can have many levels)

Effect sizes are used to quantify the impact _____ of the variable on the ______ variable.

independent; dependent

Effect sizes are used to quantify the impact of the ____ variable on the ____ variable.

independent; dependent

Which measurement level(s) can data be measured at for tests that compare means (like t tests & ANOVA)

interval/ ratio

test statistic

is a random variable that is calculated from sample data and used in a hypothesis test. You can use test statistics to determine whether to reject the null hypothesis. The test statistic compares your data with what is expected under the null hypothesis.

CI t-equation/means:

is a range of values that you can be 95% certain contains the true mean of the population.

scatter plot

is a set of points plotted on a horizontal and vertical axes. Scatter plots are important in statistics because they can show the extent of correlation, if any, between the values of observed quantities or phenomena (called variables). positive (direct) relationship: -high scores on X are associated with high scores on y -postive R score negative (inverse) relationship: -high scores on X are associated with low scores on Y

Explain what it means to say the normal curve is a probabilistic distribution.

is a statistical function that describes all the possible values and likelihoods that a random variable can take within a given range. ... These factors include the distribution's mean (average), standard deviation, skewness, and kurtosis

linear association

is a statistical term used to describe a straight-line relationship between 2 variables -very rare calculated by y=mx+b

parameter

is any numerical quantity that characterizes a given population or some aspect of it. tells us something about the whole population.

Unlike the correlation procedure, regression analysis _____ symmetrical.

is not

With Pearson's chi-square test, the model we use to calculate expected frequencies assumes that there _______ a relationship between variables.

is not

sampling distribution

is the probability distribution of a given random-sample-based statistic. importance: -estimate parameters of the population distribution Considered a plot of error: -because sampling distribution is an estimate of population

Conditional Probability

is the probability of one event occurring with some relationship to one or more other events

coefficient of determination

is the proportion of the variance in the dependent variable that is predictable from the independent variable. R^2

Which below is an undesirable characteristic of frequentist probability?

it doesn't provide a way to assign probability to single non-repeatable events

A one-way ANOVA is called "one-way" because

it predicts the outcome variable using one predictor variable

For questions that you can answer using probability theory, the truth of the world is ________ . For statistical questions, the truth of the world is ________ .

known; unknown

When the between-groups variance is a lot larger than the within-groups variance, the F-value is ____ and the likelihood of such a result occurring due to sampling error alone is _____

large; low

The _____ the difference between observed and expected frequencies, the _____ the chi-square test statistic will be.

larger; larger

A Bayes factor _______ than 1 supports the null hypothesis.

less

For one-sample tests, as the difference between the sample mean and population mean gets closer to zero, it becomes _______ likely that the null will be rejected.

less

The relationship between confidence level and interval width is such that as our confidence increases (from 95% to 99%, for example), the estimate will be come _____ precise.

less

One generates a sampling distribution of means by

making a frequency distribution of means computed from multiple samples of a specified size from a specified population

Which measure(s) of central tendency is/are at the midpoint of a normal distribution?

mean median mode

negative direction

mean less than median less than mode starts skewness low then peaks on right side

outcome

measurable result of some process -grade on exam

test-retest reliability

measure produces *consistent* results when same entities tested at 2 different time points

standard deviation

measures the dispersion of a dataset relative to its mean and is calculated as the square root of the variance

Which of these statistics has a sampling distribution?

median mean range standard deviation every statistic has a sampling distribution

In a skewed distribution, the ______ would be closer to the body of the distribution, while the ______ would be pulled towards the tail (outliers).

median; mean

A ______ calculates the average effect size across several comparable studies.

meta analysis

median

middle score when scores are ordered useful for ordinal or distributions that are skewed/have outliers interval/ratio

positive direction

mode less than median less than mean starts peak then skewness to the left

frequentist view of probability

more dominant in stats defines probability as a long-run frequency states as the sample increases (approaches infinity), the probability will even out towards the right answer Pros: -objective: probability of event grounded in the world -unambiguous cons: -infinite sequences dont exist in physical world EVENTS MUST BE REPEATABLE TO ESTIMATE PROBABILITIES

mode

most frequent score Intuitively, it is the value associated with the highest point on the histogram can be used for nominal data and is also useful for strongly bimodal or multimodal distributions nominal, ordinal, interval ratio

Which below are requirements of the nominal level of measurement?

mutually exclusive categories collectively exhaustive categories

degrees of freedom for R

n-2

As sampling size increases, the sample statistic becomes a better estimate of the population (because more cases are included in the estimate). As the estimate gets better, the width of the sampling distribution (its standard deviation) will get _____ .

narrower

As sample size increases, the confidence interval becomes _____ and _____ precise.

narrower; more

correlational research design

naturally observing without directly interfering

Assuming our sample size does not include the entire population, can we be absolutely certain our specific sample mean is the same as the population mean?

no, there's always the possibility of sampling error

The assumptions for the single-sample z test that are robust are random samples and ______

normally distributed

The chi-square distribution describes the probability of chi-square values when the ______ is true.

null hypothesis

test statistic representation

null hypothesis

r null hypothesis

null: p=0

A desperate statistician believes that using corny stats jokes as pick-up lines will increase the number of dates he goes on. Statisticians, as a group, typically go on 1.3 dates per year, σ = 0.4 To test whether the statistician is correct, would you use a one-tailed test or two-tailed?

one-tailed

Turning a meaningful but somewhat vague concept into a specific measurement is known as [term].

operationalization

whether a student was a freshman, sophomore, junior, or senior. What is the measurement level for these responses

ordinal

trial

particular instance of the experiment -an exam

percentile rank

percentage of cases with score at or below a given level in a frequency distribution can be computed from z scores

The symbol for mean of a _____ is μ and the symbol for the mean of a _____ is X¯ .

population; sample

power

probability of rejecting H0(null) when H0(null) is false -measure of how likely your test is to be correct -ability of test statistic to detect effect -related to type 2 error -power= 1- beta depends on many things: -size of effect -how strict the criterion for accepting hypothesis is (alpha-level) -sample size -whether test is one or two tailed

Probability vs statistics

probability: -"the doctrine of chances" -tells you how often different kinds of events will happen -starts with a model of the world stats: -work other way around than probability -DO NOT know the truth of the world -only data, data being used to LEARN the truth of the world

effect sizes

quantifies how "similar" the true state of the world is to the null hypothesis -example: difference between true population parameters (x) and the parameter values assumed by null hypothesis (x0) -(x-x0) why is it useful? -tells you whether or not you should care about the effect size -big effect size: difference is real, and of practical importance -small effect size: difference is real, but might not be interesting

Why effect sizes are useful:

quantify strength of relationships to null

quantitative vs. qualitative

quantitative: -testing theories using numbers -consists of numerical data Qualitative: -testing theories using language -magazine articles/interviews, conversations, newspapers, media broadcast

_____ indicates the strength and direction of an association while _____ shows the amount of variance one variable explains in the other.

r; r^2

In a _____ sample, all cases have an equal chance of being selected, and all combinations are possible. Also, the selection of one case doesn't affect the selection of other cases; they are independent.

random

A ____________ score has not been transformed, so it is in its original units of measurement.

raw

Consider a study using a between-groups design with between-groups df = 3 and within-groups df = 4. Given a p level of 0.05, the researcher should:

reject the null hypothesis if F > 6.591.

When cases are not selected independently of one another and share certain characteristics, those samples are:

related

Precision and uncertainty

relationship: every data set leaves us with uncertainty, so our estimates are never perfectly accurate confidence intervals allows us to use precision to be as accurate as possible

A sample that is ____ mirrors the population in important respects.

representative

Binomial Distribution

represents the probability for x successes in n trials, given a success probability p for each trial. discrete distribution

Research vs. Statistical Hypotheses

research: making a substantive, testable scientific claim statistical: Must be mathematically precise and must correspond to specific claims about characteristics of the data generating mechanism (i.e. the population)

quasi-experimental research

researcher doesn't control the predictor, but can use it to assign to groups (ex: cannabis users & nonusers)

biased or unbiased?

sample mean: unbiased standard deviation: Biased

Suppose there is a jar containing many gumballs, each with a unique number on it. The numbers range from 0 to 32 and there is an equal number of gumballs with each number. A student set out running an experiment with the following procedure: Pick five gumballs from the jar, calculate the mean of the numbers on the gumballs, write down the result on a piece of paper, and put the gumballs back to the jar. Repeat the process 499 times so altogether there are 500 means recorded. Then draw a frequency distribution of the 500 means. In this case, the sample size is

sample size: 5 number of samples: 500

Sampling error occurs because

samples only include parts of the population, so they differ and have some error involved in estimating parameters

A ___ is essentially a plot of error.

sampling distribution

Samples vary, so it's not uncommon to get a statistic that is somewhat different from the parameter it is supposed to estimate. This difference between the statistic and parameter is called:

sampling error

Writing up the results of a hypothesis test

say something about the p value and whether or not the outcome was significant result in critical region: -p < alpha for a significance level that you chose in advance -means you reject null; statistically significant results in common zone: -our data are within the range we'd expect if the null week true - P> alpha -fail to reject null; not statistically significant significant p values: -significant under p<.05 -p<.05: null rejected -p<.01: null rejected -p<.001: null rejected

If the t-statistic observed is _______, then it is unlikely that the value we observed could occur from two random samples; it is likely to be due to the experimental manipulation.

small

If the null hypothesis predicts the data well, the difference between observed and expected will be _____ and the chi-square test statistic will be _____ (relative to if the null was a bad predictor)

small; small

Range

smallest score subtracted from largest

theoretical construct

something youre trying to measure that can't directly observe

In what units do z scores express raw scores?

standard deviation units

The _____ provides an overall average of how far different sample means deviate from the true population mean.

standard error of the mean

A _____ is a characteristic of a sample, while a ______ is a characteristic of a population.

statistic; parameter

Mean

sum of scores decided by number of scores sensitive to outliers uses all the cases and in that sense it provides most information, but its not appropriate for some measurement levels and distribution shapes interval/ratio

calculate test statistic

t= M1-M2/ SM1-SM2 t= calculate the different between 2 sample means/ calculate the standard error of the difference for the two means

meta analysis

taking a weighted average of effect sizes for series of studies that investigated same research question Each effect size is weighted by its precision (i.e., how good an estimate of the population it is) -Large studies, which will yield effect sizes that are more likely to closely approximate the population, are given more 'weight' than smaller studies, which should have yielded imprecise effect size estimates. Aim of meta-analysis is to estimate the population effect (not 'significance'). -Therefore, it overcomes the same problems of NHST that we discussed for effect sizes --Quantifying effects on continuum rather than interpreting with an arbitrary threshold

chi-square test of independence

tests whether there is an association between 2 variables used to determine if the relative frequencies within the categories of one variables are associated within the relative frequencies within the categories of a second variable. df= (r-1)(c-1)

You are interested in whether political party affiliation would influence voting on a particular issue. You gather data on whether Republicans, Democrats, and Independents voted "yes" or "no" on the issue. You're not sure what probabilities to compare to (you don't know whether people will be more likely to vote "yes" or "no"), so you use a chi-square test of independence. Your contingency table includes political party (R/D/I) and vote ("yes"/"no"). The null hypothesis for this example would state:

that there is NOT a relationship between political party and voting for this issue.

Variance

the expectation of the squared deviation of a random variable from its mean divide by N when using population divide by n-1 when using a sample

With one-way ANOVAs, the null hypothesis is represented by:

the grand mean

How does standard deviation change the shape of a distribution?

the higher the standard deviation, the more spread out the data standard deviation close to zero indicates data points are close to the mean

As alpha increases, what happens to the likelihood of rejecting the null hypothesis?

the likelihood increases

As the alpha-level increases, what happens to the likelihood of rejecting the null hypothesis?

the likelihood increases

HARKing

the practice in research article of presenting a hypothesis that was made after data collection as though it were made before data collection

statistics

the practice or science of collecting and analyzing numerical data in large quantities, especially for the purpose of inferring proportions in a whole from those in a representative sample.

marginal likelihood

the probability of the observed data

sample space

the set of all possible outcomes Grade: A, B, C,...

logic of confidence intervals

the size of the interval around sample mean is determined by: -The expected amount of sampling error (SEM) --more error, *wider* interval -The specific level of confidence (either 95 or 99%) --trade-off between confidence and precision As N (sample) increases, SE and interval gets smaller lower error, more precise estimate

The standard deviation of a sampling distribution of sample means is known as

the standard error

The t test statistic for related samples is a ratio of mean differences relative to

the standard error of the mean difference.

calculate test statistic

the sum of all variables (OF-EF)^2/ EF// add numbers

Why do we use μD¯ when we write hypotheses about mean differences?

the symbol μ means "hypothesis" in Greek.

For 2x2 ChiSquare tests, what does a significant result mean?

there's predictability in the relationship between the variables there is an association (relationship) between the variables

Effect sizes are useful because...

they provide a way to compare across different studies that have measured different variables, since they provide an objective and typically standardized measure of observed effect. they help determine whether an effect is meaningful or important in the real world.

For a goodness of fit test you have 5 groups with 10 people in each group. What is your degrees of freedom? (Report the number, rounded to the nearest whole number.)

think- k= 5 groups of people df= k-1 4

SST (sum of squares total)

total variation that needs to be explained (diff observed compared to null: no diff)

ANOVA is always a two-tailed test.

true

The sum of a set of z scores is zero.

true

With hypothesis testing, whether a conclusion is actually true or not is unknown, though we can calculate how likely it is that a conclusion is true

true

With hypothesis testing, whether a conclusion is actually true or not is unknown, though we can calculate how likely it is that a conclusion is true.

true

With hypothesis testing, whether a conclusion is actually true or not is unknown, though we can estimate how likely it is that a conclusion is true.

true

effect size can be negative

true

A ___ - tailed hypothesis is also known as a non-directional hypothesis.

two

When a researcher rejects the null hypothesis and the null hypothesis is actually true, what kind of error is this?

type 1 error

Dave's one-way ANOVA resulted in a non-significant F-ratio. He then looked at the SPSS output for model parameter estimates to interpret individual parameters. Because Dave's ANOVA did not find a significant result, if he interprets individual parameters he is at risk of making which type of error?

type 2

, Dr. Reed failed to reject the null hypothesis, which was unfortunate because in this case the null hypothesis should have been rejected. What type of error did Dr. Reed make?

type 2 error

Dr. Reed failed to reject the null hypothesis, which was unfortunate because in this case the null hypothesis should have been rejected. What type of error did Dr. Reed make?

type 2 error

Dr. Spencer Reed conducted a study to see whether listening to Wu Tang Clan albums improves vocabulary. He compared Wu Tang listeners to the rest of the population, using vocabulary measures that had a known population mean and standard deviation. Based on his results, Dr. Reed failed to reject the null hypothesis, which was unfortunate because in this case the null hypothesis should have been rejected. What type of error did Dr. Reed make?

type 2 error

Without the N-1 correction, sample variance:

underestimates population variance

between-group variability

variability in scores that is primarily due to the different treatments the groups received (or the variability across the different levels). (treatment effect)

within-group variability

variability within a sample who have all received the same treatment (or are in the same level). Mostly reflects individual differences.

Case studies:

very detailed, but unique, isolated examples that don't generalize well

In a scatterplot, the more dispersion there is in the points, the _____ the association.

weaker

Simpson's Paradox

when a trend appears in several different groups of data but disappears or reverses when these groups are combined

complementary probabilities

when one event occurs if and only if the other does not both events cannot occur at the same time Solve: probability (NOT A)= 1- probability (of A)

floor effect

when some constraint (limitation, restriction) prevents your data values from dropping below a particular value example: math test given to incoming freshman to test their math capabilities

ceiling effect

when some constraint prevents your data from exceeding a particular value ex: a test where the highest score possible is 100 points

When should you use post hoc tests?

when the F-ratio exceeds the critical value

descriptive statistics

when we use statistics to describe our sample

inferential statistics

when we use the data from a sample to draw a conclusion about a population

repeated measures

when you have one sample cases and your collect data on them at 2 different time points

A ChiSquare test is used to determine _______ there is an association between variables.

whether

Which of the following study designs control for individual differences?

within-subjects design repeated-measure design

Decision rule for Gof

x^2 < critical value, fail to reject x^2> critical value, reject null critical value: -DF= variables-1

independent measures

you cannot remove the individual differences - creating higher population variance estimates - so you need a bigger experimental effect to be convinced that the results are significant. So the estimated variance based on the difference scores is quite low, but the estimated variance based on the data from the two experimental groups is quite high.

repeated measures

you use the same people and subtract their before and after scores, so you basically remove the individual differences. This process (mostly) isolates the experimental difference.

decision rule for 2 tailed tests (z and t tests)

z</= -critical value and z>/=critical value, reject null(statistically significant) -critical value < z <critical value, fail to reject start with believing the null is true

A sampling distribution of mean differences that represents the null hypothesis will be centered on ____ . (Or in other words, what value will be at the center/middle of the sampling distribution of means.)

zero

confidence interval equation

μ-(1.96 x SEM) μ+(1.96 x SEM) CI(95)= X (mean) +/- (1.96 x SD/SQR of n (sample)

You are not feeling so confident about how you'll do on the quantitative portion of the GRE. To deciding whether to a GRE prep course is worth the outrageous cost, you want to make sure that scores of people who took the course are significantly different than those who didn't. (For now we won't worry about higher/lower, just whether the scores are different.) If the mean quantitative GRE score of all GRE test takers is 151.3 with a standard deviation of 150.8, what will your null hypothesis for a single-sample ztest be?

μ=151.3 (mu equals 151.3)

Which below is the symbol for an estimate of the population mean?

μ^ (mu hat)


Set pelajaran terkait

ECON 3900 Practice Questions Chapter 11 "Aggregate Demand: Part II"

View Set

Literary Terms antithesis - cacophony

View Set

Chapter 8: communicating digital content

View Set

[1S-MIDTERMS] General Physics 1 - Module 1

View Set

First Astronomy Exam Smart Work Answers

View Set

Ch. 71 Care of Pts with Gynecologic Disorders

View Set