Stats Exam 5, PSY 3801 Exam 5, PSY 3801: Final Exam, Terms, PSY 3801 Exam #3, PSY3801 EXAM 6, Psychology (Statistics)- Exam 1, Psychology 3801, PSY 3801 Stats Final, PSY 3801 Test 3, Research Methods Final

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Central Tendency

"central" score or representative score in a set of scores

sum of squares

(SS) the sum of the squared deviations of a set of scores around the mean of those scores

sample variance

(S^2x) the average of the squared deviations of scores around the sample mean

estimated population variance

(S^2x) the unbiased estimate of the population VARIANCE calculated from sample data using N-1

sample standard deviation

(Sx) the square root of the sample variance; interpreted as somewhat like the 'average' deviation

estimated population standard deviation

(Sx) the unbiased estimate of the population STANDARD DEVIATION calculated from sample data using N-1

probability

(p) the likelihood of an event when a population is randomly sampled; equal to the event's relative frequency in the population

The equation for degrees of freedom for the chi-square test of independence is ____________

(r-1) x (c-1)

proportion of variance accounted for

(r^2) in a correlational design, the proportion of the differences in Y scores associated with changes in X

Snowball sampling

(referral sampling)-researcher asks participants to refer others to them

The equation for expected frequency for the chi-square test of independence is __________

(row total x column total)/n

The sum of deviations from the mean

(x - x̄) = 0, ALWAYS EQUAL TO ZERO

population variance

(σ^2x) the average squared deviation scores around the population mean

population standard deviation

(σx) the square root of the population variance, or the square root of the average squared deviation of scores around the population mean

*To find the MS values in ANOVA, you divide the SS by the _____*

*DEGREES OF FREEDOM*

*central limit theorem*

*a statistical principle that defines the mean, standard deviation, and shape of a sampling distribution:* 1) a sampling distribution is always approximately normal 2) the mean of the sampling distribution equals the mean of the underlying raw score population used to create the sampling distribution 3) the std. deviation of the sampling distribution is related to the std. deviation of the raw score population /// -a statistical theory that states that, given a sufficiently large sample size from a population with a finite level of variance, the mean of all samples from the same population will be approximately equal to the mean of the population -sample means will equal to mu /// The central limit theorem states that: Given a population with a finite mean (μ) and a finite non-zero variance (σ2), the sampling distribution of the mean approaches a normal distribution as N, the sample size, increases

Betty computes a χ2 goodness-of-fit test and finds that all of the observed values equal their corresponding expected values. What should she conclude?

*fail to reject* the null hypothesis

What assumption for independent samples t test assumes the population *variances* are equal?

*homogeneity of variance*

The *independent samples t test* assumes that the independent variable affects the __________ of the populations.

*means*

The χ2 test most often is used with __________ scaled variables as long as the categories are mutually exclusive.

*nominal*

one-way chi square vs. two-way chi square

*one-way chi square:* has ONE categorial variable (i.e. type of shoe) with several levels (adidas, nike, etc.) and we want to know whether the frequency of observations differs between the groups *two-way chi square:* has TWO categorial variables (i.e. gender and support for new stadium) and we want to know if these categories are related

It is possible to compute the Pearson product-moment correlation if one is given ____ sets of _____ measurements on the *same* individuals

*two* sets of *interval/ratio* measurements

Interrupted time series design

- Multiple measurements taken before and after some intervention - Useful for evaluating overall trends

Interrupted time series design with control group design

- Multiple measurements taken before and after some intervention compared to a control group that does not receive intervention: before and after AND control group

Interrupted time series with switching replications

- Same treatment implemented in two different locations at two different times - sometimes used for classroom studies - each group acts as a control for the other at different times

What is archival research?

- archival research (past programs, laws)

Why do we use factorial designs?

- evaluate multiple main effects at once (overall effect of a single IV) - can see interaction effects ( effect of one IV depends on the levels of another IV)

Small- N design: Changing Criterion design

- inspired by shaping - target behavior is too difficult or complex to achieve at once, so incremental target behaviors are established e.g. exercise program, smoking cessation programs -> start small: walking ( reward incremental behaviors)

Survey Research (characteristics, when you want to use it and disadvantages)

- measurement procedures that involve asking questions of a sample of respondents - used to gather descriptive info about self-described, attitudes, feelings, opinions and behaviors - can be qualitative or quantitative, requires careful sampling procedure d: return rate (>85% is amazing, 70-85% is very good, 60-70% acceptable, <60% is not representative), social desirability bias

Quasi-experimental designs

- no random assignment - at least one IV is a subject variable - used to test cause-effect relationships when true experiment is not ethical or practical

What are some wording issues in survey design?

- open-ended vs. closed questions, open ended questions are easier to write, but harder to score. May have closed question followed by open-ended elaboration. Open-ended questions may provide guidelines for later closed questions. - overlapping options: 18-24, 24-30 (how do you pick?) - linguistic ambiguity: visiting relatives can be fun.. "visiting" as a verb or "visiting relatives" - double-barreled questions: asking about more than one construct in a single question (higher taxes or more jobs) - leading questions: do you prefer hamburgers flame-broiled or fried.

What are the basic elements of Single- subject designs within ABA?

- operationally define behavior - establish baseline - begin treatment and monitor behavior

Ex post facto

- subjects are placed into groups "after the fact" of their pre-existing subject characteristics (eg. gender, personality, ethnicity) - try to match groups on other relevant characteristics (e.g. income level, age, ed level)

Non-equivalent control group design (without pretest)

- used when pretesting is impractical (or when unforeseen opportunity for research presents itself) - e.g. IV: distance from SF earthquake (exp: North Cali, nonequivalent: AZ, DV: nightmare frequencies)

Small- N Designs/ Single-Case Research Designs

- used when subjects are difficult to find (e.g. specific type of brain damage or psychological disorders - emphasize inductive reasoning (reasoning from specific cases to general laws of behavior)

Applied Behavior Analysis (ABA)

- using behavioral principles to solve real-world problems - ABA programs ideally have social validity: value for improving society, value perceived by participants, extent to which participants actually use program

What's return rate? What do you do to get the best chance for return?

-85 percent return rate → excellent -70-85 percent return rate → good -60-70 percent return rate→ acceptable less than 60 percent is probably not representative, and probably can't be used The best chance for return: -surveys need to be brief, and easy to respond to -starts with interesting questions and leaves "boring" demographic questions until the end -notified you'll receive the survey -nonresponse triggers reminders -professional packaging -return postage included -gifts provide incentive -the person can't have any reason to believe the survey is a sales pitch

nondirectional vs. directional hypothesis testing

-A nondirectional alternative hypothesis states that the null hypothesis is wrong. A nondirectional alternative hypothesis does not predict whether the parameter of interest is larger or smaller than the reference value specified in the null hypothesis. -A directional alternative hypothesis states that the null hypothesis is wrong, and also specifies whether the true value of the parameter is greater than or less than the reference value specified in null hypothesis. /// non-directional hypothesis: predicts that the independent variable will have an effect on the dependent variable, but the direction of the effect is not specified (there "will be a difference") directional hypothesis: predicts that the independent variable will have an effect on the dependent variable, the direction of the effect IS specified (there "will be less" or "will be more", etc.)

Linear Regression

-Allows us to describe the relationship between two variables (find the slope and intercept) -Allows us to make predictions of one variable to the other -Process of finding the "best-fitting" like to scatterplot of data

Correlation

-Describes the strength of the relationship between two variables -Direction of the relationship between two variables

two-way chi square

-The procedure for testing whether category membership on one variable is independent of category membership on the other variable; also called the "test of independence" -has TWO categorial variables (i.e. gender and support for new stadium) and we want to know if these categories are related

one-way chi square

-The procedure for testing whether the frequencies of category membership on one variable represent the predicted distribution in the population; also called the goodness-of-fit test; tests how good the fit is between the data and the null hypothesis -has ONE categorial variable (i.e. type of shoe) with several levels (adidas, nike, etc.) and we want to know whether the frequency of observations differs between the groups

Standard error of estimate

-When we DO know the relationship between 𝑋 and 𝑌, we can use the regression line to make predictions of 𝑌. -Used to describe the variability of the points in a distribution about the regression line.

determining significance of main effects and interactions

-You can often get a quick idea of the pattern of results from a study by looking at a figure. It's important to remember that figures cannot tell you whether a pattern is significant - for that, you need the results of a statistical test. -The less parallel the lines on a graph are, the more likely there is to be a significant interaction.

Fisher's Protected t test/LSD test

-a method for comparing treatment group means after The ANOVA null hypothesis of equal means has been rejected using the ANOVA test. If the test fails to reject the null hypothesis, this procedure should NOT be used. -'Protection' means that you only perform the calculations when the overall ANOVA resulted in a P value less than 0.05

confidence interval

-a range of values within which we are confident that the number is found -measures the probability that a population parameter will fall between two set values. The confidence interval can take any number of probabilities, with the most common being 95% or 99%.

Simple random sampling (pros and cons)

-each member of the population has equal chance of being selected -issues: we may want to represent systematic differences, such as grade or gender, the population may be too large

Stratified sampling (pros and cons)

-proportions of important subgroups are represented precisely -allows you to represent systematic differences in the population -requires researcher to determine what subgroups are relevant to the current question

What is the relationship between validity and reliability?

-reliability is a prerequisite for validity (square/rectangle relationship) valid= accurate reliable= consistent

Cluster sampling (pros and cons)

-researcher randomly selects cluster of people having some feature in common (ex: class in school) -each class is a cluster -randomly select one cluster and survey all units (students) in that class (one-stage cluster sampling) -randomly select one cluster then randomly select some students from that class (2-stage cluster sampling) -allows you to sample from large population -often includes other sampling procedures

Quota sampling

-researcher represents subgroups in a nonrandom fashion (ex: run participants until you get 60 percent females)

Purposive sampling

-specific type of person recruited for study (ex: self-employed people)

Non-equivalent control group design (with pretest)

-typically include pretests and posttests - groups are not equal at start of study AND they experience different events in the study itself - e.g. O1 T O2

What are the three main ways researchers falsify data, and why are those unethical?

1) Data invention: made up numbers 2) Data dropping: dropping outliers- have a priori definition of inclusion if you foresee this being an issue (ex: you won't include data from participants that fall asleep) 3) Data exploitation: "p" hacking (p<.05) - probability that what you found is due to chance- statistical significance- can't change numbers to fit statistical significance. For example, if p=.08, you can't include, because it's due to chance. Another example: p<.06, don't fudge the numbers basically

What are the two most common errors in interpreting correlations?

1) Directionality problem: correlations are descriptive, they don't predict causes/effects 2) 3rd variable problem: an unmeasured variable may explain the relationship (e.g. ice cream sales and murders [heat])

What three main factors can affect the strength of a correlation, and how do they affect a correlation? (Be able to identify examples of each)

1) Linearity of data: zero correlation doesn't mean zero relationship 2) restricting the range 3) outliers: you can eliminate, but you need to have a good reason ahead of time

What are the Four Types of Measurement Scales?

1) Nominal 2) Ordinal 3) Interval 4) Ratio

What are the 3 types of quasi-experimental designs?

1) ex post facto 2) non- equivalent control group design - with pretest - without pretest 3) interrupted time series design - normal - with control group - with switching replications

To study behavior, we want to be able to:

1. Describe behavior and characteristics 2. Predict behavior 3. control and change behavior

How to find Average Deviation

1.Start out with normal scores 2. Figure out the mean 3. start another table that subtracts each # by the mean 4. make a table with the absolute values of x- x̄ 5. divide the some of the absolute value by the N (number of subjects)

Interval scale

10-12, 13-14 (e.g. age ranges)

There are _________ cells in a 3x4 contingency table

12

A 4x6 contingency table has _______ degrees of freedom

15

A two-way ANOVA is used to test the significance of the differences between sample means when you have (what number of...?) ________________ independent variables.

2

If there are 5 rows and 6 columns in your contingency table, how many degreesof freedom will there be in your chi-square test of independence? ________________

20

If there are 4 categories in your data table, how many degrees of freedom will there be in your chi-square goodness-of-fit test? ________________

3

In a 2x3x5 research design, how many independent variables are there? ________________

3

A study evaluated the effectiveness of three types of therapy in producing self-reliance. The researcher also assessed sex because she thought the therapy might affect males and females differently. An independent groups design was used with 5 participants per cell. What is the design of the study? (_____ x _____ _____)

3 x 2 factorial

7.a. When you performa two-way ANOVA to test the significance of the main effect(s)and interaction(s)of the independent variables in your experiment, how many F-values would you compute? ________________b. How many of those F-values test main effects? ________________c. How many of those F-values test interactions? _

3, 2, 1

In two-way ANOVA, SST is partitioned into how many SS components?

4

There are ______ cells in a 2x2 contingency table

4

A 3x5 contingency table has ________ degrees of freedom

8

Under the normal distribution, approximately _____% of the area is contained between plus and minus 2 standard deviations from mean.

95 %

You have timed the rate at which participants can solve puzzles under three conditions of noise: high, medium, and low. In addition, the participants are under the influence of marijuana, cocaine, or alcohol. What kind of design do you have?

A 3 x 3 between-subjects factorial design

parameter

A characteristic of a population.

statistic

A characteristic of a sample.

contingency table

A classification tool that reveals the various possibilities (contingencies) in the comparison of variables; a table that presents data in terms of all combinations of two or more variables.

contingency table

A data matrix that displays the frequency of some combination of possible responses to multiple variables; cross tabulation results

bimodal distribution

A distribution with two modes.

Alternative Hypothesis

A hypothesis that stands in opposition to the null hypothesis. It may be directional or nondirectional.

ratio level of measurement

A level of measurement that has all the properties of the interval level of measurement, plus the presence (or possibility) of a true or legitimate zero (0) point.

ordinal level of measurement

A level of measurement that presumes the notion of order (greater than and lesser than).

data distribution

A listing of the values or responses associated with a particular variable in a data set.

observed value

A measure produced by an observation and measurement system. Observed values serve as the data that the researcher and others will interpret to form conclusions about an investigation.

perfect association

A pattern of association between variables in which there is perfect predictability; knowledge of the value of one variable allows a precise prediction of the value of the other variable.

negative (inverse) association

A pattern of association in which the variables track in opposite directions; as one variable increases in value, the other variable decreases in value.

positive (direct) association

A pattern of association in which the variables track in the same direction; as one variable increases in value, the other variable increases in value.

sampling frame

A physical representation of the population; a listing of all the elements in a population.

Z (Z score)

A point along the baseline of a standardized normal curve.

sample

A portion of a population

Predict

A primary goal of performing a regression analysis is to allow you to ________ the value of one variable when you know the value of the other.

correlation (Pearson's r)

A procedure designed to determine the strength and direction of an association between two interval/ratio level variables. Also known as Pearson's r.

Constant

A quantity that does not change its value within a certain context

two-tailed test scenario

A research situation in which the researcher is looking for an extreme difference that could be located at either end of the distribution.

one-tailed test situation

A research situation in which the researcher is looking for an extreme difference that is located on only one side of the distribution.

random sample

A sample selected in such a way that every unit has an equal chance of being selected, and the selection of any one unit in no way affects the selection of any other unit. In a random sample, all combinations are possible.

Representative sample

A sample that reflects the attributes of target population as a whole

Central Limit Theorem

A statement about the relationship between a population and a sampling distribution based on that population. If repeated random samples of size n are taken from a population with a mean or mu (μ) and a standard deviation (σ), the sampling distribution of sample means will have a mean equal to mu (μ) and a standard error equal to σ /√n . Moreover, as n increases the sampling distribution will approach a normal distribution.

null hypothesis

A statement of equality; a statement of no difference; a statement of chance. In the case of a hypothesis test involving a single sample mean (that is compared to a known population mean), the null is typically a statement of the value of the population mean.

1-2-3- Rule

A statement of how much area under a normal curve is found between ±1, ±2, and ±3 standard deviations from the mean.

range

A statement of the difference between the highest and lowest scores or values in a distribution. As a measure of dispersion or variability, the range is simple to calculate, but it doesn't say much about the distribution.

one-way ANOVA

A statistical test used to analyze data from an experimental design with one independent variable that has three or more groups (levels).

interval level of measurement

A system of measurement based on an underlying scale of equal intervals.

t-test vs. z-test

A t-test is used for testing the mean of one population against a standard or comparing the means of two populations if you do not know the populations' standard deviation and when you have a limited sample (n < 30). If you know the populations' standard deviation, you may use a z-test.

table of areas under the normal curve

A table of values that tell you what proportion of the area under the normal curve is found between the mean and any Z value.

frequency distribution

A table or graph that indicates how many times a value or score appears in a set of values or scores.

regression analysis

A technique that allows the use of existing data to predict future values.

margin of error

A term used to express the width of a confidence interval for a proportion.

ANOVA

A test to determine if there is a significant difference among three or more groups or samples.

skewed distribution

A unimodal distribution that departs from symmetry, in the sense that most of the cases are concentrated at one end of the distribution.

normal curve

A unimodal, symmetrial curve that is mathematically defined on the basis of the mean and standard deviation of an underlying distribution.

standardized normal curve

A unimodal, symmetrical, theoretical distribution based on an infinite number of cases, having a mean of 0 and a standard deviation of 1.

scatter plot

A visual representation of the values of two variables on a case-by-case basis.

standard deviation

A widely used measure of dispersion or variability. The standard deviation is the square root of the variance.

variance

A widely used measure of dispersion or variability. The variance is equal to the standard deviation squared.

Small- N design: simplest design?

A-B design - A = baseline - B = treatment - weakness: you cannot test for extraneous variables

ANOVA

ANalysis Of VAriance - btwn means of 3 or more groups

What is validity?

Accuracy

Small- N design: Alternating Treatments Design

After baseline established, different treatments are alternated numerous times - a form of counterbalancing ( 2 conditions without carryover effects)

Small- N design: Withdrawal (or reversal) design

After treatment has been in effect, it is withdrawn - A- B- A (another baseline) - A-B-A- B(another baseline, posttest)

population

All possible cases; sometimes referred to as the universe. It is often thought of as the total collection of cases that you're interested in.

A(n) _____________ hypothesis is a hypothesis that stands in opposition to the null hypothesis

Alternative or research

non-directional hypothesis

An alternative or research hypothesis that does not specify the nature or direction of a hypothesized difference. It simply asserts that a difference will be present.

directional hypothesis

An alternative or research hypothesis that specifies the nature or direction of a hypothesized difference. It asserts that there will be a difference or a change in a particular direction (increase or decrease).

curvilinear association

An association between two variables that would, if represented in a scatter plot, conform to a general pattern of a curved line.

linear association

An association between two variables that would, if represented in a scatter plot, conform to a general pattern of a straight line.

What is the definition of the estimate of the standard error in the case of a t test for related sample means?

An estimate of the standard deviation of the sampling distribution of mean differences

estimate of the standard error of the mean

An estimate of the standard deviation of the sampling distribution of sample means; a function of the standard deviation of a sample.

factor

An independent variable

between-subjects

An independent variable that is studied using independent samples in all conditions

within-subjects

An independent variable that is studied using related samples in all conditions

mean (average) deviation

An infrequently used measure of dispersion based, in part, on the absolute deviations from the mean of the distribution.

confidence interval for the mean

An interval/range of values within which the true mean of the population is believed to be located -constructed around the SAMPLE MEAN when the population SD is known or when it is not known

standard error of the estimate

An overall measure of the difference between actual and predicted values of Y.

-1

An r value of ________ would be interpreted as a perfect negative association.

1

An r value of ________ would be interpreted as a perfect positive association.

Mean

Arithmetic Average (sigma x OVER n), the balancing point of a set of scores.

direct relationship

As X increases Y increases B>0

Inverse relationship

As one variable increases, the other variable decreases B<0

Increases

As r increases, r squared____

decreases

As r increases, standard error_____

homogeneity of variance

Assumption that the variances of the populations being represented are equal

If the results of an ANOVA are statistically significant, what can you conclude?

At least two (but possibly more) of the sample means are significantly different from each other

Parameter

Average for population

collapsing

Averaging together scores from all levels of one factor to calculate the means for the other factor

The F ratio is the ratio of the amount of variation _________ the groups to the amount of variation ________ the groups

Between; within

aaa

Can assume an infinite number of values between any two points on the scale

Discrete

Can assume only a certain number of values between any two points on the measurement scale

Descriptive Method

Case study method, Observational method, survey method (concerned with population BUT you take a survey from a group of people and draw conclusions about general population)

Information obtained on variables measured at the nominal or ordinal level is said to be _______ data

Categorical

Nominal

Classification, putting observations into categories (sex, nationality, category names)

__________ and __________ are effect size measured used with *t tests*

Cohen's d; r^2 pb (MEMORY TOOL: YOU PUT THE D AND THE PB IN THE T; YOU PUT THE D AND THE PEANUT BUTTER IN THE TOAST?)

correlation

Computing a ____helps us answer questions about how much variability in one variable is associated with variability in another variable.

Y,X

Computing the linear regression of Y on X allows you to predict the value of ________ when you know the value of ________.

What is reliability?

Consistency over repeated trials (e.g. scale)

confidence interval for a proportion

Constructed in an effort to estimate the proportion in a population, based upon a proportion in a sample

A _______ table is a classification tool that reveals the various possibilities in the comparison of variables

Contingency

Strength and direction of the relationship

Correlation coefficient tells us 2 things:

combined

Correlation for _________ groups can be quite different from the correlations for the individual groups.

Predictive Methods

Correlational Method, Quasi-experimental Method

Correlational vs Experimental studies

Correlational: Finding a linear relationship between two variables Experimental: manipulating a variable/variables in an experiment

ANOVA vs. t-test

DIFFERENCES: t-test: hypothesis test that is used to compare the means of two samples ANOVA: statistical technique that is used to compare the means of MORE than two samples SIMILARITIES:

The between-groups sum of squares is transformed into an estimate of the between-groups variance by dividing the between-groups sum of squares by an appropriate number of _______

Degrees of freedom

The within-groups sum of squares is transformed into an estimate of the within-groups variance by dividing the within-groups sum of squares by an appropriate number of ________

Degrees of freedom

Range

Difference between the highest ande lowest scores

SPSS

Different subjects in different rows Different variables in different columns

An alternative or research hypothesis that specifies the nature or direction of a hypothesized difference is considered a __________

Directional hypothesis

Y,X

Do not make predictions of____ for values of ____that are outside the range of 𝑋 over which you computed the original regression line because you have no evidence of the form of the relationship beyond that range.

Content validity

Do the test items measure all facets of the intended construct? (e.g. IQ, cumulative exams)

Face validity

Does the measure seem valid to those who are taking it -> not actually validity (e.g. direct questions about socially unacceptable behavior)

Between-subjects design

Each participants receive only one level of the IV - Required when using subject variables or when the study needs uniformed participants - disadvantages = greater likelihood of extraneous variables

External validity

Ecological validity, the extent to which a study's results generalize beyond the parameters of the laboratory. -> other participants, settings, interventions, outcomes

What is the definition of effect?

Effect is the change in measurement that is attributable to a treatment condition or stimulus of some sort

Interaction Effect

Effect of 1 IV across the levels of the other IV ( see graph, if it seems to cross each other, there is an interaction)

Simple Effect

Effect of 1 IV within one level of another

Main Effect

Effect of only 1 IV, disregarding other IVs

Cohen's d

Effect size measure that reflects the magnitude of the differences between the means of conditions

The _______ frequency is the frequency that would be expected to occur in a particular cell, based upon chance and the marginal distributions

Expected

Carryover Effects (4)

Experience in first condition that affects performance in later conditions Practice effects Fatigue effects Order effects Sequence effects

Extraneous vs. Confounding variables

Extraneous: anything that varies other than what you manipulate (ex: the time of time, temperature, extra noise, gender, etc.) Confounding: something that varies systematically with your IV (e.g. time of day)

extreme

Extreme scores have an ____ effect on the correlation coefficient

The calculated test statistic for ANOVA is known as the _______ ratio

F

A Type II error involves _________ a null hypothesis when it is ___________

Failing to reject; false

Type II error

Failure to reject the null hypothesis when the null is false.

Extreme scores don't affect the mean

False

In the test for the difference of means, independent sample design, the number of cases in each sample must be equal. True or false?

False

Median is affected by extreme scores

False

The post hoc comparison to use if the ns in all levels are not equal is _________

Fisher's protected t test

Floor effect vs ceiling effect

Floor: nobody gets higher than 10%, low results (closer to 0) Ceiling: everyone scores really high -> why we have pilot studies

Inverse relationship, r squared=.49

For -.70 what two things does that specifically tell you about the relationship between X and &

Variable

General characteristics that can take on different values for different observations

Unimodal

Has 1 mode

Biased sample

Having a part of the population represented more than others (not reflecting all the attributes)

Oskar computes a χ2 goodness-of-fit test and finds a negative χ2 obtained value. What should he conclude?

He made a mistake somewhere in his computations

ANOVA null hypo

Ho: μ1=μ2=μ3

correlation coefficient is positive when there is a positive relationship and it is negative when there is a negative relationship

How is the sign of the correlation coefficient related to the direction of the relationship between two variables?

z-Score

How many standard deviation's you are away from the mean

chi-square test of independence

Hypothesis-testing procedure for categorical variables that tests whether or not there is an association between two variables (not strength) OBJECT: to determine if the pattern reflected in a contingency table departs from chance in a significant manner

Your research design consists of independent variable A and independent variable B. If you average or combine the data across levels of IV B at each level of IV A, you would be evaluating the main effects of _________________.

IV A

Your research design consists of independent variable A and independent variable B.If you average or combine the data across levels of IV B at each level of IV A, you would be evaluating the main effects of _________________.

IV A

If there is a significant interaction between IV A and IV B, then the effects of IV A depend upon the level of ________________.

IV B

less than

If P is ____ alpha, you reject the null

reject the null

If our observed value of p is less than alpha you_________

when to reject t-test

If the absolute value of the t-value is greater than the critical value, you reject the null hypothesis

Small

If the standard error of estimate is ____, that means that the points in the scatterplot are close to the regression line and we can make good predictions.

1,-1

If the standard error of the estimate is 0 then the correlation needs to either be ____ or _____.

characteristics of a z-score distribution

If your Z-score distribution is based on the sample mean and sample standard deviation, then... 1) the mean will equal zero 2) the standard deviation will equal one

linear regression

In a ____, we want to find the straight line that is closest to every point in a scatterplot

main effect

In a factorial design, the overall effect of one independent variable on the dependent variable, averaging over the levels of the other independent variable.

tail of the distribution

In a skewed distribution, the elongated portion of the curve.

does not equal

In linear regression, Y on X _______ X on Y

Experimental Method

Independent variable and dependent variable, random assignment, experimental group and control group

categorical data

Information obtained on variables measured at the nominal or ordinal level; responses that can be classified into categories.

If the independent variable has no effect on behavior, what does that say about the F-ratio?

It will equal approximately 1

In a 2x3x5 research design, the number 3 tells you the number of ________________ of one of the independent variables.

Levels

If you had a research problem appropriate for ANOVA and it was based on the results from three samples, what would be the null hypothesis?

M1=M2=M3

What are the seven threats to internal validity? Which ones are participant-related (p) and which ones are measurement-related (m)?

MR SMITH -Maturation: development changes that occur with the passage of time (P) -Regression: If score at T1 is extreme, score at T2 will most likely be closer to average -> statistically most scores cluster around mean (M) - Selection: nonequivalent groups if participants are not randomly assigned (P) - Mortality (attrition): subjects drop out of (long-term) study (P) - Instrumentation: measurement changes from T1 to T2 (M) - Testing/Practice effects: exposure to pretest affects performance on posttest (M) - History: event occurs between T1 and T2 that is unrelated to intervention, but produces change. (P)

In an ANOVA, increasing the differences between group means by adding a constant to all the scores in one group will increase:

MSb

mean square between groups

MSb, estimates of variance between groups =SSb/dfb

mean square within groups

MSw, estimates of variance within groups =SSw/dfw

Ordinal

Magnitude only (Ranking)

Ratio

Magnitude, Equal intervals, Absolute zero (salary, age, weight)

interval

Magnitude, equal intervals, no absolute zero (temp)

Manifest vs. latent variables

Manifest: corresponds to questionnaire items Latent: unobserved variables that are measured by multiple observed variables (also called manifest variables, items, or indicators of the latent variables)

Manipulated vs. subject variables

Manipulated variables: variables you can actually manipulate (e.g. stress, anxiety, etc.) Subject variables: qualities in subjects that are inherent, can't change it (e.g. gender, race, etc.)

In the matched samples design (the design based on the mean difference), the sampling distribution at issue is the sampling distribution _____________

Mean differences

Another name for the between-groups estimate of variance is _______

Mean square between

Another name for the within-groups estimate of variance is ________

Mean square within

What is the relationship between extraneous and confounding variables ( in other words, are all extraneous variables confounding)?

No, confounding is only when it systematically lines up Extraneous: anything that varies in the experiment that you don't manipulate Confounding: something that varies systematically with your IV (e.g. time of day)

Are sampling distributions always normally distributed?

No, sampling distributions are not *always* normally distributed

What is NOIR?

Nominal, Ordinal, Interval, Ratio

__________ frequencies are the frequencies presented in the cells of a table

Observed

A __________ tailed test scenario is appropriate when the alternative or research is directional in nature

One-tailed

When you analyze data based on a *population*, you calculate a ______

PARAMETER

Population

Parameter

nonparametric vs. parametric statistics

Parametric: -to test group MEANS -use in normal distribution -ratio or interval Nonparametric: to test group MEDIANS and MODES -use in any distribution -ordinal or nominal

Data set

Particular set of information (e.g., the height of everyone in this room)

How do we use r and r^2 to interpret correlational results?

Pearson's r = coefficient of correlation, descriptive statistics that ranges from -1.00 to 1.00, requires interval or ratio scale data (use Spearman's rho for ordinal) -> tells strength of correlation r^2= coefficient of determination, portion of variability in one variable that accounts for variance in another

Social desirability bias

People answer surveys/ questionnaires in a way that they think is socially desirable (because they don't want to be judged)

Quasi-experimental Method

People are CHOOSING what group they are going to be in

Self-selection bias

People who volunteer for surveys or experiments might differ than people who don't volunteer and that variable isn't accounted for. Similar to nonresponse bias

Positive vs negative vs no correlation

Positive: as one variable increases the other increases as well. Negative: as one variable increases, the other decreases No correlation: no relationship between two variables

What is the power of a test?

Power is the ability of a test to reject a false null hypothesis

Measurement

Process of assigning numbers or labels to observations

Scientific vs. pseudoscientific studies

Pseudoscientific: - false science that appears to use scientific methods, but ignores disconfirming studies - relies heavily on anecdotal "evidence" (evidence with subjective value) - reduces complex phenomena to overly simplistic concepts - often involves belief perseverance, confirmation bias, and the availability heuristic - e.g. phrenology Scientific: - assumes determinism, discoverability - objective (ideally), systematic observations - can be challenged - Produces data-based conclusions - Asks answerable (empirical) questions -> what can we answer

Regression Analysis

Purpose is to describe the general relationship between two variables.

X,Y

Put another way, it doesn't matter which variable you call ___ and which you call ___ when you compute 𝑟.

types of variables

QUANTITATIVE: indicates the AMOUNT of the variable that is present QUALITATIVE: indicates a CLASSIFICATION/CATEGORIZATION of the variable, NOT measured in amounts

What type of data is required in order to measure a correlation?

Quantitative data

proportion

R squared=

Measures of variability include:

Range

What does the value of t represent in the t test for related sample means?

Ratio that expresses how far the observed mean difference departs from the assumed mean difference of 0 in standard error units

Predicting Y, given our X values

Regress Y on X means

A Type I error involves ________ a null hypothesis when it is ___________

Rejecting; true

Type I error

Rejection of the null hypothesis when the null is true.

Within-subjects design

Repeated-measures designs. - Each participant is exposed to each level of the IV - Required/preferred when 1) conducting psychophysical experiments or 2) when population of interest (or available sample) is small. - advantages = eliminate the need for equivalent groups, reduction in error variance, and requires fewer subjects

Dehoaxing

Revealing the true purpose of the study

marginals

Row and column totals in a contingency table, which are shown in its margins.

between groups sum of squares

SSb, finding deviation of each sample mean from the grand mean, squaring the deviations, weighing the squared deviations for each sample, summing across all the samples

within groups sum of squares

SSw, finding deviation of each score in a sample from sample means, squaring the deviations, adding the squared deviations for each sample, summing across all the samples

When you analyze data based on a *sample*, you calculate a _____

STATISTIC

ratio examples

Salary, weight, age

related samples

Samples selected in such a manner that cases included in one sample are somehow related or matched to cases in another sample. In some instances, the matching is achieved by using the same subjects tested in two situations (for example, in a before/after test situation). In other instances, the matching is achieved by matching subjects or cases on the basis of relevant criteria.

independent samples

Samples selected in such a manner that the selection of any case in no way affects the selection of any other case. If same group, but survey anonymous then INDEPENDENT SAMPLES

Measurement Scale

Set of possible numbers or labels that you can assign to your observations (Magnitude, Equal intervals, Absolute Zero, Nominal, Ordinal, Ratio, Interval)

match

Signs for correlation and slope will always____

What are the six elements of informed consent?

Six elements in an informed consent form: 1) Fair and understandable explanation of the activity and its purpose. 2) Explanation of the discomforts or risks that may be reasonably expected to occur. 3) Explanation of benefits that may be reasonably expected to occur. 4) Disclosure of alternative procedures (where relevant). 5) Willingness to answer all questions. 6) Freedom to withdraw consent and participation at any time.

how far points are from the regression line

Standard error tells you

inferential statistics

Statistical procedures used to make statements or inferences about a population, based on sample statistics.

chi-square goodness-of-fit test

Statistical test used to evaluate how well a set of observed values fit the expected values. The probability associated with a calculated chi-square value is the probability that the differences between the observed and the expected values may be due to chance.

Sample

Statistics

Desensitizing

Stress reduction (telling that the test was hard, its okay if they struggled), make them feel better

r

Symbol for correlation coefficient

How would you best describe MSb?

Systematic variance plus error variance

Debriefing

Telling participants what they just did and why, involves dehoaxing and desensitizing, required when deception is involved in the experiment. Additionally, you need to tell them that their data can be removed from the data set if they wish

Y prime (Y')

The Y value that you are attempting to predict, based on a given value for X and the regression equation.

a term in the regression equation (Y' = a + bX )

The Y-intercept; the point at which the regression line crosses the Y-axis.

standard error of estimate

The ________ is to the regression line as the standard deviation is to the mean.

power

The ability of a test to reject a false null hypothesis.

how to interpret z scores

The absolute value of the z-score tells you how many standard deviations you are away from the mean. If a z-score is equal to 0, it is on the mean

level of confidence

The amount of confidence that can be placed in an estimate derived from the construction of a confidence interval. Level of confidence is mathematically defined as 1 minus the level of significance. The level of confidence is a statement of the percentage of times (99%, 95%, etc.) one would obtain a correct confidence interval if one repeatedly constructed confidence intervals for repeated samples from the same population.

effect size

The amount of influence that changing the conditions of the independent variable had on dependent scores

mean square between

The between-groups estimate of variance; calculated by dividing the between-groups sum of squares by the between-groups degrees of freedom.

central tendency

The center or typicality of a distribution. The three most common measures of central tendency are the mean, median, and mode.

effect

The change in a measurement that is attributable to a treatment condition or stimulus of some sort.

cell

The combination of one level of one factor with one level of the other factor

levels

The conditions of the independent variable

r squared

The correlation coefficient is represented by r. The proportion of variation in Y that is associated with or explained by variation in X is represented by

sampling error

The difference between a sample statistic and a population parameter that is due to chance.

In the independent samples design (the design based on the difference of means), the sampling distribution at issue is the sampling distribution of _____________

The difference between means

main effect

The effect on the dependent scores of changing the levels of one factor after collapsing over the other factor

regression equation

The equation that describes the path of the line of best fit. The regression equation is used to predict a value of Y (referred to as Y′ or Y-prime) on the basis of an X value (Y′ = a + bX).

standard error of the difference

The estimated standard deviation of the sampling distribution of differences between the means

variability

The extent to which the scores in a distribution are spread around the mean value or throughout the distribution. The two most commonly used measures of dispersion are the variance and the standard deviation.

expected frequency

The frequency expected in a category if the data perfectly represent the distribution of frequencies described by the null hypothesis

expected frequency

The frequency that would be expected to occur in a particular cell, given the marginal distributions and the total number of cases in the table.

observed frequency

The frequency with which participants fall into a category of a variable

Control group

The group in an experiment that does NOT receive the variable being tested.

Experimental Group

The group in an experiment that receives the variable being tested.

Control Group

The group that gets no treatment, no manipulation

Experimental Group

The group that gets treatment, the manipulation

regression line

The line that passes through a scatter plot in such a way that the square of the distance from each point in the plot to the line is at a minimum. Also known as 'line of best fit'

Statistic

The mean of a distribution of scores measured

mu

The mean of a population

In the matched/related sample t-test what does D (bar on top) represent?

The mean of the differences between the related samples. Basically, the mean difference

grand (overall) mean

The mean that would result if the values of all cases in an ANOVA application were added and the sum divided by the total number of cases.

between-groups degrees of freedom

The number of degrees of freedom associated with the estimate of between-groups variance; equivalent to the number of groups minus 1.

within-groups degrees of freedom

The number of degrees of freedom associated with the within-groups estimate of variance; equivalent to the number of cases minus the number of groups.

Nonresponse bias

The people of who respond to surveys/questionnaires (versus people who don't) might be a variable that needs to be accounted for.

point of inflection

The point at which a normal curve begins to change direction. It is one standard deviation above or below the mean of the underlying distribution.

critical value

The point on a sampling distribution that marks the beginning of the critical region; the value that is used as a point of comparison when making a decision about a null hypothesis. If the calculated test statistic (e.g., Z or t) meets or exceeds the critical value, the null hypothesis can be rejected.

experiment-wise error rate

The probability of making a Type I error when comparing all means in an experiment

level of significance

The probability of making a Type I error.

mixed design

The procedure performed with one within-subjects factor and one between-subjects factor

predict values of Y when you know values of X

The purpose of performing a linear regression of Y on X is to allow you to...

F ratio

The ratio of the between-groups estimate of variance to the within-groups estimate of variance. The F ratio is frequently referred to as the ratio of the mean square between to the mean square within.

critical region

The region of rejection or area/portion in a sampling distribution that contains all the values that allow you to reject the null hypothesis

Predictor, criterion

The regression equation you find applies only for the _____ and _______ as you defined them; it usually will not apply if you want to make predictions in the other direction.

mode

The response or value that appears most frequently in a distribution. The mode is the only measure of central tendency that is appropriate for nominal level data.

calculated test statistic

The result of a hypothesis-testing procedure; the value that is compared to a critical value when testing the null hypothesis.

Z ratio

The result of finding the difference between a raw score and a mean, and dividing the difference by the standard deviation. This procedure converts a raw score into a Z score.

observed frequency

The result or frequency presented in each cell of a contingency table.

sampling distribution of sample means

The result you would get if you took repeated samples from a given population, calculated the mean for each sample, and plotted the sam- ple means.

treatment effect

The results of changing the conditions of an independent variable so that different populations of scores having different μs are produced

median

The score that divides a distribution in half; the midpoint of a distribution, or the point above and below which one-half of the scores or values are located. The formula for the median is a positional formula; it will tell you the position of the median in the distribution, not its value.

positive skew

The shape of a distribution that includes some extremely high scores or values. A distribution is said to have a positive skew if the tail of the distribution points toward the right.

nominal level of measurement

The simplest level of measurement; a system of measurement based on categories that are mutually exclusive and collectively exhaustive.

regression constants

The slope and Y-intercept of the regression line are called the

b term in the regression equation (Y' = a + bX )

The slope of the regression line; the change in Y that accompanies a unit change in X.

Variability

The spread of scores

standard error of the mean difference

The standard deviation of a sampling distribution of mean differences between scores reflected in two samples. The sampling distribution, in this case, would be the result of repeated sampling each time looking at two related samples, and focusing on the difference between the individual scores in each sample. The individual differences would be treated as forming a distribution, and that distribution has a mean. The repeated samplings would result in repeated mean differences. The recording/plotting of those mean differences would constitute the sampling distribution. The standard error would be the standard deviation of the sampling distribution.

standard error of the mean

The standard deviation of a sampling distribution of sample means.

standard error of the difference of means

The standard deviation of a sampling distribution of the difference between two sample means. The sampling distribution, in this case, would be the result of repeated sampling—each time taking two samples, calculating the mean of each sample, calculating the difference between the means, and recording/plotting the differences. The standard error would be the standard deviation of the sampling distribution.

phi coefficient

The statistic that describes the strength of the relationship in 2 x 2 chi square design

contingency coefficient

The statistic that describes the strength of the relationship in chi square that is not a 2 x 2 design

Standard error of estimate

The subscript indicates that this is the standard error of predictions for 𝑌 when we know the value of 𝑋

between-groups sum of squares

The sum of the squared deviation of each sample mean from the grand mean, weighted by the number of cases in each sample, and summed across all samples.

within-groups sum of squares

The sum of the squared deviations of each score from its sample mean, summed across all samples.

f-ratio

The test statistic for analysis of variance and compares the differences (variance) between groups with the differences (variance) that are expected within groups (aka systematic variance+error variance/error variance)

-1 to 1

The value of r has a range from ________ to ________.

coefficient of determination

The value of r2; it is the amount of variation in one variable (Y) that is attributable to variation in another variable (X)

correlation coefficient

The value of r; a measure of the strength and direction of an association between two interval/ratio level variables. The value of r can range from -1.0 to +1.0.

dependent variable

The variable that's presumed to be influenced by another variable.

independent variable

The variable that's presumed to influence another variable.

mean square within

The within-groups estimate of variance; calculated by dividing the within-groups sum of squares by the within-groups degrees of freedom.

In multiple comparison procedures, post hoc tests are completed after the ANOVA. Why are post hoc test preferred over running several t tests?

They decrease the probability of a *Type I* / Type ONE error

What was the purpose of the Hobbs committee? What five general principles did they develop?

They developed ethics (standards governing conduct) for the research and process of psychology. - From 1948-1953 the Hobbs committee developed 5 general principles of ethical conduct for practitioners and researchers. 1) Beneficence and nonmaleficence. You should strive to do good AND strive to not do ill. 2) Fidelity and Responsibility- researchers need to behave professionally. 3) Integrity- researchers must be scrupulously honest in all aspects of research 4) Justice- treat people well, and know enough about what you're studying that you do so 5) Respect for people's rights and dignity. This needs to be applied both during the experiment, and afterwards with the data.

What is the Nuremberg Code? What stipulations does it include?

This was developed after the Nuremberg trials (post WWII) to require experiments to be conducted in an ethical manner - it set a moral standard for medical and psychological experiments. informed consent - must know enough so they can withdraw if they want to -right to withdraw -avoidance of harm -qualified investigators -based on previous knowledge that justifies experiment - favorable risk/benefit ratio

Aaa

True

Changing a score in a set of scores changes the mean

True

In a matched samples design (the test involving the mean difference) the number of cases in each sample must be equal. True or false?

True

True

True or False: The regression line is influenced by extreme scores in a scatterplot.

A _________ tailed test scenario is appropriate when the alternative or research hypothesis is non-directional in nature

Two-tailed

What is counterbalancing and when/why do we use it?

Use of more than one sequence of conditions to control for carryover effects and "nuisance variability" -> due to extraneous variables Complete counterbalancing: every possible sequence used at least once (x!) Partial/incomplete counterbalancing: subset of possible sequence used (sample from complete set)

descriptive statistics

Use to summarize or describe data samples from samples and populations Ex: the calculation of the mean, range

Research Methods

Using Statistics (and other tools) to do Psychology

What is systematic variance?

Variance between groups due to the effects of the independent variable

Y,X

We are using our regression equation to make predictions of ___ when we have values of ___

pooled variance

Weighted average of the sample variances in a two-sample t-test (S^2 pool)

Dependent Variable

What we measure, depends on manipulation

What do you need to include when submitting to the IRB?

When submitting to the IRB, you need to include: -rationale for your study -detailed procedures + materials (the more info the better) -explanation of risks/benefits/what could happen -don't include jokes -copy of informed consent (what you WILL give subjects, not signed copies)

interaction effect

When the relationship between one factor and the dependent scores depends on the level of the other factor that is present

correlation coefficient

When we compute a correlation, we calculate a value called a

Outside

When you compute the regression of Y on X, you should not use it to make predictions for value of X that are ________ the range of X values that you used in performing the regression analysis.

Criterion validity (2 types)

Whether the measure can accurately predict another measure of behavior AND produces similar results to existing measures (related to an existing criterion) -> predictive (can it predict the results of construct) and concurrent (measures in existence)

Sample mean

X BAR

Predictor

X in regression, the variable used to predict the criterion variable

predictor

X=

Criterion

Y in regression, variable being predicted from the predictor variable (e.g. college grades are predicted from SAT scores)

Y predicted

Y1=

criterion

Y=

How can you create equivalent groups in a between-subjects design?

You can create equivalent groups in between- subjects design by using blocked random assignment (ensuring one randomly assigned participant in each group before adding a new participant) or matching (grouping together participants on relevant subject variable and then randomly assigning to group)

What's the difference between multiple IVs and multiple levels of one IV?

You can have only one IV, but you need to have at least 2 levels of IV: control group and experimental group

Cross- product

You multiply the x and the y together

when to reject z-test

You reject the null hypothesis if the z-score is large, which means that the p-value is small. If you reject a hypothesis at the 5% significance level, p < .05, hence you will reject that hypothesis at the 10% significance level. If you fail to reject a hypothesis at the 5% significance level, p > .05, hence you will fail to reject that hypothesis at the 1% significance level.

non-linear

You should not perform a linear regression for data that clearly show a ________ relationship.

Sum of squares

You take the (x- x̄) and you square it, SS

Factorial Notion

[# of levels of IV 1] X [# levels of IV 2] # of terms indicate # of IVs - 2x2= most common

factorial design

a design in which ALL levels of one factor are *combined* with ALL levels of the other factor

frequency distribution

a distribution showing the number of times each score occurs in the data

A researcher evaluated the effectiveness of three types of therapy in producing self-reliance. She also was interested in whether the types of therapy affected men and women differently. Volunteers were randomly assigned to a type of therapy. After the therapy was completed, each volunteer completed a questionnaire measuring self-reliance. What type of analysis should you run if you are interested in whether there are mean differences in self-reliance?

a factorial ANOVA

Correlational method

a form of research that includes "quasi-experimental" designs such as survey research or naturalistic observations, in which different groups are compared, but cause and effect between variables cannot be determined. These are different from True Experimental Designs because there is no control condition, nothing is manipulated, and there are many differences between the groups other than the independent variable(s).

Variable

a general characteristic we can measure for each person

two-way ANOVA

a hypothesis test that includes two nominal independent variables, regardless of their numbers of levels, and a scale dependent variable

Constant

a quantity that does not change its value within a certain context

normal distribution

a set of scores in which the MIDDLE score has the highest frequency and, proceeding toward higher or lower scores, the frequencies at first decrease slightly and then decrease drastically /// If a data distribution is approximately normal then about 68% data values are within *one* standard deviation of the mean, about 95% are within *two* standard deviations, and about 99.7% lie within *three* standard deviations

how to designate designs in a two-way ANOVA

a specific design is described using the number of levels in each factor. If, for example, factor A has TWO levels and factor B has TWO levels, we have a "two-by-two"/2 x 2 ANOVA. each factor can involve any number of levels.

chi-square test of independence

a test to determine whether there is an association between two categorical variables

Statistics

a tool used by Psychologists ... the most important tool

Independent Variable

a variable (often denoted by x ) whose variation does not depend on that of another.

Dependent Variable

a variable (often denoted by y ) whose value depends on that of another.

If you find significance for a χ2 test of independence, you should compute an effect size measure. If you have a 2 x 2 design, you compute the ____(a)_____ coefficient; if you have any other design, you compute the _____(b)______ coefficient.

a- phi b- contingency

If you have a 2 x 2 design, you compute the ___(a)_____ coefficient WHEREAS if you have any other design than a 2 x 2 design, you compute the _____(b)_____ coefficient

a- phi b- contingency

power

ability of a test to reject a false null hypothesis, 1- probability of type II error

Describe three challenges to survey research and how researchers can overcome those challenges.

absence of control in naturalistic observation observer bias -behavior checklists (operational definition of your construct) -multiple raters (all watching the same thing) participant reactivity -use unobtrusive measures

Order effects

affect of treatment position (e.g. first, third, last)

mean

all numbers added up, then divided by the number of total numbers (ex. 1, 2, 3 1+2+3= 6, divide by 3, =2)

positively skewed distribution

an asymmetrical distribution with low-frequency, extreme HIGH scores, but WITHOUT corresponding low-frequency, extreme LOW scores; its polygon has one one PRONOUNCED tail OVER THE HIGH SCORES

negatively skewed distribution

an asymmetrical distribution with low-frequency, extreme LOW scores, but WITHOUT corresponding low-frequency, extreme HIGH scores; its polygon has one one PRONOUNCED tail OVER THE LOW SCORES

estimated standard error of the mean

an estimate of the standard deviation of the sampling distribution of means

mean square

an estimated population variance

Program evaluation

applied research that attempts to assess the effectiveness and value of public policy or specially designed programs

The major difference between t tests and ANOVA is that t tests ______

are limited to two conditions

multimodal

as more than 2 modes

directional hypothesis

asserts change in a particular direction, increase of decrease

Factorial Design

at least 2 IVs, unlike one-factor design (1 IV)

r***

average amount for which the X & Y scores correspond; Pearson Correlation Coefficient /// use when you want to describe the linear relationship between two interval or ratio variables //// factors that affect r: 1) the amount of variability in the data 2) differences in the shapes of the 2 distributions 3) lack of linearity 4) the presence of 1 or more "outliers" 5) characteristics of the sample 6) measurement error

Population

based on parameters

sample

based on statistic

effect

change in measurement that is attributable to a treatment condition/stimulus of some sort

no difference in proportions observed in chi-squared

chi-squared will always be zero

frequency distribution (pt. 2/as opposed to other graphs)

comes in many different types -- simple, relative, cumulative, and cumulative percentage, etc.; table that displays the frequency of various outcomes in a sample

two-way ANOVA

compare group means when we have two independent variables

if you have any other design than a 2 x 2 design, you compute the ___________ coefficient

contingency

Partial correlation

correlation accounted for a third variable - part of the variable correlate, but there is no correlation between the other part

directionless

correlation coefficient is

Predictive (Relational) Methods

correlational method, quasi-experimental method

Descriptive Statistics

describe or summarize a set of data. Measures of central tendency and measures of dispersion are the two types

descriptive vs. inferential statistics

descriptive: procedures for organizing and summarizing SAMPLE data - answers from such data are often a single number that describes important info about the scores inferential: procedures for drawing inferences about the scores and relationship that would be found in a POPULATION - help us to decide whether or not our sample accurately represents the relationship

complex designs

designs that have more than one independent variable, aka factorial

Case Studies

detailed description and analysis of a single individual - typically in narrative form (qualitative study) - common in clinical work -> individual may illustrate factors that influence a disorder and treatment methods - weaknesses: generalizability, memory

main effect

differing means of the same independent variable on a graph illustrate this

one-tailed test scenario

directional, looking for a extreme value in either direction, level of significance is focused on only one end depending on direction

standard error of the estimate

directly measures the accuracy of predictions; somewhat like the 'average' difference between the actual Y scores that participants would obtain and the Y' scores we predict for them

Construct validity

does a test adequately and accurately measure the theoretical construct that it claims to measure? - can't be established or destroyed with single study - slowly built over time and many supporting studies

Probability sampling (pros and cons)

each member of the population has definable probability of being selected for the sample

Random assignment vs. random selection

each participant gets an equal chance to be in group random selection: for sample random assignment: for condition

What kind of studies are exempted by the IRB?

educational studies, naturalistic observation, non-sensitive survey research (sensitive= eliciting negative emotions), working with a pre-existing data set

Sequence effects

effect from preceding treatment

Blocked random assignment

ensure one randomly assigned participant in each group

one-sample z-test

estimate the mean of a population and compare it to a target or reference value when you know the standard deviation of the population

Participant Observation (characteristics, when you want to use it and disadvantages)

experimenter joins group being observed (Festinger's cult study, cognitive dissonance) d: deception,no control, observer bias (behavior checklist or what you define as something, multiple raters), consent and privacy issues

complex designs/factorial designs

experiments that involve 2 or more independent variable studied simultaneously

between groups variance

expression of amount of deviation of sample means from the grand mean

within groups variance

expression of amount of deviation of sample scores from sample means

Large

f the standard error of estimate is ____, the points in the scatterplot are far from the regression line and our predictions will not be so good.

What kind of studies are expedited by the IRB?

faster than fill review, anything involving deception/manipulation of participants

error variance

general variance among people

You should reject the null hypothesis if your observed value of chi-square is ________________ than the critical value of chi-square.

greater

range*

greatest number subtracted by the number of least value/ difference between greatest and least data values (ex. 1, 2, 3, 4, 5, 6 range is 6-1, therefore range is 5)

grand (sample) mean

group (sample) mean The mean of an individual sample in an ANOVA application.

Matching (in a between-subjects design)

grouped together on relevant subject variable and then randomly assigned to group

Bimodal

has 2 modes

multimodal

has more than two modes

pie chart (as opposed to other graphs)

has no axes and should be used with categorial data; circular statistical graphic which is divided into 'slices' to illustrate numerical proportion

unimodal

has one mode

Bimodal

has two modes

Characteristics

height, heart rate, favorite ice cream flavor, level of introversion, etc.

Example of Variable

height, introversion, vertical jump, IQ, ...

What is counterbalancing for?

helps counter and avoid order or sequence effects

behaviors

how high I can jump, how fast I can run, etc.

z-score

how many standard deviations away from the mean

perform a linear regression

if data falls on a straight line you would..

reject the null

if the exact p-value for our observed value of t is less than alpha (.05), our observed value of t must be more extreme than the critical value so, we should ___________.

The first 10 students who arrived for a Thursday lecture filled out a questionnaire on their attitudes toward the instructor. The first 10 who were late for lecture were spotted and afterward filled out the same questionnaire. The appropriate design for testing the significance of the difference between the means is ______

independent samples t test

Experimental Method

independent variable, dependent variable, experminental, control group, random assignment

nonparametric statistics

inferential procedures that *don't* require assumptions about the raw score population represented by the sample, used with the MEDIAN and MODe

parametric statistics

inferential procedures that require certain assumptions about the raw score population represented by the sample; used to compute the MEAN

Data

information (e.g., measurements)

no parallel lines

interaction

A

intercept, tells us the point at which the line crosses the Y axis

Linear regression is used to predict _______ from _______

it's used to predict unknown y scores from known x scores

non-directional hypothesis

just says a difference will be present, no specificity

The formula for the number of degrees of freedom for the between-groups estimate of variance is ________, where k equals the number of groups or samples under consideration

k-1

dfb

k-1, where k is the number of categories of samples in the study

The *median* is __________ sensitive to extreme scores than the mean.

less

In a 2x3x5 research design, the number 3 tells you the number of ________________ of one of the independent variables.

levels

When a *straight* line accurately describes the relationship between two variables, the relationship is called __________.

linear

difference between linear and nonlinear relationships

linear: straight line; y scores only move in one direction nonlinear/curvilinear: y scores change direction as x scores increase

In linear regression, if b is negative, higher values of X are associated with ______

lower values of Y

Ordinal

magnitude only

Ratio

magnitude, equal intervals, absolute zero

properties of measurement scales

magnitude, equal intervals, absolute zero

interval

magnitude, equal intervals, no absolute zero

difference in average tables

main reaction

In program evaluation and applied behavior analysis, what other factors may have led to similar results?

maturation, history

group mean

mean of each sample individually

data point

measurement for a particular person (e.g., Joe's height)

The amount of spread in scores is associated with what?

measures of variability

median

middle number when ranked lowest to highest (ex. 7, 5, 3, 6, 1 rearrange = 1, 3, 5, 6, 7 median is 5)

Formative Evaluation

monitoring of program in progress: determines if program is being implemented as planned, provides data on how program is being used, used as a pilot study to determine if program should be expanded

The *mean* is __________ sensitive to extreme scores than the median.

more

aaa

most frequently occurring score in a set of scores, useful as a measure of central tendency for a nominal variable (ex: most common car color)

mode

most often to occur (ex. 1, 2, 2, 3, 4 mode is 2)

negative skew

mu < median The shape of a distribution that includes some extremely low scores or values. A distribution is said to have a negative skew if the tail of the distribution points toward the left.

ΣXY

multiply EACH INDIVIDUAL X SCORE times its corresponding Y SCORE and then add all sums together

To find the expected value for a cell in the χ2 test of independence, you must multiply the ________ for the cell and divide by _________.

multiply the *marginals* and divide by *N*

dfw

n total -k, the total number of cases in the study

The formula for the number of degrees of freedom for the within-groups estimate of variance is _______, where n equals the total number of cases under consideration

n-k

In the independent samples t test, what equation is used to compute degrees of freedom?

n1 + n2 - 2

Nominal scale

names (e.g. teams)

chi-squared can never be

negative

Can standard deviation be negative?

no

Can variance be negative?

no

parallel lines

no interaction

You would perform a chi-square test of independence when your data are measured on a ________________ scale of measurement.

nominal

Nominal

none of the three properties

When a *curved* line accurately describes the relationship between two variables, the relationship is called __________.

nonlinear; curvilinear

Ratio scale

numbers (e.g. GPA)

Descriptive Methods

observational methods (naturalistic observation), case study method, survey method (sample, population)

Naturalistic Observation (characteristics, when you want to use it and disadvantages)

observer is unobtrusive, habitation assumed (e.g. Goodall's animal studies) d: no control, observer bias (behavior checklist or what you define as something, multiple raters), consent and privacy issues

statistical interaction

occurs when the effect of one independent variable on the dependent variable changes depending on the *level* of another independent variable

Ordinal scale

ordered stuff (e.g. ranking)

Summative Evaluation

overall assessment of program effectiveness

grand mean

overall mean of multiple samples

Fatigue effects

performance declines due to tiredness (or boredom)

Practice Effects

performance improves with practice

If you have a 2 x 2 design, you compute the _________ coefficient

phi

samples vs. populations

population: the ENTIRE group of individuals to which a law of nature applies sample: a relatively small subset OF a population

A newspaper headline writer found that the more adjectives she put in the titles of her articles, the greater the number of newspapers that were sold that day. The relationship between numbers of adjectives and newspaper sales must be ________

positive

When you compute an observed value of chi-square, the sign will always be ________________.

positive

Tukey's honestly significant difference (HSD)

post-hoc test, compute all pair-wise comparrisons--> then insert them into the equation Q= |Xbar1-Xbar2|/√(MSw/n)

When you compute an observed value of chi-square, the sign will always be ________________.

postitive

With all other factors held constant, as the effect of the independent variable decreases, power will __________ and the probability of a Type II error will __________.

power will decrease; probability of type || error will increase

power

probability that we will detect a relationship and correctly reject a false null hypothesis; the probability of avoiding a type II error

decision rule

procedure that the researcher uses to decide whether to accept or reject the null hypothesis . ex. a researcher might hypothesize that a population mean is equal to 10. He/she might collect a random sample of observations to test this hypothesis.

inferential statistics

procedures for determining whether sample data accurately represent the relationship in the population

post hoc comparisons

procedures used to compare all pairs of means in a significant factor to determine which means differ greatly from each other; USE when Fobt is significant

Measurement

process of assigning numbers or labels to observations

In your research, you observe vehicles as they approach a stop sign. For each vehicle, you categorize the age of the driver as "young" or "old" and you record whether or not the driver comes to a complete stop. In your chi-square test of independence analyzing the data, the null hypothesis is that the _______________ of old and young drivers that come to a complete stop are equal.

proportions

The chi-square goodness-of-fit test compares an observed distribution of observations to a distribution of expected values based on a known distribution of ________________ in the population.

proportions

ordinal example

ranking

type of measurement scales

ratio, interval, ordinal, nominal (noir)

Regression vs multiple regression

regression analysis allows use of variable in a correlational study to predict another

chi-squared obs greater than chi-squared crit

reject ho

type one error

reject ho when ho is true

What kind of studies are given full review by the IRB?

requires more board members to review: research involving children, the elderly, mentally disabled, the ill, sexual topics, past criminal history, inducing high levels of manipulation/stress and/or pain. Showing graphic video clips, etc.

two tailed test scenario

researcher is looking for extreme difference at either end, non-directional

Discriminant validity

scores on measure of some construct should not be related to tests that are theoretically unrelated (e.g. LOC, self-efficacy, self-confidence)

Convergent validity

scores on measure of some construct should relate to scores on other tests that are the theoretically related to that construct

measurement scale

set of possible numbers or labels that you can assign to your observations

Small- N design: Multiple baseline design

several baseline measures are established: different individuals, same behavior e.g. - One individual, different behavior - 1 ind, 1 behavior, different settings - treatment introduced at different times - behavior measured before and after

nominal example

sex, nationality, category names

regression coefficients

slope and y-intercepts are

(ΣX)^2

squared SUM OF X (find the sum of X, square that individual number)

(ΣY)^2

squared SUM OF Y (find the sum of Y, square that individual number)

statistic vs. parameter (hint: ss, pp)

statistic: number that describes an aspect of the scores in a SAMPLE parameter: a number that describes an aspect of the scores in a POPULATION - obtained when applying INFERENTIAL procedures

measures of variability

statistics that summarize the extent to which scores in a distribution differ from one another

measures of central tendency

statistics that summarize the location of a distribution on a variable by indicating where the center of the distribution tends to be located

linear regression line

straight line that summarizes a linear relationship by passing through the center of the scatterplot

Stratified sampling vs cluster sampling

stratified: select some units from each subgroup -proportions of important subgroups are represented precisely -allows you to represent systematic differences in population -requires researcher to determine what subgroups are relevant to current question cluster sampling: select all units in subset of subgroups -researcher randomly selects cluster of people having some feature in common (ex: the same class in school)

Psychology

study of behavior

deviation score*

subtract the mean from each score (ex. 1, 2, 3 mean= 2 therefore; deviation score = -1, 0, 1)

ΣX^2

sum of THE SQUARED X SCORES (square each individual x score and then add them all up)

ΣY^2

sum of THE SQUARED Y SCORES (square each individual y score and then add them all up)

ΣX

sum of X scores

(ΣX)(ΣY)

sum of X times the sum of Y

ΣY

sum of y scores

Your IV is sex -- comparing men to women. The appropriate statistical test will be what kind of test?

t test

In the independent samples t test with unequal sample sizes, the unbiased estimate of the population variance is found by __________

taking a *weighted average* of the sample variances

Marginal means

tell the main effects, you want a diff of 10% - you can't interpret the main effects without patterns

Correlation

tells us about how variables are related and how strongly they are related.

r squared

tells us the proportion of total variability in 𝑌 that is related to variability in 𝑋.

interval example

temperature

In two-way ANOVA, a significant interaction occurs when _____

the *combined* effects of the two variables yield an unexpected effect

chi square procedure

the NONPARAMETRIC procedure for testing whether the frequencies of category membership in the sample represent the predicted frequencies in the population

cumulative frequency

the NUMBER of scores in the data that are at or below a particular score

percentile

the PERCENTAGE of all scores in the sample that are below a particular score

expected value

the average of each possible outcome of a future event, weighted by its probability of occurring

Split half (internal) reliability

the consistency across similar questions in different parts of the same test (spread similar questions through)

Test-retest reliability

the consistency of ratings between the first time tested and the second times tested

Inter-rater reliability

the consistency of ratings between two raters

Parallel forms reliability

the consistency of ratings in comparison to a similar test (measuring same construct)

The sampling distribution of the mean is _____

the distribution of sample means over repeated samples

In two-way ANOVA, a main effect for variable A means that _____ for variable B

the effect of variable A is significant when averaged over all levels of variable B

Internal validity

the extent to which the experimenter can control extraneous variables that might otherwise affect the outcome of the experiment - truth of inferences about causality among variables

The nondirectional alternative hypothesis asserts that ______

the independent variable has an effect of some kind

interaction

the interplay that occurs when the effect of one factor depends on another factor

degrees of freedom

the number of scores in a sample that reflect the variability in the population; determine the shape of the sampling distribution when estimating σx/population standard deviation

z-score

the number of standard deviations from the mean a data point is

*ANOVA*

the parametric procedure for hypothesis testing in an experiment containing two+ conditions

two-way ANOVA

the parametric procedure performed when an experiment contains TWO INDEPENDENT variables

related samples t-test

the parametric procedure used for testing sample means from two related samples (related samples are created by matching each participant in one condition with a participant in the other condition OR by repeatedly measuring the same participants under all conditions)

one-sample t-test

the parametric procedure used in a one-sample experiment when the standard deviation of the raw score population is estimated

independent samples t-test

the parametric procedure used to test sample means from two INDEPENDENT samples

percentile rank

the percentage of scores in its frequency distribution that are equal to or lower than it. For example, a test score that is greater than or equal to 75% of the scores of people taking the test is said to be at the 75th percentile, where 75 is the percentile rank.

P value

the probability of obtaining that value of t or a more extreme value of t (if the null hypothesis is true)

relative frequency

the proportion of time that a score occurs in a distribution -- THE FORMULA FOR RELATIVE FREQUENCY IS f/N (how often something happens divided by all outcomes)

standard error of the mean

the standard deviation of the sampling distr. of means

Tukey's HSD test

the type of post hoc procedure performed with ANOVA to compare means from a factor when all levels have equal 'n's.

Independent Variable and what are the three types of IVs?

the variable that you manipulate 1) situational IV: something about the environment (e.g. by-standers) 2) task IV: what the participant do/ materials (part of the things the participants are involved) 3) instructional: the instruction you give to the participants

The primary reason we use a scatterplot to view data for correlation and linear regression is

to determine if the relationship is linear or nonlinear

What is the purpose of a Balanced Latin Square?

to make sure that every subject is exposed to every condition for within subjects design, also so that you don't have to use complete counterbalancing. -> every condition precedes and follows every other condition exactly once (controlling for sequence effects), every condition occurs equally often in every position (controlling for order effects)

equation

total variability in Y = variability in Y that is associated with changes in X + error

Fisher's LSD test is most useful when you have _____ (number of means)

two or three means

stem & leaf plot

type of GRAPH used with CONTINUOUS data; allows one to see exact value of the data; each data value is split into a 'leaf' (usually the last digit) and the 'stem' (the other digits)

Ordinal Scale (Scales of Measurement)

type of measurement scale in which scores indicate rank order (ex. '1' assigned to best student, '2' to second-place student, and so on) - indicate only a relative amount of who scored high or low -- NO score of ZERO

Ratio Scale (Scales of Measurement)

type of measurement scale in which scores measure actual amounts, but zero DOES mean nothing is present and negative scores ARE NOT possible (ex. dollars)

Interval Scales (Scales of Measurement)

type of measurement scale in which scores measure actual amounts, but zero doesn't mean zero amount is present, so NEGATIVE numbers ARE possible -- an equal amounts separates any adjacent scores

Nominal Scale (Scales of Measurement)

type of measurement scale that DOESN'T require an amount, each score is used for IDENTIFICATION/CLASSIFICATION -- measures qualitatively

Central Tendency

typical value for a probability distribution. It may also be called a center or location of the distribution. ... The most common measures of are the arithmetic mean, the median and the mode.

line graph

typically used when graphing experimental results with 1+ independent variable; used to show how values change

Non-probability sampling

use a convenience sample - which is a group of people who meet general requirements of study and recruited in a variety of nonrandom ways (ex: subject pool)

one-way ANOVA

use to analyze data from an experiment in which there is one independent variable

box plot/box and whisker plot

used to display patterns of QUANTITATIVE data; two horizontal lines (whiskers) extend from both the front and the back of the box

Inferential Statistics

used to make generalizations from a sample to a population.

scatterplot

used to see the relationship between two QUANTITATIVE variables; each dot on it represents one observation from a data set

bar graph (as opposed to other graphs)

used with CATEGORIAL data; used with graphing frequencies or means of the categories; chart that uses UN-TOUCHING bars to show comparisons between categories of data

histogram (as opposed to other graphs)

used with CONTINUOUS data, used for graphing frequencies; plot used to discover/show underlying frequency distribution; uses TOUCHING bars to show continuity

systematic variance

variance that is due to the effects of the independent variable

What is a major limitation of a two-group design?

very often two groups are insufficient for a clear interpretation

Type II error

we accept the null hypothesis when it is false; (the probability of avoiding a type II error is known as POWER)

Can we eliminate carryover effects? explain.

we cannot eliminate carryover effects, but we can control for it.

Type I error

we reject the null hypothesis when it is true

preform a chi-squared test of independence

when dependent variable is measured on a nominal scale

one-way ANOVA

when one INDEPENDENT VARIABLE is tested, this is used to determine whether there are any statistically significant differences between the means of two or more independent (unrelated) groups (although you tend to only see it used when there are a minimum of three, rather than two groups).

repeated-measures design

when the same participants are measured under all conditions of the independent variable

Do a linear regression when

when there appears to be a linear relationship between the two variables of interest

interaction

when two independent variables effect each other (parallel lines usually mean there is not one)

p value, t

when we do a t-test on a computer, it gives us the exact______ for our observed value of_____

The advantage of a powerful experiment is that

you are more likely to detect the real effects of the independent variable, if there are any

__________ and __________ are effect size measures used with one-way ANOVA.

η2; ω2

Population mean

μ


Ensembles d'études connexes

NUR 101 Unit Two Communication, Nurse-Patient Relationship, Legal Aspects

View Set

306 Ricci Chapter 13: Labor and Birth Process

View Set

Custom: Pediatrics practice questions # 2 (Ana)

View Set

MGT: Exam 1: Chapter 2 Study guide

View Set

Chemistry A New Approach to the Atom 4-4: Review and Reinforcement

View Set

Chapt 15 (licensing) (state laws)

View Set

AP G&P - Ratification of the US Constitution

View Set