PSY:2812 (Research Methods and Data Analysis in Psych II) Exam #1

Ace your homework & exams now with Quizwiz!

What are the consequences of within-subject experimental studies?

1. "Order effects" are a major threat to internal validity 2. Practice/fatigue effects 3. Carry-over effects (proactive interference) 4. Counter-balance (order) is the default, but concurrent conditions should be done if possible

How do point estimation and confidence interval work together?

1. 95% of the area of the normal is within 1.96 σ of the mean, μ X̄ is our estimate for μᵪ sx̅ is our estimate of σᵪ 2. So, if sx̅ is accurate (df is very high), there is a 95% chance that: μᵪ is between X̄ - 1.96 sx̅ and X̄ + 1.96 sx̅ X̄ - 1.96 sx̅ is the lower bound (for μᵪ) X̄ + 1.96 sx̅ is the upper bound (for μᵪ) 3. The above only works when sx̅ is known to be perfectly accurate, which requires an infinite sample 4. When we are not sure about sx̅, we flatten the normal, which is known as the t-distribution 5. How much we flatten the normal depends on the degrees of freedom Lower bound = X̄ - tcrit sx̅ Upper bound = X̄ + tcrit sx̅ 6. Where tcrit depends on df, but is usually a bit more than 2.000

What are the trade-offs in correlational studies?

1. At least two dependent variables 2. Easier to have high external validity

What are the consequences of between-subject and within-subject experimental studies?

1. Between-subjects: Different subjects in different conditions 2. Within-subjects: All subjects in all conditions

What are the tradeoffs of between-subject and within-subject experimental studies?

1. Between-subjects: Easier to have high construct validity 2. Within-subjects: Easier to have high stats conclusion validity

What are the threats to a pre-test/post-test, two-group design?

1. Biased attrition effects 2. Regression effects These threats are not addressed by adding a control group

What are the four validities?

1. Construct Validity (extent to which the test or measure accurately assesses what it is supposed to) 2. Statistical (Conclusion) Validity (extent to which statistical inferences about the population, based on a sample, are accurate) 3. External Validity (extent to which findings can be generalized to other situations, people, settings, and measures) 4. Internal Validity (extent to which the causal relationship being tested is not influenced by other factors or variables) 5. Ethics, practicality, and efficiency are also very important

What is part of step #2 in point estimation?

1. Convert to best guess and estimate of error 2. Best guess for population mean, μᵪ, is the sample mean, X̄ 3. Standard (single-value) estimate of the error is the "standard error (of the mean)" when using the sample mean to estimate the population mean Symbol: sₓ̄ sₓ̄ = sᵪ/√N → most important formula in I-stats

What are the three criteria for causality?

1. Covariation (association) 2. Temporal precedence 3. Internal validity

What is the "textbook" sequence for causality?

1. Covariation is usually observed before a causal claim is even proposed (this criterion is usually already met) 2. Temporal precedence is needed to show that the cause comes before the effect 3. Internal validity is needed to rule out alternative explanations

What is the difference between descriptive and inferential statistics?

1. Descriptive statistics summarize a set of data (rarely wrong unless a mathematical error has occurred) Ex: "The mean of the sample is 12.45" 2. Inferential statistics go beyond the data in hand to make a statement about the sampling population (almost always wrong) Ex: "Based on these data, my guess for the true (population) mean is 12.45"

What is said about empirical sciences and data?

1. Empirical sciences use both data and logic to test theories 2. A rational science only uses logic 3. It is relatively easy to summarize data 4. The logical rules can rarely be summarized simply Ex: "A correlation does not imply causation" is true, but highly misleading

What are the three types of claims?

1. Frequency 2. Association 3. Causal

What are the steps of point estimation?

1. Get the mean and standard deviation of the sample (D-stats) 2. Convert to best guess and estimate of error (I-stats) 3. Express findings in simplified/standard format

What does a "clean" experiment have?

1. Has a dependent variable, so there is some data to be analyzed 2. Has an independent variable manipulated/set by the experimenter 3. Has no other systematic differences between the conditions 4. All extraneous variables are held constant or are equal-on-average

How can a third variable or confound be tested in correlational studies?

1. Hold the third variable constant and re-test the original correlation (not used in correlational studies) 2. Remove the influence of the third variable from the correlation via partial regression and re-test the original correlation (used in correlational studies)

What is the purpose of the p-value in hypothesis testing?

1. H₀ is rejected when the p-value is very low (and the "p" in "p-value" is short for probability), but a p-value is not the probability that H₀ is true 2. It is the probability of getting the (sample) data on the assumption that H₀ is true 3. With a (fully specified) H₀ in hand and a selected sample size, one can calculate the probability (or likelihood) for all possible outcomes in advance 4. This is called the "sampling distribution"

What are the different possible outcomes with hypothesis testing?

1. H₀ is true (reality = no relationship), retain H₀ (conclusion = no relationship) → correct 2. H₀ is false (reality = relationship), retain H₀ (conclusion = no relationship) → Type II error "miss", "beta lack of power" 3. H₀ is true (reality = no relationship), reject H₀ (conclusion = relationship) → Type I error "false alarm", "alpha risk" 4. H₀ is false (reality = relationship), reject H₀ (conclusion = relationship) → correct "hit" Type I and Type II errors are threats to statistical validity

What can go wrong during a cross-lagged analysis?

1. If any preliminary test is nonsignificant, then the data cannot be used (no answer, either way, to the directionality question) 2. If neither cross-lags is significant, then theory is not supported (either wrong or not enough subjects were run) 3. If both cross-lags are significant and very similar in magnitude, then theory is not supported (may be half-correct, mutual, bi-directional causation, or completely wrong if due to a third variable)

What is the logic behind the cross-lag?

1. If either variable (in a pair) is no good (P), then the correlation will not be significant (Q) 2. Correlation is (not) significant (¬q), therefore, neither variable (in the pair) is no good (∴¬p) 3. If all variables in a panel study exhibit at least one correlation, then all of the data are good

How are confounds dealt with when trying to establish internal validity?

1. If you believe that X causes Y and Z is the confound (potential third variable), collect measures of X, Y, and Z 2. Verify a significant rᵧᵪ 3. Next, test for a significant prᵧᵪᐧz 4. You need the results from two tests: rᵧᵪ = ___, N = ___, p = ___ prᵧᵪᐧz = ___, N = ___, p = ___ 5. If rᵧᵪ is not significant, then your X → Y theory is not supported 6. Usually you would stop here, but not always 7. If rᵧᵪ and prᵧᵪᐧz are both significant, then your X → Y theory survives (but you cannot conclude that it has been "proven") 8. If rᵧᵪ is significant but prᵧᵪᐧz is not, then your X → Y theory is not supported (but a more complicated version is still possible) 9. When nothing is being controlled for, all of X and all of Y contribute to the correlation 10. When you control for a third variable, Z, splitting each of X and Y into two parts (via regression), only the parts of X and Y that are not linked with Z contribute to the partial correlation

How do confounds and spurious relationships affect internal validity?

1. Internal validity can be defined as the extent to which the observed pattern of results (with respect to the variables of interest) is due to causal relationships (between the variables of interest only) 2. The alternative is that the pattern is due to one or more spurious relationships instead 3. All threats to internal validity are some sort of confound (also known as "third variable" for correlations) 4. In the context of correlational studies, we cannot prevent confounds, but we can remove their effects during the analysis (done via partial correlation)

Why is the above definition of "bias" preferred?

1. It does not conflict with the definition of "confound" 2. It maintains the distinction between objective and theoretical 3. It keeps internal and construct validity separate 4. It does not require complete details/specification 5. It makes it clear that it is not always a problem while providing a warning and a reminder

What are critical formulas for statistics?

1. Mean: X̄ = ΣXᵢ/N 2. Standard Deviation: s²ᵪ = ∑(Xᵢ - X̄)²/(N) sᵪ = √s²ᵪ 3. Standard Error: sₓ̄ = sᵪ/√N

What are the consequences in experimental studies?

1. Need a clean method of manipulating the independent variable 2. A clean experiment only creates the desired difference that defines the conditions of the experiment (levels of the independent variable) 3. Does not create any unintended differences

What are the different hypothesis tests?

1. Null hypothesis statistical testing (NHST) 2. Bayesian inference (not very good at making inferences) 3. Mixed-effect modeling This is not a case of one being better; you match your choice to the situation If your theory concerns only a few variables and only claims that one variable (the cause) has some influence on another variable (the effect), but does not say how large this influence will be (just that it is not zero), then the most appropriate option is "traditional" null hypothesis statistical testing

What is the actual process for establishing causation using correlational data?

1. Observe covariation between cause and effect 2. (Start to) establish internal validity using partial correlations 3. Repeat until confident or find a nonsignificant partial 4. Establish temporal precedence of the cause (include the third variable in the cross-lag study) 5. Verify temporal precedence of the cause over both the effect and the third variable

What is the "textbook" process for establishing causation using correlational data?

1. Observe covariation between cause and effect (linear correlation is simplest form) 2. Establish temporal precedence of the cause via longitudinal panel study and cross-lagged analysis 3. (Start to) establish internal validity via partial correlations

What are the ways to present a point estimate?

1. Option 1: Values for a plot of the data Best guess ± how wrong we might be Mean ± standard error (of the mean) Ex: μ = 100.00 ± 5.00 In other words, just give the basic inferentials, separated by ± 2. Option 2: 95% confidence interval Range of values that will contain μ 95 times in 100 Lower bound = mean - half-width of the interval Upper bound = mean + half-width of the interval Written as lower bound < μ < upper bound Ex: 90.00 < μ < 110.00

What are the steps for a cross-lagged analysis?

1. Purpose: Establish that the proposed cause has temporal precedence over the proposed effect 2. Requirements: Measures of both the cause and the effect at two or more points in time (all from the same people) 3. Procedure: There are preliminary tests that must be passed or the set of data cannot be used to establish temporal precedence 4. Assuming that the preliminary tests are passed, a pair of critical tests are conducted last 5. C1-E2 should be significant and stronger than E1-C2 because causation only goes forward in time 6. Assuming that C = proposed cause, E = proposed effect, and C1 = variable measured at Time 1, the following preliminary tests must be conducted: 7. C1-E1 (Time 1 cross-sectional correlation) must be significant 8. C1-C2 ("cause" autocorrelation) must be significant 9. E1-E2 ("effect" autocorrelation) must be significant 10. Critical tests are conducted (only if all preliminary tests are passed) 11. Test and compare C1-E2 and E1-C2 12. C1-E2 should be significant and larger than E1-C2 13. Assuming that all preliminary tests are passed and C1-E2 is significant, how much larger than E1-C2 does it need to be? 14. The answer depends on both the specific values of the two correlations and how many subjects were run 15. Any positive difference can be treated as larger

What is the decision rule with hypothesis testing?

1. Reject H₀ (concluding that the relationship does exist) if the p-value is less than .05 2. Technically, H₀ is rejected if the p-value is less than ɑ, but ɑ is always .05 in psychology 3. A p-value less than .05 implies that the finding is "significant" 4. The decision is based (only) on the p-value

What are the steps to determining significance from Levene's Test?

1. Report the p-value from Levene's Test 2. Is the p-value signficant (p < .050)? If yes (equal variance is not assumed) → Welch's t If no (equal variance is assumed) → Student's t

What is part of step #1 in point estimation?

1. Sample mean and standard deviation 2. Assuming a sample of N values: X₁ ... Xₙ: Mean = ∑(Xᵢ/N) Symbol: X̄ Variance = ∑(Xᵢ - X̄)²/(N - 1) Symbol: s²ᵪ Standard deviation = √variance Symbols: √s²ᵪ or sᵪ

What are the causes of Type I errors?

1. Sampling error Ex: Bad luck 2. Violating an assumption of the analysis Ex: Using student's t-test when Levene's Test is significant

What are the different ways covariation (association) can be tested?

1. Standard correlation 2. Non-linear regression 3. Multiple regression 4. Point-biserial correlation 5. Phi coefficient or chi-square 6. T-test or ANOVA

What are the steps for null hypothesis statistical testing?

1. Start with the assumption (H₀ "null hypothesis) that the relationship does not exist 2. Collect data relevant to the hypothesis 3. Calculate the probability of observing (some aspect of) the data (that was just collected), still assuming that the relationship does not exist 4. Reject the null hypothesis and conclude that some relationship does exist if the calculated probability is < .05 or retain the null hypothesis and conclude that a relationship does not exist if the calculated probability is > .05

What are the consequences of between-subject experimental studies?

1. Subject confounds are a major threat to internal validity ("selection effects") 2. Random assignment is easier to do, but "matching" pre-test/post-test covariates are more effective

What are the four main threats to pre-test/post-test, one-group designs?

1. Testing effects 2. Instrumentation effects 3. Maturation effects 4. History effects The standard solution to all four of these problems (testing, instrumentation, maturation, and history) is to add a control group We cannot avoid these four problems, but we can measure their effects on the data and then remove these effects (by subtraction)

What is the logic behind a pre-test/post-test, two-group design?

1. The change (𝚫) score for T = intervention + confounds 2. The change (𝚫) score for C = confounds only 3. Therefore, difference in change scores = intervention 4. Conduct independent-samples t-test on change scores H₀ = μΔT = μΔT 5. A "control condition" is a condition that includes everything except the treatment 6. Used to get a measure of the effects of confounds

How is internal validity established?

1. The technical name for a correlation when at least one other variable is being controlled for (statistically) is partial 2. Assuming that X and Y are the cause and effect and that Z is the potential confound, the original association is YX (list effect first) 3. rᵧᵪ = correlation between X and Y (when controlling for Z, it is YXᐧZ) 4. prᵧᵪᐧz = partial correlation between X and Y with respect to Z 5. If the correlation between the cause and the effect remains significant when you control for the third variable (both rᵧᵪ and prᵧᵪᐧz are significant) → the original correlation was not (entirely) due to the confound and your cause-effect theory survives 6. If controlling for the third variable makes the original correlation move (a lot) towards zero and become nonsignificant (rᵧᵪ is significant, but prᵧᵪᐧz is not) → your cause-effect theory has no support (might be correct, but it is more complicated) 7. If controlling for the third variable makes the original correlation become nonsignificant and you do not want to give up on your theory, you need to determine the directionality of the relationship between the cause and third variable 8. If cause → third variable, then your cause-effect theory survives (and you also now know it is mediated) 9. If third variable → cause, then your cause-effect theory is dead (because the relationship is spurious)

What is the actual process of empirical science?

1. Theory predicts the predicted data and the observed data is compared to see if they match 2. Theory #1 predicts predicted data #1 3. Theory #2 predicts predicted data #2, which must be different than predicted data #1 4. Does the observed data match either predicted data #1 or predicted data #2?

What type of threat is the placebo effect?

1. Threat to construct validity because it is beliefs Where did the participants get their beliefs? 2. (Observable) clues: Ex: Talking to a therapist Does this now make it a confound (internal validity)? Not if the clues = the conditions of the experiment This is the same as demand-driven reactivity Ex: Demand characteristics from the conditions are not confounds In both cases, they open the door to alternative theoretical explanations, but not different observable root causes 3. (Observable) clues: Ex: Talking to a therapist vs. not talking to anyone Does this now make it a confound (internal validity)? Yes if there is more to therapy than mere talking Note that the key is no longer from where they got the beliefs, it is the confounding of therapy with talking

What are the causes of Type II errors?

1. Too much variability in the data 2. Not enough subjects Ex: Standard error (for univariate) = sₓ/√N 3. Too small of an effect Ex: t-value = effect/standard error 4. Floor or ceiling effect Ex: Constriction of observed values due to the "hard" limits of the measure being used and can reduce the difference between conditions

Standard Error

1. Univariate Case: sx̅ = (sᵪ/√N) = (√variance of X)/(√N) sx̅² = variance of X/N 2. Independent Samples Case: sx̅-x² = (variance of X₁/N₁) + (variance of X₂/N₂) → via additive variance rule sx̅-x = √(variance of X₁/N₁) + (variance of X₂/N₂) = √(s₁²/N₁) + (s₂²/N₂) → this value has df = N₁ - 1 + N₂ - 1 sx̅² = s²/N sx̅-x = √sx̅₁² + sx̅₂² → this value has df = N₁ - 1 + N₂ - 1

What are the steps of a manipulation check?

1. Verify successful manipulation of target construct using a positive manipulation check 2. Verify selective manipulation of target construct using negative manipulation check(s) 3. Can be analyzed separately (pass/fail checks) 4. Can be included in fancy mediation analyses with dependent variable

What does the symbol and formula mean for standard error of the mean?

1. We use sᵪ for the standard deviation of a set of X values 2. So, if we have a set of X̄ values, instead, we use sᵪ for their standard deviation 3. Central Limit Theorem tells us that sᵪ will be (σᵪ/√N) 4. We can estimate σᵪ using sᵪ, so the formula for standard error is sx̅ = sᵪ/√N

History Effects

A change in behavior due to an external event (that occurs during an experiment) Threatens internal validity Without intervention = without event With intervention = with event

Testing Effects

A change in behavior due to previous testing Threatens internal validity Without intervention = first time subject tested With intervention = second time subject tested

Maturation Effects

A change in behavior that emerges over time (regardless of details of experiment) Threatens internal validity Without intervention = subject age = X With intervention = subject age = X + lag

Instrumentation Effects

A change in measurement due to previous use Threatens internal validity Without intervention = first time msr used With intervention = second time msr used

Biased Attrition Effects

A change in the data due to systematic (unequal) loss of participants The solution is to do (what you can) to avoid attrition Show that lost participants were typical Omit all of the lost participants' data

How do design confounds affect internal validity?

A design confound is an extraneous variable that co-varies with the independent variable without being caused by the independent variable If the design confound is "downstream", it could be a possible mediators of IV → DV and is not a threat to internal validity If the design confound is a parallel covariation (classic design confound), it is a serious threat to internal validity

Control Variable

A factor that is held constant to test the relative impact of an independent variable

Levene's Test

A hypothesis test used to test the hypothesis that two population variances are equal The null hypothesis for Levene's Test is that σ₁² vs. σ₂² If the p-value from Levene's Test is < .050 (significant), we cannot assume equal variance If the p-value from Levene's Test is ≥ .050 (nonsignificant), we can assume equal variance

The decision whether to use a between-subjects design or a within-subjects design involves ___ A. A trade-off between construct validity and statistical conclusion validity B. A trade-off between internal validity and external validity C. A trade-off between external validity and construct validity D. A coin-flip

A trade-off between construct validity and statistical conclusion validity

What is the half-width of the confidence interval?

About 65% of the time, μ is less than one sx̅ away from X̄ To raise this to 95%, we multiply sx̅ by tcrit to get the half-width tcrit is 1.96 for infinitely large samples (and 95% confidence) and increases as the samples get smaller, but is rarely higher than 2.50

The quality or usefulness of any given study or experiment depends on ___ A. Only one of the four types of validity, selected at random B. Two of the types of validity, selected by a committee of experts C. Three of the types of validity, with the researcher(s) choosing which to ignore D. All four of the types of validity

All four of the types of validity

Placebo effects ___ A. Are a potential problem for most tests of treatment efficacy B. Are probably best viewed as a threat to Construct Validity (even if some texts refer to them as threats to Internal Validity) C. Are a lot like the problem of demand-driven reactivity in other experiments D. All of the above

All of the above

The (main) threat to internal validity that is specific to between-subject designs is ___ A. Subject confounds (which were called "selection effects" in the text) B. The possibility that the subjects assigned to different conditions aren't the same (on average) C. Pre-existing differences in the subjects assigned to different conditions D. All of the above

All of the above

The pre-test/post-test, one-group design ___ A. Was sometimes used to test the efficacy (effectiveness) of an intervention B. Should actually be avoided (as it is terrible) C. Has four threats to Internal Validity due to the fact that pre-test vs post-test cannot be counter-balanced D. All of the above

All of the above

The standard error of the mean ___ A. Is equal to the sample standard deviation divided by the square-root of sample size B. Gets larger when the sample standard deviation increases (all else being equal) C. Gets smaller when the size of the sample is increased (all else being equal) D. All of the above

All of the above

Correlational Studies

All variables are measured, "dependent variables" Temporal precedence via cross-lagged analysis Internal validity via partial correlations Relatively easy to maintain high external validity, but relatively difficult to establish internal validity

A "design confound" is ___ A. An aspect of the experiment that covaries with the dependent variable B. An aspect of the experiment that covaries with the independent variable C. Whether one is using a within- or between-subjects design D. The same thing as a "subject confound"

An aspect of the experiment that covaries with the independent variable

A manipulation check is ___ A. An extra measure at the very beginning of the experiment, used to verify successful and/or selective influence of the target theoretical construct B. An extra measure in the middle of the experiment (after the IV has been set but before the DV has been measured), used to verify successful and/or selective influence of the target theoretical construct C. An extra measure at the very end of the experiment, used to verify successful and/or selective influence of the target theoretical construct

An extra measure in the middle of the experiment (after the IV has been set but before the DV has been measured), used to verify successful and/or selective influence of the target theoretical construct

By the strict definition (from lecture), a confound in an experiment is ___ A. Anything, including a belief, that covaries with dependent variable B. An objective (directly observable) thing that covaries with dependent variable C. Anything, including a belief, that covaries with independent variable D. An objective (directly observable) thing that covaries with independent variable

An objective (directly observable) thing that covaries with independent variable

Demand Characteristics

Any aspect of the experiment that provides subjects with information as to what is being studied and/or what results are expected Not a problem (on its own), but often causes reactivity, which is a huge threat to construct validity

Order Effect

Any change in the pattern of results due to when some data were collected relative to other data

Subject Confound

Any systematic, pre-existing difference between the subjects assigned to different conditions Opposite of equivalent groups (equivalent groups have the same mean value on all attributes that could influence the dependent variable)

Confound

Anything that covaries (in any way) with (at least) the proposed cause

The issues related to construct validity ___ A. Have nothing to do with experiments B. Only apply to measurement C. Only apply to manipulations D. Apply to both measurement and manipulations

Apply to both measurement and manipulations

Design Confounds

Aspects of the environment that covary with the independent variable Ex: A buzzing noise in dim-lighting conditions (only) Design confounds do not depend on the subjects They are built into the environment

The defining attribute of "experiment" (the thing that all true experiments have and non-experiments don't have) is ___ A. At least one dependent variable B. At least one independent variable C. At least one extraneous variable D. At least one potential confound

At least one independent variable

Experiments

At least one variable (the cause) is manipulated = independent variable (IV) This variable is the defining attribute of experiments At least one other variable (the effect) is measured = dependent variable (DV) Temporal precedence is established by setting the value of the independent variable before measuring the dependent variable

Observer Bias

Beliefs (held by the coder) that could cause the coder to record the (same) data differently, depending on the condition of an experiment

Experimenter Bias

Beliefs (held by the experimenter) that could cause the experimenter to behave differently when running the different conditions of an experiment

What are the basic inferentials for point estimation?

Best guess for μ (population mean) is X̄ (sample mean) Estimate of error is "standard error" of the mean Standard error = standard deviation / √N Uses the symbol sx̅

Which of the following is still a potential problem even when you use a two-group design? A. Biased attrition effects B. History effects C. Maturation effects D. All of the above

Biased attrition effects

Association Claim

Bivariate (or larger) statement of fact Ex: The correlation between BDI-II and HAM-A is .55

T-Test or ANOVA

Categorical → continuous

Find the confound in this example. Experiment: within-subject version of lighting → memory, Group 1 = normal light (1) then dim light (2), Group 2 = dim light (1) then normal light (2)

Confound is between dim and abnormal light/dim and normal light Going into dim conditions, light changes down Going into bright conditions, light changes up Any change will have a short-term effect on performance One of the conditions is confounded with change, while one is not

Point-Biserial Correlation

Continuous and dichotomous

Descriptive Statistics

D-stats summarize the data in the sample Usually the mean and standard deviation Do not forget to keep track of sample size (N), plus a verbal label for shape

What happens if any of the prelims fail in a cross-lagged analysis?

Data cannot be used to test for temporal precedence The causal claim is neither supported nor ruled out

When do sources of threats become actual threats for construct validity?

Demand characteristics are not a threat on their own They only become a threat if they cause the subject to change their behavior (react), therefore not providing measures of the intended construct

How does a positive manipulation check work?

Demonstrate that a large part of the independent variable's effect on the dependent variable is via the +MCV using mediation analysis

How does a negative manipulation check work?

Demonstrate that the independent variable's effect on the dependent variable is still significant when controlling for the -MCV using covariance analysis

Ceiling Effect

Experimental design problem in which independent variable groups score almost all the same on a dependent variable, such that all scores fall at the high end of their possible distribution Potential cause of a null result Result of too little variability within levels

Floor Effect

Experimental design problem in which independent variable groups score almost all the same on a dependent variable, such that all scores fall at the low end of their possible distribution Potential cause of a null result Result of too little variability within levels

Causal Claim

Explanation for an observed association Ex: Depression causes anxiety

What is part of step #3 in point estimation?

Express the findings in standard format

Which of the following is not one of the three criteria for establishing (providing evidence in support of) a causal claim? A. Covariation (association) B. External validity C. Internal validity D. Temporal precedence

External validity

Regression Effects

Extreme values are not likely to repeat Ex: When only low-scoring subjects are given intervention due to random assignment not being used Threatens internal validity Without intervention = score - regression With intervention = score + regression The solution is to ensure equivalent groups at pre-test

Sampling Theory

Focuses on the generalizability of descriptive findings to the population from which the sample was drawn Population: Mean = μₓ Standard Deviation = σₓ Shape = ? Sample: Mean = X̄ Standard Deviation = sᵪ Sample Size = N Sampling + descriptive statistics is how to get from population to sample Repeat this process an infinite number of times, always using the same N, keeping only X̄ Then, get the descriptive statistics for the X̄'s

What is the Central Limit Theorem?

Given infinite independent samples, all of size N and from the same continuous population (with mean μₓ and standard deviation σₓ): 1. The mean of sample means will be μₓ 2. The standard deviation of the sample means will be (σₓ/√N) 3. The shape of the sample means will be normal (as long as N ≥ 30 and/or the shape of the population is normal) If we only know (or assume) the value of μₓ: 1. The mean of sample means will be μₓ 2. The (estimated) standard deviation of the sample means will be sₓ/√N or sₓ̄ 3. Switch the shape of the sample means to being t-distributed with N-1 df

If you reject a null hypothesis that is actually true (which is know as a "false alarm"), you ___ A. Have made a Type I error B. Have made a Type II error C. Owe it an apology

Have made a Type I error

Inferential Statistics

I-stats provide estimates of the sampling population Usually an estimate of the population mean (μ) Conducts a test of a null hypothesis I-stats are always guesses, so they can be wrong We must always warn people about error, so the "basic inferentials" are pairs of a best guess and an estimate of how wrong it might be

What is an example of a two-condition, within-subjects design?

IV = bright (B) vs. dim (D) lighting DV = number of items recalled correctly All subjects in both conditions (counter-balanced order) Within-subject design is a comparison between conditions and occurs within every subject H₀: μB = μD can also be rewritten as H₀: μB - μD = 0 OR as H₀: μB-D = 0 It is a test of a single set of values (the differences) against zero

Why is the choice of time points for the cross-lag important?

If we are correct about the direction of causation, and the process of causation takes time, then the strongest correlations across the variables should be Cause at Time 1 and Effect at Time 2 To get the predicted pattern of results, you must match the timing of the measures to the "lag" or delay of causation Ex: Depression (BDI) and anxiety (HAM-A) are correlated and almost definitely causally related, but there is a huge debate on directionality If relatively short lags are used: Dep1 ↔ Anx2 > Anx1 ↔ Dep2 If relatively long lags are used: Anx 1 ↔ Dep2 > Dep1 ↔ Anx2

What are the consequences of using the wrong version of the independent samples t-test?

If you assume that σ₁² vs. σ₂² when it is not true, the rate of Type I errors can increase, while the rate of Type II errors might decrease If you do not assume that σ₁² vs. σ₂² when it is true, the rate of Type I errors does not change, while the rate of Type II errors might increase Because Type I errors are much worse, we are careful about assuming that σ₁² vs. σ₂²

How is temporal precedence measured?

In correlational contexts, temporal precedence is usually established by conducting a panel study (often referred to as a "cross-lagged study") Measure the proposed cause (C) and effect (E) at two or more points in time (1, 2, etc) The data must pass three preliminary tests: 1. C1 → E1 (cross-sectional correlation) 2. C1 → C2 (autocorrelation) 3. E1 → E2 (autocorrelation) If (and only if) the preliminary tests are passed, test and compare 4. C1 → E2 5. E1 → C2

What does a theory do for hypothesis testing?

In order to be tested, a theory must make specific predictions Theories either "survive" or are "falsified" by a test We would like to provide evidence in favor of a theory that is often quite vague Ex: Depression causes anxiety In order to provide evidence in favor of a theory that is vague, we have to falsify the opposite (more specific) Ex: Depression does not cause anxiety Theories that say "no effect" are null hypotheses Even a very specific theory will not make a precise prediction for a sample Ex: 𝜋 (heads) = .50 The expected value (infinite, long-run average) is specific, but the predicted value for a sample is not, especially when the sample is small This is why hypothesis testing, based on samples, is subject to Type I and Type II errors

Assume that the goal of the lighting-recall experiment is to test whether the perceptual difficulty interferes with memory. What is wrong with the following criticism (besides tone)? "Your experiment sucks because making the room dark has more than one effect... you confounded perception with sleep-inducing"

Incorrect use of "confound"

How is internal validity established using experiments?

Internal validity is established by controlling or controlling for all potential confounds Relatively easy to establish high internal validity at the cost of lowering external validity In the context of experiments, internal validity is the extent to which the pattern observed in the dependent variable is due to the causal effects of the independent variable In the context of experiments, a confound is any extraneous variable that covaries with the independent variable When deciding to run an experiment (instead of a correlational study), internal validity is chosen to be stressed over external validity The first way in which internal validity might be lost is based on how the independent variable is chosen to be manipulated

Counter-Balancing

Involves systematically changing the order of treatments or tasks for participants in a 'balanced' way to 'counter' the unwanted effects on performance in any one order

What is one of the defining traits of a scientific theory?

It can be disproved/falsified

What is the logic behind hypothesis testing?

It is difficult (impossible) to prove that something is always or even generally true, but it is not difficult to disprove the same claim Therefore, instead of trying to prove that a relationship exists (even if that is the goal), we try to disprove that it does not exist Ex: If you disprove that the correlation is zero, then the correlation must be non-zero

What is the answer to the following counterpoint: "If this is such a big deal, why did you use 'confound' (a lot) when talking about depression and anxiety?"

It is okay to use "confound" when you have good operational definitions

Why do we use a 95% confidence interval?

It is the complement of 5% "risk" (p-value)

Measurement Error

Logical reasoning test scores are affected by multiple sources of random error, such as item selection, participant's mood, fatigue, etc Result of too much variability within levels

Partial Correlation Coefficient

Measure of association between two variables after controlling for the effects of one or more additional variables

What situations produce a significant rᵧᵪ, and a nonsignificant prᵧᵪᐧz?

Mediated and spurious relationships The difference between these models is the directionality of the Z-X relationship Both will fail to produce a significant prᵧᵪᐧz

How does construct validity pertain to experiments?

Moving between the realms of theoretical constructs and observables always involves construct validity The standard definition of construct validity suggests that this is only an issue for measurement It is also an issue for manipulations The only (existing) label for this is "selective influence"

When is it okay to provide or accept a best guess about a population value (such as the true mean, μ) without an associated estimate of how wrong this guess might be? A. Always B. When the sample standard deviation is smaller than the sample mean C. On Tuesdays in March D. Never

Never

Are all confounds threats to internal validity?

No An indirect causal relationship is still causal

Affirming the Consequent

One of the most common logical errors in science (invalid argument) Assume: p → q (if p, then q) Observe: q (q is verified) Conclude: ∴p (therefore not p) We learn little to nothing about p when q is verified People seem to read "if" as "if and only if" (invalid) Alternative explanations are often forgotten The data could have occurred for a different reason (alternative theory)

Modus Tollens

One of the most important logical rules used in science (valid argument) Assume: p → q (if p, then q) Observe: ¬q (not q) Conclude: ∴¬p (therefore not p) p = theory, q = data/results Way to potentially falsify a scientific theory

What are manipulation checks?

One or more measures Taken after the manipulation is applied but before the dependent variable is measured (for directionality) Used to verify the successful and/or selective change in (only) the target theoretical construct Note the involvement of both sides of construct validity

Pre-Test/Post-Test, One-Group Design

Participants → Pre-test DV → Intervention → Post-test DV Ex: Does playing video games help with enumeration? Participants (children) → Pre-test DV (numerical processing) → Intervention (action video game play) → Post-test DV (numerical processing) Assume that enumeration was higher after the intervention (post-scores were better than pre-scores) Does this provide good evidence in favor of the intervention? Four main threats all come from an inability to counter-balance the order of conditions (the without-intervention data must be collected before the with-intervention data) Uses paired samples t-test for a within-subjects experiment

What happens if neither cross-lag is significant?

Prima facie evidence against any causation

What happens if the cross-lags are equal and both are significant?

Prima facie evidence of mutual causation

The ___ of any study or experiment depends on how much of all four validities work

Quality (usefulness)

What is the difference between random assignment and matching?

Random assignment places participants into control and experimental groups at random Matching places similar participants into control and experimental groups at random Whereas matching has to explicitly identify each confounding variable that it will control, random assignment takes care of all the confounding variables at the same time If a researcher fails to match groups on a critical confounding variable, the study is jeopardized

Individual Differences

Reading scores are affected by irrelevant individual differences in motivation and ability Result of too much variability within levels

According to the rule that is used in psychology (and most other empirical sciences), when the p-value is less than .05, we ___ A. Reject the null hypothesis (H₀ ) B. Retain the null hypothesis (H₀ ) C. Reject the alternative hypothesis (H₁) D. Retain the alternative hypothesis (H₁)

Reject the null hypothesis (H₀ )

Pre-Test/Post-Test, Two-Group Design

Subjects randomly assigned → pre-test → treatment → post-test OR Subjects randomly assigned → pre-test → no treatment → post-test Hopefully equal-on-average Convert to change scores and test (will be an independent samples t-test but with high power) Uses a control group to measure (and try to remove) the effects of the problems of a pre-test/post-test, one-group design Treatment Group (T) gets the intervention Control Group (C) gets everything except the intervention Uses independent samples t-test for a between-subjects experiment

Two-Test, Post-Test Only Design

Subjects → treatment or no treatment → post-test No way to remove subject differences, verify successful random assignment, or check for biased attrition Fewer demand characteristics

What happens if one cross-lag is significant and larger than the other?

Support for a particular direction of causation Cause at Time 1 and Effect at Time 2 should be significant and larger than Cause at Time 2 and Effect at Time 1

Univariate T-Test Vs. Zero

T-value = the distance between X̄ and 0, expressed in multiples of s/√N, so it is simply X̄/(s/√N) Has no units because they cancel

Simple Hypothesis Testing (T-Test)

T-values have an associated degree of freedom (df), which is N-1 for the simplest test From t and df, you can get the p-value

How can the four main threats to a pre-test/post-test, one group design be organized?

Testing = subject and order (first/second) Instrumentation = other and order (first/second) Maturation = subject and time (earlier/later) History = other and time (earlier/later)

The Normal (Gaussian) Distribution

The "standard normal" is when the mean = 0 and the standard deviation = 1 95% of the area of the normal is within 1.96 σ of the mean, μ

When do sources of threats become actual threats for internal validity?

The beliefs of the experimenters are not a threat on their own They only become a threat if they cause the experimenter to behave differently when running each condition

Assume that you believe that depression (D) causes anxiety (A). To test this claim, you have measured depression and anxiety (in a large number of participants) at two different times (1 & 2). You have already conducted the three preliminary tests (i.e., D1-A1, D1-D2, A1-A2) and all of these required correlations were large and significant. You are now about to conduct the two critical tests. In order for your data to support your claim that depression causes anxiety, which of the following needs to be true? A. The correlation between depression at Time 1 and anxiety at Time 2 (D1-A2) is significant and larger than the correlation between anxiety at Time 1 and depression at Time 2 (A1-D2) B. The correlation between anxiety at Time 1 and depression at Time 2 (A1-D2) is significant and larger than the correlation between depression at Time 1 and anxiety at Time 2 (D1-A2)

The correlation between depression at Time 1 and anxiety at Time 2 (D1-A2) is significant and larger than the correlation between anxiety at Time 1 and depression at Time 2 (A1-D2)

Construct Validity

The extent to which a set of (one or more) empirical measures provides an exhaustive and selective estimate of the target theoretical construct Often embodied in terms of an operational definition (of a theoretical construct) Ex: Depression (as a construct) is usually operationally defined as BDI score Once established, an operational definition is bidirectional Ex: Depression = BDI score Level of depression ⇄ BDI score

Ethics

The extent to which a study follows a system of moral principles to go by when conducting studies and experiments

Practicality

The extent to which a study is used for the improvement of outcome assessment purposes

Efficiency

The extent to which a study or procedure is inexpensive and easy to use and takes only a small amount of time to administer and score

What is the goal of hypothesis testing?

The hypotheses that we test concern relationships in the sampling population, not just in the sample Hypothesis testing is inferential, not descriptive Since these are inferences, they can be incorrect, even when they are true for the sample and nothing was done wrong

What happens when the cross-lagged analysis prelims were passed, but the predicted cross-lagged analysis was not passed?

The original theory is not completely ruled out Directionality might be correct, but the timing might be wrong

Covariation

The proposed cause and effect must show some relationship Ex: Have a significant (linear) correlation

Temporal Precedence

The proposed cause must come before the effect

What is the purpose of cross-lag prelims?

The purpose of prelims is to verify the quality of the data (only)

What does a mediated relationship indicate?

The relationship between X and Y is mediated (causal, but not direct) You can get from X to Y by obeying the arrows, but cutting the Z links will eliminate rᵧᵪ If this diagram is accurate, then the partial correlation between X and Y with respect to Z (prᵧᵪᐧz) will be very small or zero and not significant

What does a spurious relationship indicate?

The relationship between X and Y is spurious ↓ You cannot get from X to Y by obeying the arrows, and cutting the links involving Z will eliminate rᵧᵪ If this diagram is accurate, then the partial correlation between X and Y with respect to Z (prᵧᵪᐧz) will be very small or zero and not significant

What does "significant" mean?

The results are unlikely to have happened by chance (very loose definition, technically wrong) If we assume that there is no relationship, then these data are very unlikely (better definition) How unlikely do the data need to be (assuming no relationship) in order for us to conclude that there is a relationship? Less than 5%, this is why the p-value must be < .05

Point Estimate

The simplest statistical inference is estimating the true mean, μ, of a population This is a "point estimate" because it is the best guess for a single value (point on a line) It is rarely all that we care about, but this process illustrates all of the important concepts The true correlation, ρ, is also a point, and the way that we estimate ρ uses a similar process The most important concept is to never provide (nor accept) a best guess without an associated estimate of how wrong it might be

Additive Variance Rule

The variance of the difference (or sum) of two independent events is equal to the sum of the variances

What does it mean if there are no other (observable) differences between two conditions?

There are no confounds Condition 1 (Bright) → Data 1 Condition 2 (Dim) → Data 2

Are there trade-offs between validities or is it all-or-none?

There are trade-offs Ex: Experiment (internal validity) vs. correlational study (external validity) There is no such thing as a perfect study or experiment

What does it mean if your method of manipulation causes a systematic (unintended) difference between the conditions?

There is a confound Condition 1 (Bright) → 72 Degrees → Data 1 Condition 2 (Dim) → 68 Degrees → Data 2

What does it mean if the groups assigned to the different conditions are not equal-on-average when using a between-subject design?

There is a confound Condition 1 (Bright) → Average Age = 20 → Data 1 Condition 2 (Dim) → Average Age = 19 → Data 2

What does it mean if the experimenter codes the data from different conditions using different rules/criteria?

There is a confound Condition 1 (Bright) → Experimenter codes "blurg" as correct → Data 1 Condition 2 (Dim) → Experimenter codes "blurg" as incorrect → Data 2

What does it mean if the experimenter behaves differently when running the two conditions?

There is a confound Condition 1 (Bright) → Experimenter says, "This is easy" → Data 1 Condition 2 (Dim) → Experimenter says, "This is hard" → Data 2

What does it mean if you fail to counter-balance the order of conditions when using a within-subject design?

There is a confound Condition 1 (Bright) → Run First → Data 1 Condition 2 (Dim) → Run Second → Data 2

What does it mean if the experimenter expects different results in the two conditions (but behaves and codes the same)?

There is not a confound Condition 1 (Bright) → Experimenter believes that bright is easier than dim → Data 1 Condition 2 (Dim) → Experimenter believes that bright is easier than dim → Data 2

What does it mean if there is mere variability or differences in variability of an extraneous variable (EV)?

There is not a confound Equal-on-average is sufficient Condition 1 (Bright) → 68-72 Degrees, Mean = 70 → Data 1 Condition 2 (Dim) → 66-74 Degrees, Mean = 70 → Data 2

What does it mean if you do counter-balance the order of conditions?

There is not a confound involving order Condition 1 (Bright) → ½ Run First, ½ Run Second → Data 1 Condition 2 (Dim) → ½ Run First, ½ Run Second → Data 2

The beliefs of an experimenter become a threat to internal validity when ___ A. The experimenter thinks about them B. The experimenter tries to hide them C. They cause the experimenter to behave differently when running the different conditions D. They cause the experimenter to avoid vaccines

They cause the experimenter to behave differently when running the different conditions

What are potential sources of bias?

Things that are not observable but might differ across the conditions of an experiment They may provide alternative theoretical explanations for the results

Two-Tailed Hypothesis Test

This can be rejected by a (large) positive or negative t-value (two-tailed test)

One-Tailed Hypothesis Test

This can only be rejected by a (large) positive t-value (one-tailed test) The p-value for a one-tailed test is exactly half as large as that for a two-tailed test (and therefore more likely to be less than .05) However, when a one-tailed test is used, the extreme values that "obey" H₀ must be ignored

What is the answer to the following counterpoint: "It is a confound validity issue, not an internal validity issue, so why should I not have used the word 'confounded'?"

This matters a lot because the appropriate "fix" depends on the type of problem

Multiple Regression

Three or more continuous variables

What are the symbols for hypothesis testing?

To help clarify the distinction between samples and populations, Roman letters are used for sample values and Greek letters are used for population values X̄ is the mean of X in a sample μ is the mean of X in the population rᵧᵪ is the correlation between X and Y in a sample ρᵧᵪ is the correlation between X and Y in the population The name for no relationship is "null hypothesis", and it uses the symbol H₀ Ex: H₀ in a correlational study is that ρᵧᵪ = .00 If you disprove H₀, then the opposite, H₁, must be true Ex: H₁ in a correlational study is that ρᵧᵪ ≠ .00

Non-Linear Regression

Two continuous variables

Standard Correlation

Two continuous variables

Frequency Claim

Univariate statement of fact Ex: The mean BDI-II score is 5.14

The pre-test/post-test, two-group design ___ A. Is just as bad as the pre-test/post-test, one-group design B. Uses a control group to eliminate the problems of a pre-test/post-test, one-group design C. Uses a control group to measure the effects of the problems of a pre-test/post-test, one-group design D. All of the above

Uses a control group to measure the effects of the problems of a Pre-test/Post-test, One-group design

Mediator Variable

Variable that explains the relationship between two other variables

Moderator Variable

Variable that influences the strength and direction of the relationship between two other variables

Third Variable

Variable that is responsible for a correlation observed between two other variables of interest

If a mediated diagram is accurate, then the partial correlation between X and Y with respect to Z (prᵧᵪᐧz) will be ___ (If it matters, you may assume that the simple, bivariate correlation between X and Y [rᵧᵪ] is strong and significant) A. Very small or zero and not significant B. Just as large is the correlation between X and Y [rᵧᵪ]] and also significant C. Much larger than the correlation between X and Y [rᵧᵪ]] and even more significant

Very small or zero and not significant

If a spurious diagram is accurate, then the partial correlation between X and Y with respect to Z (prᵧᵪᐧz) will be ___ (If it matters, you may assume that the simple, bivariate correlation between X and Y [rᵧᵪ] is strong and significant) A. Very small or zero and not significant B. Just as large is the correlation between X and Y [rᵧᵪ]] and also significant C. Much larger than the correlation between X and Y [rᵧᵪ]] and even more significant

Very small or zero and not significant

Implied Causation

When a bivariate statement (association) is made, many people believe that the association is causal Ex: "Assertiveness and income are correlated" People may think that assertiveness causes a higher income or vice versa, even if there is no scientific support for one variable causing the other

Implied Association

When a conditional frequency claim is provided, many people infer that the value for some other subset is different Ex: "The mean BDI score for left-handers is 6.87" People may question whether the BDI score for right-handers is higher or lower, assuming that there is an association based on the stated frequency claim

Placebo Effect

When the beliefs (of the participants) concerning treatment efficacy affect the results

Degrees of Freedom (df)

When we have only one sample (as is usual), X̄ is used as an estimate (best guess) for μᵪ We use sx̅ as an estimate of how wrong we might be We use degrees of freedom as the indicator of how good this second estimate is df = N-1

There are trade-offs between the four types of validity (plus ethics, practicality, and efficiency). One of the most important implications of these trade-offs is that ___ A. While some studies or experiments might be better than others, there is no such thing as a "perfect" study or experiment B. The amount of each validity is either zero or 100% C. Any given study or experiment is either perfect or useless D. All research is useless

While some studies or experiments might be better than others, there is no such thing as a "perfect" study or experiment

Why is standard error smaller for within-subject designs?

Within-subject designs have less participants than between-subject designs, thus making the variability and standard error of the difference smaller

Point Estimation (Mean ± SE)

X̄ is our estimate (best guess) for μᵪ sx̅ is our estimate of how wrong we might be df is our measure of quality of sx̅ μᵪ = X̄ ± sx̅ with N - 1 df The problem is that the relationship between sx̅ and the actual error is very complicated

What is the answer to the following counterpoint: "If I operationalize perceptual difficulty and/or sleepiness, I would be allowed to use the word 'confound' and also apply what I know about confounds to fix this mess?"

Yes

Assumption of Equal Variance

sx̅-x = √sx̅₁² + sx̅₂² → this value has df = N₁ - 1 + N₂ - 1 The above does not require that N₁ = N₂ or that s₁² = s₂² It does require that σ₁² = σ₂² If σ₁² ≠ σ₂², then several adjustments must be made, which will always reduce the value of df

Standard Error of the Mean

sₓ̄ = sᵪ/√N (√s²/N) Implications: 1. Error increases with sample standard deviation 2. Error decreases with sample size μᵪ = x̄ ± sₓ̄ (used for plotting data) For a difference between the two means: Within-subjects → pre-subtract (remove individual differences), which causes you to have only one set of values Between-subjects → cannot pre-subtract (keep individual differences), and the variances add together, so the standard error of the difference is much larger

T-Tests

t = (mean violation of the null hypothesis)/(standard error of mean violation) Ex: If the null hypothesis says that the difference between two means should be zero: t = (difference between the sample means)/(standard error of the difference) Note that t has an associated degrees of freedom (df), which comes from the denominator ("error term") When the data (from the two conditions) are independent samples (not linked in any way) t = (observed difference between the means)/(standard error of the difference) The observed difference is still easy to calculate and does not depend on assumptions/form of the test The standard error is much more complicated and does depend on assumptions/form of the test

Paired Samples T-Test

t = (observed mean violation of H₀)/(standard error of the violation of H₀) Used to test H₀: μA = μB for within-subject designs Actually done by (secretly) calculating the difference for each subject and then testing these values against zero Has the option of being one-tailed, but this will not be used The critical "output" from a paired samples t-test are: The t-value (with the associated degrees of freedom) The p-value (tells if H₀ should be rejected) Ex: t(x) = y.yy, p = .zzz X = degrees of freedom You can also request the results from point-estimation for the true mean difference (a 95% confidence interval or mean +/- the standard error) From the t and df, you get the p-value and make a decision Tests H₀: μA-B = 0, where each subject contributes one difference score It is really a test of a single set of values against zero t = (d̅)/(sd̅) and df = N-1

Independent Samples T-Test

t = (observed mean violation of H₀)/(standard error of the violation of H₀) t = (d̅)/(sd̅) Paired samples t-test (after within-pair subtraction) → H₀: H₀: μD = 0 Independent samples t-test → H₀: μ₁ = μ₂ In general, a standard error is the standard deviation of the sampling distribution So, if you take a sample size N₁ from Population 1 and a sample of size N₂ from Population 2, calculate the difference between the two sample means (X̄₁ - X̄₂) and repeat this an infinite number of times, the standard deviation of the differences is sx̅-x

What is an example of the proper definition using confidence intervals?

"95% of all intervals created using this method will contain the true mean, μ"

What can be used as a category label of design and subject confounds to maintain distinction between objective and theoretical?

"Bias-driven confounds"

What are the trade-offs in experimental studies?

1. At least one independent variable and one dependent variable 2. Easier to have high internal validity

Reactivity

A change in behavior due to being studied

Spurious Relationship

A significant, replicable, and often robust relationship that is not causal in either direction

Bias

A theoretical construct (a belief) that has the potential to create a confound

Internal Validity

Alternative explanations of the covariation must be ruled out Ex: The covariation must still be found when other variables are controlled (either experimentally or statistically)

Phi Coefficient or Chi-Square

Two categoricals


Related study sets

Pathophysiology final exam practice

View Set

128 Civics Questions & Answers (2020 Version)

View Set