Exam 4 - Research Methods

Ace your homework & exams now with Quizwiz!

Reasons for poor replicability

- Hesitation to publish the null - small samples and low power - unknown moderators - questionable research practices

Calculating effect size for MA

Common effect size determined for all studies, weight based on sample size, can calculate effect sizes for subgroups

Respect for persons

Confidentiality, right to withdraw, debriefing, voluntary participation, informed consent

Advantages of MA/VG

- Better estimates of population parameters - Resolution of theoretical debates fueled by "mixed-findings." - Science progresses faster because evidence for or against theoretical propositions becomes clearer

Problems with MA

- "file drawer problem" all studies might not be published/accessible, biased effect sizes - power may still be inadequate if there are too few studies with small samples - Can't overcome methodological flaws in original study

Distribution of effect sizes

- If all samples were drawn from same population of effects, variability should be due to sampling error - if homogeneity is supported (tests of homogeneity are insignificant) then variability can be due to sampling error - weighted effect sizes are calculated, overall population effect size - if heterogeneity exists, studies grouped by potential moderators and re-tested - may also be due to outliers which can be trimmed

Coding studies for key characteristics

- Key characteristics are anything that could account for variance in effect sizes - I.e., potential moderators

Low power problem

- Many studies are greatly underpowered - Failures to reject the null are often the result of low power and result in "mixed findings" which prevent further exploration

P values cont

- P is not a sliding scale, choose P before analyses and stick with it - no such thing as a "trend towards significance" - Significance is a rule in vs. out process, accept the level of T1 error you are willing to commit and stick with it

Conducting the NHST

- calculate t statistic - compare to sampling distribution to see if it exceeds critical value - critical value dependent on df, p value, and whether 1 or 2 tailed - if our statistic does not exceed critical value, retain the null

types of nonprobability sampling methods

- convenience, accidental or haphazard - purposive - snowball

Types of probability sampling

- simple random sampling, systematic, stratified, cluster

Problems with NHST

- the idea that there is no difference or no correlation when the null is true is probably incorrect - given a large enough sample size, anything will be significant - We are more likely to commit Type II (saying there is no effect when there is one) errors with NHST, and NHST does not address this

Meta Analysis

-examines effect sizes across a number of different studies looking at same IV and DV - Estimate population effect sizes by avging effect sizes - Attempts to overcome low power problem by making use of many samples - Validity Generalization studies (VG) measure the validity of a certain measure across a number of studies -

Two tailed tests

-require larger samples because we are testing in both directions, splitting critical value - usually choose two tailed bc we want to interpret anything regardless of direction, choose 1 tailed for intervention research

Three questions about relationship between IV and DV

1. What is the nature of the relationship between the independent and dependent variables? 2. Is the relationship real, or is it due to chance? 3. How big an effect did the independent variable have?

stratified random sampling

A form of probability sampling; a random sampling technique in which the researcher identifies particular demographic categories of interest and then randomly selects individuals within each category. - oversample underrepresented populations - e.g., by race, add up to form representative sample

probability sampling

A type of sampling in which every element in the population being studied has a known chance of being selected for study

Goal of sampling

Achieve representative proportion of population so you can generalize study findings Different sampling techniques vary in how likely they are to achieve this goal.

Sources of Type II errors

Bad research design - Poor construct valid - Weak manipulation/unreliable variables - Failure to control extraneous variables - Failure to test for curvilinear relationships - Failure to test for moderators Low power

Confidence Intervals as an alternative to NHST

CI focuses on estimating the actual effect of interest - and the degree of uncertainty about what it really is - Researchers who see results presented as CIs are more likely to make correct inferences if they think in terms of estimating the effect size rather than NHST

Types of effect size

Cohens D: Difference between the means divided by SD Percentage of variance explained, e.g., R2, eta square, etc

proportionate stratified sampling

Each strata are equal in size

disproportionate stratified sampling

Each strata is a different size - necessary when oversampling certain populations, e.g., minorities

Exploring moderators in MA

If effect sizes are significantly different between subgroups, probable that something is moderating

Pitfalls of sampling

If sampling procedure is flawed, could end up with a sample that is not representative of general population

data torturing

If you analyze data enough, it will tell you what you want to hear -Type 1 error and multiple testing -Are subset differences real or chance findings

Real vs. Chance association

Inferential stats tells us the likelihood of whether or not our observed results are due to chance rather than the effect of the IV.

effect size (magnitude)

Low variability = smallest effect size, high variability, greatest effect size

Times when accepting the null is the goal

Mediator analysis - no effect of the IV after controlling for the M Ruling out confounds Discriminant validity Have to be careful, if not enough power, then accepting the null doesn't mean anything

Effect size uses

Metric of practical significance, how much the IV is influencing the DV Effect sizes are standardized, can be used to compare across multiple studies Can be used to determine power for a study

How big is the effect?

NHST tells us nothing about how big/important the size of the effect is. Effect size is the magnitude or size of the association, how much impact the IV has on the DV

Determining power

Power is 1-beta (probability of determining no effect when there is one) Higher power requires larger samples, typically strive for 80%

Ethical principles

Respect for persons, beneficence, justice

Beneficence

Risk-Benefit Analysis • Monitoring for Harms • Alleviating adverse effects • Debriefing • Confidentiality

Nonprobability sampling

Selection is systematic or haphazard, but not random.

Practical significance and small effects

Small effects add up over time Weak manipulations sometimes result in small effects, which means with a strong manipulation, we could see a large effect

Null hypothesis testing

Statistical test to determine whether the results are due to sampling error Goal is to reject the null, which is assumed to be true by default - there is no difference between groups (t/f test) - the corr coeff is 0 (regression) Accepting the null means that you did not find evidence suggesting it was false

Guidelines of strong research

Strong theoretical foundation Devise and stick to a-priori data collection and analyses Decide on reasons for data trimming in advance Avoid HARKING Report all the results, not just the significant ones Double check results for accuracy

Consequences of low power

Studies that are significant and make it into the lit yield distorted effect sizes

Interrelationships of power

T2 error is dependent on power Power of test is dependent on alpha (i.e., accepted T1 error commission As alpha becomes more conservative, power increase As sample size increases, power increases, As effect size increases, power increases

data snooping

Using multiple analyses to find significance and then publishing the results. "torturing the data until they confess" looking at data before experiment is complete

Practical Significance, effect size

Value judgement for how useful information is for theory or clinical implications Criterion for practical significance is the minimum impact considered to be important to research

Justice

Voluntary participation, compensating control groups, IC, equitable sharing of risks and benefits

Determining sample size and power

You need: The Effect Size you expect to find The Type I Error Rate you will set Whether the statistical test will be one-sided or two-sided The amount of Power you want to detect the effect (i.e., 1-Type II Error rate)

purposive sampling

a biased sampling technique in which only certain kinds of people are included in a sample - identify a typical case through lit and experts - Problem: Proportionality of these cases in the population

nonprobability sampling

a sampling technique in which there is no way to calculate the likelihood that a specific element of the population being studied will be chosen - likely to misrepresent population, no way to tell if it does or not

data trimming

consists of changing data values so that they better fit the predictions made by the research hypothesis

Simple random sampling

every member of population has an equal chance of being selected

Snowball

people forward the survey onto people they know. good for hard to reahc pops

Convenience & accidental sampling

people who are easily accessible - college studs - Problem: no evidence of representativeness

P values

probability that you obtained results your did based on sampling error, assumes the null hypothesis is TRUE The size of P is not the size of the effect (e.g., .001 is not a bigger effect than .05) - It is a measure of rarity, doesn't say anything about how big or important the effects are

sampling distribution

the distribution of values taken by the statistic in all possible samples of the same size from the same population - If you repeatedly took two samples of size n from the same population and computed the difference between the two means divided by the SE, those differences would form the sampling distribution of the t-statistic - sampling distributions will depend on the size of the sample - A sampling distribution tells you what percentage of samples (or differences between two samples) will exceed any particular value

Data dredging

the inappropriate (sometimes deliberately so) use of data mining to uncover relationships in data that may be misleading.


Related study sets

The Movement I: Renaissance Architecture

View Set

Prep-U Ch. 62: Caring for Clients with Traumatic Musculoskeletal Injuries

View Set

Compensation Final Exam (Ch. 13 - 18) UNL

View Set

Chapter 8: Perception Interventions

View Set

ohio life insurance missed questions and answers part 8

View Set