research methods test 2 dr. b
demand characteristics
(subjects knows "the game") -a participant might pick up on some clue or bias from the researcher, the situation, or something about the experiment that gives the participant and idea of what type of response the researcher is looking for. doesn't mean that the participant is right, just that something makes them act in a way they think is what the researcher wants and not necessarily in their normal manner. similar to observer bias except that the bias is found in the participants and not the observers of the research.
1 factor analysis of variance (ANOVA)
-"1-factor"=1 IV -"2 factor"= 2 ivs (factorial design-chapter 8) -once overall significant effect is found, then post hoc testing. -comparing each level of iv against each other level
alternative hypothesis
-a relationship ("a difference") between variables in population is expected given our sample ex: there is indeed a statistically-significant relationship between what type of water the flower plant is fed and growth. -a researchers' predictions often specifies the direction of the relationship (e.g., a positive correlation between variables)
a hypothesis example
-a researcher is interested in the relationship between playing violent video games and aggression
Multilevel designs advantage #1
-ability to discover nonlinear effects. -RT study with 2 levels. (1 and 3 mg of caffeine) -adding levels (2 and 4 mg) -> possible nonlinear effects WITHIN-SUBJECTS, multilevel designs -nonlinear results: some variables have a curvilinear relationship with each other. increases in x initially produce increases in Y, but after a while subsequent increases in X produce declines in Y, e.g. -often think that variables will increase exponentially rather than arithmetically. For ex, each year of education may be worth an additional 5% income, rather than, say, $2,000. hence, for somebody who would otherwise make $20,000 a year, an additional year of education would raise their income $1,000. for those who would otherwise be expected to make $40,000, an additional year could be worth $2,000. note that the actual dollar amount of the increase is different, but the percentage increase is the same. -suppose we think that a variable has one linear effect within a certain range of its values, but a different linear effect at a different range. for example, we might think that each additional year of elementary school education is worth $5,000, and each year of college education is worth $8,000, i.e. all years of education are not equally valuable. allows for changes in slope, with the restriction that the line being estimated be continuous; that is, it consists of two or more straight line segments. the true model is continuous, with a structural break. at the point of the structural break, the slope becomes steeper, but the line remains continuous. -ebbinghaus forgetting curve designs-
Multilevel designs -advantage #2
-ability to rule out alternative explanations. ex 14: debunking the mozart effect -multilevel repeated measures -iv->listening to mozart listening to gentle rainstorm -control-no listening -dv-> recall of digits in reverse -found out it was really just listening to soothing music or music that you might like which is creating the effect but wouldn't know this if research didn't add the extra level to the IV. results=no mozart effect. -ruled out an alternative explanation.
subject variables
-already existing attributes of subjects in a study ex. gender, age, personality characteristic ex: effects of aniety on memory -as a manipulated variable-> induce different degrees of anxiety in participants.
within subjects, single factor designs
-also called repeated measures designs -manipulated independent variable (all subjects participate in all levels of the independent variable) ex: the effects of exercise on memory
participant problems(subject selection)
-any bias in selecting and assigning participants to groups that results in systematic differences between the participants in each group. -the differences exist before one group is exposed to the experimental treatment. how to guard against this threat: e.g. randomly sampled participants
threats to internal validity(history)
-any external or historical event that occurred during the course of the study that may be reasonable for the effects. -how to guard against the threat -repeat study -interpret results in reports -imagine that jose treats the subjects who are getting his new treatment in august of 2001 and the subjects getting the traditional treatment during september of 2001. on the last day of each month, he measures his patients' depression to see how much they have improved. on august 31, he measures the patients who got his new treatment, and overall, they improved quite a bit. but then on september 30,he measures the patients who received traditional treatment and their depression seems even worse than it was at the beginning of september. 22
confounds
-any uncontrolled extraneous variable -the temperature in the experimental classrooms is higher than the control. -the time of day differs in the control and experimental classrooms -the experimenter is nicer in one condition -the gender is different across therapists. -the experimental group gets more credit in SONA compared to the control group
measures of variability
-are statistics that describe the amount of difference and spread in a data set. -if the numbers corresponding to these statistics are high it means that the scores or values in our data set are widely spread out and not tightly centered around the mean -range: the difference between the lowest and highest scores in a distribution -standard deviation: the average distance of all of the scores in the distribution from the mean/center point. -measures of variability -how spread out or dispersed scores are in a distribution -range, standard deviation, variance, interquartile range.
with a manipulated iv
-assuming no confounds->iv causes changes in DV -increasing the intensity of a tone should increase the speed at which people respond to the tone. -increasing the number of pellets given to a rat for pressing a bar should increase the number of times the bar is pressed. -when a change in the level (amount) of an independent variable causes a change in behavior, we say that the behavior is under the control of the independent variable.
essential feature of experimental research
-at the very least, experiments must involve a comparison between two situations, groups for conditions) -1 hr of playing video games -->experimental group-group that receives treatment. 0 hrs of playing video games-->control group-group that does not receive treatment.
matched groups designs concerns:
-attrition (when you lose one, you lose the pair) -matching is difficult
manipulated vs subject variables
-cannot draw causal conclusions when using subject variables -all that can be said->the groups differ from each other. -using both manipulated and subject ivs -bandura's bobo study- studied children's behavior after they watched a human adult model act aggressively towards a bobo doll, a doll like toy with a rounded bottom and low center of mass that rocks back to an upright position after it has been knocked down. there are different variations of the experiment. the most notable experiment measured the children's behavior after seeing the human model get rewarded, get punishment, or experience no consequence for physically abusing the bobo doll. the experiments are empirical methods to test bandura's social learning theory. the social learning theory claims that people learn largely by observing, imitating, and modeling. it demonstrates that people learn not only by being rewarded or punished, but they can also learn from watching somebody else being rewarded or punished. -manipulated->type of exposure to violence -subject->gender
3 criteria for causation
-cause must be related to the effect e.g. does playing violent video games cause aggression -cause must precede the effect e.g. no other explanations must exist for the effect.
power
-chance of rejecting a false null hypothesis -sample size an important factor -is the likelihood that a test will be able to detect an effect (during a research study) when one truly exists. -we test the opposite of our hypothesis, called the null hypothesis, by looking for enough evidence to say that it is false, and we should reject it. in rejecting the null hypothesis, we are in effect saying that our hypothesis is true. -statistical power is the probability of correctly rejecting the null hypothesis when it is in fact false (meaning, the original hypothesis is true). -ex: say you want to find out if taking a vitamin supplement increases mental alertness. And let's say that, in this instance, the vitamin supplement is indeed effective in increasing alertness. Your test would have statistical power if it is able to lead you to correctly reject the null hypothesis. if a test has high statistical power, then it will help you to conclude that the vitamin supplement has an effect. if a test lacks statistical power, you might end up wrongly concluding that the vitamin supplement is useless in increasing alertness, when in fact it is effective
threats to internal validity(instrumentation)
-changes in the measurement instrument (or procedures) from pretest to posttest -instrument is faulty -instrumental bias (or instrumental decay) -takes place when the measuring instrument (e.g., a measuring device, a survey, interviews/participant observation) that is used in a study changes over time. -ex: when we think about a speedometer, we would expect it to be as accurate when recording a speed of 100 mph as 20 mph. however, for measurement devices, this is not the case. can be less precise when recording some values compared with others. -sometimes, we can think of the measurement device as the researcher collecting the data, since it is the researcher that is making the assessment of the measurement. -imagine that a researcher is using structured, participant observation, to assess social awkwardness in two different types of profession. imagine that the researcher monitors these two different groups of employees, and scores their level of social awkwardness on a scale of 1-10 (e.g., 10=extremely socially awkward). the way that the researcher scores may change during the course of an experiment for two reasons: first, the researcher can gain in experience(i.e. become more proficient) or become fatigued during the course of the experiment, which affects the way that observations are recorded. this can happen across groups, but also within a single group. second a different researcher may be use for the pre-test and post-test measurement. in quantitative research using structured, participant observation, it is important to consider the ability/experience of the researchers, and how this, or other factors relating to the researcher's scoring, may change over time. however, this will only lead to instrumental bias if the way that the researcher scores is different for the groups that are being measured (e.g., the control group vs the treatment group). how to guard against this threat: -piloting your instrument to ensure it is reliable -an equivalent control group helps to ensure your instrument is consistent -participants problems can also threaten internal validity
presenting the data(graph form)
-continuous variable- the variable exists on a continuum -unlimited intermediate values exist -line graph is preferred, but bar graph OK. -discrete variable-each level represents a distinct category -no intermediate values can occur. -use a bar graph, line graph inappropriate -analyzing single factor, multilevel designs
control techniques(random assignment)
-creating equivalent groups -random assignment -refers to how chosen participants are then assigned to experimental groups. -each subject has equal chance of being assigned to any group in the study. -speeds potential confounds equally through all groups -increases the likelihood that the two groups are the same at the outset, that way any changes that result from the application of the independent variable can be assumed to be the result of the treatment of interest. -helps eliminate possible sources of bias -makes it easier to generalize the results of a population to a larger population. -goal: remove individual difference
between subjects designs
-different sets of subjects in each level of an IV -comparison is between two different groups of subjects -control-baseline/"standard" comparison condition -experimental-some level of IV or treatment -necessary when... -subjects in each condition have to be naive i.e, participating in one condition could effect performance in the other -subject variable will always be between subjects -ex: lou has two groups of participants, one in the 50 degree room and one in the 85 degree room. he is comparing the scores of the two groups to see if the cold room or the hot room will produce better test scores.
participant problems(attrition)
-dropping out of an experiment e.g. watch video with sensitive materials, keep from using commonly used drug, exposure therapy to treat phobias -comparisons between groups are no longer meaningful. how to guard against this threat: -an equivalent control group allows us to see whether there is differential attrition due to our manipulation
situational variables
-e.g. effect of the number of bystanders on the chances of help being offered -are factors in the environment that can unintentionally affect the results of a study. such variables include noise, temperature, odors, and lighting. for ex: lets say researchers are investigating the effects of caffeine on mood. one day the air conditioning breaks down in the lab. the participants who visit the lab that day to take part in the study get very hot and uncomfortable, and when filling out the questionnaire to measure their mood most of them report being in a bad mood. the researchers cannot be sure whether the caffeine or the heat caused the participants' bad mood. experimenters should try to control for situational variables so they don't throw off research results.
statistical analysis (effect size)
-emphasizes the size of difference between variables, not merely whether there is a difference or not -useful for meta-analysis -is a simple way of quantifying the difference between two groups that has many advantages over the use of tests of statistical significance alone. emphasizes the size of the difference rather than confounding this with sample size.
pre-test/post test control group design
-everything is the same, but you add a pre test. -the dv is measured twice for all people. -in the post-test only, we assume the groups are equivalent based on random assignment. -education, where researchers want to monitor the effect of a new teaching method upon groups of children. other areas include evaluating the effects of counseling, testing medical treatments, and measuring psychological constructs.
T-test
-examines the difference between the mean scores for the two samples and determines (with some probability) whether this difference is larger than would be expected by chance factors alone. -you hypothesize that students in the experimental group will display significantly greater learning gains compared to those in the control group.
how does experimentation involve control?
-experimental control are mechanisms that eliminate extraneous factors that might otherwise affect the results of an experiment
threats to internal validity(regression to mean)
-extreme scores, upon retesting, tend to be less extreme (i.e., they move toward the mean) -some initially high (or low) scores may be due to chance/luck -how to guard against this threat -an equivalent control group helps to see whether chances in DV are due to regression to the mean -a golfer with a handicap of 2 averages a score of 73. this score represents the golfer's average score. on some days he goes wild and shoots a 63 which is awesome, but extreme. over time the golfer will have many more scores around his average than far away from it as the scores tend to regress toward the mean of 73.
fatigue and practice effects
-fatigue: participants tire after one or two tests -practice: participants may perform better as they become use it
with a subject IV
-groups may differ in several ways->iv cannot be said to cause dv -are the differing individual characteristics of participants in an experiment. ex: how much sleep the individual person got the night before, and many more. -participant variables can be considered extraneous variables because they are variables that can influence the results of an experiment but that the experimenter is not studying. these can challenge the validity of a study by influencing the results.
single-factor-two levels(between-subjects, single factor (IV) designs
-independent groups designs -one independent variable with two separate groups random assignment to create equivalent groups example: question: last-minute cramming (LC) vs distributed studying (DS), which is best? -the simplest form of experiment has 1 IV or factor. -matched groups (or pairs) designs -manipulated independent variable (separate groups) -matched on an important variable (demographics) -one member of each matched pair must be randomly assigned to the experimental group and the other to the control group. -example: study to compare cramming vs distrusting learning.
inferential statistics
-inferring general conclusions about the population form sample data -provide ways of testing the reliability of the findings of a study and "inferring" characteristics from a small group of participants or people (your sample) onto much larger groups of people (the population). ex: t-tests, ANOVAs
participant problems(diffusion of treatment)
-information "contamination" from other participants that completed the experiment already. e.g. participants talk with a friend about the experiment. how to guard against this threat: -large groups tested in short time frames -asking participants not to talk about the study in debriefing.
t test assumptions
-interval or ratio scale data -data normally distributed (or close) -homogeneity of variance(look at standard deviations)
cross sectional designs
-involves using different groups of people who differ in the variable of interest but share other characteristics, such as socioeconomic status, educational background, and ethnicity. -involves looking at people who differ on one key characteristic at one specific point in time. -people of different ages are examined in the same time -cross-sectional studies are observational in nature and are known as descriptive research, not causal or relational, meaning that you can't use them to determine the cause of something, such as disease. researchers record the information that is present in a population, but they do not manipulate variables. -its often used to look at the prevailing characteristics in a given population. ex: determine if exposure to specific risk factors might correlate with particular outcomes. researcher might collect data on past smoking habits and current diagnoses of lung cancer. while this type of study cannot demonstrate cause-and-effect, it can provide a quick look at correlations that may exist at a particular point. -it can provide information about what is happening in a current population. -participants are usually separated into groups known as cohorts. ex: researchers might create cohorts of participants who are in their 20s, 30, and 40s. problem=potential for cohort effects: groups can be affected by cohort differences that arise from the particular experiences of a unique group of people. individuals born during the same period may share important historical experiences, but people in that group who are born in a given geographic region may share experiences limited solely to their physical location. individuals who were alive during the invasion of pearl harbor, vietnam, or 9/11 might have shared experiences that make them different from other age groups. -differences could be due to environmental reared in -worse with large age differences
measures of central tendency
-is a single value that describes the way in which a group of data cluster around a central value. it is a way to describe the center of a data set. -lets us know what a normal or 'average' for a set of data. -condenses the data set down to one representative value, which is useful when you are working with large amounts of data. -allows you to compare one data set to another. ex: lets say you have sample of girls and a sample of boys, and you are interested in comparing their heights. by calculating the average height for each sample, you could easily draw comparisons between the girls and boys. -also useful when you want to compare one piece of data to the entire data set. say you received a 60% on your last psychology quiz, which is usually in the D range. You go around and talk to your classmates and find out that the average score on the quiz was a 43%. in this instance, your score was significantly higher than those of your classmates. since your teacher grades on a curve, your 60% becomes an A. Had you not known about the measures of central tendency, you probably would have been really upset by your grade and assumed that you bombed the test. -mode: the score or category that occurs with the greatest frequency (the most often). -median: the one score that divides the distribution exactly in half (i.e, the midpoint of the distribution) -mean: the arithmetic average of a distribution. ex: "how many pets do you have?"
continuous variable
-is a way of organizing distributions which can have any range of values in between differing values. ex: weight or height- a person doesn't have to be either 150 pounds or 151 pounds. they could be 150.6 or 150.99999 pounds.
ttest for dependent samples, for
-matched group designs -repeated measures designs/with subjects
descriptive statistics
-measures of central tendency mean, mode, median
post-test only control group design
-mutually exclusive groups -i.e, different people in control and experimental group -give level of IV (control or intervention) -take post test -can be done with one group (no comparison group) or two groups (with a comparison group) of participants. participants receive an intervention and are tested afterwards. -company wants to know whether its employees benefited from a recent employee training conference (listed below as "intervention"). following the implementation of the training conference, employees were given a short survey (posttest) to assess their satisfaction with the conference. the surveys were used to provide feedback and make changes to next year's conference.
hypothesis testing
-null hypothesis -no relationship ("no difference") between variables in the population expected, given our sample. -hypothesis that says there is no statistical significance between the two variables in the hypothesis. it is the hypothesis that the researcher is trying to disprove. ex, susie's null hypothesis would be something like this: there is no statistically significant relationship between the type of water I feed the flowers and growth of the flowers. a researcher is challenged by the null hypothesis and usually wants to disprove it, to demonstrate that there is a statistically-significant relationship between the two variables in the hypothesis.
null and alternative example relation
-null: if one plant is fed club soda for one month and another plant is fed plain water, there will be no difference in growth between the two plants. -alternative: if one plant is fed club soda for one month and another plant is fed plain water, the plant that is fed club soda will grow better than the plant that is fed plain water.
longitudinal design
-observational research method in which data is gathered for the same subjects repeatedly over a period of time. -can extend over years or even decades. -studies a single group over a period of time -within subjects approach -problem=attrition. subjects drop out.
experimenter bias
-occurs when a researcher unconsciously affects results, data, or a participant in an experiment due to subjective influence. it is difficult for humans to be entirely objective which is not being influenced by personal emotions, desires, or biases. -experimenter expectations can influence subject behavior, especially if they want a certain result e.g. nice, more positive feedback to experimental group
Special-purpose control group designs (placebo control groups)
-placebo-inactive substance-a placebo is a substance with no known medical effects, such as sterile water, saline solution, or a sugar pill. is a fake treatment that in some cases can produce a very real response. who do people experience real changes as a result of fake treatments? expectations of the patient play a significant role in the placebo effect; the more a person expects the treatment to work, the more likely they are to exhibit a placebo response. -placebo effect- important to note that a "placebo" and the "placebo effect" are different things. the term placebo refers to the inactive substance itself, while the term placebo effect refers to any effects of taking a medicine that cannot be attributed to the treatment itself. -when performance of placebo group=experimental group -subject expectations explain the effect of treatment: has no effect on an illness, it can have a very real effect on how some people feel. just how strong this effect might be depends on a variety of factors, for instance 1. the nature of the illness 2. how strongly the patient believes the treatment will work. 3. the type of response the patient expects to see 4. the type of positive messages a doctor conveys about the treatment's effectiveness 5. genes may also influence how people respond to placebo treatments. ADVANTAGES -using a placebo in medical and psychological studies is that it allows researchers to eliminate or minimize the effect that expectations can have on the outcome.
random selection
-process of gathering (in a truly random way) a representative sample for a particular study. -is important because the scientist wants to generalize his or her findings to the whole population without actually testing the whole population. to achieve this, the scientist identifies a population or group to study and randomly selects people. (could also be an item or animal) -to be in the study, random means the people are chosen by chance -when you have a truly random sample, you reduce the chance that the results are due to factors of the participants in the study. -choosing a representative sample is often accomplished by randomly picking people from the population to be participants in a study. -means that everyone in the group stands an equal chance of being chosen -refers to how participants are randomly chosen to represent the larger population
partial counterbalancing
-random sample of all possible combinations is selected -latin square -every condition of the study occurs equally often in every sequential position -every condition precedes and follows every other condition exactly once.
experimental fatigue (testing)
-reflects general experiences that take place during the experiment that lead to physical and/or mental fatigue. could be due to a particular treatment, which may be physical and/or mentally demanding, or simply due to the fact that being part of a research project, which is unusual for most participants, can be tiring.
hypothesis testing-2 possible outcomes
-reject null hypothesis (with some probability) -conclude you found a significant relationship between variables p is less than .05 -fail to reject the null hypothesis -conclude you found no significant relationship between variables p is greater than .05 -because you are testing a sample and making inferences about the population, your statistical decisions have a probability of being wrong! -possible errors -type I-is the rejection of a true null hypothesis(also known as a "false positive" finding or conclusion) -type II-failure to reject a false null hypothesis (also known as a "false negative" finding or concusion)
threats to internal validity (testing)
-repeated testing could lead to better or worse performance e.g., get better on test; fatigued from repeated testing how to guard against this threat -an equivalent control group helps to see relative benefits or fatigue. -familiarity with the test could influence the performance on the second testing. changes in the final scores may be a result of repeated testing. -only occur in experimental and quasi-experimental research designs that have more than one stage; that is, research designs that involve a pre-test and a post-test. -some of the reasons why this occurs include learning effects (practice and carry-over effects) and experimental fatigue.
how can a study have low statistical conclusion validity?
-researcher might do the wrong analysis or violate some of the statistical assumptions -researcher might selectively report some analyses -measure has low reliability
learning effects(testing)
-result in increased post-test performance (i.e., higher scores on the dependent variable) because participants have become familiar with some aspect of the experiment (e.g. its subject matter) from the pretest. as a result of these learning effects, during the post test, participants may: a. understand the format of the experiment b. understand the purpose of the experiment c. become familiar with the testing environment d. develop a strategy/approach to do better/worse in the experiment (or moderate their outcome) e. become less anxious about the experiment
presenting the data
-sentence and paragraph form -table form
construct validity
-the adequacy of the operational definitions for both the independent and dependent variables used in the study -well chosen and well-defined Ivs and Dvs ex: the effects of TV violence on children's aggression -operational defition
internal validity
-the degree to which an experiment is methodologically sound and confound-free. -does my study actually answer the research question I proposed and designed to answer? -confound
the latin square example
-the effects of music on memory 123 231 312
the validity of experimental research external validity
-the extent to which research findings generalize to contexts other than those of the experiment -generalize
the validity of experimental research(statistical conclusion validity)
-the extent to which the researcher uses statistics properly and draws the appropriate conclusions from the statistical analysis
illusory correlation
-the perception that a relationship exists between two variables (which could conclude behaviors, events, items, or people) when in fact there is not a strong relationship between the two. -is created when two separate variables are paired together, which leads to an overestimation of how often they co-occur. -it is illusory in that the relationship between the two variables is not real; it is the result of our biased perception of the variables and a lack of information. -ex: sal is traveling to london, england for the first time. one of the first places that he stops is a souvenir shop. sal ends up with so many bags that his purchases take up all of the tiny counter space and the cashier has to place some of the bags on the floor. a few minutes after leaving the shop, sal turns back around having remembered that he left bags on the floor. upon his return, sal witnesses the cashier with one of the bags in his hand and concludes that the cashier was trying to steal his purchases. later that week, sal goes to a different shop to buy souvenirs, and the cashier does not enter the discount code for the clearance items, which results in sal being overcharged. after a conversation with another employee. sal is refunded his money. because of these two events, sal concludes that all of the cashiers in london's shops are thieves who try to take advantage of tourists.
wait list control groups
-to insure equivalent groups in a study of program effectiveness. -is a group of participants who do not receive the experimental treatment, but who are put on a waiting list to receive the intervention after the active treatment group does. 2 purposes: -it provides an untreated comparison for the active experimental group to determine if the treatment had an effect. by serving as a comparison group, researchers are able to isolate the independent variable and look at the impact it had. -it allows the wait listed participants an opportunity to obtain the intervention at a later date. -when conducting an experiment, these people are randomly selected to be in this group. they are closely resemble the participants who are in the experimental group or the individuals who receive the treatment. -serves as a benchmark, allowing researchers to compare the experimental group to the wait list control group to see what sort of impact changes to the independent variable produced. it essentially allows researchers to assess the effect of the intervention against not receiving treatment during that same time period (while still providing all participants with treatment eventually) -wait list group later administered treatment only if shown to be effective. example: miller and dipilato (1983) evaluated the effectiveness of two forms of therapy (relaxation and desensitization) to treat clients who suffered from nightmares.
longitudinal and cross sectional relation/comparison
-unlike longitudinal studies that look at a group of people over an extended period, cross-sectional studies are used to describe what is happening at the present moment. -longitudinal are more likely to be influenced by what is known as selective attrition, which means that some individuals are simply more likely to drop out of a study than others, which can influence the validity of the study. -one of the advantages of cross-sectional studies is that since data is collected all at once, it's less likely that participants will quit the study before data is fully collected.
control techniques (matching)
-used in experimental research in order for different experimental conditions to be observed while being able to control for individual difference by matching similar subjects or groups with each other. -participants are grouped together on some traits and then randomly assigned -deliberate control over a potential confound ex: a researcher wants to know which educational method is best for teaching students a new concept. a group of students are split into two different groups. the researchers would look at standardized test scores and grades and try to match each student with another student that has the same test scores and grades. so a student with a test score of 95 who made As would be in Group A while another student with the same scores would be placed in Group B. this process would be done for all the students in the experiment. then the experimenters would use one educational method on Group A and another method on Group B. they could then see how the differing methods influenced the students' learning of the concept. by using matched groups the researchers can see how the different conditions were influential and know that the results were not confounded by the students' individual differences because they had been even distributed across the two groups. individual differences can confound experimental results so by controlling for this researchers can be more confident in the results of the different conditions.
discrete variable
-uses a distinct label instead of a continuous one. for ex: eye color or gender can be considered a discrete variable because individuals are either part of a category or they aren't-there is no range of answers in between.
studies with high external validity
1. generalize to other populations -college sophomore problems 2. generalize to other environments -ecological validity 3. generalize to other times
studies with high internal validity
1. have valid operational definitions 2. have valid measurements 3. have little to no confounds
what is an experiment?
An experiment is a systematic research study in which.... the investigator directly varies some factor (or factors) -holds all other factors constant
subject bias
Hawthorne effect-effect of knowing one is in a study (i.e, participants change their behavior/performance) evaluation apprehension-participants tend to behave in ideal ways so as not to be evaluated negatively demand characteristics-subjects pick up cues giving away true purpose and study's hypothesis -to minimize researchers conduct what is known as a double-blind study. such studies involve both the experimenters and the participants being unaware of who is receiving the real treatment and who is receiving the false treatment. by minimizing the risk of subtle biases influencing the study, researcher are better able to look at how the effects of both the drug and the placebo. -manipulation check during debriefing could help
what if you have more than 2 levels of your IV
Its called a single factor, multilevel design -single factor means 1 independent variable, multiple levels to the IV. multilevel, more than two levels.
-use matching when...
Small N might not yield equivalent groups -some matching variable correlates with DV -Measuring the matching variable is feasible. Who's in the study? -ideal: random sampling -goal: ensure the sample is representative of the population about whom we are trying to generalize. -What happens during the experiment? -All things should be equal during the experiment, except the manipulation. E.g. procedure, experimenter, environment -What if we do all of these things in our experiment? -The level of the IV is the only thing that is different across the groups -we can conclude that the iv causes changes in the dv What if we do all of these things in our experiment? -the level of IV is the only thing that is different across groups. -we can conclude that the iv causes changes in the dv.
t test for independent samples, for
between-subjects designs
measuring dependent variables
dvs are any behaviors measured in an experiment problems: ceiling effects floor effects solution: -task of moderate difficulty, determined through pilot testing.
instructional variables
e.g. giving different strategies for how to memorize a list
task variables
e.g. giving people different kinds of intelligence tests
within subjects design
is a type of experimental design in which all participants are exposed to every treatment or condition. all of the subjects in the study are treated with the critical variable in question. -also called repeated measures design -used when comparisons within the same individual are essential (e.g. perception studies) -subjects are their own control group -ex: let's imagine that you are doing an experiment on exercise and memory. for your independent variable, you decide to try two different types of exercise: yoga and jogging. instead of breaking participants up into two groups, you have all the participants try yoga before taking a memory test. then, you have all the participants try jogging before taking a memory test. next, you compare the test scores to determine which type of exercise had the greatest effect on performance on the memory tests.
counterbalancing
is a type of experimental design in which all possible orders of presenting the variables are included. For ex, if you have two groups of participants (group 1 and group 2) and two levels of an independent variable (level 1 and level 2), you would present one possible order (group 1 gets level 1 while group 2 gets level 2) first and then present the opposite order (group 1 gets level 2 while group 2 gets level 1). this way you can measure the effects in all possible situations. obviously there are limitations with this procedure as not all studies can be designed this way and as you increase the number of variables, conditions, etc., it just becomes logistically problematic. -alerting the order of the experimental conditions -reduces influence of practice and fatigue -complete counterbalancing can be difficult with complex designs -test participants in every possible different order at least once -works: well with only a few conditions
threats to internal validity(maturation)
maturing physically, cognitively, emotionally during the course of the study. -an adult who loses a parent, for instance, learns to cope with a new emotional situation that will affect the way he or she deals with situations that followd -how to guard against this threat: -control group with equivalency -compare control groups pre-test and post-test scores.
some benefits of using within-subjects designs
minimizing individual differences since the same people produce both scores. -help reduce errors associated with individual differences. -individuals are exposed to all levels of a treatment, so individual differences will not distort the results. each participant serves as his or her own baseline. -increase in statistical power(ability to find significant results) -does not require large pool of participants.
analyzing single factor, multilevel designs (is it appropriate to run multiple t test)
multiple ttest are inappropriate, because the more test you run is the more likely you will find significance which increases chances of type I error and we call it fishing. the longer you stay out fishing the more likely you are to catch something so this means the more you do it is the more likely you'll find significance when you shouldn't be finding significance
double blind experiment
neither the experimenters nor the participants know which condition is being tested.
single blind experiment
only the experimenters know which condition is being tested
statistical analysis(confidence intervals)
range within which population mean likely to be found. -good way to think about a CI is as a range of plausible values for the population mean (or another population parameter such as a correlation), calculated from our sample data. A CI with a 95 percent confidence level has a 95 percent chance of capturing the population mean. technically, this means that, if the experiment were repeated many times, 95 percent of the CIs would contain the true population mean. CIs are ideally shown in the units of measurement used by the researcher, such as proportion of participants or milligrams of nicotine in a smoking cessation study. this allows readers to assign practical meaning to the values. -also give information about the "precision" of an estimate. an uncertain estimate using a 95 percent CI would be quite wide, whereas a more certain estimate using a 95 percent CI will be much smaller and therefore more precise. -can be used directly in a meta-analysis and in meta-analytic thinking which considers across studies rather than focusing on just one result of one study. -allows a researcher to immediately compare a current result with CIs from previous studies.
some concerns of using within-subjects designs(order effects)
refers to how the positioning of question or tasks in a survey, test, etc., influences the outcome. designed to measure whether the order of the questions makes a difference in the outcome of the survey -order effects can be caused by practically anything and so are notoriously difficult to control for. -fatigue: participants tire after one or two tests -practice: participants may perform better as they become more familiar with the testing environment -other testing conditions: participants may gradually improve(or decline) due to factors in the testing environment like heating, lighting,or ergonomics.
ceiling effects
task is too easy, all scores very high, disguising any differences -is used to describe a situation that occurs in both pharmacological and statistical research. -can be seen when a variable is no longer measured or estimated at a certain point. in a census there are categories for things such as age and income that have a selective number of choices. for the top ranges (examples would e income of $100,000 or higher, 65 years of age or older) there is an open ended component to the selection that prevents measurement past the cutoff point. there is not difference between someone that makes $100,000 and $1 million dollars as there is no difference between a 65 and an 80 year old.
floor effects
task too difficult, all scores very low, disguising any differences -seen in a study being conducted by a school on the prevalence of academic dishonesty. they ask their students in an anonymous questionnaire how many of them have stolen an exam from a teacher. the results would be very low presuming that most people have never done this in their academic career. the results would suggest that academic dishonesty was very very low in this student population. if they changed the question to be whether a person has looked at another's paper the amount of "yes" answers would be prevalent and be more representative of the prevalence of cheating.
some concerns of using within-subjects designs(carryover effects)
using counterbalance helps prevent carryover effects -refers to any lingering effects of a previous experimental condition that are affecting a current experimental condition -essentially it is an effect that "carries over" from one experimental condition to another. this effect is seen when a subject performs in more than one condition making this is a common concern in within-subjects design. for ex, a researcher wants to know the effects of a medication on memory. the subject is given a list of words to memorize in two different conditions: with a placebo and with the real medication. the researcher doesn't consider the possibility of carryover effects and uses the same list of words for both experimental conditions. exII: having participants take part in yoga might have an impact on their later performance in jogging and may even affect their performance on later memory tests.