PSY 300 Final Exam Study Guide
Interrogate the construct validity of an association claim, asking whether the measurement of each variable was reliable and valid.
-Does the measure have good reliability? -Is it measuring what it's intended to measure? -What is the evidence for its face validity, it's concurrent validity, its discriminant and convergent validity?
Explain two reasons to conduct a factorial study.
-One reason researchers conduct studies with factorial designs is to test whether an independent variable effects different kinds of people, or people in different situations, in the same way. -When researchers test an independent variable in more than one group at once, they are tasing whether the effect generalizes. -The process of using a factorial design to test limits is sometimes called testing for moderators. -Can test theories. The goal of most experiments in psychological science is to test hypotheses derived from theories. Indeed, many theories make statements about how variables interact with one another. The best way to study how variables interact is atto combine them in a factorial design and measure whether the results are consistent with the theory.
Interpret different possible outcomes in cross-lag correlations, and make a causal inference from each pattern.
(from the books example on narcissism): The results could have followed one of three patterns. The study did show that parental overpraise at earlier time period was significantly correlated with child narcissism at the later time periods. Such a pattern was consistent with the argument that overpraise leads to increases in narcissism over time. However, the study could have shown the opposite result- that narcissism at earlier time periods was significantly correlated with overpraise later. Such a pattern would have indicated that the childhood narcissistic tendency came first, leading patterns to change their type of praise later. Finally, the study could have shown that both correlations are significant- that overpraise at time one predicted narcissism at time two and that narcissism at time one predicted overpraise at time 3. If that had been the result, it would mean excessive praise and narcissistic tendencies are mutually reinforcing. in other words, there us a cycle in which overpraise leads two narcissism, which leads parents to overpraise, and so on.
Distinguish an association claim, which requires that a study meet only one of the three rules for causation (covariance), from a causal claim, which requires that the study also establish temporal precedence and internal validity.
1. Covariance of cause and effect. The results must show a correlation, or association, between the cause variable and the effect variable. 2. Temporal precedence. The cause variable must precede the effect variable; it must come first in time. 3. Internal validity. There must be no plausible alternative explanations for the relationship between the two variables. -Third-variable problem: when we can come up with an alternative explanation for the association between two variables, that alternative is some lurking third variable.
Use the three causal criteria to analyze an experiment's ability to support a causal claim.
1. Covariance. Do the results show that the causal variable is related to the effect variable? Are distinct levels of the independent variable associated with different levels of the dependent variable? 2. Temporal precedence. Does the study design ensure that the causal variable comes before the outcome variable in time? 3. Internal validity. Does the study design rule out alternative explanations for the results?
Explain the value of pattern and parsimony in research.
A pattern of results best explained by a single, parsimonious causal theory. Parsimony: the degree to which a scientific theory provides the simplest explanation of some phenomenon; the simplest explanation of a pattern of data- the theory that requires making the fewest exceptions or qualifications. -Researchers use a variety of methods ad many studies to explore the strength and limits of a particular research questions. Another example comes from research on TV violence and aggression. Many studies have investigated the relationship between watching violence on TV and violet behavior. Some studies are correlational; some are experimental. Some are on children; others on adults. Some are longitudinal; other are not. But in general, the evidence all points to a single, parsimonious conclusion that watching violence on TV causes people to behave aggressively.
Identify and interpret data from a multiple-regression table and explain, in a sentence, what each coefficient means.
A positive beta, like a positive r, indicated a positive relationship between that predictor variable and the criterion variable, when the other predictor variables are statistically controlled for. A negative beta, like a negative r, indicates a negative relationship between two variables (when the other predictors are controlled for). A beta that is zero, or not significantly different from zero, represents no relationship (when the other predictors are controlled for). The higher the beta is, the stronger the relationship is between that predictor variable and the criterion viable. The smaller the beta is, the weaker the relationship. The predictor variable "exposure to levels of sex on TV" has a beta of .25. This positive beta, like positive r, means higher levels of sex on TV go with higher pregnancy risk (and lower levels o sex on TV go with lower pregnancy risk), even when we statistically control for the other predictor, age. The beta that is associated with a predictor variable represents the relationship between that predictor variable and the criterion variable, when the other predictors are controlled for. When p is less than .05, the beta is considered statistically significant. The p is greater than .05, the beta is considered not significant, meaning we cannot conclude beta is different from zero.
Identify effect size d and statistical significance and explain what the mean for an experiment.
A statistically significant result suggests covariance exists between the variables in the population from which the sample was drawn. d- this measure represents how far apart two experimental groups are on the dependent variable. It indicates not only the distance between the means, but also how much the scores within the groups overlap. The standardized effect size, d, takes into account both the difference between means and the spread of scores within each group (the standard deviation). When d is larger, it usually means the independent variable caused the dependent variable to change for more of the participants in the study. When d is smaller, it usually means the scores of participants in the two experimental groups overlap more.
Describe interactions in terms of "it depends."
Another way to describe interactions involves key phrases. Some interactions, like the crossover interaction between memory and expertise, can be described using the phrase "it depends" as in: "The memory capacity of children depends on their level of expertise."
Interrogate the construct validity of a manipulated variable in an experiment, and explain the role of manipulation checks and theory testing in establishing construct validity.
Ask how well the researchers manipulated (or operationalized) them. Manipulation check: an extra dependent variable that researchers can insert into an experiment to them that their experimental manipulation worked. Likely to be used when the intention is to make participants think or feel certain ways. Example: Researchers were interested in investigating whether humor would improve students memory of a college lecture. Students were randomly assigned to listen too a serious lecture or one punctuated by humorous examples. To ensure that they actually found the humorous lecture than the serious one, students rated the lecture on how "funny" and "light" it was. These items were in addition to the key dependent variable, which was their memory for the material. As expected the students in the humorous lecture condition rated the speaker as funnier ad lighter than the students in the serious lecture condition. The researchers concluded that the manipulation worked as expected.
Describe random assignment and explain its role in establishing internal validity.
Assigning participants at random to different levels of the independent variable- by flipping a coin, rolling a die, or using a random number generator- controls; for all sorts of potential selection effects.
Describe how the procedures for independent-groups and within-groups experiments are different. Explain the pros and cons of each type of design.
Both of these studies used an Independent-groups design: different groups of participants are placed into different levels of the independent variable. Also called a between-subjects design or between-groups design. In the notetaking and pasta bowl studies, there were different participants at each level of the independent variable. In the notetaking study, some participants took notes on laptops and others took noes in longhand. In the pasta bowl study, some participants were in the large-bowl condition and others were in the medium-bowl condition. In a within-groups design: there is only one group of participants, and each person is presented with all levels of the independent variable. Also called a within-subjects design. For example, Mueller and Oppenheimer might have run their study as a within-groups design if they had asked each participant to take notes twice- once using a laptop and another time handwritten.
Distinguish measured from manipulated variables in a study.
Manipulated variables: controlled, such as when the researchers assign participants to a particular level (value) of the variable. Measured variables: take the form of records of behavior or attitudes, such as self-reports, behavioral observations, or physiological measures.
Describe counterbalancing, and explain its role in the internal validity of a within-groups design.
Counterbalancing: present the levels of the independent variable to participants in different sequences. With counterbalancing, any order effects should cancel each other out when all the data are collected. Order effects: when exposure to one level of the independent variable influences responses to the next level.
Identify three types of correlations in a longitudinal correlational design: cross-sectional correlations, autocorrelations, and cross-lag correlations.
Cross-sectional: test to see whether two variables, measured at the same point in time, are correlated. Autocorrelations: they determine the correlation of one variable with itself, measured on two different occasions. Same variable measured at two different times. Cross-lag: show whether the earlier measure of one variable is associated with the later measure of the other variable. Help establish temporal precedence.
Explain why control variables can help an experimenter eliminate design confounds.
Design confound: an expirementor's mistake in designing the independent variable; it is a second variable that happens to vary systemically along with the intended independent variable and therefore is an alternative explanation for the results. Control variables: Allow researchers to separate one potential cause from another and his eliminate alternative explanations for results. Control variables are therefore important for establishing internal validity. When are experiment has a design confound, its has poor internal validity and cannot support a causal claim. Because the van Kleef did not have any apparent design confounds, its internal validity is sound. The researchers carefully though about confounds in advance and turned the, into control variables instead. The researchers took steps to help them justify making a causal claim.
Review three threats to internal validity: design confounds, selection effects, and order effects.
Design confound: an expirementor's mistake in designing the independent variable; it is a second variable that happens to vary systemically along with the intended independent variable and therefore is an alternative explanation for the results. Selection effects: When the kinds of participants in one level of the independent variable are systematically different from those in the other. They can also happen when the experimenters let participants choose (select) which group they want to be in. A selection effects may result if the experimenters assign one type of person (e.g., all the women, or all who sign up early in the semester) to one condition, and another type of person (e.g., all the men, or all those who want until later in the semester) to another condition. Order effects: when exposure to one level of the independent variable influences responses to the next level. Practice effects: a long sequence might lead participants to get better att the task, or to get tired or bored in the end. Carryover effects: some form of contamination carries over from one condition two the next. Example, image sipping orange juice right after brushing your teeth; the first taste contaminates your experience of the second one.
Cronback's alpha
Determine internal reliability when there are three or more relevant measures.
Explain how the correlation coefficient, r, represents strength and direction of a relationship between two quantitative variables.
Direction refers to whether the association is positive, negative, or zero; strength refers to how closely related the two variables are - how close r is to 1 or -1.
Describe the differences among direct replication studies, conceptual replication studies, and replication-plus-extension studies.
Direction replication: researchers repeat an original study as closely as they can to see whether the effect is the same in the newly collected data. Conceptual replication: researchers explore the same research question but use different procedures. Replication-plus-extension: researchers replicate their original experiment and add variables to test additional questions.
Explain why experiments are superior to multiple-regression designs for controlling for third variables.
Even when a study takes place over time another probable is that researchers cannot control for variables they don't measure. Even though multiple repression controls for any third variables the researchers do measure in the study, some other variable they didn't consider could account for the association. This unknown third-variable problem is one reason that a well-run experimental study is ultimately more convincing in establishing causation than a correlational study. An experimental study on TV, for example, would randomly assign a sample of people to watch either sexy tv shows or programs without sexual content. The power of random assignment would make the two groups likely to be equal on any third variables the researchers did not happen to measure, such as religiosity, social class, or parenting styles. A randomized experiment is the gold standard for determining causation. Multiple regression, in contrast, allows researchers to control for potential third variables, but only for the variables they choose to measure.
Articulate how a crossed factorial design works.
Factorial design: one in which there are two or more independent variables (also referred to as factors). In the most common factorial design, researchers cross the two independent variables; that is, they study each possible combination of the independent variables. Strayer and Drews created a factorial design to test whether rather effect of driving while talking on a cell phone depended on the drivers age. They used two independent variables (cell phone use and driver age), creating a condition representing each possible combination of the two. To cross the two independent variables, they essentially overlaid one independent variable on top of another. This overlay process created four unique conditions, or cells: younger drivers using cell phones, younger drivers not using cell phones, older drivers using cell phones, and older drivers not using cell phones. There are two independent variables (two factors)- cell phone use and age- and each one has two levels (driving while using a cell phone or not; younger or older driver). This particular design is called a 2x2 (two by two). Factorial design, meaning that two levels of one independent variable are crossed with two levels of another ikdepmen variable. Since 2x2=4, there are four cells in this design.
Always researched in a generalization mode?
Frequency claims
Identify the following nine threats to internal validity: history, maturation, regression, attrition, testing, instrumentation, observer bias, demand characteristics, and placebo effects.
History threats: result from a "historical" or external factor that systematically affects most members or the treatment group at the same time as the treatment itself, making it unclear whether the change is caused by the treatment received. To be a history threat, the external factor must affect most people in the group in the same direction (systematically), not just a few people (unsystematically). A comparison group can help control for history threats. In the Go Green study, the students would need to measure the kilowatt usage in another, comparable dormitory during the same 2 months, but not give the student's in the second dorm the Go Green campaign materials. If both groups decreased their electricity usage about the same over time, the decrease probably resulted from the change of seasons, not from the Go Green campaign. However, if the treatment group decreased its usage more than the comparison group did, you can rule out the history threat. Maturation: a change in behavior that emerges more or less spontaneously over time. People adapt to changed environments; children get better at walking and walking; plants grow taller- but not because of any outside intervention. It just happens. Regression: a statistical concept called regression to the mean. When a group average (mean) is unusually extreme at time 1, the next time that group is measured (time 2), it is likely to be less extreme- closer to its typical or average performance. The change would not occur because of the treatment, butt simply because of the regression to the mean, so in this case there would be an internal validity threat. Attrition: In studies that have a pretest and protest, attrition is a reduction in participant numbers that occurs when people drop out before the end. Attrition can happen when a pretest and posttest are administered on separate days and some participants are not available on the second day. An attrition threat becomes a problem for internal validity when attrition is systematic; that is, when only a certain kind of participant drops out. Testing: refers to a change in the participants as a result of taking a test (dependent measure) more than once. People might have become more practiced at taking the test, leading two improved scores, or they may become fatigued or bored, which could leave to worse scores over time. Instrumentation: occurs when a measuring instrument changes over time. In observational research, the people who are coding behaviors are he measuring instruments, and over a period of time, they might change their standards for judging behavior by becoming more strict or more lenient. Observer bias: occurs when researchers' expectations influence their interpretation of the results. Demand characteristics: a problem when participants guess what the study is supposed to be about and change their behavior in the expected direction. Placebo effects: occurs when people receive a treatment and really improve- but only because the recipients believe they are receiving a valid treatment.
Describe an interaction as a "difference in differences."
In a factorial design with two independent variables, the first two results obtained are the main effects for each independent variable. The third result os the interaction effect. Whereas the main effects are simple differences, the interaction effect is the difference in differences.
Interrogate two aspects of external validity for an experiment (generalization to other populations and to other settings).
In interrogating external validity in the context of causal claims you ask whether the causal relationship can generalize to other people, places, and times. You should ask how the experimenters recruited their participants. External validity also applies to the types of situations too which an experiment might generalize. For example the van Kleef study used pasta, but other researchers in the same lab found that late serving containers also cause people too consume more soup, popcorn, and snack chips.
Identify an experiment's independent, dependent, and control variables.
Independent variable: the manipulated (causal) variable Dependent variable: the measured variable, or outcome variable Control variables: what the experimenter holds constant on purpose (important for establishing internal validity)
Explain ABA Designs
Involves establishing a baseline condition (the "A" phase), introducing a treatment or intervention to effect some sort of change (the "B" phase), and then removing the treatment to see if it returns to the baseline.
Explain how longitudinal designs are conducted.
Longitudinal designs can provide evidence for temporal precedence by measuring the same variables in the same people at several points in time.
Kappa
Looking at agreement between raters but for two variables and categorical data.
Describe three causes of within-groups variance—measurement error, individual differences, and situation noise—and indicate how each might be reduced.
Measurement error: a human or instrument factor that can inflate or deflate a person's true score on the dependent variable. Solution: use reliable, precise tools. When researchers use measurement tools that have excellent reliability (internal, inter-rated, and test-retest), they can reduce measurement error. When such tools also have good construct validity, there will be a lower error rate as well. More precise and accurate measurements have less error. Solution 2: measure more instances. One solution to measuring badly is to take more measurements. When a tool potentially causes a great deal of random error, the researcher can cancel out many errors simply by including more people in the sample. The error cancel each other out. Individual differences: They can be a problem in independent-groups designs. Solution 1: add more participants. The more people you measure, the less impact any single person will have on the group's average. Adding more participants reduces the influence of individual differences within groups, thereby enhancing the study's ability to detect differences between groups. Solution 2: Change the design. One way to accommodate individual differences is to use a within-groups design instead of an independent-grourps design. Within-groups designs control for irrelevant individual differences, and they require fewer participants than independent-groups designs. Situation noise: external distractions Solution: carefully controlling the surroundings of an experiment.
Identify a mediation hypothesis and sketch a diagram of the hypothesized relationship. Describe the steps for testing a mediation hypothesis.
Mediator: We know something happens, but we want to know why it works. Example- We know conscientious people are more physically healthy than less conscientious people, but why? The mediator of this relationship might be the fact that conscientious people are more likely to follow medical advise and instructions, and that's why they're healthier. Following doctor's orders would be the mediator of the relationship between the trait, conscientiousness, and the outcome, better health. Example: We know there's a relationship between having deep conversations and feeing's of well-being. Researchers might next propose a reason- a mediator of this relationship. One likely mediator could be social ties: Deeper conversations might help build social connections, which in turn can lead to increased well-being. They would propose an overall relationship, c, between deep talk and well-being. However, this overall relationship exists only because there are two other relationships: a (between deep talk and social ties) and b (between social ties and well-being). In other words, social ties mediate the relationship between deep talk and well-being. The researchers could examine this mediation hypothesis by following five steps: 1. Test for relationship c. Is deep talk associated with well-being? (If it's not, there is no relationship to mediate.) 2. Test for relationship a. Is deep talk associated with the proposed mediator, strength of social ties? Do people who have deeper conversations actually have stronger ties than people who have more shallow conversations? 3. Test for relationship b. Do people who have stronger social ties have higher levels of well-being? 4. Run a regression test, using both strength of social ties and deep talk as predictor variables to predict well-being, to see whether relationship c, goes away. (If social tie strength is the mediator of relationship c, the relationship between deep talk and well-being should drop when social tie strength is controlled for. Here we would be using regression as a tool to show that deep talk was associated with well-being in the first place because social ties strength was responsible.) 5. Mediation is definitely established only when the proposed causal varbblaes is measured or manipulated first in a study, followed some time later by the mediating variable, followed by the proposed outcome variable. In other words, to establish mediation in this example, the researchers must conduct a study in which the amount of deep talk is measures (or manipulated) first, followed shortly afterwards, by a measure of social tie strength. They have to measure well-being last of all, to rule of the possibility that the well-being led to having deeper conversations.
Ceiling effect
Not much variation
Articulate the reasons that a study might result in null effects: weak manipulations, insensitive measures, ceiling and floor effects, not enough variance between groups, too much variance within groups, reverse confound, or a true null effect.
Null effect: a result without the expected content: that is, the proposed result is absent. Weak manipulation: When you interrogate a null result, then, it's important to ask how the researchers operationalized the independent variable. In other words, you have to ask about construct validity. The researcher might have obtained a very different pattern of results if he had given $0.00, $5.00, and $150.00 to the three groups. The educational psychologist might have found reading games improve scores if done daily for 3 months rather than just a week. Insensitive measures: Sometimes a study finds a null result because the researchers have not used an operationalization of the dependent variable with enough sensitivity. It would be like asking a friend who hates spicy food to taste your two bowls of salsa; he's simply call both of them "way too spicy." When it comes to dependent measures, it's smart to use ones that have detailed, quantitative increments- not just having two or three levels. Ceiling effect: all the scores are squeezed together at the high end. In a floor effect, all the scores cluster at the low end. Reverse confound: A study might be designed in such a way that a design confound actually counteracts, or reverses, some true effect of an independent variable. In the money and happiness study, for example, perhaps the students who reserved the most money happened to be given the money by a grumpy experimenter, whole those who received the least money were exposed to a more cheerful person; this confound would have worked aghast any true effect of money on mood. -If, after interrogating these possible obscuring factors, you find that the experiment was conducted in ways that maximized its power and yet still yielded a non significant result, you can probably conclude the independent variable truly does not affect the dependent variable.
Interrogate the statistical validity of an association claim, asking about features of the data that might distort the meaning of the correlation coefficient, such as outliers in the scatterplot, effect size, and the possibility of restricted range (for a lower-than-expected correlation). When the correlation coefficient is zero, inspect the scatterplot to see if the relationship is curvilinear.
Outliers: an extreme score- a single case (or a few cases) that stands out from the pack. Depending on where it sits in relation to the rest of the sample, a single outlier can have an effect on the correlation coefficient r. Can be problematic because even though they are only one or two data points, they may exert disproportionate influence. In a bivariate correlation, outliers are mainly problematic when they involve extreme scores on both of the variables. In evaluating the positive correlation between height and weight, for example, a person who is both extremely tall and extremely heavy would make the r appear stronger; a person who is extremely Shortt and extremely heavy would make the r appear weaker. Effect size: All associations are not equal; some are stronger than others. Recall that the effect size describes the strength of a relationship between two or more variables. The conventions for labeling correlations are small, medium, or large in strength. Larger effect sizes allow more accurate predictions. Errors of prediction get larger when associations get weaker. Positive and negative associations can allow us to predict one variable from another, and the stronger the effect size, the more accurate, on average, our predictions will be. Larger effect sizes are usually more important. However, there are exceptions to this rule; a small effect size can be important, especially when it has life-or-death implications. Restricted range: In a correlation study, if there is not a full range of scores on one of the variables in the association, it can make the correlation appear smaller than it really is. Because restriction of range makes correlations appear smaller, we would asl about it primarily when the correlation is weak. Researchers could, if possible, recruit more people at the ends of the spectrum. Curvilinear: When a study reports that there is no relationship between two variables, the relationship might truly be zero. In rare cases, however, there might be a curvilinear association in which the relationship between two variables is not a straight line; it might be positive up to a point, and then become negative. For example, between age and the use of health care services. As people get older, their use of the health care system decreases up to a point. then, as they approach age 60 and beyond, health care use increases again. However, when we compete a simple bivariate correlation coefficient r on these data, we get only r = .01 because r is designed to describe the slope of the best-fitting straight line through the scatterplot. When the slope of the scatterplot goes up and then down, r does not describe the pattern very well. The straight line that fits best through this set of points is flat and horizontal, with a slope zero. Therefore, if we looked only at the r and not at the scatterplot, we might conclude there is no relationship between age and use of health care. When researchers suspect a curvilinear association, the statistically valid way to analyze it is to compute the correlation between one variable and the square of the other.
Give examples of how external validity applies both to other participants and to other settings.
Participants: To asses a study's generalizability to other people, you would ask how the participants were obtained. If a study is intended to generalize to some population, the researchers must draw a probability sample from that population. If a study uses a convince sample, you can't be sure of the study's generalizability to the population the researcher intends. -It's a population, not the population -External validity comes from how, not how many -Just because a sample comes from a population doesn't mean it generalizes to that population Other settings: Sometimes you want to know whether a lab situation created for a study generalizes to real-world settings.
Confounds
Possible alternative explanations, which are threats to internal validity. When there is a confound, you are confused about what is causing the change in the dependent variable.
Identify posttest-only and pretest/posttest designs, and explain when researchers might use each one
Posttest-only design: participants are randomly assigned to independent variable groups and are tested on the dependent variable once. Pretest/posttest design: participants are randomly assigned to at least two different groups and are tested on the key dependent variable twice- once before and once after exposure to the independent variable. -Researches might use a pretest/posttest design if they want to study improvement over time, or to be extra sure that two groups were equivalent at the start- as long as the trees does not make the participants change their more spontaneous behavior.
Explain the difference between concurrent-measures and repeated-measures designs.
Repeated-measures design: a type of within-groups design in which participants are measured on a dependent variable more than once, after exposure to each level of the independent variable. Concurrent-measures designs: participants are exposed to all the levels of an independent variable a roughly the same time, and a single attitudinal or behavioral preference is the dependent variable.
Consider why journalists might prefer to report single studies, rather than parsimonious patterns of data.
Reporting on the latest study without giving the full context can make it seem as though scientists conduct unconnected studies on a whim. It might even give the impression that one study can reverse decades of previous research. In addition, skeptics who read such science stories might find it easy to criticize the results of a single, correlational study.
Be familiar with first two steps of meta-analysis
Researchers collect all possible examples of a particular kind of study. They then average all the effect sizes to find an overall effect size.
Explain P-hacking and give at least 2 examples of it
Researchers include multiple dependent variables in an experiment, and perhaps only one out of seven or eight variables show a significant difference. A researcher might report the significant effect, dismissing all the variables that didn't work. Scientists might craft an after-the-fact hypothesis about a surprising result, making it appear as if they predicted it all along, a practice called HARKing, for "hypothesizing after the results are known. " Or a researcher might peek at the study's results, and if they are not quite significant, run a few more individuals, decide too remove certain outliers from the data, or run a different type of analysis. Called p-hacking, in part because the goal is to find a p value of just under .05, the traditional value for significance testing
Explain why experimenters usually prioritize internal validity over external validity when it is difficult to achieve both.
Researchers usually prioritize experimental control- that is, internal validity. To get a clean, confound-free manipulation, they may have to conduct their study in an artificial environment like a university laboratory. Such locations may not represent situations in the real world. Although it's possible to achieve both internal and external validity in a single study, doing so can be difficult. Therefore, many experimenters decide to sacrifice real-world representativeness for internal validity.
Explain Preregistration
Scientists can preregister their study's method, hypotheses, or statistical analyses online, in advance of data collection. This process could discourage p-hacking because it values the full, honest theory-data cycle rather than only the significant findings.
Understand statistical significance
Statistically significant: the conclusion a researcher reaches regarding the likelihood of getting a correlation of that size just by chance. -Provide a probability estimate (p, sometimes abbreviated as sig for significance). The p value is the probability that the sample's association came from a population in which the association is zero. If the probability (p) associated with the result is very small- that is, less than 5%- we know that the result is very unlikely to have come from a zero-association population. -Sometimes we determine that the probability (p) of getting the samples's correlation just by chance would be relatively high (i.e., higher than p = .05) in a zero-assocaition population. It is considered to be "nonsignificant" or "not statistically significant." This means we cannot rule out the possibility that the result came from a population in which the association is zero.
Define replication crisis in psychology
The OSC selected 100 studies from three major psychology journals and recruited researchers around the world to conduct replications. By one metric, only 39% of the studies had clearly replicated the original effects. -In response, scientists have introduced changes. In is fro research journals to require much larger sample sizes, both for original and replication studies. Researchers are also urged to report all of the variables and analysis they tested.
Define statistical power
The likelihood that a study will return a statistically significant result when the independent variable really has an effect. -A within groups design, a strong manipulation, a large number of participants, and less situation noise are all things that will increase the power of an experiment. Of these, the easiest way to increase power is to add more participants.
Given a factorial notation (e.g., 2 x 4), identify the number of independent variables, the number of levels of each variable, the number of cells in the design, and the number of main effects.
The notation for factorial designs follow a simple pattern. Factorials are notated in the form "_x_." The quantity of number indicated the number of independent variables (a two by three design is represented with 2 numbers, 2 and 3). The value of each of the numbers indicated how many levels there are for each independent variable (2 levels for one and 3 levels for the other). When you multiply the two numbers, you get the total number of cells in the design. -In a factorial design, researchers test each independent variable to look for a main effect- the overall effect of one independent variable on the dependent variable, averaging over the levels of the petter independent variable. In other words, a main effect is a simple difference. In a factorial design with two independent variables, there are two main effects.
What is not true of selection effects?
They are unimportant for interrogating internal validity.
Articulate the mission of cultural psychology: to encourage researchers to test their theories in other cultural contexts (that is, to generalize to other cultures).
They have challenged researchers who work exclusively in theory-testing mode by identifying several theories that were supported by data in one cultural context but not in any other.
Articulate the difference between mediators, third variables, and moderating variables.
Third variables: the proposed third variable is external to the two variable in the original bivariate correlation; it might even be seen as an accident- a problematic "lurking variable" that potentially distracts from the relationship in interest. Mediator: internal to the causal variable and often of direct interest to the researchers, rather than a nuisance. In the deep talk example, the researchers believe stronger social ties is the important aspect, or outcome, of deep talk that is responsible for increasing well-being. When researchers test for mediating variables, they ask: Why are these two variables linked? When they test for moderating variables. they ask: Are these two variables linked the same way for everyone, or in every situation? Mediators ask: Why? Moderators ask: Who is most vulnerable? For whom is the association strongest? Mediating variable comes in the middle of the other two variables. Moderators can inform external validity.
Describe matching, explain its role in establishing internal validity, and explain situations in which matching may be preferred to random assignment.
To create matched groups from a sample of 30, the researchers would first measure the participants in a particular variable that might matter to the dependent variable. Student ability, operationalized by GPA, for instance, might matter in a study of note taking. They would next match participants up in pairs, starting with the two having the highest GPAs, and within that matched set, randomly assign one of them to each of the two note taking conditions. They would then take the pair with the next-highest GPAs and within that set again assign randomly to the two groups. They would continue this process until they reach the participants with the lowest GPAs and assign them at random too. This method ensures that the groups are equal on some important variable, such as GPA, before the manipulation of the independent variable.
Identify Type 1 and Type 2 errors
Type I error, also known as a "false positive": the error of rejecting a null hypothesis when it is actually true. In other words, this is the error of accepting an alternative hypothesis (the real hypothesis of interest) when the results can be attributed to chance Type II error, also known as a "false negative": the error of not rejecting a null hypothesis when the alternative hypothesis is the true state of nature. In other words, this is the error of failing to accept an alternative hypothesis when you don't have adequate power
Interrogate the external validity of an association claim by asking to whom the association can generalize.
When interrogating the external validity of an association claim, you ask whether the association can generalize to other people, places, and time.
Define dependent variables and predictor variables in the context of multiple-regression data.
When researchers use multiple regression, they are studying three or more variables. The first step is to choose the variable they are most interested in understanding or predicting; this is known as the criterion variable, or dependent variable. The Chanda team were primarily interested in predicting pregnancy, so they chose that as their criterion variable. The criterion (dependent) variable is almost always specified in either the top row or the title or a regression table. The rest of the variables measured in a regression analysis are called predictor variables, or independent variables. In the sexy TV/pregnancy study, the predictor variables are the amount is sexual content teenagers reported viewing on TV an the age of each teen.
Describe the difference between generalization mode, in which external validity is essential, and theory-testing mode, in which external validity is less important than internal validity and may not be important at all.
When researchers work in the theory-testing mode, they are usually designing correlational or experimental research to investigate support for a theory. The theory-data cycle is the process of designing studies to test a theory and using the data from the studies to reject, refine, or support the theory. In theory-testing mode, external validity often matters less than internal validity. Although much of the research in psychology is conducted in theory-testing mode, at certain times theory-testing takes place in generalization mode, when researchers want to generalize the findings from the sample in a previous study to a larger population. They are careful, therefore, to use probability samples with appropriate diversity of gender, age, ethnicity, and so on. In other words, researchers in generalization mode are concerned about external validity. -Frequency claims rarer always in generalization modes. -Association and causal claims are sometimes in generalization mode.
Identify interaction effects in a line graph.
While its possible to compute interactions from a table, it is sometimes easier to notice them on a graph. When results from a factorial design are plotted as a line graph and the lines are not parallel, there may be an interaction, something you would confirm with a significant test. If the lines are parallel, there probably is no interaction. Notice that lines font have to cross to indicate an interaction; they simple have to be non-parallel.
Explain what a meta-analysis does.
a way of mathematically averaging the results of all the studies (both published and unpublished) that have tested the same variables to see what conclusion that the whole body of evidence supports.
Identify strength of Cohen's effect size d based on Table 10.2
d= .20 -> small, or weak -> r= .10 d= .50 -> medium, or moderate -> r= .30 d= .80 -> large or strong -> r= .50
Explain file drawer problem
the idea that a meta-analysis might be overestimating the true size of an effect because null effects, or even opposite effects, have not been included in the collection process.