TEST #1
Choosing how to measure something
which of the four measurements to use based on the type of item you are measuring weather it be measurable by numbers or physical descriptions.
Demand characteristics (and ways to control)
A feature of the experimental design or procedure that increases the chances that participants will detect the true purpose of the study; Participants can be biased and behave in ways they think they should behave - Good subject role: Tries to act in a way that confirms hypothesis - Bad subject role: Tries to act in a way that disconfirms hypothesis - Social desirability bias: Gives socially desirable answers • Some strategies - Make subjects blind to the condition they are in - Mask the experimental manipulation by using a placebo
Pseudoscience
A field of inquiry that attempts to associate with true science, relies exclusively on selective anecdotal evidence, and is deliberately too vague to be adequately tested.
Directionality problem and third-variable problem
A high correlation indicates that the level of one variable predicts the level of another variable (very useful); Cannot infer cause-effect relationship. Correlation can be "spurious."
Samples and populations
A portion or subset of a population; all of the members of an identifiable group; The entire collection of individuals being considered (the people you are trying to generalize your results to); The subset of the population who actually participate
Stratified random sample
A probability sample that is random, with the restriction that important subgroups are proportionately represented within it; A type of random sample in which subgroups are represented in the same proportions as the population
Blocked randomization
A procedure used in within-subjects design as a counterbalancing procedure to ensure that when participants are tested in each condition more than once, they experience each condition once before experiencing any condition again; Each block of trials involves a random order of each of the levels of the IV - Useful when you have many items/observations per condition
Attrition/Mortality effects
A threat to the internal validity of a study; occurs when participants fail to complete a study, usually but not necessarily a longitudinal study; those finishing the study may not be equivalent to those who started it; Participants may be more likely to complete the experiment in one condition than another • Even if random assignment ensures equivalent groups at the start of an experiment, they may not be equivalent by the end of the experiment
Subject selection effects
A threat to the internal validity of a study; occurs when those participating in a study cannot be assigned randomly to groups; hence the groups are nonequivalent
Counterbalancing
For a within-subjects variable, and procedure designed to control for sequence effects; can be used to control for order effects, sequence effects, and item effects
ratio
Interval with True Zero (e.g., age, response time, errors) ; The amount of time someone waits before finally eating the marshmallow
Continuous variables
Not limited to particular values
2) What describes exactly how a variable is measured in the context of a research study? a. Operational definition b. Convergent validity c. Demand characteristics d. Cohen's d
a
A researcher is interested in the effect of using Snapchat on feelings of self-worth. To explore the relationship between the two variables, the researcher measures the amount of time participants spend using Snapchat per day (in minutes), and the degree to which they report having strong feelings of self-worth on a scale of 0 to 10 (0 indicating low feelings of self-worth; 10 indicating high feelings of self-worth). 10) Which variable is being measured on a ratio scale? a. Snapchat Use b. Self-worth c. Both d. Neither
a
Cohen's d
Effect size used to indicate the standardized difference between two means *The difference between two means divided by the standard deviation. d = (M1 - M2 )/SD
Data
Facts, figures, and other evidence gathered through observations.
Type II errors
Failing to reject the null hypothesis when it is false; failing to find a statically significant effect when the effect truly exists.
between-subject designs
An experimental design in which different groups of participants serve in the different conditions of the study; Each subject experiences only one experimental condition
Within-subject designs
An experimental design in which the same participants serve in each of the conditions of the study: also called a repeated measures design; Each subject experiences all experimental conditions - More powerful design, but requires additional control
confounding variables
An extraneous variable that covaries with the independent variable and could provide an alternative explanation of the results. (threat to researcher); Variables that vary systematically with the IV (large threat to internal validity)
Error bars/Confidence intervals
An inferential statistic in which a range of scores is calculated; with some degree of confidence (e.g., 95%) it is assumed that population values lie within the interval.
Extraneous variables
An uncontrolled factor no of interest to the researcher but that could affect the results; All possible variables other than the IV (can affect the DV, but if controlled, then not a threat to internal validity)
Limits to how we know what we know (e.g., authority, logic, intuition)
Authority - Government, media, professors, doctors, parents. - Often right, sometimes very wrong. Sometimes we just don't know... • Logic and Reason - Assumes assumptions are true... under all conditions... • Intuition and Experience - Initial impression, gut feeling, judgment, etc. - Not objective; can be inaccurate and biased
dependent variables
Behavior measured as the outcome of an experiment; The outcome being measured
Order effects
Can occur in a within-subjects design when the experience of participating in one of the conditions of the study influences performance in subsequent conditions; Subjects may perform differently when a condition is experienced first than when it is experienced second
Correlations and relationships between variables
Cannot infer cause-effect relationship; A representation of the relationship between two variables
nominal
Categorical (e.g., major in college, graduate or not) ; Whether or not someone eats the marshmallow
Importance of the control group
Controls are needed to increase power and prevent confounding variables, but they can also reduce external validity
Basics of the scientific method
Determinism and discoverability • Consider alternative hypotheses and collect data to see which is supported • Objectively verify assumptions through empirical evidence (direct observation and experience) • Remain skeptical and reserve judgment in the face of contradictory information (self-correcting/tentative) • Probabilistic (inferences/assumptions are never 100%)
Variability (range, standard deviation)
Difference between the high and low scores of a sample; Estimate of the average amount by which scores in a sample deviate from the mean
Ensuring equivalent groups through random assignment
Large sample plus random assignment ensures equivalent groups - All extraneous subject factors are controlled - Matched random assignment can also be useful (e.g., small sample size and or when an extraneous variable is likely to co-vary with the DV) • An experiment without random assignment will suffer from subject selection effects and is likely to have low internal validity • Even if a difference is statistically significant (reject the null), you won't be able to interpret - Many possible confounding variables - Low internal validity
Discrete variables
Limited to particular values (often whole numbers)
Problems with single-group pretest-posttest designs
Measure behavior of a single group before and after some treatment/intervention; no control group thus no real interpretation of the results
Disadvantages to using within-subject designs
Need to control for order, sequence, and item effects - Instrumentation and issues with measurement (way in which DV is measured may change across repeated observations) - Study takes longer and subjects may get tired of task/develop strategies (which can threaten external validity)
Experimenter bias (and ways to control)
Occurs when an experimenter's expectations about a study affect its outcome. ways to control include random assignment; Tendency for researcher to (consciously or unconsciously) treat participants differently in the different conditions - Impossible to self-monitor
ordinal
Order (e.g., class standing, birth order) ; The types of food that are most difficult to avoid eating, in rank order
Descriptive statistics
Provide a summary of the main features of a set of data collected from a sample of participants.
Advantages to using within-subject designs
Reduce random error by controlling for individual differences (every participant serves as own control) - Requires fewer participants (more power)
Type I errors
Rejecting the null hypothesis when it is true; finding a statistically significant effect when no true effects exists.
interval
Roughly Equal (e.g., IQ scores, stress rating) ; The extent to which someone believes they would be able to avoid eating the marshmallow (1 = not at all, 5 = maybe, 9 = very likely)
Statistical Power
The chances of finding a significant difference when the null hypothesis is false; depends on alpha, effects size, and sample size; the probability of rejecting the null if the null is false (1 - β)
Definition of internal validity
The extent to which a study is free from methodological flaws, especially confounding factors; Researcher's ability to conclude that the IV caused changes in the DV (IV → DV?)
External validity
The extent to which the findings of a study generalize to other populations, other settings, and other times; Extent to which the results of a study can be generalized to other situations (populations, environments, tasks, etc.)
Independent variables
The factor of interest to the researcher; it can be directly manipulated by the experimenter (e.g., creating different levels of anxiety in subjects), or participants can be selected by virtue of their possessing certain attributes (e.g., selecting two groups that differ in normal anxiety).; The factor being manipulated
item effects
The items in one condition may have a different effect than the items in another condition
Multi-level independent variables
The most common approach is the factorial design, in which each level of one independent variable is combined with each level of the others to create all possible conditions.
Operationally defining an independent variable
The precise meaning of a concept within a study (e.g., how a construct of interest is measured) • Provides a way of measuring something that can't be observed directly (e.g., learning, creativity, anxiety, honesty) • Different operational definitions allow researchers to make different conclusions; explaining how something is being manipulated so it is possible to replicate.
alpha level
The probability of making a Type I error; the significance level; Probability of rejecting the null if the null is true (typically 5%) - In a study we observe a "p-value" for an observed effect and then compare it to alpha - If p-value is less than alpha we "reject the null hypothesis"
Examples of how our intuitions can be biased/inaccurate
We never really/fully know why we do the things we do (or think the way we think) • Cognition is designed to rely on heuristics to make decisions, guide behavior, and inform beliefs (e.g., availability, anchoring) • We are influenced without conscious awareness
hindsight bias
What seems common sense now wasn't common sense before
A researcher is interested in the effect of using Snapchat on feelings of self-worth. To explore the relationship between the two variables, the researcher measures the amount of time participants spend using Snapchat per day (in minutes), and the degree to which they report having strong feelings of self-worth on a scale of 0 to 10 (0 indicating low feelings of self-worth; 10 indicating high feelings of self-worth). 12) After completing the study, participants are given a new question. In this question, they are told to imagine that they are applying for a job and that they will need to decide how much salary to ask for in their negotiations. Individuals who scored high on the selfworth questionnaire ask for more money on average than individuals who scored low on the self-worth questionnaire. What does this observation tell you about the measurement of self-worth? a. It has high convergent validity b. It has high discriminant validity c. It has high systematic error d. It has high operationalizationism
a
Operational definition
a definition of a concept or variable in terms of precisely described operations, measures, or procedures.
Prediction
a goal of psychological science in which statements about the future occurrence of a behavioral event are made, usually with some probability
Random samples
a sample that fairly represents a population because each member has an equal chance of inclusion; Random sample from the population. Ensures a representative sample by giving each member of the population the same chance of being included in the sample. (need to be careful about nonresponse bias)
Theory
a set of statements that summarizes and organizes existing information about phenomenon, provides an explanation for it, and serves as a basis for making predictions to be tested empirically
Scientific theories (useful, parsimonious, falsifiable)
a statement or set of statements that can be used to explain, predict, and understand a given phenomenon or observation - Summarizes and makes tentative conclusions based on existing empirical knowledge - Proposes explanations - Makes predictions
Normal distribution and its role in null hypothesis testing
allows to see which direction the results are from the null
Hypothesis
an educated guess about a relationship between variables that is then tested empirically
Statistical significance
an observed result (e.g., the difference between two means in an experiment) to the distribution of results expected if the null is true
3) Students in section A and B are given a sleep questionnaire in which they report the amount of sleep they got the night before the quiz. Amazingly, the mean number of hours in the two sections was identical (6.8 hours)! The standard deviations, however, were quite different. In section A, the standard deviation was very small. In section B, the standard deviation was very large. Knowing only what you know in this question, which section is most likely to have someone who failed to get any sleep the night before the quiz? a. Section A b. Section B c. They are equally likely
b
4) By definition, a measurement that does not measure what it is supposed to measure has low: a. Reliability b. Construct validity c. Both of the above d. None of the above
b
5) If a theory cannot be proven wrong, then it is: a. A scientific theory b. Not a scientific theory c. An interesting/important idea d. Not an interesting/important idea
b
7) Control procedures such as counterbalancing are designed to maximize internal validity by eliminating ___________. a. Extraneous variables b. Confounding variables c. Both of the above
b
A researcher is interested in the effect of using Snapchat on feelings of self-worth. To explore the relationship between the two variables, the researcher measures the amount of time participants spend using Snapchat per day (in minutes), and the degree to which they report having strong feelings of self-worth on a scale of 0 to 10 (0 indicating low feelings of self-worth; 10 indicating high feelings of self-worth). 11) Participants take two versions of the self-worth questionnaire. Participants who score high on the first questionnaire also tend to score high on the second questionnaire, whereas participants who score low on the first questionnaire also tend to score low on the second questionnaire. Based only on this information, what do you know to be true? a. The questionnaire has high construct validity b. The questionnaire has high reliability c. The questionnaire is worthless
b
Logic of the experimental method
begin with two identical or equivalent groups of individuals and to do something to the members of one group; Manipulate factor of interest while keeping other factors under control (possible to assess cause-effect relationship)
1) Based on what was discussed in lecture, who can be susceptible to the effects of confirmation bias? a. Everyday people when making inferences based on their own experiences b. Well-meaning scientists when deciding what research questions to ask and how to try to answer them c. Both of the above
c
6) Within-subject designs are typically: a. More powerful than between-subject designs b. More susceptible to order and sequence effects than between-subject designs, and thus tend to require more counterbalancing and control c. Both of the above
c
Working independently, two researchers at two different universities examine the benefits of a new mindfulness intervention program on prospective health outcomes. Both researchers find statistically significant benefits and thus reject the null hypothesis that there are no benefits of the mindfulness program. Researcher A's sample consisted of 50 psychology majors currently attending UCSC. Researcher B's sample consisted of 200 individuals sampled more broadly from the entire U.S. population. 8) Who could have possibly committed a Type I error? a. Researcher A b. Researcher B c. Both researchers could have possibly committed a Type I error d. Neither researcher could have possibly committed a Type I error
c
Basic idea of null hypothesis testing
involves deciding whether the data provide enough evidence to "reject" the null hypothesis
Ways of describing/illustrating data
charts, graphs, diagrams
illusory correlations,
correlation appears to exist, but either does not exist or is much weaker than assumed
Working independently, two researchers at two different universities examine the benefits of a new mindfulness intervention program on prospective health outcomes. Both researchers find statistically significant benefits and thus reject the null hypothesis that there are no benefits of the mindfulness program. Researcher A's sample consisted of 50 psychology majors currently attending UCSC. Researcher B's sample consisted of 200 individuals sampled more broadly from the entire U.S. population. 9) Who could have possibly committed a Type II error? a. Researcher A b. Researcher B c. Both researchers could have possibly committed a Type II error d. Neither researcher could have possibly committed a Type II error
d
Sequence (carryover) effects
form of sequence effect in which systematic changes in performance occur as a result of completing one sequence of conditions rather than a different sequence; Experiencing one condition may affect the experiencing of another condition
Criterion validity
form of validity in which a psychological measure is able to predict some future behavior or is meaningfully related to some other measure; Does the measure predict specific outcomes or behaviors associated with the construct?
Construct validity
in measurement, it occurs when the measure being used accurately assesses some hypothetical construct; also refers to whether the construct itself is valid; in research, refers to whether the operational definitions used for independent and dependent variables are valid; The extent to which a measure taps into the construct of interest (does it measure what it is supposed to measure and not something else?)
Systematic error
may not be problematic if you are interested in relative comparisons (between conditions, pre-post, etc.), as opposed to absolute observations; varies between conditions being compared is very problematic
Four scales of measurement (nominal, ordinal, interval, and ratio)
measurement scale in which the numbers have no quantitative value, but rather identify categories into which events can be placed; measurement scale in which assigned number stand for relative standing or ranking; measurement scale in which numbers refer to quantities and intervals assumed to be of equal size- a score of zero is just one of many points on the scale and does not denote the absence of the phenomenon being measured; measurement scale in which numbers refer to quantities and intervals are assumed to be of equal size- a score of zero denotes the absence of the phenomenon being measured
Random error
not problematic if multiple observations lead to a reliable measurement
convergent validity,
occurs when scores on a test designed t measure some construct (e.g., self-esteem) are correlated with scores on other tests theoretically related to the construct; Does the measure relate to scores on other measures that are theoretically related to the construct?
discriminant validity
occurs when scores on a test designed to measure something construct (e.g., self-esteem) are uncorrelated with scores on other tests theoretically unrelated to the construct; Does the measure not relate to scores on other measures that are theoretically unrelated to the construct?
Ceiling/Floor effects
occurs when scores on two or more conditions are at or near the maximum/minimum possible for the scale being used, giving the impression that no differences exist between the conditions; When scores are too high or too low, you can fail to observe an effect even if there is one
What it means to "reject the null hypothesis"
rejecting the null hypothesis means that whatever was believed before the study was proven to be false from the results founded
Multiple methods and converging evidence
research involves more than one method being used in study; refers to the concept that independent scientific findings acquired with different methodological techniques can come together, or converge, to create strong and well-supported conclusions
Basic research
research with the goal of describing, predicting, and explaining fundamental principles of behavior; Describing, predicting, and explaining fundamental principles of behavior and mental processes.
applied research
research with the goal of trying to solve an immediate real-life problem; Direct and immediate relevance to the solution of real-world problems
Ways to achieve internal validity
researchers need to operationally define IV in a way that isolates construct of interest - Simple/subtle manipulations tend to be easiest to interpret - Big manipulations are easily confounded by other factors unrelated to the construct of interest - Researchers often need to operationalize IVs in multiple ways and provide converging evidence
Ecological validity
said to exist when research studies psychological phenomena in everyday situations (e.g. memory for where we put our keys); Extent to which the conclusions of a study can be generalized to natural, real-life settings and situations.
Confirmation bias,
social cognition bias in which events that confirm a strongly held belief are more readily perceived and remembered; disconfirming events are ignored or forgotten; We search for, interpret, favor, and recall information in a way that confirms preexisting beliefs and hypotheses • Ignore/forget contradicting evidence (memory is designed to function this way) • Belief Perseverance • Anecdotal "evidence"
Tentative nature of the scientific method
subject to change in the light of new evidence or new interpretation of existing evidence
Reliability of a measurement
the extent to which measures of the same phenomenon are consistent and repeatable; measures high in reliability contain a minimum of measurement error; The extent to which a measurement gives the same result on different occasions
Central tendency (mode, median, mean)
the most frequently appearing score in a data set; the middle score of a data set- an equal number of scores is both above and below the median; the arithmetic average of a data set found by adding the scores and dividing by the total number of scores in the set.; The most frequently occurring score in a sample; The middle score in a rank-ordered sample; The average of scores in a sample
Psychological science
the scientific method is the backbone of psychology
Skewed distributions; outliers; interpreting the mean
things that can change the interpretation of the results
inferential statistics
used to draw about the broader population on the basis of a study using a sample of that population; Method of drawing conclusions about data to make generalizations