Psych Methods Midterm 1
Differentiate between a psychological construct, a conceptual definition, and an operational definition. Be prepared to identify these in provided examples and to generate new examples yourself. Discuss some of the steps that Friedman & Woods (2005) describe in developing operational definitions related to the construct of early intervention.
A psychological construct is a tool used to understand human behavior. It includes cognitions, emotions, attitudes, personality characteristics and intelligence. Conceptual definition is concrete and similar to something we would find in a dictionary, it describes what is happening but not the methods. One example is describing a hungry person as having a desire for food. An operational definition defines a concept through measurement and induction methods. The same hunger example would be described as being deprived of food for X amount of hours. In Friedman & Woods study, they developed operational definitions by sorting the behaviors into categories and then collapsing these categories into broader subsections, such as "Conversation and Information Sharing."
Describe some limitations of behavioral measures.
Can be ambiguous Can be difficult or impossible to collect Researchers could lose scientific objectivity Researchers run the risk of influencing the behavior of the participants
What are some advantages and disadvantages of disguised and nondisguised observation techniques? What do we mean by unobtrusive measures?
Disguised observation techniques- when researchers conceal the fact that they are watching/recording Advantages: behavior is not influenced by researchers presence Disadvantages: raises ethical issues about privacy and consent Nondisguised observation- when individuals know they are being studied Advantages: can get informed consent, can be more direct and unambiguous Disadvantages: people often do not respond naturally when they know they're being recorded Unobtrusive measures involve measures that can be taken without participants knowing that they are being studied ex) counting number of empty liquor bottles in neighborhood garbage cans instead of asking residents to report consumption
What implications do the Milgram and Zimbardo studies have for our role as experimenters regarding the power of the situation and especially social roles?
Background: Teacher and learner roles. One of the participants was a confederate (was in on the study). The actual participant was always a teacher. While the teacher was watching, the learner was attached to electrodes that shocked them when they gave the wrong answer. After several increasing shocks, the learners would scream and pound on the wall. After all this, the learner would be silent. The teacher was asked to continue by the experimenter, even if they wanted to stop. Most Psychologists predicted that teachers would rarely continue if they wanted to stop, and that they would never use the strongest shock (2%). Whereas in reality, 65% of teachers went to to the maximum level. Implications: The Milgram and Zimbardo studies show us that experimenters have authority over participants. As experimenter, we present the information about a study to possible participants. In this way we have the power to deceive people, as Milgram/Zimbardo did by recruiting for the study under the guise of memory and learning. The studies demonstrated that everyday people will comply with an authority figure even if it means going against their own moral values.
Describe 4 categories of response variable measures (behavioral observation, self-report, physiological, and archival) and give examples of each.
Behavioral observation: direct observation of behavioral tendencies in either a contrived or naturalistic setting ex) eating, blinking, smiling Self-report: a measure on which participants provide information about themselves. Usually one of three formats: free response, rating scale, and fixed alternative ex) filling out a questionnaire in rating scale response format about a movie participants watched Physiological: a measure of bodily activity generally used to assess processes within the nervous system ex) neural electrical activity, neuroimaging, biochemical processes like salivation Archival: data are analyzed from existing records ex) census reports, court records, or personal letters
What are the three principles identified in the Belmont report?
Beneficence, Justice, and Respect for Persons
Describe practices to increase reliability of our observational measures.
Clear and precise operational definitions must be provided for the behaviors that will be observed and recorded Raters should practice using the coding system, comparing and discussing their practice ratings with one another before observing the behavior to be analyzed
Describe three methods of nonprobability sampling.
Convenience sampling Most common in psych (explained in first question) Recruiting participants based on their availability Researchers uses undergrads/people from local community You can still test hypotheses about the relationships among variables, even if you cannot define the population from which the individual came If you do want to generalize, you can repeat the experiment with other convenient samples that are part of a particular population Quota Sampling A type of convenience sampling where researcher determines beforehand what proportions they want of particular populations (ex: I want 60 women and 60 men in my sample) Still based on availability of participants, it's just that an effort is made to recruit certain individuals from that population Purposive (judgmental) sampling Where researchers use past research findings to create a representative sample Ex: national elections Based on previous elections, researchers identify populations that tend to vote in the same way as the country does as a whole Generally, it is unreliable, not great at creating good representative samples Snowball Sampling Where existing participants recruit from their acquaintances Bias: most likely people chosen will be people who have a lot of friends Utilized with populations that you cannot obtain a sampling frame from (ex: sex workers)
What are deduction and induction, and how do they relate to theories and hypotheses? Why is it impossible to prove a theory?
Deduction and induction are ways to generate a hypothesis. Deduction involves reasoning from a general proposition (theory) to specific implications of that proposition (hypotheses) (theory → hypothesis → confirmation). Induction involves generating a hypothesis from observed facts (includes previous research findings) (observations → hypothesis → theory). It is impossible to prove a theory because a theory is only a set of concepts that organizes observations and inferences to predict phenomenon. Theories have been substantiated through previous research. Not a single hypothesis, but a collection of facts and data. Proving a theory would mean that predictions were always true, but you can only test a finite set of predictions to confirm the theory. In this way, a theory can be verified, but never proven.
Describe some of the methodological challenges Dennett (1998) described encountered in moving from thought experiments to real experiments. What advantages does an empirical rather than armchair approach afford?
Denet encountered problems mostly because philosophical problems are set in an idealized world where everything goes right. In the field, this is often not the case. Some problems he encountered were that all experiments were not replicable, that the proximity and facing of the monkey mattered (i.e. hard to get the experiment just right for data collection). In addition, in non-thought experiments, the data can suggest multiple conclusions, and coding behaviors is not a simple task. Therefore, although the theoretical stages of thought can inspire empirical data. The empirical rather than armchair approach gives evidence for the theory rather than generalized thoughts on the theory, despite the difficulties in data collection and experiment implementation.
Dennett (1998)
Dennetts study started as a thought experiment that later transferred to a real world experiment. The experiment was on Vervets and how to decipher their language. This experiment connects to the class/reading in that it demonstrates how to create an experiment and the difficulties with measurements. It goes through the difficulties in using data collection (description) in order to understand explanations and make predictions.
We discussed 4 (description, prediction, explanation, control) goals of scientific research. Explain what is meant by each goal, how they are interrelated, and when and why researchers would pursue one over the other. What types of research designs/strategies (and variables) are often associated with these goals? Be able to give an example of each.
Goals. Description: Describing behavior is a way to give context. Description in psychology is used to describe a problem, issue or behavior. This description is used to distinguish abnormal vs normal behaviors. The research designs for descriptive psychology are case studies, surveys, self-tests and natural observation. Description is used as a basis for predictions. Explanations can not be used without descriptions. Predictions cannot be made without descriptions of behavior. Researchers may solely use description if there is not yet a basis for the other three goals. Prediction: Prediction in psychology is the act of looking at past observed behavior in order to predict a plausible future outcome in behavior/if other humans might exhibit similar patterns in behavior. This can be found through statistical analysis of past data (i.e. data has to be collected). This is used in order to help control or change behaviors. So prediction may be used in order to help create new experimental controls. Explanation: Explanation is psychology gets at interpreting behavior (as opposed to simply describing it). It builds on the descriptions noted (i.e. the man is frowning) and the predictive theories in place in order to give a reason as to why the behavior occurred. Often past test results are used in order to fortify the explanation and are used in order to help understand the behavior. If an explanation is found, then this can be used in order to inform how to influence the participants/subjects. Control: Psychologist aim to control influence and change the subjects/participants behavior. By understanding the explanation behind a behavior, a test can be created with this in mind in order to influence said behavior. For example, if we know that the explanation for frowning is sadness, a test may be run that aims to influence the participant into being sad so that the behavior can be observed. All of the four goals of psychology work together and build off one another.
What are the phases of scientific research as we covered them in lecture? Relate these by example to your lab project on naturalistic observation - what aspects of that project do you see relating to the different phases of research?
Idea generation Problem definition Procedures design Observation Data analysis Interpretation Communication
Discuss some of the considerations that one should make when selecting a sample size. What is an economic sample? What is power, and what are some of the factors that will affect it?
If the goal of the research is to make generalizations about a population, the sample needs to be representative of that population. In turn, smaller populations are easier to manage and the larger the sample size that is in proportion to the population, the better. The most important idea behind an economic sample is that we are not trying to get the biggest possible sample. Although this increases generalizability to the population as a whole it may take a lot of time and effort. Instead, we strive for a sample that is large enough to detect an effect given the likelihood of their being an effect. Power is a statistical measure of sample size and the likelihood of detecting an effect at that sample size. Power helps us have an estimate of a good economic sample size. Factors that affect it include sample size, sample size in proportion to the population size, and the likely size of the effect on past research.
What is the difference between a probability sample and a nonprobability sample? When is it appropriate to use one or the other?
In a probability sample, the only factor determining what participants from the population make up the sample is probability; it is random. A common way of getting a probability sample would be to systematically choose participants. For example, if you are choosing from a population that goes to the ER on Friday nights, you may choose every fourth person that walks into the ER on a Friday night. This type of sample is useful for surveying and research that does not require the sample to have other factors in order to be representative of the population. Due to this fact, probability samples are rarely used in Psychological studies. Rather, nonprobability samples are used, which create samples of a population due to factors other than probability, frequently the availability of participants.
Explain why researchers rarely use a true random sample. What is a representative sample? What is sampling error? What is the error of estimation, for which types of sampling methods does it apply, and by what factors is it affected? What is misgeneralization, and what are some of the factors that might make it more likely to occur? What does it mean to be a WEIRD participant (Rad, et al., 2018), and what are some of the problems with convenience samples of such individuals?
In order to use a true random sample, a sampling frame for every individual within a population or of every individual in a cluster representing that population must be obtained. Researchers rarely use a true random sample because they cannot obtain a sampling frame from large populations. A representative sample is on we can draw accurate, unbiased estimates of the characteristics of the larger population. Sampling error is the difference between the values obtained in the sample and the population at large. If a sample is truly representative, sampling error should be very low. The error of estimation is only used in probability sampling because it predicts sampling error based on probability alone. In other words, it indicates the degree to which the data obtained from the sample are expected to deviate from the population as a whole. The confidence in a probability sample is expressed by the error of estimation. For example, if you say the results of a study are accurate within 3%, you are 95% sure that everyone in the population at large would yield the same results +/- 3%. Misgeneralization is when you apply the results from a nonrepresentative sample to a population at large. It is more likely to occur when your sample size is much smaller in proportion to the population at large, and when the variance of data is high because the mean of the results will not accurately reflect all the values of the population. A WEIRD participant comes from a population that is western, educated, industrialized, rich, and democratic. A problem with convenience samples of these individuals is that they are frequently used in studies that make generalizations about populations that include non-WEIRD individuals. It is difficult to strive for universal claims without cultural variability within a sample.
What criteria are seen as important to doing good science? What are the properties of a good scientific theory? Of a good hypothesis?
It is important to have a background in research, use systematic empiricism (relying on an observation in order to draw a conclusion), be publicly verified (so that others can replicate the study and further test the hypothesis, and to decrease fabricated data), and be centered around a solvable problem (questions that can be answered given current knowledge). A good scientific theory is valid in that it is backed up by empirical evidence. A good theory proposes causal relationships, is coherent, is parsimonious, generates a testable hypothesis, stimulates others to conduct similar research, and solves an existing theoretical question. A good hypothesis uses deduction and induction.
List some categories of physiological measures and give an example of each.
Measures of neural electrical activity ex) EEG or EMG Neuroimaging ex) CAT scan or fMRI Measures of autonomic nervous system activity ex) heart rate, respiration, blood pressure Blood and saliva assays ex) measuring testosterone in saliva or properties of blood correlated to health Precise measurement of overt reactions ex) special sensors attached to face to measure blushing in studies of embarrassment
When might one use naturalistic observation, and when might one instead observe in a contrived setting?
Naturalistic observation involves the observation of ongoing behavior as it occurs naturally with no intervention or intrusion by researcher. ex) parent-child interactions on playground, animals in their natural habitat Contrived observation involves observation of behavior in settings that are arranged specifically for observing and recording behavior ex) observing relationships in laboratory setting behind one-way mirror, setting up situations outside lab to observe people's reactions
Explain when (and why) behavioral observation measures are preferred. Provide an example of situations that call for behavioral observation.
Often unobtrusive Must be used if participants cannot give self-report ex) infants Does not require accurate introspection
What are some approaches to generating research ideas? From where might we draw inspiration?
One own's experiences/observations, folk psychology, problems already studied that you want to modify, testing an existing explanation, etc.
Explain what reliability is and why it is so important. How does it relate to measurement error, true score, and observed score? What are some factors affecting measurement error? Provide examples to illustrate. Describe with examples three types of reliability and discuss how you might evaluate your operational definition on each (conceptually, not statistically).
Reliability is important because it pertains to how much our measures of a particular variable are to the true score, or a perfect measurement of that variable. For example, it we are doing research that pertains to people's intelligence, there is no way we can obtain their true intelligence. All we can do is create measures of intelligence, that are reflected in observed scores. A true score would be a measurement of that variable without any error. The closer measures are to a participants true score, indicating low measurement error, the more reliable it is. Mad stuff affects measurement error: transient states (ex: participant's mood), stable attributes (ex: less intelligent participants give incorrect answers because they do not understand the questions), situational factors (ex: if a researcher is super nice, the participant may try harder than they normally would), characteristics of the measurement itself (ex: research makes questions too ambiguous), and actual mistakes (ex: accidentally recording a measure that did not occur). Three types of reliability: Test-retest reliability: the consistency of participants' responses on a measure over time (assuming the true score is constant overtime) Example: if you are testing intelligence, scores on a test you create as a measure should not vary when taken a month apart To operationalize: measure construct over a period of time Interitem reliability: the consistency among multiple items in a single measure Example: if you are measuring depression with an inventory consisting of multiple questions, all the questions should pertain to the same construct To operationalize: look at correlation between an individual item and the overall measure, if it is low it should be removed Also split-half reliability: splitting the items in two and then correlating the two measures Interrater reliability: the consistency among two or more researchers who observe/record participants' behavior Example: if you and another researcher are recording the number of times a participant smiles and you both get different numbers To operationalize: only use data that the researchers agree on/are close
What is meant broadly by the word "measurement?" What is meant by the different properties of numbers (identity, magnitude, equal intervals, true zero), and which scales of measurement (nominal, ordinal, interval, ratio) possess which of these properties? Why is it important to consider the scale of measurement of the response measures that we use? Contrast reliability and validity and how they affect our ability to evaluate our hypothesis.
Reliability pertains to what extent the measurement of a construct contains error/is consistent. However, validity pertains to whether or not that measurement is measuring its intended construct. Without reliability, we will incorrectly interpret our hypothesis because our measurements will not have reflected the actual constructs. Without validity, out hypothesis we will also incorrectly interpret out hypothesis because we were not actually measuring the intended construct. Therefore, both reliability and validity are needed to correctly evaluate a hypothesis.
When are self-report measures appropriate? What advantages do they have? What are some examples of self report measures?
Self-report measures are especially useful when researchers need information on things like past experiences, feelings, self-views, and attitudes. This information can be especially difficult, impractical, or even unethical to acquire through direct observation. It can be much easier and more direct to get information firsthand from participants. Two examples of self-report measures are questionnaires and interviews.
What are four basic methods of probability sampling? Be prepared to identify examples of each.
Simple random sampling Assign a random number to everyone in the population, and then choose based on some random process, frequently done with a random number Systematic sampling Pick people based on a determined pattern (an example in the above question) Stratified random sampling Population is separated into strata (based on some variable) and then there is random sampling from each strata For example, you could break up a population based on where they live and then randomly choose an equal number of people from each to get a good representation of each strata Cluster sampling Unlike simple/stratified random sampling, cluster sampling does not require a sampling frame (list of everyone in a population) → saves a LOT of time if your population is large 1) sample groupings (clusters) not individual participants → after obtaining groupings, 2) get sampling frame from each and then randomly sample participants Example Population: college students in the United States Very large 1) get a list of all of the colleges in the US, these would serve as clusters → randomly sample from the clusters (randomly sample five universities) → then get a sampling frame from the clusters you chose and randomly select individual participants from there
What kinds of biases can be observed to affect self-report measures?
Social desirability response bias: participants concerned about how they will be perceived and evaluated, so they respond in a socially desirable manner rather than naturally and honestly Acquiescence and nay-saying response styles: tendency to agree with statements regardless of content or tendency to disagree.
What is a stimulus variable? What is a response variable? How do these relate to different types of variables as defined by their use in research (predictor variable, criterion variable, independent variable, dependent variable)? What do we mean by levels of a variable? Be prepared to identify different variable types (and levels) in provided examples, and to generate examples yourself.
Stimulus and response variables are general categories of variables. A response variable is the outcome or mental state/behavior of interest a researcher is studying while the stimulus variable are factors that have an effect on that mental state/behavior of interest. Predictor and independent variables are types of stimulus variables but they are used in specific research designs (predictor is the stimulus variable in a correlational design and independent variable is the stimulus variable in an experimental design. Criterion and dependent variables are response variables (criterion variable is the response variable in a correlational design and the dependent variable is the response variable in an experimental design). Levels of a variable, usually referring to stimulus variables, we are talking about the number of conditions of the stimulus variable. In the Vohs paper the stimulus variable is an orderly environment and the two levels are orderly and disorderly. Often researchers aim for, especially in higher constraint research, a directional hypothesis (we predict the direction of the effect of one variable on another). For example, the Vohs paper we predict that orderliness will affect behavior in a certain way, orderly will lead to conventionality, the stimulus variable will lead to an increase or decrease the response variable.
Describe 5 ways of knowing and how or whether they are used in science.
Tenacity: "it has always been that way" Includes superstitions (ex: the number 13) Doesn't connect to empirical evidence or observation Intuition: "it feels true" Sometimes align w/ experiences, but not always Not always connected with empirical evidence Authority: "the boss says its true" The authority could be wrong (ex: research on how vaccines cause autism) Rationalism: "it makes sense logically" Create beliefs on assumptions via rules of logic Empiricism: "it is observed to be true" objective/direct observation Our observations are not objective Science: combination of rationalism and empiricism *cannot have one or the other: empiricism means nothing out of context of rationality and rationalism cannot be supported without empiricism
What is an Institutional Review Board?
The Institutional Review Board (IRB) is an administrative body established to protect the rights and welfare of human research subjects recruited to participate in research activities. The IRB is charged with the responsibility of reviewing, prior to its initiation, all research involving human participants.
Understand the basic elements of ethical treatment of human research subjects and what to consider in a cost- benefit analysis of a study; describe the basic issues when considering informed consent, privacy and confidentiality, coercion, physical and mental stress and harm, use of deception, debriefing, and vulnerable populations. Be prepared to assess an example study along ethical dimensions.
The official guidelines for research provided by the federal government and professional organizations, including the APA, are utilitarian. In this way, they rest on cost-benefit analysis. Benefits: basic knowledge, improvement of research or assessment techniques, practical outcomes (ex: antibiotic research), benefits for researchers (ex: advances their career), and benefits for participants (ex:clinical research). Costs: the most minor is the time and effort the participant must give, more serious risks: mental/physical welfare (ex: aversive states like anxiety, boredom, pain), most serious: threatens health/life. Other costs: money, supplies, etc. or is detrimental to the research community at large. Primary way to ensure the protection of participants' rights: to obtain informed consent (research participants were informed about the nature of the study and have given their explicit consent) Prevents researchers from violating privacy (by studying them without their knowledge) Gives participants enough information for them to make a reasoned decision about whether or not they want to participate Signed forms Problems with informed consent Undermines a study's validity: participants will act differently when they are under scrutiny versus when being observed, and if they know about specific hypothesis, can adjust their behavior in a way they normally would not Not all populations can give informed consent: children, psychotic/mentally retarded people (consent is obtained from parent/legal guardian) May not be necessary for the study: naturalistic observations (ex: researcher observing patterns of seating on public buses) IRB waives requirement of informed consent if 1) the research involves no moe than minimal risk to participants, 2) the waiver of informed consent will not adversely affect the rights and welfare of participants 3 Privacy APA guidelines do not offer explicit guidelines Matters when participants do not know that they are being studied or are not told that certain kinds of private information are being collected Coercion All ethical guidelines insist that participants must not be pressured into participating in research When participants agree to participate because of real/implied pressure from someone two had authority over them Ex: professors who ask their students to participate in research Researchers must respect people's freedom to decline or discontinue their participation and cannot offer high incentives Physical and mental stress Minimal risk: risk of harm or discomfort that is no greater in probability/magnitude than risks people experience in daily life Higher risk (cause stress/pain): allowed if benefit is high and participant agrees after being informed of possible risks Additional safeguards: lead investigar monitors participants after they leave the study/make ongoing reports on any evidence of adverse effects Deception APA/federal government: you can never use deception about aspects of a study that would prevent participants from participating Why use it? To prevent participants from learning true purpose of a study Can include Using a confederate Providing false feedback to participants Giving incorrect information regarding stimulus material Cons Some people believe it is never ethical to use in research Even when it is justified on grounds that it has positive outcomes, it can lead to undesired consequences Ex: research participants enter a study already suspicious of what the researcher tells them because it is known that deception is used in research Most people agree that deception is important in research settings As long as they are informed afterwards, and were deceived for good reasons Confidentiality Information received about research participants in all course of study are confidential How can you insure? w/ anonymous data! Avoid collecting informationation that could uniquely identify participants When data cannot be anonymous/is used in the research: identifiers are used
Explain what validity is and why it is so important. Why is validity harder to assess than reliability? Describe the different kinds of validity and give examples of each. What are some problems with face validity?
Validity refer refers to the extent to which a measurement actually measures the intended construct and is so important because it determines whether or not your hypothesis was actually supported. If a measurement does not reflect its intended construct, you would not actually be measuring that construct, invalidating your hypothesis. Validity is harder to access because reliability can be determined overtime by accessing correlations whereas validity requires knowledge about the nature of the construct itself. Types Face: involves subjective judgement, is valid by the judgement of experts/or the public at large, it is never sufficient in proving the validity of a construct and also can exist in both cases when a construct is not valid and is valid Ex: (when the construct is actually valid but lacks face validity): job applicants sued a company when requiring to take a test with unusual questions (ex: "Evil spirits possess me sometimes") Although these questions had face validity to experts as a psychological measure, they lacked face validity with the applicants, which made them sue Problems with face validity: they don't tell you anything about actual validity and could prevent research from being carried out even if the measures are actually valid because the public gets mad Construct: in behavioral science we usually measure hypothetical constructs (which are unobservable: love, intelligence, learning, etc.) so to access the validity of a hypothetical construct, we can compare how we measure it to how we measure other hypothetical constructs Ex: if we are measuring self-esteem, whatever scores we get should be positively correlated to scores of confidence and negatively correlated with scores of anxiety Criterion-related: the extent to which a measure allows us to distinguish among participants on the basis of a behavior Ex: Do SAT scores distinguish between students who will do well in college from those who will not? If yes: has criterion-related validity