Intro to Experimental Psych Exam 2
Shere Hite had 4,500 surveys returned to her. This is a large sample, which is desirable, so what was the problem with using all of the surveys returned?
4,500 surveys represented a very small return rate compared to the 100,000 surveys sent out, making the number of surveys unrepresentative of the sample.
Multiple-group time-series design
A design in which a series of measures are taken on two or more groups both before and after a treatment
Single-group posttest-only design
A design in which a single group of subjects is given a treatment and then tested
Single-group time-series design
A design in which a single group of subjects is measured repeatedly before and after a treatment
Single-group pretest/posttest design
A design in which a single group of subjects takes a pretest, then receives some treatment, and finally takes a posttest
Nonequivalent control group pretest/posttest design
A design in which at least two nonequivalent groups are given a pretest, then a treatment, and finally a posttest
Nonequivalent control group posttest-only design
A design in which at least two nonequivalent groups are given a treatment and then a posttest measure
Solomon four-group design
A design with four groups that is a combination of the posttest-only control group design and the prestest/posttest control group design
Null hypothesis
A hypothesis that specifies what the results of an experiment will be if the main hypothesis being tested is wrong. Often states that there will be no difference between experimental groups.
Floor effect
A limitation of the measuring instrument that decreases its ability to differentiate between scores at the bottom of the scale
Ceiling effect
A limitation of the measuring instrument that decreases its ability to differentiate between scores at the top of the scale
Rating scale
A numerical scale on which survey respondents indicate the direction and strength of their response
Socially desirable response
A response that is given because a respondent believes it is deemed appropriate by society
Two-tailed hypothesis
A statistical test in which the critical area of a distribution is two sided and tests whether a sample is either greater than or less than a certain range of values
Systematic replication
A study that varies from an original study in one systematic way
With which survey methods is interviewer bias of greatest concern?
A telephone survey or a personal interview are the survey methods in which interviewer bias is of greatest concern.
Sampling bias
A tendency for one group to be overrepresented in a sample.
How is it possible for a test to be reliable but not valid?
A test can be reliable but not valid if there is a lot of consistency with the measures but not be measuring in a constructive way; does not measure what it claims to be measuring. A test can consistently measure something other than what it claims to measure.
Regression to the mean
A threat to internal validity in which extreme scores upon retesting tend to be less extreme, moving toward the mean
Maturation effect
A threat to internal validity in which naturally occurring changes within the participants could be responsible for the observed results.
Diffusion of treatment
A threat to internal validity in which observed changes in the behaviors or responses of the subjects may be due to information received from other subjects in the study.
Testing effect
A threat to internal validity in which repeated testing leads to better or worse scores
Experimenter effect
A threat to internal validity in which the experimenter, consciously or unconsciously, affects the results of the study
Mortality (attrition)
A threat to the internal validity in which differential dropout rates may be observed in the experimental and control group, leading to inequality between the groups.
Mail survey
A written survey that is self-administered
Why does alternate-forms reliability provide a measure of both equivalency of items and stability over time?
Alternate-forms reliability provides both a measure of equivalency of items and stability over time because it is distinguishing the degree of relationship between scores on two equivalent tests. This ensures that the tests are truly parallel to another.
Type I Error
An error that occurs when a researcher concludes that the independent variable had an effect on the dependent variable, when no such relation exists; a "false positive" (Source: CHH, 2 Ed).
Type II Error
An error that occurs when a researcher concludes that the independent variable had no effect on the dependent variable, when in truth it did; a "false negative" (Source: CHH, 2 Ed).
Between-subjects design
An experiment in which different subjects are assigned to each group
Posttest-only control group design
An experimental design in which the dependent variable is measured after the manipulation of the independent variable
Pretest/posttest control group design
An experimental design in which the dependent variable is measured both before and after manipulation of the independent variable
Single-blind experiment
An experimental procedure in which either the experimenter or the participants are blind to the manipulation being made
Double-blind experiment
An experimental procedure in which neither the experimenter nor the participant knows the condition to which each participant has been assigned; both parties are blind to the manipulation
Statistical significance
An observed difference between two descriptive statistics such as means that is unlikely to have occurred by chance
Confound
An uncontrolled extraneous variable or flaw in an experiment
On the most recent exam in your biology class, every student made an A. the professor claims that he must really be a good teacher for all of the students to have done so well. Given the confounds discussed in this module, what alternative explanation can you offer for this result?
Ceiling effect—the test would not be sensitive enough to detect differences in knowledge of biology.
Partially open-ended questions
Closed-ended questions with an open-ended "other" option
You have just developed a new comprehensive test for introductory psychology that covers all aspects of the course. What type(s) of validity would you recommend establishing for this measure?
Content and criterion/concurrent validity. This would measure multiple aspects of the people's knowledge on the test.
Explain the differences between a stratified random sample and quota sampling.
Differences: Quota sampling uses convenience sampling to obtain the participants o Does not sample from the population randomly like stratified random sampling does Not all of the members of the population have an equal chance of being selected for the sample in quota sampling, whereas in stratified random sampling they do In stratified random sampling, subgroups or strata are fairly represented, unlike quota sampling
External validity
Extent to which we can generalize findings to real-world settings
Why is face validity not considered a true measure of validity?
Face validity is not considered a true measure of validity because it does not refer to what the text is actually measuring, only what it appears to be measuring on the surface. It has less to do with actual validity than with the public's view.
A researcher hypothesizes that children in the South weigh less (because they spend more time outside) than the national average. Identify Ho and Ha. Is this a one- or two-tailed test?
Ho: µSouthern children ≥ µchildren in general Ha: µSouthern children < µchildren in general This is a one-tailed test because it is claiming that children in the South weigh less than the national average, which shows expectancy in a specific direction.
The admissions counselors at Brainy University believe that the freshman class they have just recruited is the brightest yet. If they wanted to test this belief (that the freshmen are brighter than the other classes), what would the null and alternative hypotheses be? Is this a one- or two-tailed hypothesis test?
Ho; µFreshman = µAll other classes, or µFreshman ≤ µAll other classes Ha: µFreshman > µAll other classes This hypothesis test would be one-tailed because it has a clear direction in which the expected difference is between the groups.
If on your next psychology examination you find that all of the questions are about American history rather than psychology, would you be more concerned about the reliability or validity of the test?
I would be more concerned about the validity of the test because if the exam is about history rather than psychology in a psych course, then the instrument (test) will not be measuring what it claims to measure (students' knowledge).
Imagine that a husband and wife who are very tall (well above the mean for their respective height distributions) have a son. Would you expect the child to be as tall as his father? Why or why not?
I would expect the child to not be the same height as his father because in the confounding variable in regression to the mean, subjects who are selected for study based on the extreme characteristics can be less extreme at a later testing. In this case, the son could be less tall than his father because his height could have regressed to the mean.
A student at your school wants to survey students regarding their credit card use. She decides to conduct the survey at the student center during lunch hour by surveying every fifth person leaving the center. What type of survey would you recommend she use? What type of sampling technique is being used? Can you identify a better way of sampling the student body?
I would recommend that she use personal interviews to conduct the survey. The sampling technique that is being used is convenience sampling, because the study is being held at the school that she goes to, so the subjects in the school would be the most convenient for her to use. A better way of sampling the student body would be to randomly select a certain number of people from the school, then survey them on their credit card use.
Which of the following variables would be a subject variable if used as a nonmanipulated independent variable in a quasi-experiment? Gender Religious Affiliation Ethnicity Amount of alcohol consumed Amount of time spent studying Visual acuity
If used as a nonmanipulated independent variable in a quasi-experiment, the subject variables would be gender, ethnicity, religious affiliation, and visual acuity because these are conditions that each member cannot be assigned to as they come to a study. They come with these characteristics to the study.
What are internal validity and external validity? Why are they so important to researchers?
Internal validity: The extent to which a causal conclusion is warranted based on a study. It is important to researchers because it allows the study to have a cause and effect and it proves that there is a relationship between variables. External validity: the extent to which the results of an experiment can be generalized. It is important to researchers because it allows them to apply the effect of the study to the outside world.
Two people observe whether or not vehicles stop at a stop sign. They make 250 observatons and disagree 38 times. What is the interrater reliability? Is this good, or should it be of concern to the researchers?
Interrater reliability = 212/250×100=84.8% This should not be of concern to the researchers because the interrater reliability is fairly high—it is very close to 100%. This means that there is consistency with the measures that those two people conducted.
The librarians are interested in how the computers in the library are being used. They have three observers watch the terminals to see if students do research on the Internet, use e-mail, browse the Internet, play games, or do schoolwork (write papers, type homework, and so on). The three observers disagree 32 out of 75 times. What is the interrater reliability? How would you recommend that the librarians use the data?
Interrater reliability = 43/75×100=57.33% This indicates that there is an issue with the librarian's measurement technique. I would recommend that the librarians not use this data because it will not be reliable. They are not paying enough attention to what activities the students in the library are doing.
Explain the differences in terms of flexibility and control between naturalistic and laboratory observational research.
Naturalistic observational research: With this type of research, there is more flexibility with the observation of the subjects. Therefore, there is not a lot of control because the researcher is simply observing the reactions of animals or humans in their everyday setting. This allows for the subjects to act how they normally would on a daily basis, which is why the researcher does not necessarily have a lot of control over what happens during this type of observation. Laboratory observational research: With this type of research, control is increased, so flexibility is decreased. The subjects most likely will have much more reactivity because they know they are being watched.
Provide several operational definitions of anxiety. Include nonverbal measures
Nonverbal measures: Amount of leg shakes per minute Number of fingers cracked per minute
A researcher believes that family size has increased in the last decade in comparison to the previous decade, that is, people are now having more children than they were before. What would the null and alternative hypotheses be in a study designed to assess this contention? Is this a one- or two-tailed hypothesis test?
Null hypothesis: µFamily size now ≤ µFamily size in previous decade Alternative hypothesis: µFamily size now > µFamily size in previous decade This hypothesis test would be a one-tailed test because there is a prediction in the direction in which the expected difference of family size is in the last decade.
Identify the type of measure used in each of the following situations: e. As part of a research study, your professor takes pulse and blood pressure measurements on students before and after completing a class exam.
Physiological measure
Provide several operational definitions of anxiety. Include physiological measures.
Physiological measures: The amount of muscle tension, perspiration, and stress that a person in experiencing. An increase of heart/respiratory rates A particular pattern of answers on a self-survey about anxiousness
Describe the difference between probability and nonprobability sampling.
Probability sampling is a technique in which each of the members of the population has a known probability of being selected as part of the sample, whereas in nonprobability sampling the individual members of the population do not have an equal likelihood of selection.
What is the difference between a true experimental design and a quasi-experimental design?
Quasi-experimental designs may not have an independent variable, but a true experimental design does have a manipulated independent variable. Also, in a quasi-experimental design, groups come to a study already differing.
Closed-ended questions
Questions for which subjects choose from a limited number of alternatives.
Open-ended questions
Questions for which subjects formulate their own responses.
Demographic question
Questions that ask for basic information such as age, gender, ethnicity, or income.
Identify the type of measure used in each of the following situations: b. When you join a weight loss group, they ask that you keep a food journal noting everything that you eat each day.
Self-report
Identify the type of measure used in each of the following situations: d. While eating in the dining hall one day, you notice that food services have people tallying the number of patrons selecting each entrée.
Self-report
Identify the type of measure used in each of the following situations: a. As you leave a restaurant, you are asked to answer a few questions regarding what you thought about the service.
Self-report measure
Explain the similarities between a stratified random sample and quota sampling.
Similarities: Both accurately represent the population on certain characteristics Both are methods of sampling a given population
How is stratified random sampling different from random sampling?
Stratified random sampling guarantees to represent a specific subgroup or strata in the population, whereas random sampling assures that each member of the population is likely to be chosen.
Identify the type of measure used in each of the following situations: c. When you visit your career services office, they give you a test that indicates professions to which you are best suited.
Test (self-report)
Which of the following correlation coefficients represents the highest (best) reliability score? a. +.10 b. -.95 c. +.83 d. .00
The +.83 (c.) represents the highest reliability score because among those given scores, that is the one that indicates the strongest relationship between variables.
While grading a large stack of essay exams, Professor Hyatt becomes tired and hence more lax in her grading standards. Which confound is relevant in this example? Why?
The confound relevant in this example would be the instrumentation effect, because changes in the dependent variable—the tiredness of the professor—may be due to changes in how well she measures the correctness of exams.
How does the quasi-experimental method allow us to draw slightly stronger conclusions than the correlational method? Why is it that the conclusions drawn from the quasi-experimental studies cannot be stated in as strong a manner than those from a true experiment?
The correlational method cannot conclude that relationships are causal and the variables may possibly be related in a certain way, whereas the quasi-experimental method allows for observation of systematic differences between two or more groups.
Why is data reduction of greater concern when using narrative records as opposed to checklists?
The data are already quantified—they do not have to be reduced at all. With narrative records, there is a way more subjective collection of data than with checklists because the researcher(s) record every single thing that happens during experimentation based on their own perceptions.
Internal validity
The extent to which the results attirbuted to the manipulation of the independent variable rather than to some confounding variable
Alternative hypothesis
The hypothesis that the research wants to support and that predicts a significant difference exists between the groups being compared
Nonmanipulated independent variable
The independent variable in a quasi-experimental design in which subjects are not randomly assigned to conditions but rather come to the study as members of each condition
Which of the following is an operational definition of depression? a. That low feeling you get sometimes. b. What happens when a relationship ends. c. Your score on a 50-item depression inventory. d. The number of boxes of tissues that you cry your way through
The operational definition of depression is c, your score on a 50-item depression inventory because it states the variable in terms of the activities a researcher uses to measure or manipulate it.
How are pretest/posttest designs an improvement over posttest-only designs?
The pretest/posttest designs are an improvement over posttest-only designs because the measures are taken twice: before and after treatment. Then, the measures can be compared to each other. This cannot happen in a posttest-only design.
Imagine that the following questions represent some of those from the survey described in Exercise 1. Can you identify any problems with these questions? c. Most Americans believe that a credit card is a necessity. Do you agree?
The problem with this question is that it is a closed-ended question, because it only requires a yes or no answer.
Imagine that the following questions represent some of those from the survey described in Exercise 1. Can you identify any problems with these questions? b. How much did you charge on your credit cards last month? $0-$400; $500-$900; $1,000-$1,400; $1,500-$1,900; $2,000 or more
The problem with this question is that it is a demographic question. It asks for basic information from the participant in a study.
Imagine that the following questions represent some of those from the survey described in Exercise 1. Can you identify any problems with these questions? a. Do you believe that capitalist bankers should charge such high interest rates on credit card balances?
The problem with this question is that it is a loaded question. It asks if a person "believes" a certain thing, which is an emotionally laden term.
A researcher collects data on children's weights from a random sample of children in the South and concludes that children living there weigh less than the national average. The researcher, however, does not realize that the sample includes many children who are small for their age and that in reality there is no difference in weight between children in the South and the national average. What type of error is the researcher making?
The researcher is making a Type I error because he or she was claiming that there is a difference between the groups of people when in reality there was none.
If the psychology professor in Exercise 2 had access to only one section of introductory psychology, describe how she might use a single-group design to assess the effectiveness of weekly quizzes. Which of the three single-group designs would you recommend?
The researcher would use a single-group posttest-only design in which each exam given in the semester serves as a posttest measure.
Interviewer bias
The tendency for the person asking the questions to bias the subjects' answers
Response bias
The tendency to consistently give the same answer to almost all of the items on a survey.
What is the problems with the following survey question? a. Do you agree that school systems should be given more money for computers and recreational activities?
This is a double-barreled question, because it asks more than two things.
What is the problems with the following survey question? c. Most people feel that teachers are underpaid. Do you agree?
This is a leading question because it uses the terms "most people feel...", leading the person being asked to answer in a desired manner.
Assume that the following conclusion represents an error in hypothesis testing. Indicate whether the statement is a Type I or II error. a. Based on the data, the null hypothesis was rejected.
This is a type I error.
Assume that the following conclusion represents an error in hypothesis testing. Indicate whether the statement is a Type I or II error. c. There was no significant difference between right- and left-handers in their ability to perform a special task.
This is a type I error.
Assume that the following conclusion represents an error in hypothesis testing. Indicate whether the statement is a Type I or II error. b. There was no significant difference in the quality of work between nurses who work 8- and 12-hour shifts
This is a type II error.
Assume that the following conclusion represents an error in hypothesis testing. Indicate whether the statement is a Type I or II error. d. The researcher failed to reject the null hypothesis based on these data.
This is a type II error.
What is the problems with the following survey question? b. Do you favor eliminating the wasteful excesses in the city budget?
This question is a loaded question, because it uses the words "wasteful excesses" which makes the person being asked think a certain way about the money that is going into the city budget.
A researcher is interested in whether listening to classical music improves spatial ability. She randomly assigns subjects to either a classical music condition or a no-music condition. Participants serve in the music or no music conditions for the specified time period and then are tested on their spatial ability. What type of design is this?
This study is a between-subjects design. If subjects are randomly assigned to one of the two conditions, then different subjects are used in each condition.
A researcher randomly selects a group of smokers and a group of non-smokers and then measures lung disease in each group. What type of design is this? If the researcher observes a difference between the groups in the rate of lung disease, why can he or she not conclude that the difference is caused by smoking?
This type of design would be a single group posttest-only design. He or she could not conclude that the difference is caused by smoking by observing a difference between the groups in the rate of lung disease because they solely looked at the characteristics of the lungs, not how a treatment could affect the disease.
Subject (participant) effect
Threat to internal validity in which the subject, consciously or unconsciously, affects the results of the study
If reactivity were your greatest concern in an observational study, which method would you recommend using?
To decrease the amount of reactivity, I would recommend using naturalistic observation because that way, the participants or subjects in a study are less likely to change their behavior because they are carrying on with normal activities.
If a researcher decides to use the .10 level rather than the conventional .05 level of significance, what type of error is more likely to be made? Why? If the .01 level is used, what type of error is more likely? Why?
With the .10 level of significance, the researcher is willing to accept a higher probability that the result may be due to chance. Therefore a Type I error is more likely to be made than if the researcher used the more traditional .05 level of significance. With a .01 level of significance, the researcher is willing to accept only a .01 probability that the result may be due to chance. In this case a true result is more likely to be missed and a Type II error is more likely.
History effect
a threat to internal validity in which an outside event that is not a part of the manipulation of the experiment could be responsible for the results
Instrumentation effect
a threat to internal validity in which changes in the dependent variable may be due to changes in the measuring device.
Conceptual replication
a type of replication of research using different procedures for manipulating or measuring the variables
We discussed the history effect with respect to a study on stress reduction. Review that section and explain how having a control group of equivalent subjects would help to reveal the confound of history.
having a control group in the stress reduction study would help reveal the confound of history because if this confound is present, we would expect the control group also to increase in stress level.
One-tailed hypothesis
only one direction of an effect or relationship is predicted in the alternative hypothesis of the test
Exact replication
replication of research using the same procedures for manipulating and measuring the variables that were used in the original research