Exam 2
How do we know that a weight scale is reliable?
Device shows the same weight when you step on it twice, right after doing it the first time
The goal of using a sample is to provide estimates about the characteristics of ____________________________________
an entire population
If the p-value is greater than alpha (p > 0.05), then we _____________________ the null hypothesis and we say the result is _________________
fail to reject statistically nonsignificant
Scientists are afraid to commit which type of error?
false alarm
How could random errors affect the true value?
fluctuate around the true value, making an increase or decrease in the true value, so we never really see the true value.
type of face-to-face unstructured interview in which a number of people are interviewed at the same time and share ideas both with the interviewer and with each other
focus group
A statistical table that indicates how many individuals in the sample fall into each set of categories on a quantitative variable is a
frequency distribution
One of the advantages of naturalistic research is its
high ecological validity
when criterion validity involves attempts to foretell the future (ex. predicted school performance based on SAT scores)
predictive validity
the proportion of dependent variable that is "explained by" the independent varaible(s)
proportion of explained variability (indicated by the square of the effect-size statistic)
a set of fixed-format, self-report items that is completed by respondents at their own pace, often without supervision
questionnaire
refers to the selection of people who participate in a research project, usually with the goal of being able to use these people to make inferences about a larger group of individuals
sampling
the distribution of all the possible values of a statistic
sampling distribution
refers to the tendency to pay attention to the events that are occurring around you and to adjust your behavior to "fit in" with the specific situation you are in
self-monitoring
In our Meteorologist example, setting alpha to a high value will follow which known principle: "Better...."
"Better safe than sorry"
Cronbach's coefficient scale ranges from alpha = ______________ to alpha = _______________, because it reflects the underlying correlational structure of the state
0.00 to +1.00
What is alpha normally set to?
0.05
List two measures that are used to summarize sample data.
1. Measures of central tendency 2. measures of variability (variance)
Here is a distribution of scores from a measure of depression: 3, 7, 9, 9, 9, 11, 13, 20, 27, 30. In this sample the range is _________ and the mode is____________.
27 9
According to polls candidate 1 has a 5 percent lead over a candidate 2, and the margin of error is 6. What is the likely population true value about the candidate's lead?
Actually the confidence interval (margin of error)
How can we get rid of the random errors in a study?
Add them up, get the average to get a better estimate of the true values
How do we know whether the measures actually assess the conceptual variables they are designed to measure?
Answer: one has to specify two measured (variable) test properties.
What is the relationship between margin of error and number of sample measures?
As the sample size increases, margin of error decreases
Why, in our examples, did some random errors have a positive and some have a negative sign?
Because of the fluctuations
After collecting evidence (data) we can simply calculate the descriptive statistics (such as measures of central tendency and variability). Why can't we clearly make the conclusion about the true population's parameter from such descriptive analysis?
Because there's random variability in the data
What is a sampling distribution of Null Hypothesis?
Can estimate sampling distribution of the null hypothesis from the data pretty well and can use it for hypothesis testing
According to polls candidate 1 has a 5 percent lead over a candidate 2, and the margin of error is 6. What does it mean in terms of the possible final election outcome? Who is likely to win?
Can't trust the lead, could just be a sampling error
Who calculates a p-value?
Computer does
How do we know that it measures weight? [what is a concept of weight?]
Concept of weight is force placed on the scale, need validity
What should be the correlation between different measured variables in the converging validities?
Correlation should be strong
What should be the correlation between different measured variables in the discriminant validities?
Correlation should be zero
measure is an estimate of the average correlation among all the items on a scale and is numerically equivalent to the average of all possible split-half reliabilities
Cronbach's coefficient scale
Which error is more problematic for testing hypotheses?
Depends on the situation
Which measure of central tendency is the best?
Depends on the type of sampling distribution (if symmetric, mean is best; if skwed, median or mode is best)
Which of the four validities come(s) with a " cheaper" price?
Face and content validities are the cheaper types of validity
A failure of which of the two would lead to more detrimental effects in measurement?
Failure of validity is more problematic than failure of reliability.
In our "where is the camera" example, what was demonstrated?
Figuring something out by testing the null hypothesis
What was the main idea behind t-statistics and beer brewing, invented by William Gosset?
From a sample, you can make accurate conclusions about the population; you don't have to test the entire population to get accurate conclusions (don't have to taste all the beer, just a sample of it, main idea is generalization and sampling)
How do we know that a weight scale is reliable? (like wechsler adult intelligence scale, not scale for bodyweight)
Get the same score when you repeat the test (or close to the same score)
What do we use statistical hypothesis testing for?
Go beyond descriptive statistics and deals with random error in the data, so provides supporting evidence that either supports or falsifies the hypothesis.
What is systematic in a systematic error?
Have a systematic source of variability added to the true score. Systematic because some other concept besides random error has been engaged.
What is the p-value? (define it)
How likely certain statistics exceeded the critical value given that the hypothesis is true (said another way, it's how likely is it that the data has been generated by chance)
Why can't we tell that some research would prove a hypothesis?
Hypothesis can't be proven, so forget about the word "prove"
How can you affect alpha and beta values?
Improve the testing measures
Describe the Null Hypothesis distribution in a case of a correlational study.
In a correlational study, the Correlational coefficient indicates how strong two variables are related to each other. Because of the null hypothesis, if the correlational coefficient if above zero in a correlational study, the means that it probably happened by chance (refer to definition of null hypothesis)
In the jury example how would a judge would "determine" an alpha value?
Increase the amount of evidence required to make the case
What is an alpha value?
Indicates rate of false alarms in the conclusions
Draw a 2x2 table showing the state of the world and decisions about these states and name the cells.
MEMORIZE TABLE DRAW IT NOW
Order the measures of central tendencies with respect to their sensitivity to outliers, from the least to the most sensitive.
Mode, median mean
How do we know that it measures weight? [what is a concept of weight?] (like wechsler adult intelligence scale, not scale for bodyweight)
Need validity
What is a null hypothesis (Ho)?
No relationship between the variables of interest, and even if there is a relationship, it probably happened by chance
What could be a source(s) of random errors in psychological sciences?
Participant, test, researcher
What is a beta value?
Rate of misses in the hypothesis testing (how many times you reject the null hypothesis when it was false, ex. Someone was deemed not guilty in court, when in reality they are guilty)
How do we determine an interview's validity in practice?
Try to quantify the information from the interview and wait 20 years to see if it's predictable.
Which one is more problematic for researchers: sampling error or sampling bias?
Sampling bias is more problematic than sampling error
What is the most important property of random errors?
Self-cancelling
Describe the Null Hypothesis distribution in a case of comparing two group's scores study.
Subtraction of the two scores should be used to plot the scores and figure out the difference. If the scores are the same, the null hypothesis will turn out to be zero
What does the Null Hypothesis "tells" about the property under investigation?
Tells us that there's no relationship between variables and if there is a relationship, it happens by chance (basically the definition of the null hypothesis)
What is the margin of error?
The confidence interval around a certain value which we can expect the true population value to fall in.
What is an actual score?
The value of the variable of interest that you would like to learn about, the observed score
Why can't we use the words "prove" and "proof" to describe the results of hypothesis testing?
There's random variability, so you can't prove a hypothesis because you can't get 100% clear results
What is the relationship between alpha and beta?
They are inversely related. Increase alpha, beta decreases. Increase beta, alpha decreases.
Define a concept of power in terms of a beta value.
They are inversely related. Larger beta, smaller power. Larger beta, smaller power.
Why are random errors considered random?
They happen haphazardly. There's no systematic reason behind there, it could be a glitch or a recording accident
Why would you compare p-value with alpha?
To find significant results
In what sense did we use an analogy of soup making and the main topic in chapter 6?
Try a few samples (bites) of the soup, don't need to eat the entire thing, just like how you don't need to have the sample be the whole population, you just need a few representative sample(s) of a population for it to work.
Do you remember what is a likely validity value of an interview? And what does it mean?
Very low validity Helps look at the personality side of the person being interviewed.
Do you want p-value to be high or low in your study?
Want it to be very low
Who calculates an alpha-value?
We do
Does the following hold: If the results are reliable then they are valid?
We don't know whether they are valid or not. Could be valid, could not be valid (bit, if they are valid, then they are reliable.)
In the case of Hypothesis testing flow chart (Figure 8.1) we used an analogy about sausage making that is related to the seeming logical paradox. What was the issue about?
We think of our idea using a research hypothesis, but then end up testing the idea by using the null hypothesis
Using the conventional approach to hypothesis testing (shown in the book, Figure 8.1) when would you find a significant result?
When we compare p-value to alpha
Define the true value in the following: Hypothesis 2: More caffeine increases alertness. The truth in this case relates to which group?
Whether adding more caffeine will change alertness (subtract less caffeine participants from more caffeine participants and see whether the difference is more than zero)
What is the main goal of an interview?
Whether you can read information from the quality of the interviewer from the self-report
An employer is interested in learning about the employees in his company. He interviews each of his employees, one at a time, using a short interview. He has conducted which of the following?
a census
A Type 2 error occurs when
a researcher fails to reject the null hypothesis when it is false.
What is a research hypothesis?
a specific and falsifiable prediction regarding the relationship between or among two variables
Which of the following will be likely to increase statistical significance? a. A larger sample size b. A smaller alpha c. A smaller effect size d. A two-sided (rather than a one-sided) p-value
a. A larger sample size
In a study involving altruism in restaurants, Elena focused only on the helping behaviors that each restaurant employee demonstrated (and not on any other types of behaviors). This is an example of which of the following? a. Event sampling b. Time sampling c. Individual sampling d. Snowball sampling
a. Event sampling
Which of the following is true about the difference between random and systematic error? a. Random error is self-canceling, whereas systematic error tends to increase or decrease the scores on the measured variable. b. Random error tends to increase or decrease the scores on the dependent variable, whereas systematic error is self-canceling. c. Both random error and systematic error are self-canceling. d. Because systematic error is self-canceling, it is less problematic in research than is random error.
a. Random error is self-canceling, whereas systematic error tends to increase or decrease the scores on the measured variable.
A report indicates that a survey is accurate within "plus or minus 1 percent." This statement refers to which of the following? a. The margin of error b. The standard deviation c. The response rate d. The sampling plan
a. The margin of error
Which of the following is an example of inferential statistics? a. William uses a small sample of depressed people to predict mood changes in a large population of depressed people. b. Charles interviews three schizophrenic patients. c. Leslie notes the relationships between a psychology professor and the students in her class. d. Janet gives participants a personality test both before and after they take a final exam.
a. William uses a small sample of depressed people to predict mood changes in a large population of depressed people.
What is the equation for the actual score?
actual score = true score + random error
Which of the following could lead a scientist to fail to detect a relationship between variables? a. High statistical power b. A small alpha c. A small beta d. A large effect size
b. A small alpha
Which of the following is true about the p-value? a. It indicates how many observations have been made. b. It indicates whether the result is likely to have occurred by chance. c. It indicates the likelihood of having made a Type 1 error. d. It indicates the likelihood of having made a Type 2 error.
b. It indicates whether the result is likely to have occurred by chance.
the probability of the scientist making a type 2 error
beta
the sampling distribution for events that have two equally likely possibilities (ex. correct and incorrect guesses)
binomial distribution
In terms of age, a twelve-year-old college student can probably be considered to be which of the following? a. A skewed value b. A modal value c. An outlier d. Close to the median
c. An outlier
Which of the following is true regarding Type 2 errors? a. They are less likely to occur when alpha is smaller. b. They increase as N increases. c. They are generally of less concern to scientists than are Type 1 errors. d. They can be modified to show statistical significance.
c. They are generally of less concern to scientists than are Type 1 errors.
A test designed to measure baseball performance skills would be said to have high predictive validity if a person who scores low on the test also a. scores low on a test measuring basketball performance. b. scores high on a test measuring basketball performance. c. has a poor batting average over the baseball season. d. scores high on the same test given two weeks later.
c. has a poor batting average over the baseball season.
Rob is studying the relationship between caffeine intake and aggression. He found that the amount of caffeine consumed is positively correlated with aggressive behavior. Which of the following p-values would support his research hypothesis? a. p > .05. b. p = .06. c. p = .0089 d. p < -.05
c. p = .0089
to measure each person about whom we wish to know
census
when criterion validity involves assessment of the relationship between a self-report and a behavioral measure that are assessed at the same time
concurrent validity
refers to the extent to which a measured variable actually measures the conceptual variable that is (the construct) that it is designed to assess
construct validity
concerns the degree to which the measured variable appears to have adequately sampled from the potential domain of questions that might relate to the conceptual area of interest
content validity
refers to the extent to which a measured variable is found to be related to other measured variables designed to measure the same conceptual variable
convergent validity
the correlation that occurs when validity is assessed through correlation of a self-report measure with a behavioral measured variable
criterion validity
the behavioral variable when validity is assessed through correlation if a self-report measure with a behvioral measured variable
criterion variable
Donna is in a two-part experiment. She completed a self-esteem measure one week ago and is now taking the same measure again. Which of the following problems may arise from this process? a. Systematic error phenomenon b. Violation of face validity c. Equivalent form effects d. Retesting effects
d. Retesting effects
Which of the following is an example of systematic error? a. The participant misreads the question b. The experimenter misprints the question. c. The participant forgets to answer a question. d. The participant displays socially desirable responding.
d. The participant displays socially desirable responding.
Which of the following research approaches is most likely to have ethical problems? a. The acknowledged participant b. The case study c. The acknowledged observer d. The unacknowledged observer
d. The unacknowledged observer
refers to the extent to which a measured variable is found to be unrelated to other measured variables designed to assess different conceptual variables
discriminant validity
Research that is conducted in situations that are similar to the everyday life experiences of the participants is said to have
ecological validity
indicates the magnitude of a relationship
effect size
in this approach, two different but equivalent versions of the same measure are given at different times and the correlation between the scores on the two variables is assessed
equivalent-forms reliability
Consider the following item that might appear on a Likert scale: "I normally do not get along well with other people." Such an item has high __________ validity, but may not have __________ validity because respondents may not answer it honestly.
face construct
refers to the extent to which the measured variable appears to be an adequate measure of the conceptual variable
face validity
using sampl data to draw inferences about the true state of affairs
inferential statistics
refers to the extent to which the score on the items correlate with each other and thus are all measuring the true score rather than random error
internal consistency
the reliability that occurs when the internal consistency of a group of judges is calculated
interrater reliability
questions are read to the respondent in person or over the phone
interview
A measure only has construct validity if _________________________________
it measures what we want it to measure
used to measure agreement among judges when the variable(s) of interest is (are) nominal; ranges from k=0.00 to k=+1.00
kappa (k)
In comparison to interviews, questionnaires are
less likely to be influenced by the presence of the researcher.
A scientist has concluded that two variables are significantly correlated when in fact they are not. The scientist has
made a Type 1 error
Do you usually specify a null hypothesis in your study?
no
an effect size of zero indicates ___________ and a larger (positive) effect size indicates __________________
no relationship between the variables stronger relationships between the variables
the complicated pattern formed from the relationships among the many different measured variables, both self-report and otherwise
nomological net
the assumption that the observed data reflect only what would be expected under the sampling distribution (the sampling distribution being what would happen by chance)
null hypothesis (symbolized as H₀)
A researcher who used a self-report measure of job interest to predict performance on a managerial task would be interested in the _____ of her self-report measure.
predictive validity
Case studies are frequently based on a descriptive record of
one or two individuals who have abnormal experiences.
Results are said to be statistically significant if
p < alpha
each statistic has this that shows the likelihood of an observed statistic occurring on the basis of the sampling distribution
p-value (probability value)
involves trying out a questionnaire or other research on a small group of individuals to get an idea of how they react to it before the final version of the product is created
pilot testing
Questionnaires are good because they provide ___________________________________ however, they are limited as they don't provide _________________________________________
plenty of information (they don't explain) relationships typically
the entire group of people that the researcher desires to learn about
population
probability that the researcher will be able to reject the null hypothesis given that the null hypothesis is actually false and thus should be rejected
power (power of a statistical test)
What is the equation of power? (power of a statistical test)
power = 1 - beta
chance fluctuations in measurement; usually cancel each other out
random error
Sampling error is _______________________________
random errors
The researcher hopes to ______________ the null hypothesis
reject
If the p-value is less than alpha (p < 0.05), then we __________ the null hypothesis, and we say the result is ____________________
reject statistically significant
If the results of an experiment are determined to be statistically significant, then the experimenter should
reject the null hypothesis
the extent to which a measure is free from random error
reliability
What is the equation for reliability?
reliability = true score/actual score
the percentage of people who actually complete the questionnaire and return it to the investigator
response rate
when the same or similar measures are given twice, responses on the second administration may be influenced by the measure having been taken the first time
retesting effect
If a researcher wanted to be certain to detect even a small relationship between variables, she would
set alpha to be larger.
the standard that the observed data must meet
significance level (or alpha)
correlate a person's score on one half of the items (for instance, the even-numbered items) with their score on the other half of the items (the odd-numbered items). If the scale is reliable, then the correlation bwteen the two halves will approach r=1.00, indicating that both halves measure the same thing
split-half relaibility
other conceptual variables, such as levels of stress, moods, or even preference for classical over rock music, which are personality variables that are expected to change within the same person over short periods of time
states
The smaller the alpha value, the more ___________ the standard
stringent
uses quantitative fixed-format items
structured interview
a series of self-report measures administered either through an interview or a written questionnaire
survey
errors (and variables) that sytematically increase or decrease the scores on the measured variable
systematic error
Event sampling, individual sampling, and time sampling are all used in
systematic observation
A university needs to create a sample of students to interview about their perceptions of the campus climate. They find a computerized list of all of the currently registered students and choose every fiftieth name on the list. What sampling method has been used?
systematic random sampling
refers to the extent to which scores on the same measured variable correlate with each other on two different measurements given at two different times
test-retest reliability
Discriminant validity refers to
the extent to which a measured variable does not correlate with other variables designed to measure different conceptual variables.
If sampling distribution is perfectly symmetric and bell shaped, then the best measure of central tendency to use is _____________________
the mean
In an experimental research design, what is H₀? (the null hypothesis)
the mean score on the dependent variable is the same in all of the experimental groups
Which descriptive statistic would best portray the central tendency of the following distribution (1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 1000)?
the median
Sampling bias occurs when
the probability with which the members of the population have been selected is not known.
Conclusions can be drawn about the population on the basis of ____________________________
the properties of samples
In a correlational research design, what is H₀? (the null hypothesis)
there is no correlation between the measured variables
What does rejecting the null hypothesis mean?
to be able to conclude that the observed data were caused by something other than chance alone
conceptual variables such as intelligence, friendliness, assertiveness, and optimism, which are personality variables that are not expected to vary (or at most vary only slowly) within people over time
traits
the part of the scale score that is not random error of the individual on the measure
true score
Define the true value in the following: Hypothesis 1: Male are more risk prone than females. (need true value) B. The truth in this case relates to which group?
true value: difference between males and females scores (subtract females from males) B. Relates to the different scores, not to particular males and females
take into consideration that unusual outcomes may occur in more than one way
two-sided p-values
type of error that occurs when you reject the null hypothesis when it is really true (psychotherapy program reduces anxiety when it really doesn't; test says you have covid when you really don't)
type 1 error (aka false alarm)
type of error that occurs when you fail to reject the null hypothesis when the null hypothesis is really false (psychotherapy program is not working even though it really is; test says you don't have covid when you really do)
type 2 error (aka miss)
A measured variable that contains a large proportion of random error is said to be
unrelaible
the interviewer talks freely with the person being interviewed about many topics
unstructured interview
Which ingredients make a research hypothesis?
variables and relationship
A Type 1 error occurs when
we reject the null hypothesis when it is true.