Research Methods Ch 3 (exam 1)

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Divergent validity

-demonstrated by showing little or no relationship between the measurements of two different constructs. -involves demonstrating that we are measuring one specific construct and not combining two different constructs in the same measurement process -The goal is to differentiate between two conceptually distinct constructs by measuring both constructs and then showing that there is little or no relationship between the two measurements

Four types of measurement scales:

1. Nominal 2. Ordinal 3. Interval 4. Ratio

Types and Measures of Reliability

1. Successive measurements 2. Simultaneous measurements 3. Internal consistency

Successive measurements

1. Test-retest Reliability 2. Parallel-forms reliability

Common sources of error that influence reliability

1.Observer error 2. Environmental changes 3. Participant changes

How are reliability and validity related? How are they independent?

A measure cannot be valid unless it is reliable, but a measure can be reliable without being valid. it is not necessary for a measurement to be valid for it to be reliable. For example, we could measure your height and claim that it is a measure of intelligence.Although this is a foolish and invalid method for defining and measuring intelligence, it would be very reliable, producing consistent scores from one measurement to the next. Thus, the consistency of measurement is no guarantee of validity.

split-half reliability

A measure of reliability obtained by splitting the items on a questionnaire or test in half, computing a separate score for each half, and then measuring the degree of consistency between the two scores for a group of ex) use of exams that consist of multiple items (questions or problems) to measure performance in an academic course

Self Report measure

A measurement obtained by asking a participant to describe his or her own attitude, opinion, or behavior.

Behavioral Measures

A measurement obtained by the direct observation of an individual's behavior. pro: provide researchers with a vast number of options, making it possible to select the behaviors that seem to be best for defining and measuring the construct. For example, the construct "mental alertness" could be operationally defined by behaviors such as reaction time, reading comprehension, logical reasoning ability, or ability to focus attention. the behavior could also be the variable of interest

operational definition

A procedure for indirectly measuring and defining a construct (variable that cannot be observed or measured directly) An operational definition specifies a measurement procedure (a set of operations) for measuring an external, observable behavior and uses the resulting measurements as a definition and a measurement of the hypothetical construct.

double blind

A research study in which both the researcher and the participants are unaware of the predicted outcome for any specific participant.

Interval scale

A scale of measurement in which the categories are organized sequentially and all categories are the same size. The zero point of an interval scale is arbitrary and does not indicate a total absence of the variable being measured. ex) temperature

Ratio Scale

A scale of measurement in which the categories are sequentially organized, all categories are the same size, and the zero point is absolute or nonarbitrary, and indicates a complete absence of the variable being measured. •We can measure the absolute amount of the variable. •We can measure the direction and magnitude of the difference between measurements and describe differences in terms of ratios. ex) glasses of water

nominal scale

A scale of measurement in which the categories represent qualitative differences in the variable being measured. The categories have different names but are not related to each other in any systematic way.

ordinal scale

A scale of measurement on which the categories have different names and are organized sequentially (for example, first, second, third).

Face Validity

An unscientific form of validity that concerns whether a measure superficially appears to measure what it claims to measure; subjective asks if the measurement technique looks like it measures the variable it's supposed to?

demand characteristics

Any potential cues or features of a study that (1) suggest to the participants what the purpose and hypothesis (2) influence the participants to respond or behave in a certain way. Demand characteristics are artifacts and can threaten the validity of the measurement, as well as both internal and external validity.

Constructs

Hypothetical attributes or mechanisms that help explain and predict behavior in a theory.

single blind

If the research study is conducted by an experimenter (assistant) who does not know the expected results, the experimenter should not be able to influence the participants.

Theories

In the behavioral sciences, statements about the mechanisms underlying a particular behavior.;

Artifact

In the context of a research study, an external factor that could influence or distort measures. Artifacts threaten the validity of the measurement, as well as both internal and external validity. ex) a doctor who startles you with an ice-cold stethoscope is probably not going to get accurate observations of your heartbeat.

Physiological Measures

Measurement obtained by recording a physiological activity (heart rate, brain scanning mri, pet, etc) pro: super objective

Multiple measures

One method of obtaining a more complete measure of a construct is to use two (or more) different procedures to measure the same variable ex) could measure both heart rate and behavior as measures of fear

participant reactivity

Participants' modification of their natural behavior in response to the fact that they are participating in a research study or the knowledge that they are being measured. Reactivity is an artifact and can threaten the validity of the measurement, as well as both internal and external validity.

pros and Limitations of self report

Pros: Each individual is in a unique position of self-knowledge and self-awareness. direct question and its answer have more face validity than measuring some other response that theoretically is influenced by fear. Cons: easy for participants to distort self-report measures may deliberately lie to create a better self-image, or a response may be influenced subtly by the presence of a researcher, the wording of the questions, or other aspects of the research situation -> undermines validity

Range effect

The clustering of scores at one end of a measurement scale.

Ceiling effect

The clustering of scores at the high end of a measurement scale, allowing little or no possibility of increases in value.

floor effect

The clustering of scores at the low end of a measurement scale, allowing little or no possibility of decreases in value; a type of range effect.

inter-rater reliability

The degree of agreement between two observers who simultaneously record measurements of a behavior. can be measured by computing the correlation between the scores from the two observers or by computing a percentage of agreement between the two observers

Reliability

The degree of stability or consistency of measurements. If the same individuals are measured under the same conditions, a reliable measurement procedure will produce identical (or nearly identical) measurements.

Validity

The degree to which the measurement process measures the variable it claims to measure.

Observer error

The individual who makes the measurements can introduce simple human error into the measurement process, especially when the measurement involves a degree of human judgment ex) pitch in baseball could be called a ball once or a strike later ex) same essay could get an A one semester and a B another time

Participant changes

The participant can change between measurements ex) person's degree of focus and attention can change quickly and can have a dramatic effect on measures of reaction time.

scales of measurement

The set of categories used for classification

predictive validity

The type of validity demonstrated when scores obtained from a measure accurately predict behavior according to a theory. ex)school districts often attempt to identify at-risk students so that preventive interventions can be arranged before academic or behavioral problems develop. One technique that is used to identify at-risk students is the Student Risk Screening Scale (SRSS). A recent study (Menzies & Lane, 2012) evaluated the predictive validity of this test by administering it to students three times during a school year. Scores on the test significantly predicted teacher ratings of student self-control and student performance in language arts

Concurrent Validity

The type of validity demonstrated when scores obtained from a new measure are directly related to scores obtained from a more established measure of the same variable. ex) if you develop a new way to measure intelligence, you could compare results from your test with the IQ test

What is the goal of an operational definition?

To provide a definition and a method for measuring a hypothetical construct

What are the two general criteria for evaluating the quality of any measurement procedure?

Validity and reliability

Simultaneous measurements

When measurements are obtained by direct observation of behaviors, it is common to use two or more separate observers who simultaneously record measurements inter-rater reliability ex) two psychologists may watch a group of preschool children and observe social behaviors

Advantage and disadvantage of multiple measures

advantage:provides more confidence in the validity of the measurements disadvantage: •Statistical analysis and interpretation •The two measures may not behave in the same way (large change in heart race, no change in behavior)

How do you determine consistency of a relationship?

by computing a correlation between the two measures.

types of range effecrs

ceiling and floor effects

internal consistency

complex construct such as intelligence or personality is measured using a test or questionnaire consisting of multiple items. The idea is that no single item or question is sufficient to provide a complete measure of the construct. split-half reliability

Convergent Validity

creating two different methods for measuring the same construct, and then showing that the two methods produce strongly related scores goal is to demonstrate that different measurement procedures "converge"—or join—on the same construct

Construct Validity

demonstrated when scores obtained from a measurement behave exactly the same as the variable itself. Construct validity is based on many research studies and grows gradually as each new study contributes more evidence.

How are validity and reliability established?

demonstrating the consistency of a relationship between two different measurements.

In many theories, constructs can be influenced by external stimuli and can influence

external behaviors. ex) external factors such as an upcoming exam can affect anxiety (a construct) and anxiety can then affect behavior (worry, nervousness, increased heart rate, and/or lack of concentration). Although researchers may not be able to observe and measure a construct directly, it is possible to examine the factors that influence a construct and the behaviors that are influenced by the construct.

(T/F): ordinal measurements allow us to determine the magnitude of the difference between the two individuals.

false. we can only determine direction of difference

In a _______ study, participants are observed in their natural environment and are much less likely to know that they are being investigated, hence they are less reactiv

field

example of operational definition

intelligence (construct) may be operationally defined with IQ test (external behavior)

Reactivity is especially a problem in studies conducted in a ________, where participants are fully aware that they are participants in a study

laboratory

Example of convergent and divergent validity

measure relationship quality in five specific domains: emotional intimacy, sexual relationship, support transactions, power sharing, and problem solving. The researchers demonstrated convergent validity by showing strong relationships among the five RQI scale ratings, indicating that the five domains of the RQI are converging on the same construct (relationship quality). After establishing convergent validity, however, the researchers wanted to demonstrate that the RQI is really measuring relationship quality and not some other variable. For example, the scores may actually reflect the general level of satisfaction with the relationship rather than the quality. It is possible, for example, for couples to be satisfied with a low quality relationship. To resolve this problem, it is necessary to demonstrate that the two constructs, "quality" and "satisfaction," are separate and distinct. The researchers established divergent validity by showing a weak relationship between the RQI quality scores and measures of general satisfaction. Specifically, correlations between the domain-specific measures of quality from the RQI and global relationship satisfaction scores were generally low.

example of reliability

measure your intelligence with an IQ test, the score we get is determined partially by your actual level of intelligence (your true score), but also is influenced by a variety of other factors such as your current mood, your level of fatigue, your general health, how lucky you are at guessing on questions to which you do not know the answers, and so on. These other factors are lumped together as error and are typically a part of any measurement.

A research study reports that participants who scored high on a new test measuring self-esteem made eye contact during an interview, whereas participants who scored low on the test avoided eye contact. Assuming that more eye contact is associated with higher self-esteem, what kind of validity is being demonstrated?

predictive

Limiting experimenter bias

single blind research and double blind research

Environmental change

there are small changes in the environment from one measurement to another, and these small changes can influence the measurements. ex) time of day, temperature, weather conditions, and lighting

parallel forms reliability

type of reliability established by comparing scores obtained by using two alternate versions of a measuring instrument to measure the same individuals calculating a correlation between the two sets of scores.

test-retest reliability

type of reliability established by comparing the scores obtained from two successive measurements of the same individuals calculating a correlation between the two sets of scores.

Operational definitions can be used as a basis for measuring variable

we could measure hunger for a group of rats by recording how much food each animal eats when given free access to a dish of rat chow. The amount that each rat eats defines how hungry it is.

example of construct validity

you are examining a measurement procedure that claims to measure aggression. Past research has demonstrated a relationship between temperature and aggression: In the summer, as temperature rises, people tend to become more aggressive. To help establish construct validity, you would need to demonstrate that the scores you obtain from the measurement procedure also increase as the temperature goes up. need to need to examine all the past research on aggression and show that the measurement procedure produces scores that behave in accordance with everything that is known about the construct "aggression."

Limitations of Operational Definitions

• an operational definition is not the same as the construct itself. •There are always concerns about the quality of operational definitions and the measurements they produce. -The operational definition may not be an accurate reflection of the construct. (leave out important components of the construct, may include extra components that are not part of the construct)

Limitations of behavioral measures

•A behavior may be only a temporary or situational indicator of an underlying construct. •It is best to measure a cluster of related behaviors rather than rely on a single indicator. •A complete definition of behavior could require several behavioral indicators.

types of artifacts

•Experimenter bias •Demand characteristics •Participant reactivity

Six definition of validity

•Face validity •Concurrent validity •Predictive validity •Construct validity •Convergent validity •Divergent validity

Other Aspects of Measurement

•Multiple measures •Sensitivity and range effects •Experimenter bias and participant reactivity Selection of a measurement procedure

What is the scale's ability in comparing different measurements?

•Nominal: reveals whether a difference exists. •Ordinal: indicate the direction of the difference (which is more, and which is less). •Interval and ratio: determines the direction and the magnitude of a difference.

3 types of consistency

•Positive relationship (correlation) •Negative relationship •No relationship

Three categories that define the three different types, or modalities, of measurement.

•Self-report •Physiological •Behavioral

limitations of physiological measures

•Typically require equipment that may be expensive or unavailable. •The presence of monitoring devices creates an unnatural situation that may cause participants to react differently than they would under normal circumstances. •Do they provide a valid measure of the construct? eg heart rate increased, but did this mean fear or anxiety, arousal, embarrassment

Experimenter bias

•Ways an experimenter can influence a participant's behavior (Rosenthal and Fode, 1963): •Paralinguistic cues (variations in tone of voice) that influence the participants to give the expected or desired responses. •Kinesthetic cues (body posture or facial expressions). •Misjudgment of participants' responses in the direction of the expected results. •By not recording participants' responses accurately (errors in recording of data) in the direction of the expected or desired results.

Selecting a Measurement Procedure

•the best starting point for selecting a measurement procedure is to review past research reports involving the variables or constructs to be examined. •Need to decide what type of scale of measurement (nominal, ordinal, interval, or ratio) is appropriate for the kind of conclusion you would like to make.


Ensembles d'études connexes

Sherpath: Female Reproductive System

View Set

Accident and Health Missed Questions

View Set

Airframe Or. Ch. 5 - Aircraft Fabric Covering

View Set

GEOLOGY 1100 MIZZOU APPOLD QUIZ ANSWERS

View Set

BUS 204 Ch. 8 Intellectual Property Rights

View Set

Principles of Marketing Chapter 2 Quiz

View Set