Psych 155 test 2

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Three different types of reliability

Test-retest Reliability Internal Consistency Scorer Reliability and Agreement

written test

a paper and pencil test in which a test taker must answer a series of questions

cohen's kappa

an index of agreement for two sets of scores or ratings

measurement error

variations or inconsistencies in the measurements yielded by a test or survey

generalizable

when a test can be expected to produce similar results even though it has been administered in different locations

practice effects

when test takers benefit from taking a test the first time (practice) because they are able to solve problems more quickly and correctly the second time they take the same test

item response theory (IRT)

a theory that relates the performance of each item to a statistical estimate of the test taker's ability on the construct being measured

item difficulty

the percentage of test takers who answer a question correctly

face validity

the perception of the test taker that the test measures what it is supposed to measure

test specifications

the plan prepared before test development or whose behavior is measured

test format

the type of questions on a test

differential validity

when a test yields significantly different validity coefficients for subgroups

intrascorer reliability

whether each clinician was consistent in the way he or she assigned scores from test to test

concurrent evidence of validity (concurrent method)

a method for establishing evidence of validity based on a test's relationships with other variables in which test administration and criterion measurement happen at roughly the same time

generalizability theory

a proposed method for systematically analyzing the many causes of inconsistency or random error in test scores, seeking to find systematic error that can then be eliminated

heterogenous test

a test that measures more than one trait or characteristic

testing universe

the body of knowledge or behaviors that a test represents

reliable test

a test that consistently yields the same measurements for the same phenomena

random responding

responding to items in a random fashion by marking answers without reading or considering the items

focus group

a method that involves bringing together people who are similar to the target respondents in order to discuss issues related to the survey

pilot test

a scientific investigation of a new test's reliability and validity for its specified purpose

test item

a stimulus or test question

practical test

a test in which a test taker must actively demonstrate skills in specific situations

discriminant evidence of validity

one of two strategies for demonstrating construct validity showing that constructs that theoretically should be related are indeed related; evidence that test scores are not correlated with unrelated constructs

response sets

patterns of responding to a test or survey that result in false or misleading information

test of significance

the process of determining what the probability is that a study would have yielded the observed results simply by chance

surveys

instruments used for gathering information from a sample of the individuals of interest

single-group validity

when a test is valid for one group but not for another group, such as valid for whites but not for blacks

qualitative analysis

when test develops ask test takers to complete a questionnaire about how they viewed the test and how they answered the questions

true/false

a test item that asks, "is this statement true or false"

field test

an administration of a survey or test to a larger representative group of individuals to identify problems with administration, item interpretation, and so on

convergent evidence of validity

one of two strategies for demonstrating construct validity showing that constructs that theoretically should be related are indeed related; evidence that the scores on a test correlate strongly with scores on other tests that measure the same construct

operational definition

specific behaviors that define or represent a construct

projective tests

tests that unstructured and require test takers to respond to ambiguous stimuli

testing environment

the circumstances under which a test is administered

cumulative model of scoring

the more the test taker responds in a particular fashion (either with "correct" answers or ones that are consistent with a particular attribute), the more the test taker exhibits the attribute being measured (e.g., multiple-choice questions)

experimental research techniques

research designs that provide evidence for cause and effect

individually administered surveys

surveys administered by a facilitator in person for respondents to complete in the presence of the facilitator

face-to-face surveys

surveys in which an interviewer asks a series of questions in a respondent's home, a public place, or the researcher's office

descriptive research techniques

techniques that help us describe a situation or phenomenon

reliability

the consistency with which an instrument yields measurements

subjective test format

a test format that does not have a response that is designated as "correct"; interpretation of the response as correct or providing evidence of a specific construct is left to the judgment of the person who administers, scores, or interprets the test taker's response

projective techniques

a type of psychological test in which the response requirements are unclear so as to encourage test takers to create responses that describe the thoughts and emotions they are experiencing; three projective techniques are projective storytelling, projective drawing, and sentence completion

interrater agreement

the consistency with which scorers rate or make yes/no decisions

alternate forms

two forms of a test that are alike in every way except for the questions; used to overcome problems such as practice effects; also referred to as parallel forms

classical test theory

No instrument perfectly reliable or consistent All test scores contain some error (X=T+E) • Test Length • Homogeneity • Test-Retest Interval • Effective Test Administration • Careful Scoring • Guessing or Faking

quantitative item analysis

a statistical analysis of the responses that test takers gave to individual test questions

correlation

a statistical procedure that provides an index of the strength and direction of the linear relationship between two variables

objective test format

a test format that has one response that is designated as "correct" or that provides evidence of a specific construct, such as multiple-choice questions

construct validity

an accumulation of evidence that a test is based on sound psychological theory and therefore measures what it is supposed to measure; evidence that a test relates to other tests and behaviors as predicted by a theory

construct

an attribute, trait, or characteristic that is abstracted from observable behaviors

content validity ratio

an index that describes how essential each test item is to measuring the attribute or construct that the item is supposed to measure

item nonresponse rate

how often an item or question was not answered

homogeneity of the population

how similar the people in a population are to one another

intrarater agreement

how well a scorer makes consistent judgements across all tests

experts

individuals who are knowledgeable about a topic or who will be affected by the outcome of something

categorical model of scoring

places test takers in a particular group or class (e.g., displays a pattern of responses that indicates a clinical diagnosis of a certain psychological disorder)

ipsative model of scoring

requires test taker to choose among the constructs the test measures (e.g., forced choice)

empirically based tests

tests in which the decision to place an individual in a category is based solely on the quantitative relationship between the predictor and the criterion

interscorer agreement

the consistency with which scorers rate or make decisions

validity coefficient

the correlation coefficient obtained when test scores are correlated with a performance criterion representing the amount or strength of the evidence of validity for the test

scorer reliability

the degree of agreement between or among persons scoring a test or rating an individual; also known as interrater reliability

content validity

the extent to which the questions on a test are representative of the material that should be covered by the test

parallel forms

two forms of a test that are alike in every way except questions; used to overcome problems such as practice effects; also referred to as alternate forms

multiple choice

an objective test format that consists of a question or partial sentence, called a stem, followed by a number of responses, only one of which is correct

content validity ratio

CVR = [(E - (N / 2)) / (N / 2)] As an example, say you assembled a team of 10 experts, seven of whom rated the product essential: CVR = [(7 - (10 / 2)) / (10 / 2)] CVR = [(7 - 5) / 5} CVR = 2 / 5 CVR = 0.40

test manual provide describe contain

Provides rationale for constructing the test history of the development process results of the validation studies Describes appropriate target audience instructions for administering and scoring the test Contains norms and information on interpreting individual scores

7 methods of testing reliability

Test-Retest Reliability Same test administered to the same people at two points in time Alternate Forms or Parallel Forms Two forms of the test administered to the same people Internal Consistency Give the test in one administration, then split the test into two halves for scoring Internal Consistency Give the test in one administration, then compare all possible split halves Inter-rater Reliability Give the test once, and have it scored (interval/ ratio level) by two scorers or two methods Inter-rater Agreement Give a rating instrument and have it completed by two or more judges Intra-rater Agreement Calculate the consistency of score for one scorer across multiple tests

concurrent evidence of validity

a method for establishing evidence of validity based on a test's relationships with other variables in which test administration and criterion measurement happen at roughly the same time

cut scores

decision points for dividing test scores into pass/fail groupings

validity

evidence that the interpretations that are being made from the scores on a test are appropriate for their intended purpose

confidence interval

a range of scores that the test user can feel confident includes the true score

two methods for demonstrating evidence of validity

The Predictive Method - Used when it is important to show a relationship between test scores and a future behavior Appropriate for validating employment tests The Concurrent Method - When test administration and criterion measurement happen at the same time - Appropriate for validating clinical tests that diagnose behavioral, emotional, or mental disorders and selection tests - Often used for selection tests because employers do not want to hire applicants with low test scores or wait a long time to get criteria data

criterion-related validity (evidence of validity based on test-criteria relationships)

evidence that test scores correlate with or predict independent behaviors, attitudes, or events; the extent to which the scores on a test correlate with scores on a measure of performance or behavior

norms

group of scores that indicate the average performance of a group and the distribution of scores above and below this average

validity - past vs present

Whether there is evidence supporting the interpretation of the resulting test scores for their intended purpose • In the past - three types of validity: content, criterion-related, and construct • Now view validity as a single concept. • "Evaluating the interpretation of test scores and accumulating evidence to provide a sound scientific basis for the proposed score interpretations."

test-retest method

a method for estimating test reliability in which a test developer gives the same test to the same group of test takers on two different occasions and correlates the scores from the first and second administrations

test plan

a plan for developing a new test that specifies the characteristics of the test, including a definition of the construct and the content to be measured (the testing universe), the format for the questions, and how the test will be administered and scored

order effects

changes in test scores resulting from the order in which tests or questions on tests were administered

item bias

differences in responses to test questions that are related to differences in culture, gender, or experiences of the test takers

item analysis

the process of evaluating the performance of each item on a test

survey objectives

the purpose of a survey, including a definition of what it will measure

random error

the unexplained difference between a test taker's true score and the obtained score; error that is nonsystematic and unpredictable, resulting from an unknown cause

influence reliability

1. Test itself poorly designed trick questions ambiguous questions poorly written questions reading level higher than the reading level of target population 2. Test administration not following administration instructions disturbances during the test period answering test takers' questions inappropriately extremely cold or hot room temperature 3. Test scoring not scoring according to instructions inaccurate scoring errors in judgment errors calculating test scores 4. Test takers fatigue illness exposure to test questions before the test not providing truthful and honest answers

predictive evidence of validity (predictive method)

a method for establishing evidence of validity based on a test's relationships with other variables that shows a relationship between test scores obtained at one point in time and a criterion measured at a later point in time

homogenous test

a test that measures only one trait or characteristic

content areas

the knowledge, skills, and/or attributes that a test assesses

criterion

the measure of performance that we expect to correlate with test scores

predictions using validity information

When a relationship can be established between test scores and a criterion, we can use test scores from other individuals to predict how well those individuals will perform on the criterion measure

evidence of validity based on relationship with external criteria

When test scores correlate with independent behaviors, attitudes, or events

instructional objectives

a list of what individuals should be able to do as a result of taking a course of instructions

subjective criterion

a measurement that is based on judgement, such as supervisor or peer ratings

population

all members of the target audience

reliability coefficient

• Correlation provides an index of the strength and direction of the linear relationship between two variables • Correlation coefficient = rxy • Reliability coefficient = rxx reliable tests will have positive signs

5 sources of evidence of validity

• Evidence based on test content • Evidence based on response processes • Evidence based on internal structure • Evidence based on relations with other variables • Evidence based on the consequences of testing

internal consistency

the internal reliability of a measurement instrument; the extent to which each test question has the same value of the attribute that the test measures

reliability

A reliable test is one we can trust to measure each person in approximately the same way every time it is used


Set pelajaran terkait

Euro Chapter 15-18 Test, Renaissance & Reformation

View Set

Vocabulary Activity 1-1// Guided Reading Activity 1-2

View Set

Organizational Behavior Chapter 8

View Set