Exam 3 psych test/mesurement

Ace your homework & exams now with Quizwiz!

A written test is to paper and pencil where a practical test is to

Demonstrating skills

Kevin did not pass his cardiology board exam (barely!). Based on his score, we could infer that he is not knowledgeable enough about cardiology to be board certified at this time. Which type of evidence refutes (i.e., goes against) that inference?

evidence based on reliability

Sources of validity evidence include all of the following EXCEPT...

evidence based on response processes {evidence based on inter-rater agreement} evidence based on test content evidence based on relations with other variables

Kevin did not pass his cardiology board exam (barely!). Based on his score, we could infer that he is not knowledgeable enough about cardiology to be board certified at this time. Which of the following is criterion-related validity evidence to support that inference?

there is a strong, positive correlation between scores on the board exam and patient outcomes

Tests are discriminatory if...

there is evidence of testing bias

Suppose a university is selecting applicants based on their SAT score. If the validity coefficient is .9 ...

they are likely to have more true positives than false positives

Heterotrait-heteromethod correlations

different trait and method (BELOW .2, SURVEY FOR ENVIRONMENTAL ACTIVISM W/SURVEY AND HOW MANY USERS SOMEONE HAS FOR ONLINE VIDEO GAMES)

Heterotrait-monomethod correlations

different traits w/same method (BELOW .3, CORRELATION BTW DOG LOVER AND CONSIENTIOUSNESS USING SAME TYPE OF SCALE)

how do we make a test fair

ensuring content validity, use within group norming,

Kevin did not pass his cardiology board exam (barely!). Based on his score, we could infer that he is not knowledgeable enough about cardiology to be board certified at this time. Which of the following is content validity evidence to support that inference

the test does not measure anything irrelevant

in terms of construct validity what questions are we asking?

-are test scores positively, strongly associated w/scores on similar constructs -are they associated with the same construct -is the scores unrelated to scores on dissimilar constructs

why do we hate cut scores, what can we do to help it?

-because they tend to be on the border line of any given point -round up

Ethical Principles of psychology

1 Bases of assessment 2 Use of assessment 3 Informed consent 4 Release of test data 5 Test construction 6 Interpretation of assessment results 7 Assessment by unequal persons 8 Obsolete tests/Outdated results 9 Test scoring/Interp. Services 10 Explain assessment result 11 Maintain test security

When developing a test, what are the recommended steps to improve the content validity?

1) define the testing universe 2) develop test specifications 3) establish a test format 4) construct test questions

test specifications

A documented plan containing details about a tests content

Objective criterion

A measurement that is observable and measurable, such as the number of accidents on the job.

Common problem with criterion related validity

Restriction of range

Which of the following is NOT an example of validity evidence?

Scores on this quiz are consistent with scores on a parallel form of this quiz

Concurrent evidence of validity

A method for establishing evidence of validity based on a test's relationships with other variables in which test administration and criterion measurement happen at roughly the same time. NOT PREDICTION Provide information about the present and the status quo

Discrimination index

A statistic that compares the performance of those who made very high test scores with the performance of those who made very low test scores on each item. ideal value is 30 or higher

Quantitative item analysis

A statistical analysis of the responses that test takers gave to individual test questions.

Psychologists should obtain informed consent for routine educational assessment

Based on the Ethical Principles of Psychologists and Code of Conduct outlined by the American Psychological Association (APA), which of the following is NOT an ethical principle involving assessments?

A large, positive correlation between individuals' scores on the ACT and SAT would be...

Black individuals who scores lower than White individuals on the SAT are expected to perform the same in college

examining the content of a test to evaluate the validity of an inference or decision is...

Content validity

How would you measure test-retest reliability?

Correlate individuals' scores on two occasions of a test

face validity

Do the questions seem relevant and important to the test taker NOT to demonstrate evidence of validity on a test

what would an expert review look like

Expert review and read how relevant each question/item is to the attribute the test measures

What would expert content categorization look like

Experts look at each questionAnd try to match it with the construct with trying to get 40% on most missed questions on last exam. 10% for item analysis and 40% for validity

The textbook tells the story of Michael who was administered an intelligence test by his school. The school determined he was "retarded," and they moved him into a special education class. When his parents asked about his intelligence score, the prinicipal said it was better to leave such matters to school authorities. Which ethical principle of assessments was violated?

Explaining Assessment Results

To have bias is to

Having poor evidence of content validity

Internal structure is to

Homogenous or Heterogeneous

What do integrity tests measure

Individual attitudes and experiences towards honesty, dependability, trustworthiness, reliability and prosocial behavior

Which of the following questions is relevant to content validity?

Is the test content representative? Do the test questions measure anything irrelevant? Does the test fail to assess any important concepts?

What two questions should we ask interms of validity

Is there evidence supporting the interpretation of the test scores? Is there evidence supporting the proposed use of test scores?

Assume a distribution of scores is negatively skewed. What are the mean, median, and mode in order of smallest to largest?

Mean, Median, Mode

Interitem correlations tend to be relatively

Small in size, often in .15 to .20 range

item difficulty

The percentage of test takers who answer a question correctly divide number of persons who enter correct by total respond to question. 0-.2= too hard .9-1= too easy most variation=.5

what is a construct?

an attribute, trait, or characteristic thats not directly observable but can be inferred by looking at observable behaviors (aggression or knowledge)

content validity ratio

an index that describes how essential each test item is to measuring the attribute or construct that the item is supposed to measure. ranges from -1.00 to 1. we compare minimum values

what questions are we asking with relations to criterion (criterion related validity)

are scores related to things we expect (what are the meaningful outcomes)

What is one way to examine content validity after a test is developed?

ask experts to match every test question to the content area that is covered

what can one do to mitigate problems associated with score differences by race?

changing the decisions we make based on test scores, taking scores into consideration but not basing contingency on them (consider other factors)

Suppose you are developing a selection test for a company (i.e., a test that will help the company make hiring decisions). Which of the following could you do to help define the testing universe?

conduct a job analysis

A large, positive correlation between individuals' scores on the ACT and SAT would be...

convergent

validity coefficient

correlation coefficient between a test score (predictor) and a performance measure (criterion). It is desirable to get the widest range of test scores

multitrait-multimethod matrix

correlation matrix displays data on convergent and discriminant evidence trait=construct; method=how the construct is measured

SPSS can be used to do all of the following EXCEPT...

help write good scale items display the distribution of scores on a scale {provide Cronbach's alpha of a scale} help inform decisions about which scale items to keep or remove

For a good survey item, the item-total correlation is

high if the survey is homogeneous

Restriction of range

if there is not a full range of scores on one of the variables in the association, it can make the correlation appear smaller than it really is. The validity coefficient Will only be calculated from the restrictive group/is likely to be lower than if all candidates have been hired and included in the study.

If a test question has a content validity ratio of .9 ...

it is a good test question; experts agree that it is important and relevant

Project Stage 4 and the final exam are both intended to assess students' overall knowledge of the course material. The correlation between scores on Project Stage 4 and the final exam would be a...

monotrait heter method

Cut scores are typically...

problematic because measurement is not perfectly reliable

Monotrait-monomethod correlations

same trait measured with the same method (test retest reliability, ABOVE .7, HAVE PEOPLE TAKE YOUR SCALE TWICE)

convergent validity is to

scores being strongly, positively associated with scores on similar constructs

Which of the following is evidence related to criterion-related validity?

scores on the SAT are moderately, positively associated with college GPA

Monotrait-heteromethod correlations

the same trait measured with different methods (DOG LOVERS AND HOW MANY DOGS THEY OWN) ALSO ABOVE .6

Which of the following is a subjective criterion?

supervisor ratings of performance

what steps could we take to examine discriminate validity

take two unrelated topics, then correlate scores on both scales (should be below .3)

item-total correlation

the correlation between scores on individual items with the total score on all items of a measure. higher on homogeneous, calculated through Pearson product moment correlation formula. unlike the interitem correlation which only assesses every other item. RANGE= .2-.4

Which of the following is a common problem when examining predictive validity?

the full range of data is unavailable

Validity is about...

the quality of inferences individuals' draw from a test score

If Cronbach's alpha of a scale is .60...

the reliability of the scale is inadequate (i.e., not acceptable)

Scenario: using the same assessment on two different situations, didn't allow participants to consent to take the test/to tell them why they are taking the test, items between the test didn't correlate and had overall low correlation (validity was miscalculated)

use of assessment informed consent test construction

Scenario: based job applicants on the intellect for job analysis, didn't provide feedback, mislead applicants who scored high, management assumed that those with lesser intellect would be less likely to leave the job

use of assessment explaining assessment results


Related study sets

NUR 302: Ch 4 Health Education and Promotion

View Set

MediaLab BB Blood Group Systems-Genetcs

View Set

Ch. 7 Nursing and The Law What Are the rules

View Set

Computer Literacy - Quiz - Planning to Develop a New Website

View Set