Psyc 350 Final
a pilot test is
A scientific evaluation of the test's performance, followed by revisions to determine the final form that the test will take
Norms ->_______-> _______ ____________
norms -> 60 -> normal and cut scores
Test fairness is
not statistical-it's SOCIAL not statistical
Types of test formats
objective and subjective
survey questions can be:
open ended or closed ended, fill in the blank, column format, row format, implied no choice, single-item choice, multiple choice, ranking, rating, likert-type format, semantic differential.
After writing the initial test questions, the test developer conducts a
pilot test
Whare the the 2 types of sampling
probability and nonprobability sampling
cumulative model is
produces interval data
categorical model is
produces nominal data
Examples of subjective formats
projective tests, essay (open-ended), interview
Survey questions are
purposeful and straightforward, unambiguous, correct syntax (complete sentences with an orderly arrangement of words), Appropriate rating scale, appropriate categorical alternatives, no double barreled questions (a question that is actually asking two or more questions in one), and it has to have a comfortable reading level (prefer low readability level).
Piloting the test provides an opportunity to gather
quantitative and qualitative data about the test.
What things do you consider for a target audience?
reading level, disabilities, cognitive motor or sensory impairments, are they motivated to fake their answer?
complex test formats-performance assessments
require test takers to directly demonstrate their skills and abilities to perform a group of complex behaviors and tasks.
Random response
responding to items in a random fashion by marking answers without reading or considering them.
What is a pilot test?
scientific investigation of evidence for reliability and validity.
For essay and interview questions you
score them anonymously and should have multiple scorers.
criterion approach
score to qualify as passing or failing
empirically based tests
sort individuals into 2 or more categories based on their scores on the criterion measure.
Composing the test items. Test items can be
statements, pictures, incomplete sentences.
Way to gather evidence of validity of a survey:
survey assesses the concept that a researcher attempts to measure.
Example of a survey research firm
survey monkey and qualtrics
stimulus is a
test question
Ways to gather evidence of reliability/precision of a survey:
test-retest, alternate forms, split-half reliability. split-half is more practical and most common.
Test plan for educational settings
the curricula (assigned readings, handouts, lectures) provide the basis for developing a test plan.
Why do we use internal consistency for pilot testing?
the developer consults the interitem correlation matrix to see which items should be dropped or revised to increase the test's overall internal consistency. Reliability/precision and Gonbach Alpha
Defining the target audience
the developer makes a list of the characteristics of the persons who will take the test.
homogeneity of the population
the more similar the members of the population, the smaller the sample that is necessary. The more dissimilar the members of the population, the larger the sample that is necessary to have this variation represented in the sample.
The front page of a survey must include
title, seal, logo, appeal, and instructions
What is pilot testing used for
to determine how a new test performs.
What are you looking for in a pilot test?
to see if test scorers are accurate/meaningful, if the test questions are easy or difficult, if high scores reflect a high level of having that quality, if it's biased, it can be used for any kind of group, and if instructions are clear/easy to follow and the length of time.
On a survey there has to be
transitions, bolded, white space, print->3 pages or double sided or spiral and booklet(for long surveys).
For MC and T/F items, what should you do and shouldn't do?
try not to use the word "NOT". Male sure all responses are similar in detail and length. Have 1 correct answer and try using "sometimes" or "often". Try not to use inclusive distractors like ALL OF THE ABOVE or NONE OF THE ABOVE.
What is nonprobability sampling?
type of sampling in which not everyone has an equal change of being selected from the population. Often used because they are convenient and less expensive than probability sampling.
What is probability sampling?
type of sampling that uses statistics to ensure that a sample is representative of a population. Simple random sampling, stratified random sampling, and cluster sampling are examples of probability sampling methods.
D=+?
upper group answered correctly
What happens after piloting the test?
validating the test. Now you collect evidence for validity. Test + another test measuring same construct + criteria
After calculating D value, the test developers look for items that have high positive numbers. Negative numbers indicate? D=-?
who scored low on the test responded to this item correctly
Test plan for organizational tests
will be based on job analysis that defines the knowledge, skills, abilities, and other characteristics required to perform a job successfully.
Is the scientific method that's used for survey designs the same for psychological tests?
yes
Example of Free Choice:
if the psychology department started a psychology club, would you attend the meetings? ____ Yes ___Probably ____ Uncertain
Surveys
instruments used to collect important information from individuals.
What is particularly important for subjective tests?
interrater reliability (two raters)
What is the main purpose for pretesting surveys
looking for nonsampling (associated with the design and administration) measurement errors
Self Administered surveys include
mail and individually administered
How many research firms were there in 2015?
more than 2000
normative approach
higher score will receive the job
You want interitem correlations to be
highly positively correlated
Objective items (stimulus) True/False
1 out of 2 (50%)
Presenting the findings include:
1. outline a report 2. order content 3. using slides and handouts.
response bias-acquiescence
50% negative 50% positive. for instance, someone who labels each statement on a true/false test as true would be demonstrating a response set of acquiescence.
interitem correlation matrix
A matrix that displays the correlation of each item with every other item
When would forced item be used?
Appropriate for when the potential test takers are likely to answer dishonestly. It has low face validity.
example of Likert-type format:
Being able to approach your professors is a major advantage of attending a small college. Please indicate the extend to which you agree or disagree with the following statements about your professors.
Survey Research Method
Both a science and an art. It's science because it's a process you have to follow.
Objective items (stimulus) Forced Item-
Do you go more by facts or principal? Do you prefer the planned events or unplanned events?
Purpose of the test
Does it have a normative approach or a criterion approach
example of column format:
For each of the courses listed below, identify how many hours you study per week and how many pages you read for the course per week. Course Hours per week you study Read Psychology _________ ______
How do you set up the pilot test?
For example: to diagnose emotional disabilities in adolescent's it has to be in school setting. Part should be adolescents who have been determined to have emotional disabilities and the others should be adolescents who have been determined not to have emotional disabilities. Also has to be male AND female from various economic and ethnic backgrounds.
Quantitative data of a pilot test
How you decide if item is difficult or not. You divide # of persons who answered correctly by # of person who responded to the question (P). .4 to .6 is range 0-.2 is too difficult .9-1 is too easy
Subjective items Sentence completion-
I feel happiest when I am ______________.
Preparing for survey:
Identify the objectives (lit review will help you or contact experts), define your objectives operationally (operational definitions), construct a plan (estimate the cost, administer, timeline for completing each phase of the survey).
example of an open ended question:
Last year you were a member of student government. please comment on your experience.
Examples of objective formats
Multiple choice, true/false, fill in the blank, matching
New theory of why we develop a new test.
New theory is fresh definition of constructs.
example of semantic differential:
Please circle the number of representing the demeanor of your professor Happy 1234567Grumpy
example of row format:
Please indicate how many hours per week you study for each of the following courses. Course Hour Pychology _______ Biology _______
What are the 5 steps for survey research?
Pre, construct, administer, analyze, communicate findings.
Final Phase of survey development:
Presenting the findings
Which is the most popular? Survey monkey or qualtrics?
Qualtrics
qualitative analysis
Questionnaires, group discussions, panel of experts to discuss items.
Types of surveys
Self Administered and personal interviews
What are people who deal with scientific method of survey design called?
Survey researchers
Conducting the pilot test
Test + Criterion + Questionnaire, Interviews, Group Discussions, and expert panel.
What type of instructions should be given for the administrators?
Test is in a group or individually, requirements for administration location, privacy, quiet, and comfortable chairs, tables, or desks. No.2 pencils, a computer with a dvd, time limits, script for administrator to read including answers to questions that takers are likely to answer, credentials or training required for the test administrator.
Who do you have to write administration instructions for?
Test taker, administrator, and test scorer.
faking
The inclination of some test takers to try to answer items in a way that will cause a desired outcome or diagnosis
Example of Single-Item Choice:
There are two methods that can be used to evaluate your understanding of course material. One method is to give you an in class exam, and the other method is to have you write a term paper. The question is, which method do you prefer? ________ In-class exam ________Term Paper
What do survey researchers do?
They design, conduct, and analyze surveys
Why do we need to know about surveys?
To know psychology, we need to know about surveys. Psychological tests are used for individual outcomes which are overall scores. Surveys are also used for group outcomes which are the question level.
Example of Implied No Choice:
Why didn't you pass your psychology exam? _____ I did not study. _____I did not feel well. ____ I don't know.
General rule of thumb for developers of tests.
Write twice as many items as you expect to use in final test.
a sample means
a representative subset of the population
Sampling error
a statistic that reflects how much error can be attributed to the lack of representation of the target population by the sample of respondents chosen. Fewer people will leave to increased errors.
population means
all members of the target audience
Why develop a new test?
an achievement test may be needed for a special population of individuals with a disability that affects how they perceive or answer the test questions.
Subjective items Essay Questions-
analysis, synthesis and evaluation many students prefer essay questions because this format allows them to focus on demonstrating what they have learned rather than limiting them to answering specific questions.
complex test formats-portfolio
architects
Response sets
are patterns of responding that result in false or misleading information.
What considers them a survey research firm?
company's server is accessed by user
Discrimination index
compares the performance of those who obtained very high test scores (the upper group [U]) with the performance of those who obtained very low test scores (the lower group [L]) on easy item. U= # of people in U group who were correct over total # in upper group and then X 100. L= # of people in L group who were correct over total # in lower group and then x 100.
complex test formats-simulation
consider the job of a bomb disposal technician
Objective items (stimulus) Multiple choice-
consists of questions or partial sentences, called a stem followed by a number of responses. The incorrect responses are called distractions. Guessing correct answer is 1 out of 4.
When choosing the final items, test developers do what?
construct a matrix. Lists each item, followed by its performance in terms of internal consistency, validity, difficulty, discrimination, and bias.
Administering and scoring the tests first step:
deciding if it's cumulative or categorical model
What is the first step in test development?
define the testing universe, the target audience, and the purpose of the test. Define construct operationally (that means in terms of behavior)
example of an closed ended question:
did you attend last week's biology study group? yes___ no___
To decode subjective formats, is it easy or hard?
easy
To score objective formats, is it easy or hard?
easy
Subjective items Projective Techniques-
evaluating childrens play
Test plan for clinical tests
example: developmental and psychological problems It's Based on carefully researched constructs.
personal interview surveys include:
face to face and telephone interviews
Response bias-social desirability
favorable impression
Subjective items Interview questions (organizational settings)-
general in scope. Interviewer decides what is a good or poor answer.
To decode objective formats, is it easy or hard?
hard
To score subjective formats, is it easy or hard?
hard
What are the 5 steps for regular research?
have a plan, design the study, conduct your study, analyze, communicate findings.