PSY 451- Midterm

Ace your homework & exams now with Quizwiz!

Time Sampling

1. Error associated with administering at least two different times Ex: Intelligence test

Settings of the Testing Process- Educational

Achievement test, diagnosis, diagnostic test

Settings of the Testing Process- Geriatric

May look at quality of life, dementia, pseudodementia (severe depression)

Range

Q3-Q1

Psychometric Soundness

Technical quality

Ethical Guidelines

o Body of principles of right, proper good conduct Ex: APA Ethics Code

Computer Test Pro

o Test administer have greater access to potential assess due to the Internet o Scoring an interpretation of the data tend to be quicker than paper pencil test o Cost are typically lower than paper and pencil o Can reach more populations (isolated or those with disabilities)

High Construct validity

.3-.4

Code of Fair Testing Principles in Education

1 Developing/ selecting tests 2. Interpreting scores 3. Striving for fairness 4. Informing test takers

How Can Ethical Issues Be Resolved

1. Describe the problem situation 2. Define the potential legal and ethical issues involved. Review guideline consult other as needed. 3. Evaluate the rights, responsibilities and welfare of all affected parties 4. Consider alternative action and the consequences of each action 5. Make the decision and the responsibility for it. Monitor outcomes.

Internal Consistency

1. Error associated with different set of items within one test a. Ex: How well do items measure concepts being measured

Three Types of Rational Theoretical Approach

1. Intuitively developed by test author 2. Content validation method using the judgement of experts in development and selecting items Ex: including others who identify with other groups for validity 3. Theory based according to a recognized theory of personality or social emotional functioning

Some Assumptions About Psychological Testing and Assessment

1. Psychological traits and states exist (there are internal things we don't typically observe but does exist) 2. Psychological traits and states can be quantified and measured (ex: how we define engagement) 3 Test Related Behavior Non-Tester Related Behavior (Relates to validity- Ex: wanting to understand how a student's engagement can tell us about what is happening in their life~ potential for achievement) 4. Test and Other Measurement Techniques Have Strengths and Weaknesses (often, we need to remember legal and ethical guidelines) 5.Various Sources of Error Are Part of the Assessment Process (reliability~ are we measuring what we want to measure) 6. Testing and Assessment Can Be Conducted in a Fair Unbiased manner (equity lens) 7. Testing and Assessment benefits Society

Process of Assessment

1. Referral for assessment from a source (teacher, school psychologist, counselor, judge, clinicians, corporate human resources specialist) 2. Typically 1+ referral questions used Ex: Can this child function in a general education environment. Is this defendant competent to stand trial. 3. The assessor may meet with assesee or others before final assessment to clarify reasons for referral 4. Assessment 5. Assessor writes a report about findings referring to referral questions 6. More feedback sessions with assesee and/or 3rd parties may be scheduled Assessment approaches

Process of Developing a Test

1. Test conceptualization 2. Test construction 3. Test tryout 4. Item analysis 5.Test revision

Testing People With Disabilities- challenges

1. Transforming the test into a form that can be taken by the test taker 2. Transforming the responses of the test taker so they are scorable 3. Meaningfully interpreting test data

Test Conceptualization- Things to Consider

1. What is the test design to measure? a. How is the construct designed? 2. Is there a need for the test? a. Self-report for students 3. Who will take this test? 4. How will the test be administered? 5. What is the ideal format of this test? 6. Is there any potential harm as the result of an administration of this test? a. Using anonymous names 7. How will meaning be attributed to score on this test?

Three Steps of Split Half Reliability

1.Divide test into two equivalence halves 2. Calculate Pearson's r for each halve 3. Adjust the half test reliability using Spearman's Browns formula

Code of Professional Ethics

A body of guidelines that set forth the standard of code of members of society

Ethics

A body of principles of right, proper or good conduct; contrast with laws

Psychological Test

A device or procedure designed to measure variables related to psychology (such as intelligence, personality, aptitude, interest, attitudes or values)

Ecological Validity

A judgement regarding how well a test measures what it proports to measure at the time and place that the variable being measured is actually emitted

Test

A measuring device or procedure

Interview

A method of gathering info though direct communication involving reciprocal exchange. Differs in length, purpose and nature.

Cumulative Scoring

A method of scoring whereby points or scores accumulated on individual items or subset are tallied and then, the higher the total sum, the higher the individual is presumed to be on the ability, trait or other characteristic being measured; contrast with class scoring and ipsative score

Assent

A participant is willing to do what we want them to do

Generalizable Theory

A person test scores vary from testing to testing because of variables in the testing situation

Group Think

A result of varied focuses that drive decision makers to reach a consensus

Quota System

A selection procedure whereby a fixed number or percentages of applicant with certain characteristic or from certain backgrounds are selected regardless of other factors such as documented ability

Kuder Richard Formula 20

A series of equation designed to estimate the inter item consistency of tests

Criterion Contamination

A state in which a criterion measure is itself based, in whole or a in part, on a predictor measure- When contamination does occur results cannot be taken seriously

Item Response Theory

A system of assumptions about measurement and the extent to which each item measurement that trait

Classical Test Theory (True Score Theory)

A system of assumptions about measurement that includes the notions that a test score is composed of a relatively stable component that actually is what the test or individual items as designed to measure as well as a component that is error

Portfolio

A work sample; referred to as a portfolio assessment when used as a tool in an evaluative or diagnostic process. Has been used for instructor hiring, their portfolio may contain document such as lesson plans, published writings, visual aids.

APA Ethics Code- General Principles

A. beneficence and nonmaleficence B. fidelity and responsibility C. integrity D. justice E. respect for the peoples right and dignity

Item Branching

Ability of the computer to tailor the content and order of presentation test item on the bases of responses to previous items

Measurement

Act of assigning numbers or symbols to characteristics of things

Role Play

Acting as improvised or partially improvised part in a stimulated situation

Test Revision

Action taken to modify a test content or format for the purpose of imposing the test effectiveness as a tool of measurement

Accommodation

Adaptation of a test, procedure, situation or the substitution of one test for another, to make the assessment more suitable for an assesee with exceptional needs

Measurement Error

All factors associated with the process of measuring some variable, other hand, the variable being measured.

Frequency Distribution

All scores listed alongside the number of times each score occurred

Spearman Brown Formula

Allows a test developers or uses to estimate internal consistence reliability from a correlation of two halves of a test Can be used to estimate the effect of the shortening of the test reliability Can be used to determine the number of items needed to obtain a desired level of reliability

Source of Measurement Error- Item Sampling

Alternate, equivalent or parallel form of reliability

Parallel Forms Reliability

An estimate of the extent to which item sampling and other errors have affected test score on version of sampling and other error have affected test scores on version of the same test, the means and variances of observed test scores are unequal

Alternate Forms Reliability

An estimate of the extent to which these different form of the score test have been affected by sampling or other errors

Reliability Coefficient

An index of reliability, a proportion that indicates the ratio between the true score variance on a test and the total variance

Overt behavior

An observable action or the product of an observable action including test or assessment related responses Ex: shy, very shy, not shy

Assessment Center

An organizationally standardized procedure for evaluation of multiple assessment techniques. Testing assessment acknowledged that tests were used only one type of tool used by professional assessors.

Test Developer

An umbrella term for all that goes into the process of creating a test

Trait

Any distinguishable, relatively enduring way in which one individual varies from another (trait is used very broadly and ambiguously when considering the concept)

Panel Interview (Board Interview) Pro

Any idiosyncratic bases of a lone interviews are minimized

Developmental Norms

Any trait, ability, skill or other characteristic that is presumed to develop, deteriorate or otherwise be affected by chronological age, school grade or stage of life

Construct Validity

Appropriateness of making inferences about the construct you are trying to measure based on the test scores from the test you developed

Assessment Process

Assessment is usually individualized. In contrast to testing, assessment more typically focuses on how an individual processes rather than simply the results of that processing

Assessment Skill of Evaluator

Assessment typically requires an educated selection of tools of evaluation, skill in evaluation, and thoughtful organization and integrating of data

Collaborative Psychological Assessment

Assessor and assesee may work as "partners" prom initial contact through final feedback

Naturalistic Observation

Behavioral observation that takes place in a naturally occurring resting for the purpose of evaluation and info gathering

Laws

Body of rules that must be obeyed for the good of society

Varience

Can only use when distributions are approximately normal

Score

Code or summary, statement, usually but not necessarily numerical, it reflects an evaluation of performance on tests, tasks, interviews or other samples of behavior

Assessment Approaches

Collaborative psychological assessment Therapeutic psychological assessment Dynamic assessment

Error

Collective influence of all the facts on a test core or measurement beyond there specifically measured by a test or measurement

Ipasative Scoring

Comparing a test taker score on one scale within a test to another scale within the same test

CAT

Computer adaptive testing, computer has the ability to tailor he test to the test takers ability or test taking patterns

Computer Test

Computer can serve as test administrator and as highly effective test scores

Test Conceptualization

Conceiving an idea for the best fit

Threats to Fairness

Construct irrelevance variance Test content Test context Test response Opportunity to learn

Reference Sources- Online Databases

Contain abstracts of articles, original articles and links to other useful websites

Split Half Reliability

Correlating 2 pairs of scores obtained from equivalent haves of a simple test administered once

Techniques Used to Calculate Measurement Error and Reliability- Alternate, Equivalent or parallel form Reliability

Correlation between equivalent forms of test with different items Ex: Look at scores between 2 PSY 100 exams

Techniques Used to Calculate Measurement Error and Reliability- Test Retest Reliability

Correlation between scores obtain on two occasions

Techniques Used to Calculate Measurement Error and Reliability- Split Half Reliability

Correlation between two halves of a test (Spearman Brown formula) Alpha (Use when test doesn't have right or wrong answers, Ex: Psychological scale)

Spearman's Rho

Correlation coefficient is frequently used when the sample size is small (fewer than 30 pairs of measurement) and when both sets of measurement are ordinal

Validity Coefficient

Correlation coefficient that provides a measure of the relationship between test scores and scores on the criterion measure

Privileged Info

Data protected by law from disclosure in a legal proceeding; typically, expectation to privilege are also noted in law

Coefficient of Equivalence

Degree of the relationship between various forms of a test can be evaluated by means of an alternate form or parallel form of coefficient of reliability

Homogeneity

Degree to which a test measures a single factor. The more homogenous a test is, the more inter item consistency I can be expected dot have.

Heterogeneity

Degree to which a test measures different factors

Incremental Validity

Degree to which an additional predictor explains something about the criterion measure that is not explained by predictor already used

Discrimination

Degree to which an item differentiates among people with higher or lower levels of what is being measured

User Norms Program Norm

Descriptive statistic based on a group of test takers in a given period of time · Sampling to develop norms

Correlation Referenced Test

Designed to provide an indication of where a test taker stands with respect to some variable or criterion

Alternate Forms

Different version of a test that have been constructed to be parallel

Floor Effect

Diminished ability of an assessment tool for distinguishing test takers at the low end of the ability, trait or other attribute being measured

Celling Effect

Diminished utility of an assessment tool for distinguishing test takers at the high end of the ability, trait or other attribute being measured

State

Distinguished one person from another but are relatively less enduring

Threats to Fairness- opportunity to learn

Do all student engagement test takers have the opportunity to be cognitively and affectively engage given the interaction provided to them in their secondary school they attend

Threats to Fairness- test content

Does it just measure student engagement as defined by the test authors or may or may not represent diverse identities or does it represent student engagement across all potential diverse identities that individual participants might exhibit/identity with. Important that if others do not identify with those diverse identities that they collaborate with people who do, to ensure that both their research identifies or reflects those with diverse identities but also individual recommendations

Parallel Forms

Each form of a test, the mans and variances of observed test scores are equal

Variables of Assessment

Educational Assessment Retrospective Assessment Remote Assessment Ecological momentary Assessment (EMA)

Settings of the Testing Process

Educational, clinical, counseling, geriatric, business, military, gov and organizational credentialing, academic, program evaluation, health psychology

Observer Differences

Error associated with observers judging same behavior differently using some instrument

Item Sampling

Error associated with relation of one set of items from potential items within a domain for inclusion in a test items that could be in an SEI

Test Retest Reliability

Estimate of reliability obtain by correlating pairs of scores from the same people on 2 different administration of the same test

Estimate of Inter item Consistency

Estimate of reliability of a test obtained from a measure of inter item consistency

Confidentiality

Ethical obligation of professionals to keep confidential all communication made or entrusted to them in confidence, although professionals may be compelled to disclose such confidential communication under count order or extortionary conditions, such as when such communications safer to a third party in immediate danger; contrast with privacy right

Alternative Assessment

Evaluative or diagnostic procedure or process that varies from the usual, customary or standardized ways of measurement is desired, either by virtue of some special accommodation mode to the assesee or by means of alternative methods designed to measure the same variables

Correlation

Expression of the degree and direction of a correspondence between 2 things

Meta Analysis

Family of techniques used to statistically combine info across studies to produce single estimates of the data

Case History Data Example

Files or excerpts of files maintained by schools; employers, religious institution. Letter, written correspondence, photos, newspapers, magazine clippings ect.

Summative Scale

Final test score is obtained by summarizing the rating across all items, Likert Scale, method of paired comparisons, categorical scaling, guttman scale, scalogram analysis

Protocol

Form or sheet or booklet on which a test takers response is entered

Format

Form, plan, structure, arrangement and layout of test items Ex: computerized, pencil and paper ect

Psychological Assessment

Gathering and integration of psychology related data for the purpose of making a psychological evaluating that is accomplished through the use of tools such as tests, interviews, case studies, behavioral observation and specially designed apertures and measurement procedures

Settings of the Testing Process- Gov and Organizational Credentialing Example

Government licensing, certification or other credentialing of professionals- passing the bar exam

Normative Sample

Group of people whose performance on a particular test is analyzed for reference in evaluating the performance of individual test takers

Rating Scale

Grouping of words, statement or symbols on which judgement of the strength of a particular trait, attitude or emotions are indicated by test takers

Active Consent

Having someone need to sign something before doing the assessment

Face Validity

How relevant do the test items appear to be · Cannot be statistically measured Ex: Reading fluency

Content Referenced Testing and Assessment

How scores relate to particular content area or domain

Content Validity

How well a test samples items measuring what it is intended to measure

Criterion- Related Validity

How well can you infer a test takers performance on another test base don the test you were given

Threats to Fairness- test context

Importance of environment and test itself as an environment Ex: directions for student engagement, will direct to all test takers~ reading instructions aloud so it is standardized for all

Testing People With Disabilities

Important to not all problem translate such as questions with artwork for people who are blind

Threats to Fairness- construct irrelevance variance

In a test we are trying to measure a specific construct

Ethics beneficence and nonmaleficence

In professional actions, psychologists seek to regard the welfare and rights of those with womb they interact professionally and other affected persons

Coefficient of Determination r2

Indication of how much variance is shaped by the x and y variables

Construct

Informed, scientific concept developed constructed to describe or explain behavior

Dynamic Assessment

Interactive approach to psychological assessment that usually follows a model of 1. evaluation 2. Intervention of some sort 3. evaluation

Source of Measurement Error- Observer Differences

Interrater, inter scorer, interobserver, inter judge reliability

Panel Interview (Board Interview)

Interview conducted with one interviewee by 1+ interviews at a time

Scalogram Analysis

Item analysis procedure and approach to test development that involves a graphic mapping of a test takers response

Guttman Scale

Items on it range sequentially from weaker to stronger expressions of the attitude, belief or feeling being measured

Comparative Scaling

Judgements of a stimulus in comparison with every other stimulus scale

Notification

Letting people know they are doing the assessment and to reply if they have any question, not asking for active or passive consent

Inference

Logical result or dedication

Concurrent Validity

Looking at the relationship between and our ability to infers from a test to a test administered at the same time Ex: IQ used to help make an inference in achievement test

Testing and Assessment with Communities of Color

Majority of tests standardized, validated and found reliable primary with white middle class English language sample · But historically have still been viewed as objective, culture free and generalized across cultures

Settings of the Testing Process- Business and Military

May use tests, interviews and other tools of assessment

Measures of Central Tendency

Mean, median and mode

Average proportional Distance Methods

Measure used to evaluate the internal consistency of a test that focuses on the degree of difference that exists between item scores

Criterion Referred Testing and Assessment

Method of evaluation and a way of deriving meaning from test scores by evaluating an individual score with reference to a set standard § Ex: Taking a driver's test

Norms Referenced Testing and Assessment

Method of evaluation and a way of deriving meaning from test scores by evaluating an individual test takers score and comparing it to scores of a group of test takers

Behavioral Observation

Monitoring the actions of other or oneself by visual or electronic means while recording qualitative and/or quantitative info regarding those actions

Ethics Fidelity and responsibility

Must remember professional and scientific responsibilities

Passive Consent

Needing someone to send something saying they are not willing to participate in the assessment

Types of Scale Measurement

Nominal- everything is mutually exclusive and exhaustive Ordinal- rank order- most frequently used in psychology (easier to manipulate stats. than ordinal) Interval- IQ Ratio- Has a true zero

National Norm

Norms derived from a standardized sample that was nationally representative of the population Ex: age, gender, racial/ethnic background, socioeconomic status, geographical background

Extended Scoring Report

Not only provides a listing of scores but statistical data as well

Correlation Coefficient

Number that provides us with an index of strength of the relationship between 2 things

Pearson R

Obtaining an index of the relationship between 2 variables when that relationship is linear and when the 2 correlated variables are continuous

Types of Scaling

Ordinal, nominal, interval ratio Age based Unidimensional, multidimensional Compositive, categorical

Techniques Used to Calculate Measurement Error and Reliability- Inter rater, Inter Scorer, Inter Observer, Inter Judge Reliability

Percent agreement ( Most common, BUT NOT best method, Does not consider level of agreement expected by chance) Kappa (Better method, Actual agreement as a proportion of potential agreement, corrected for change format?, From 1 (perfect agreement) to -1 (less agreement than expected by change alone), 1. >.07= excellent agreement 2. .04-.74= fair good agreement 3. <.4= poor agreement)

Percentiles

Percentage of people whose score on a test or measure falls below a particular raw score § Is a converted score that refers to a percentage of test takers Popular way of organizing all test related data

Informed Consent

Permission to proceed with a (typically) diagnostic, evaluative or therapeutic service on the basic of knowledge about the service and its risks and potential benefits

Types of Kurtosis

Platykurtic --> relatively flat Leptokurtic--> relatively peaked Mesokurtic--> somewhere in the middle

Types of Criterion-Related Validity

Predictive, concurrent

Pilot Work (Pilot study, pilot research)

Preliminary research surrounding the creation of a prototype of the test o Item may be evaluated if they should be used in the final form o Pilot works come from test construction

Standardization (test standardization)

Process of administering a test to a representative sample of test takers for the purpose of establishing norms

Scoring

Process of assigning such evaluative codes or statement to performance on tests, tasks, interviews or other behavioral samples

Validation

Process of gathering and evaluating evidence about validity

Psychological Testing

Process of measuring psychology related variables by means of devices or procedures designed to obtain a sample of behavior

Scaling

Process of setting rules for assigning numbers in measurement

Psychometrists/Psychometrists

Professional who use, analyzes and interprets psychological tests data

Validity

Proportion of the total variance attributed to true variance o The greater proportion of total variance attributed to true various, the more reliable a test is

Psychological Assessment- Test

Psychological tests or other tools of assessment may vary by content, format, administration procedures, scoring, interpretation procedure, technical quality

Ethics Justice

Psychologists exercise reasonable judgement and take precaution to ensure that their potential biases, the boundaries of their competency and the limitations of their exceptive do not lead to or condone unjust practices

Ethics Respect for the peoples Right and Dignity

Psychologists respect the dignity and worst of all people, and the right of individuals privacy, confidentially, and self-determination

CAPA (Computer Assisted Psychological Assessment) Example

Questions interactive

Scaling Methods

Rating scales, summative scale,

Standard Scores

Raw score that has been converted from one scale to another scale, where the latter scale as some arbitrary set mean and SD · Raw scores may be converted to standard scores because standard scores are more easily interpretable then saw scores

Case History Data

Records, transcripts and other accurate in written, pictorial or other forms that perverse archival info, official and informal accounts, other data and stems relevant to an assessee.

Cut Score (cutoff scores)

Reference point, usually numerical derived by judgment and used to divide a set of data into 2+ classifications (Cut scores on tests are usually in combination of other data, are used in many school contexts) Ex: Employees as aids to decision making about personnel hiring, placement and advancement

Ecological momentary Assessment (EMA)

Refers to the "in the moment" evaluation of specific problems and related cognitive and behavioral variables at the very time and place that they occur

Panel Interview (Board Interview) Con

Relates to utility; cost of using multiple interviews may not be justified

Item Bank

Relatively large and easily accessible collection of test questions

Characteristics of Criterion

Relevant, valid, uncontaminated

Case Study (case history)

Report or illustrated account concerning a person or an event that was completed on the basis of case history

Validation Study

Research that entails gathering evidence relevant to how well a test measures what it proports to measure of evaluating the validity or test or other measurement

Item Pool

Reservoir or well from which item will or will not be drawn form the final version of the test

Defining Fairness in Testimony

Responsiveness to individual characteristics and testing contexts, so that test scores will yield valid interpretation for intended users (pg 50)

Z- Score

Results from the conversion of a raw score into a number indication how many SD units the raw score in below or above the mean of the scale distribution

Continuous Scale

Scale used to measure a continuous variable

Discrete Scale

Scale used to measure a discrete variable Ex: mental health, a group of previously hospitalized, a group never hospitalized

Settings of the Testing Process- Counseling Examples

Schools, prisons, government or privately owned institutions

Psychometrics

Science of psychological measurement

Control Processing

Scoring conducted at a central location

Local Processing

Scoring done onsite

Simple Scoring List

Scoring report providing only a listing of score

Domain Sampling Theory

Seek to estimate the extent to which specific sources of variation under define condition contributing to test score

Scale

Set of numbers or symbols properties model empirical properties of the object to which the number are assigned

Distribution

Set of test scores arranged for recording a study

Culture

Socially transmitted behavior patterns, beliefs and products of work of a particular population community or group of people

Systemic Error

Source of error in measuring a variable that is typically consistent or proportionate to what is presumed to be the rue value of the variable being measured

Random Error

Source of excess in measuring a targeted variable caused by unpredictive fluctuation and inconsistences in other variables in the measurement process

Health Psychology

Specialty area of psychology that focuses on understanding the role of psychological variables in the onset, course, treatment, prevention of illness, disease or disability

Source of Measurement Error- Internal Consistency

Split half reliability

Stanine

Standard score derived from a scale with a mean of 5 and SD of approximately 2 Ex: achievement tests, SAT

Linear Transformation

Standard score that retains a direct numerical relationship to the original score

Criterion

Standardized on which a judgement or decision may be based

Standardized Error of Difference

Statistical measure that can aid a test users in determining how large a difference should be before it should be statistically significant

Kurtosis

Steepness of a distribution in its center

Categorical Scaling

Stimulus are placed in one of 2+ alternative categories that differ quantitatively with respect to some continuum

Raw Score

Straightforward, unmodified accounting of performance that is usually numerical Ex: Number of items responded to correctly an achievement test

Types of Standardization Sampling

Stratified random sample Purposive sampling Convenience sampling (incidental sampling)

Fixed Reference Group Scoring System

System of scoring where is the distribution of scores obtained on the test from one group of test takers (fixed reference group) is used as the basis for the calculation of these scores for future administration Ex: SAT

Motivational Interviewing

Targeted change in the interviews thinking and behaviors (THERAPUTIC DIALOUGE THAT COMBINES PRESSOR CENTERED LISTENING SKILLS SUCH AS OPENESS AND THERAPUTIC DIOLOUGE THAT OCMBINES PERSON CONETED LISTENING SKILLS SUCH AS OPENNESS, EMPATHY, WITH THE USE OF COGNITION ALTERING TECHNIQUES DESIGNED TO POSITVILY AFFECT MOTIVATION AND AFFECT THERAPUTIC CHANGE)

Tools of Psychological Assessment

Test Interview Portfolio Case history data Role play test Computer test

Where to go for Authoritative Info: Reference Sources

Test catalogue Test manual Professional book Reference volume Journal article Online data bases Directory of Unpublished Experimental Mental Measures

Sources of Error

Test construction (Item/Content Sampling) Test administration (test environment, test takers variables, examinees related variables) Test scoring interpretation (scoring glitch) Other (Margin or Error)

Teleprocessing

Test data may be sent to and returned from central facility by phone lines, mail or currier

Who are the parties in the Testing Process

Test developers, test users, test taker, society at large, other parties (organizations, companies, gov. agencies)

Polydomous Test item

Test items or questions with 3+ alternate responses, one score is correct or scored as being consistent with target track or construed

Norms

Test performance data of a particular group of test takers that are designed for used as a reference when evaluating or interpreting individual test score

Source of Measurement Error- Time Sampling

Test retest reliability

Group Frequency Distribution

Test score intervals replace the actual test scores

Class Scoring (category scoring)

Test takers response earn credit toward placement in a particular class or category with other test takers whose responses were onward in a similar way

Testing Process

Testing may be individual or group nature. After test administration the tester will typically add up the number of correct answers or the number of certain type of responses... with little if any regard for the how or mechanics of such content

Testing Skill of Evaluator

Testing typically requires technician like skills in terms of administering and scoring a test as well as interpreting a result

Culture Specific Tests

Tests designed for the use with people from one culture but not from another

Settings of the Testing Process- Clinical

Tests employed may include intelligence tests, personality test, neuropsychological tests, other specialized instruments. o Ex: Public, private and military hospitals, inpatient and outpatient clinics, private practice consulting room, schools and other institutions.

Assessment Role of Evaluator

The assessor is key to the process of selecting tests and/or other tools of evaluation as well as in drawing conclusions from the entire evaluation

Error of Variance

The component of a test score attributable to sources other than the trait or ability measured

Privacy Right

The freedom of people to choose the time, circumstances, and extent to which they work to share or withhold from other personal beliefs, opinions and behavior; contract wit confidentiality

Standard of Care

The level at which the average, reasonable and present professional would provide diagnostic or therapeutic serveries under the same or similar condition

Testing Role of Evaluator

The tester is not key to the process; practically speaking, one tester may be substituted for another tester without appreciably affecting the evaluation

Therapeutic Psychological Assessment

Therapeutic self-discovery and new understanding are encouraged throughout the assessment process

Diagnostic Test

Tool of assessment used to help narrow down and identify areas of deficient to be targeted for intervention

Role Play Tests

Tool of assessment where in assesses are directed to act as if they were in a particular situation. May be used in clinical settings to get a baseline and at the end of treatment

Standardized Error of Measurement

Tool used to estimate or infers the extent to which an observed score deviates from the true score

Consultive Report

Type of interpretive report designed to provide excerpt and detailed analysis of test data that mimics the work of an excerpt consultant

Effect Size

Typically expressed as a correlation coefficient · Can be replicated · Conclusions tend to be more precise · More focus on effect size than statistical significance · Promotes evidence base practice o Clinical and research findings

Assessment Objective

Typically to answer a referral question, solve a problem or arrive at a decision through the use of tools of evaluation

Testing Objective

Typically to obtain some gauge, usually numerical in nature, with regard to an ability or attribute

Local Validation Study

Typically undertaken in conjunction with a population different from the population for when the test was originally validated

Assessment Outcome

Typically, assessment entails a logical problem-solving approach that brings to bear many sources of data designed to shed light on a referral question

Testing Outcome

Typically, testing yields a score or scores of test scores

Retrospective Assessment

Use of evaluative tools to draw conclusions about psychological aspects of a persons as they existed at some point in the time prior to the assessment

Educational Assessment

Use of tests and other tools to evaluate abilities and skills relevant to success or failure in a school or pre-school context Ex: Intelligence tests, achievement tests, reading comprehension

Remote Assessment

Use of tools of psychological evaluation to gather data and draw conclusions about a subject who is not in physical proximity to the person or people conducting the evaluation

Utility

Usefulness or practical value that a test or other tools of assessment has for a practical purpose

Predictive Validity

Validity coefficient, incremental validity

True Score

Value that genuinely reflects on individual's ability (a trait) level as measured by a particular test o Domain sampling and generalizable theory

Error Variance

Variance from irrelevant, random sources

True Variance

Variance from true difference

Item Analysis

Various procedures usually statistical, designed to explore how individual test items work as compared to other items in the test and in the content of the whole test; contract with the qualitative item analysis

Issues Regarding Culture and Assessment

Verbal communication(Vocab may change, Translator skill or professionalism, Unintentional hints, Knowing proficiency of the language on the assessment) Nonverbal communication (Cultural norms that may be missed but make a difference to the test takers answers) Standards of Evaluation (preferences someone may have, individualist vs collectivistic culture)

Affirmative Action

Voluntary and mandatory efforts undertaken by federal, state and local government, private employers and schools to combat discriminating and promote equal opportunities for all in education and employment

Right to the Least Stigmatizing Label

When reporting a test

Non-Linear Transformation

When the data under consideration are not normally distributed but compare with normal distribution yet comparisons can be made

Coefficient of Stability

When the interval between testing is greater than 6 months

Predictive Validity

Whether a measure you are using at one point in time can help predict future scores Ex: predicting how students would do at state tests at the end of a year, by giving them a test earlier in the year

Convergent Validity

Will tell you if a test correlates highly in a predictive direction with a test that measures the same or similar construct it will tell you that the scale you are developing is measuring the same thing (converging) as this other measure you are comparing to How does it correlate?

Threats to Fairness- test response

Will test takers be able to understand how to respond to the five-point Likert scale Ex: looking at age to understand a Likert scale starting in grade 6

Rapport

Working relationship between the examiners and examinees

Test Construction

Writing test items (or rewriting or revising existing items) as well as formatting items, setting score rules, and otherwise designs or building the test

Types of Standard Scores

Z score T score Stanine Linear transformation Non-linear transformation

Minimum Competency Testing Program

formal evaluation program in basic skills, such as reading writing and arithmetic, designed to aid in educational decision making the ranges from remediation to graduation

Other Considerations

o A good test is one that trained examinees can administer, score and interpret with a minimum of difficulty o A good test is unequal § Yield additional results that will benefit individual test takers or larger society

Graphic Representation of Correlation

o Bivariate distribution o Scatter gram o Scatter diagram o Scatter plot o Curvilinear~ how curved a graph is

Reference Sources- Professional Book

o Book may shed light on how or why the test may be used for a particular assessment purpose, or administered to members of some special population o May provide useful guidelines for pretest interviews, drawing conclusions, making inferences about data derived from test o May alert to common error made

Reliability

o Consistency of a measuring tool o Psychological tests are consistent to verifying degree

Reference Sources- Journal Articles

o Currant journals may contain review of the test, updated or independent studies of its psychometric soundless or examples of how the instrument was used in either research or applied context o Some journals may specifically focus on matters related to the testing assessment

Grade Norms

o Designed to indicate the average test performance of test takers in a given school grade § Only useful with respect to years and moths of schooling completed Developmental Norms

Reference Sources- Test Manual

o Detailed info of a particular test and technician info relating to it should be found in the test manual o Test publishers typically require documentation of professional training before filling an order for the test manual o Universities often have test manual

Cultural Difference

o Differences between groups explained § As differences and potential strengths and not deficits

Cultural Deficit

o Differences between groups explained § By genetic and biological differences By cultural beliefs values and practice, lack of assimilation to majority culture

Reference Sources- Test Catalogue

o Distributed by published of test o Usually only contains brief description of test and seldom contains detailed technician's info a prospective user may require o Catalogue objective is to sell the test Very few highly critical review is over found in a publishes test catalogue

Evoking Interests in Culture Related Issues

o Ex: immigrants and intelligence test o Often lest out minorities in the sample population Culture Specific Tests (verbal and nonverbal communication, standards of evaluation)

Family Educational Rights and Privacy Act

o Guarantees privacy and confidentiality of educational records (including test results) o Can only be released to school employees with "legitimate educational interest" Ex: Grades when they needed to post grades somewhere before being able to submit electronically

Right to Be Informed of Test Findings

o If test is voided test takers have a right to know o Test takers are entitled to know what the recommendations are being made as a consequence of test data

Age Norms (Age Equivalent Scores)

o Indicate the average performance of different samples of test takers who were at various ages at the time the test was administered § Can be done with physical characteristics like bought or psychological characteristics like intelligence

Test Users Qualifications

o Level A~ tests or aids that can adequately be administered, scored and interpreted with the aid of the manual and general orientation to the kind of institutions or organizations in which one is working § Ex: achieved or proficiency test o Level B~ Test or aids that require some technical knowledge of test construction and use and of supporting psychological and educational fields such as statistics, industrial differences, psychology of adjustment, personally psychology and guidance § Ex: Aptitude, adjustment inventories applicable to normal population o Level C~ tests and aids that require substantial understanding of testing and supporting psychological fields together with supervised experiences in the use of the devices § Ex: projective test, individual intelligence tests

Test Tryout

o Look at data, get feedback o Narrow down items for a final test

Coefficient Alpha

o Men's of all split half correlation Preferred method for obtaining an estimate of internal consistency reliability Ranges from 0 to 1

Norm Referenced vs Criterion Referenced Evaluation

o Norm interpretation of test data, a usual area of focus is how individual performed relative to their people who took the test o Criterion referenced interpretation of test data, a usual area of focus is the test takers performance o Culture and inference

Reference Sources- Reference Volume

o Often contains a lot of details about test related info Ex: publishers, test author, intended population, test administration time

Example of Discrimination from Class

o Ricci vs. DeStefano (2009) § New Haven (CT) Fire Department § Exam to determine eligibility for promotion to lieutenant and captain § No African American and only 1 Hispanic would have been among the 15 promoted based on the exam § Civil Service Board threw out results, did not promote anyone § Frank Ricci, white fire fighter, who would have been promoted based on results, and others, sued New Haven Fire Department § Case eventually went to the U.S. Supreme Court § Argument · Ricci o Should have been promoted because: § Had dyslexia § Studied 13 hr./day § Paid someone to read textbook audio tapes to prepare flashcards · New Haven Fire Department o Was right in the throwing out the test results because: § Desperate impact of test on firefighters of color If they hadn't thrown out results, firefighters of color would have likely sued

Testing and Assessment with Communities of Color- Biases Can Involve

o Tests being more accurate for one group and not another Ex: Tests designed by for white middle class for SEI and white middle class students do better o One group scoring higher than another on a test designed to predict outcome on which groups are equal o Interpreting those differences as to cultural deficit

Computer Test Con

o Verification of identity o Refers in more general term o May have unstructured access to mater and other tools despite guidelines for test administrators

Major Issues with CAPA

§ Access to test administration, scoring and interpretation · Computerized tests are easily copied and duplicated § Compatibility of pen-paper and computerized versions of test § Value of computerized test interpretation § Unprofessional, unregulated, "psychological" testing online

Competency, based on MacCat-T

§ Being able to evidence a choice as weather one wants to participate or not § Demonstrating factual knowledge of the issue § Being able to season about the facts of a study, treatment or whatever it is to which consent is sought

Confidentiality Vs. Privilege

§ Confidentiality concerns matter of communication outside the courtroom § Privilege protects clients from disclosure to judicial proceedings

Deception

§ Do not use deception unless absolute necessary § Do not use deception at all if it will cause participants emotional distress Fully debrief participants

Written Form of Consent Specifies

§ General purpose of testing § Specific reason it is being undertakers in present care § General types of instruments to be administered

Restriction of Inflation of Range

§ If variance of either variable in a correlation analysis in restricted structured by the sampling procedure used, resulting correlation coefficient tends to be lower § If the variance of either variable in a correlational analysis is inflated by the sampling procedure, resulting correlation coefficient tends to be higher

Disparate Treatment

§ Practice intentionally designed to result in discriminatory outcome § Possibility due to social prejudice or desire to maintain status quo

Disparate Impact

§ Practice unintentionally results in discriminatory outcome Not viewed as stemming from planning or intent

Ethics Integrity

§ Psychologist seek to promote accuracy, honesty and tactfulness in the science, teachery and practice of psychology § Psychologists always minimize or avoid harm and if harm does occur, they need to correct it

Item format

§ Selected response format · Select a response from alternative response § Construct response format · Test takers supply a create the correct answers § Multiple choice § Matching item § True-false item/binary item § Completion item (fill in the blank) § Short answers items § Essay item

Testing and Assessment Benefits Society

· Allows for merit of a person's hard work to do good on an assessment to be lets say longer, rather than nepotism · Can help identity educational difficulties

Factor Analysis Approach

· Approach relies on factor analyzes and other related methods to sort and arrange individual test items into clusters or scales that are mathematically related or have specific properties · Test developers typically start by using rational theoretical approach to develop item first

Potential harm as the Result of an Administration of a Test

· Discrimination · Informed consent( Make sure the person giving permission is well educated about the purpose) · Privacy and confidentiality (Only share data with those who have an educated interest)

Various Sources of Error Are Part of the Assessment Process

· Error refers to a long-standing assumptions that factors other than what a test attempts to measure will influence performance on test o Test scores are always subject to questions about the degree to which the measurement process includes error

Discriminant Validity

· Is your scale not measuring what it is not supposed to measure o Do not want a correlation o The test you are giving should not tell you anything about this other test that has a completely different construct

SEI

· Model used by school psychologists · Broaden the idea that testing is more than just assessment · SEIà student engagement instrument · A school psychologist might use this to measure a student's engagement by looking at the student by observing, seeing how instruction looks for them, layout of classes ect. o Also look for student engagement~ opportunities for students to speak · Testing the learner is only one part of a global process used that involves multiple forms of assessment · Assessment is a greater process while testing is only one component

How are Assessment Conducted

· Responsible test users have obligations before during and after any measurement procedures is administered · Al appropriate materials and procedures must be collected prior to the assessment · Test users have the responsibility to make sure the sooner they use is suitable for testing in

APA Ethics Code- Intro and Applicability

· The Preamble and General Principles are aspirations goals to guide psychologists toward the highest ideals of psychology · Most ethical standards are written broadly in order to apply to psychologist varied roles · This ethics code applies only to psychologists' activities that are apart of their scientific, educational or professional roles · APA may implore sanctions on its members for violations of the standards of ethics code, including termination of APA membership, and may notify other bodies and individuals of actions

Concerns of the Public

· The concern of the use of psychological testing first came after WWI due to articles tested "the abuse of tests" · A year after the Russians releasing sputnik a satellite the U.S. studied assessing testing ability and aptitude to identity gifted and academically talented students which lead to another sprouting of public talk and showed in magazine article · Assessment has been affected in numerous and important wages by activities of the legislative, executive and judicial branches of the federal and state government

Test Related Behavior Non-Tester Related Behavior

· The objective of the test is to provide some indication of other aspects of the examinee behavior · Test related behavior can be used to aid in understanding of behavior that has already taken place

Legislation

· The public has also been quick to judge the utility of a test by calling it unfair and discriminant loading to group hiring and leading to quotas · State and federal legislator, executive bodies and counts have been involved in many aspects of testing and assessment. There has been little consensus about whether validated test on which there are social differences can be used to assist with employment related decision · Rule 702 allowed more experts in count to testify regarding the admissibility of the original expert testimony. Beyond expert testimony indicating that some research method or technique enjoyed general acceptance in that field other experts were not allowed to testify and present their opinion with regard to the admissibility of evidence · Daubert case gave trial judges a great deal of leeway in deciding what juices would be allowed to hear · General Elective Co. V. Joins (1997), court emphasized that the trial court had a duty to exclude unreliable expert testimony as evidence · Kumbo Tire Company V. Carmichael (1999) expended on Daubert to include all expert including psychologist

APA Ethics Code- Preamble

· This ethics code is intended to provide specific standards to cover most situations covered by psychologists · It has as its goal the welfare and protection of the individuals and groups with whom psychologists works


Related study sets

Rolfs Master APUSH American Pageant Periods 1-5 Key Terms and Ideas Chapters 1-21, Rolfs Pageant Chapter 22, Rolfs Pageant Chapter 23, Rolfs Pageant Chapter 24, Rolfs Pageant Chapter 25, Rolfs Pageant Chapter 26, Rolfs Pageant Chapter 27, Rolfs Pagea...

View Set

Chapter 14 - Miscellaneous Commercial Lines Coverage

View Set

Life Insurance Questions I Keep Getting Wrong

View Set

All Rise for the Honorable Perry T Cook

View Set

Свідомість як фундаментальна категорія філософії та психології

View Set

Zkouška z genetiky - genealogie

View Set

Computer Maintenance - Advanced Computer Hardware

View Set

Foundations in Nursing: Therapeutic Communication

View Set

Chapter 19 Waste Environmental Science Test 5/13/16

View Set

Bodies of Water and Rivers of Eastern Europe

View Set