psychological testing

Ace your homework & exams now with Quizwiz!

statistical estimates we can make about the scores we are given

-Standard error of measurement: we measure the difference between your score and the main score it might be. -Error of estimate: the degree of error involved when we are predicting the value of one variable to another -Error of mean: estimate of sampling error -standard error of the difference: how different should 2 scores be before the difference is statistically significant

fixed reference group scoring systems

1) Fixed reference group scoring systems: The distribution of scores obtained on the test from one group of testtakers is used as the basis for the calculation of test scores for future administrations of the test The SAT employs this method 2) Norm-referenced versus criterion-referenced evaluation Norm-referenced tests involve comparing individuals to the normative group; in criterion-referenced tests, testtakers are evaluated as to whether they meet a set standard (e.g., a driving exam)

Advantages of internet testing

1) Greater access to potential test users 2) Scoring and interpretation tend to be quicker 3) Costs tend to be lower 4) Facilitates testing of otherwise isolated populations and people with disabilities

Norm-referenced testing and assessment

A method of evaluation and a way of deriving meaning from test scores by evaluating an individual testtaker's score and comparing it to scores of a group of testtakers ---The meaning of an individual test score is understood relative to other scores on the same test ---Norms are the test performance data of a particular group of testtakers that are designed for use as a reference when evaluating or interpreting individual test scores ---A normative sample is the reference group to which the performance of testtakers are compared

verbal communication

Certain nuances of meaning may be lost in translation Some interpreters may not be familiar with mental health issues and pre-training may be necessary In interviews, language deficits may be detected by trained examiners but may go undetected in written tests Assessments need to be evaluated in terms of the language proficiency required and the language level of the testtaker

traits and states can be quantified and measured

Different test developers may define and measure constructs in different ways Once a construct is defined, test developers turn to item content and weighting A scoring system and a way to interpret results need to be devised

define psychological assessment

Gathering and integration of psychology-related data for the purpose of making a psychological evaluation through tools such as tests, interviews, case studies, behavioral observation, and specially designed apparatuses and measurement procedures. We use this to answer a question

standards of evalutation

Judgments related to certain psychological traits can be culturally relative Cultures differ with regard to gender roles and views of psychopathology Cultures also vary in terms of collectivist vs. individualist value Collectivist cultures value traits such as conformity, cooperation, interdependence, and striving toward group goals Individualist cultures place value on traits such as self-reliance, autonomy, independence, uniqueness, and competitiveness

objectives of testing and assessment

a. The objective of testing is typically to obtain some gauge, usually numerical in nature, with regard to an ability or attribute b. The objective of assessment is typically to answer a referral question, solve a problem, or arrive at a decision through the tools of evaluation

trait

any distinguishable, relatively enduring way in which one individual vaires from another

classical test theory

assumes individuals have a true score that would be obtained if there were no errors in measurement

overt behavior

observable action or the product of an observable action

computers as psychological tools

- Scoring may be done on-site (local processing) or at a central location (central processing) - Reports may come in the form of a simple scoring report, extended scoring report, interpretive report, consultative report, or integrative report - Computer assisted psychological assessment (CAPA) and computer adaptive testing (CAT) have allowed for tailor-made tests with built-in scoring and interpretive capabilities

sampling to develop norms (7)

---Standardization: The process of administering a test to a representative sample of testtakers for the purpose of establishing norms ---Sampling - Test developers select a population, for which the test is intended, that has at least one common, observable characteristic ---Stratified sampling: Sampling that includes different subgroups, or strata, from the population ---Stratified-random sampling: Every member of the population has an equal opportunity of being included in a sample ---Purposive sample: Arbitrarily selecting a sample that is believed to be representative of the population ---Incidental/convenience sample: A sample that is convenient or available for use; it may not be representative of the population ---Generalization of findings from convenience samples must be made with caution

assumptions about psychological testing // psychological traits and states exist

-A trait has been defined as "any distinguishable, relatively enduring way in which one individual varies from another" (Guilford, 1959, p. 6) -States also distinguish one person from another but are relatively less enduring (Chaplin et al., 1988) Thousands of trait terms can be found in the English language (e.g., outgoing, shy, reliable, and calm) • Psychological traits exist as constructs—an informed, scientific concept developed or constructed to describe or explain behavior • We cannot see, hear, or touch constructs, but we can infer their existence from overt behavior, such as test-related responses -Traits are relatively stable; they may change over time, yet there are high correlations between trait scores at different time points -The nature of the situation influences how traits will be manifested -Traits refer to ways in which one individual varies, or differs, from another

developing norms

1) Having obtained a sample, test developers: 2) Administer the test with a standard set of instructions 3) Recommend a setting for administering the test 4)Collect and analyze data 5)Summarize data using descriptive statistics including measures of central tendency and variability 6)Provide a detailed description of the standardization sample

types of norms

1) Percentile: The percentage of people whose score on a test or measure falls below a particular raw score Percentiles are a popular method for organizing test-related data because they are easily calculated One problem is that real differences between raw scores may be minimized near the ends of the distribution and exaggerated in the middle of the distribution 2)•Age norms: Average performance of different samples of testtakers who were at various ages when the test was administered 3) Grade norms: The average test performance of testtakers in a given school grade 4) National norms: Derived from a normative sample that was nationally representative of the population at the time the norming study was conducted 5) National anchor norms: An equivalency table for scores on two different tests; allows for a basis of comparison 6) Subgroup norms: A normative sample can be segmented by any of the criteria initially used in selecting subjects for the sample 7) Local norms: Provide normative information with respect to the local population's performance on some test

where to go for information on tests

1) Test catalogues - Catalogues distributed by publishers of tests; they usually contain brief and uncritical descriptions of tests 2) Test manuals - Contain detailed information concerning the development of a particular test and technical information 3) Reference volumes - Reference volumes like the Mental Measurements Yearbook or Tests in Print provide detailed information on many tests 4) Journal articles - Contain reviews of a test, updated or independent studies of its psychometric soundness, or examples of how the instrument was used in either research or an applied context 5) Online databases - Educational Resources Information Center (ERIC) contains a wealth of resources and news about tests, testing, and assessment; there are abstracts of articles, original articles, and links to other useful websites 6) The American Psychological Association (APA) has a number of databases including PsycINFO, ClinPSYC, PsycARTICLES, and PsycSCAN 7) Other sources - Directory of Unpublished Experimental Mental Measures; also, university libraries provide access to online databases, such as PsycINFO and electronic journals

The rights of test takers

1) Test takers have a right to know why they are being evaluated, how the test data will be used, and what (if any) information will be released to whom 2) With full knowledge of such information, testtakers give their informed consent 3) Information needed for consent must be in language the testtaker can understand 4) Some groups (e.g., people with Alzheimer's disease) may not have the capacity, or competency, to provide informed consent 5) The right to be informed of test findings - In the past, testtakers were often not informed of diagnostic findings or anything that might hurt their self-image 6) Currently, giving information about test performance to examinees is ethically and legally mandated and may be useful from a therapeutic perspective as well 7) Testtakers have a right to know about test findings and recommendations 8) Test users should sensitively inform testtakers of the purpose of the test, the meaning of the score relative to those of other testtakers, and the possible limitations and margins of error of the test 9) The right to privacy and confidentiality - In most states, information provided by clients to psychologists is considered privileged information 10) Privilege is not absolute - Psychologists may have to disclose information if it will prevent harm either to the client or to some endangered third party 11) Another ethical mandate regarding confidentiality pertains to safeguarding test data 12) The right to the least stigmatizing label - The Standards advise that the least stigmatizing labels should always be assigned when reporting test resultspsy

Components of competency include:

1. Being able to evidence a choice as to whether one wants to participate 2. Demonstrating a factual understanding of the issues 3. Being able to reason about the facts of a study, treatment, or whatever it is to which consent is sought 4. Appreciating the nature of the situation 5. If competency cannot be provided by the person, consent may be obtained from a parent or a legal representative

what are the settings

1. Educational settings - Students typically undergo school ability tests and achievement tests Diagnostic tests may be used to identify areas of educational intervention Educators may also make informal evaluations of their students 2. Clinical settings - Include hospitals, inpatient and outpatient clinics, private-practice consulting rooms, schools, and other institutions. Assessment tools are used to help screen for or diagnose behavior problems. 3. Counseling settings - Include schools, prisons, and governmental or privately owned institutions The goal of assessments in this setting is improvements in adjustment, productivity, or some related variable 4. Geriatric settings - Assessments primarily evaluate cognitive, psychological, adaptive, or other functioning; it focuses on the quality of life 5. Business and military settings - Decisions regarding careers of personnel are made with a variety of achievement, aptitude, interest, motivational, and other tests 6. Government and organizational credentialing - Include governmental licensing, certification, or general credentialing of professionals (e.g., attorneys, physicians, and psychologists)

testing and assessment can be conducted of a fair manner

All major test publishers strive to develop instruments that are fair when used in strict accordance with guidelines in the test manual Problems arise if the test is used with people for whom it was not intended

Tests have strengths and weaknesses

Competent test users understand and appreciate the limitations of the tests they use as well as how those limitations might be compensated for by data from other sources

legal and ethical concerns

Concerns of the public: Concerns started after World War I when tests developed for military use were adapted in schools and industry The launch of Sputnik by the Soviet Union prompted the U.S. government to greatly increase testing of abilities and aptitudes in schools to identify talented students Simultaneously, ability and personality testing greatly increased in government, the military, and business This led to renewed public concern Public concern was further stoked in 1969 by Arthur Jensen's article in the Harvard Educational Review in which he suggested that "genetic factors are strongly implicated in the average Negro-white intelligence difference" Jensen's work caused renewed public concern over nature-versus-nurture issues and what intelligence tests really measured In recent decades, the government has been extensively involved in various aspects of assessment

tests and group membership

Conflict often ensues when groups systematically differ in terms of scores on a particular test In vocational assessment, test users are sensitive to legal and ethical mandates concerning the use of tests with regard to hiring, firing, and related decision making Conflicts may arise from disagreements about the criteria for performing a particular job Some would argue that if tests are measuring what they are supposed to then group membership should not be an issue, while others seek to "level the playing field" through initiatives such as affirmative action

culture and assessment

Culture: The socially transmitted behavior patterns, beliefs, and products of work of a particular population, community, or group of people (Cohen, 1994) Professionals in assessment have shown increasing sensitivity to cultural issues with every aspect of test development and use Early psychological testing of immigrant populations by Henry Goddard was controversial He found that the majority of immigrant populations were feebleminded

How are assessments conducted?

Different methods are used Responsible test users have obligations before, during, and after testing Professional Obligations: Familiarity with test materials and procedures Ensuring that the room in which the test will be conducted is suitable and conducive to the testing It is important to establish rapport during test administration; rapport can be defined as a working relationship between the examiner and the examinee

legislation

Minimal competency testing programs: Many states in the 1970s passed laws to the effect that high school graduates should be able to meet "minimal competencies" in reading, writing, and arithmetic Truth-in-testing legislation: Passed at the state level, starting in the 1980s, the objective was to give test takers a way to learn the criteria by which they are being judged The Civil Rights Act of 1964 created the Equal Employment Opportunity Commission (EEOC) to enforce the act The EEOC has published sets of guidelines concerning standards to be met in constructing and using employment tests; they seek to prevent discriminatory testing practices during employment There is public demand for proportional representation in hiring and school acceptance, yet there are gaps in test performance by various groups Some scholars have argued that if the tests are valid and useful, they should not be changed or dismissed but rather the skill gap should be addressed Law can also derive from litigation PARC v. Commonwealth of Pennsylvania (1971) and Mills v. Board of Education of District of Columbia (1972) prompted Congress to ensure appropriate educational opportunities for children with disabilities Psychologists may act as expert witnesses in civil and criminal cases The 1923 case of Frye v. the United States established that scientific research is admissible as evidence when the research study or method enjoys general acceptance; general acceptance could typically be established by the testimony of experts and by reference to publications in peer-reviewed journals The Daubert v. Merrell Dow Pharmaceuticals ruling by the Supreme Court superseded the long-standing policy, set forth in Frye, of admitting into evidence only scientific testimony that had won general acceptance in the scientific community Opposing expert testimony, whether or not such testimony had won general acceptance in the scientific community, would be admissible The Daubert ruling gave trial judges more leeway in deciding which expert testimony could be used Some jurisdictions still rely on the Frye standard when it comes to admitting expert testimony, and some subscribe to Daubert The Daubert v. Merrell Dow Pharmaceuticals ruling by the Supreme Court superseded the long-standing policy, set forth in Frye, of admitting into evidence only scientific testimony that had won general acceptance in the scientific community Opposing expert testimony, whether or not such testimony had won general acceptance in the scientific community, would be admissible The Daubert ruling gave trial judges more leeway in deciding which expert testimony could be used Some jurisdictions still rely on the Frye standard when it comes to admitting expert testimony, and some subscribe to Daubert

nonverbal communication

Nonverbal signs or body language may vary from one culture to another Psychoanalysis pays particular attention to the symbolic meaning of nonverbal behavior Other cultures may complete tasks at a different pace, which may be particularly problematic for timed tests

define psychological testing

Process of measuring psychology-related variables by means of devices or procedures designed to obtain a sample of behavior. a device or procedure designed to measure variables related to psychology (e.g., intelligence, attitudes, personality, and interests) Psychological tests vary by content, format, technical quality, and administration, scoring, and interpretation procedures

reliability, variability, other considerations ****Nor

Reliability - The consistency of the measuring tool: the precision with which the test measures and the extent to which error is present in measurements Validity - The test measures what it purports to measure Other considerations - Administration, scoring, and interpretation should be straightforward for trained examiners; a good test is a useful test that will ultimately benefit individual testtakers or society at large

Test-Related Behavior Predicts Non-Test-Related Behavior

Responses on tests are thought to predict real-world behavior; the obtained sample of behavior is expected to predict future behavior

test developer

Test developer - Creates tests for research, publications (as commercially available instruments), or modifications of existing tests Standards for Educational and Psychological Testing covers issues related to test construction and evaluation, test administration and use, and special applications of tests such as considerations when testing linguistic minorities

collaberative psychological assessment

The assessor and assessee work as partners

Assessment of people with disabilities

The law mandates "alternate assessment" and the definition of this is up to individual states or school districts **Accommodations—the adaptation of a test, procedure, or situation, or the substitution of one test for another— are essential to make the assessment more suitable for individuals with exceptional needs

theraputic psychological assessment

Therapeutic self-discovery is encouraged through the assessment process

testing and assessment benefit society

There is a great need for tests, especially good tests, considering the many areas of our lives that they benefit

purposive sample

a nonrandom sample that is chosen for some characteristic that it possesses

tools for psychological assessment

a. Content - The subject matter of the test varies with the focus of the particular test and based on the theoretical orientation of different test developers b. Format: The form, plan, structure, layout of test items, and other considerations (e.g., time limits) c. Administration - Tests may either involve demonstration of certain tasks demanded of the assessment and trained observation of performance or may not require the involvement of test administrators ---Scoring and interpretation - Scoring of tests may be simple, such as summing responses to items. MAY REQUIRE MORE ELABORATE PROCEDURES. Some tests results can be interpreted easily or interpreted by computer, whereas other tests require expertise for proper interpretation. d. Cut Score: a reference point, usually numerical, used to divide data into two or more classifications (e.g. pass or fail; mild, moderate, severe). e. Psychometric soundness: Psychometrics is the science of psychological measurement; the psychometric soundness of a test depends on how consistently and accurately the test measures what it purports to measure test users are sometimes referred to as psychometrists or psychometricians. f. Interview: a method of gathering information through direct communication involving reciprocal exchange. Interviews vary based on their purpose, length, and nature. The quality of information obtained in an interview often depends on the skills of the interviewer (e.g., their pacing, rapport with the interviewee, and their ability to convey genuineness, empathy, and humor) g. Case history data: Information preserved in records, transcripts, and/or other forms h. Behavioral observation: Monitoring the actions of people through visual or electronic means i. Portfolio: A file containing the products of one's work; it may serve as a sample of one's abilities and accomplishments j. role play tests: Assesses are directed to act as if they were in a particular situation; this is useful in evaluating various skills

who are the parties?

a. Test user - Tests are used by a wide range of professionals b. The Standards contains guidelines for those who should administer psychological tests, but many countries have no ethical or legal guidelines for test use c. Test taker - Anyone who is the subject of an assessment or evaluation is a test taker. Test takers may differ on a number of variables at the time of testing (e.g., test anxiety, emotional distress, physical discomfort, and alertness) d. Society at large - Test developers create tests to meet the needs of an evolving society e. Laws and court decisions may play a major role in test development, administration, and interpretation f. Other parties - Organizations, companies, and governmental agencies sponsor the development of tests g. Companies may offer test-scoring and interpretation services h. Academicians may review tests and evaluate their psychometric soundness

Rorschach Inkblot test

are tests in which an individual is assumed to "project" onto some ambiguous stimulus his or her own unique needs, fears, hopes, and motivation Psychological assessment has proceeded along two lines, the academic and the applied Academic tradition - Researchers at universities throughout the world use the tools of assessment to help advance knowledge and understanding of human and animal behavior In the applied tradition, the goal is to select applicants for various positions on the basis of merit

incidental sample

convient or avaliable for use

stratified sampling

help prevent sampling bias and ultimately aid in the interpretation of the findings

most common computer reports used

simple score, and interpretive reports.

dynamic assessment

typically employed in educational settings but may also be used in correctional, corporate, neuropsychological, clinical, and other settings. Assessment evaluation intervention evaluation

rapport

working relationship between the examiner and the examinee

various sources of error are part of assessments

• Error refers to a long-standing assumption that factors other than what a test attempts to measure will influence performance on the test • Error variance: The component of a test score attributable to sources other than the trait or ability measured • Both the assessee and assessor are sources of error variance

culture and inference

•In selecting a test for use, responsible test users should research the test's available norms to check how appropriate they are for use with the targeted testtaker population •When interpreting test results, it helps to know about the culture and era of the testtaker •It is important to conduct a CULTURALLY INFORMED ASSESSMENT


Related study sets

EMT Chapter 16: Cardiovascular Emergencies Quiz

View Set

Compensation and Benefits Chapter 5

View Set

Fluid/Electrolytes: Hypervolemia

View Set

Chapter 14- Consumer Decision Process and Problem Recognition

View Set