NCE-assessments and testing

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Best practices

Counselor thoroughly understands the results Counselor should explain results in easily understood terms, and be able to provide supporting details and norms as needed Counselor should explain and understand average scores and meanings of results* Counselor should allow the client to ask questions and review aspects of the test to ensure understanding Counselor must explain the ramifications and limitations of any data obtained through testing

Standardized Scores z-scores T scores

"common language" that we use to compare several different test sores for the same individual occur by converting raw score distributions these derived scores provide for constant normative or relative meaning allow for comparisons between individuals Specifically, they EXPRESS THE PERSON'S DISTANCE FROM THE MEAN IN TERMS OF THE STANDARD DEVIATION of that standard score distribution they are continuous and have equality of units There are two most commonly used: - -

Construct Validity Convergent Validity Discriminant Validity

A test that measures abstract traits or theories, and isn't inadvertently tesing another variable. For example, a math test with complex word problems may be assessing reading skills Two subtypes are needed to assess

Equivalence

ALTERNATE FORMS OF THE SAME TEST are administered to the same group and the correlation between them is calculated How comparable the forms of the test are will influence this reliability

Major Types of Tests and Inventories

Achievement test Aptitude test Intelligence Test Occupational Test Personality Test

Test-Retest

Administering the same test twice to a group of individuals, then correlating the scores to evaluate stability

Occupational Test

Assess skills, values, or interest as they relate to vocational and occupational choices EX.- O*NET Interest Profiler, Career Assessment Inventory, Self-Directed Search

Norm referenced Criterion referenced Ipsatively interpreted

Assessments may be: - - -

Personality Test

Can be objective (rating scale based) or projective (self-reporting based), and help the counselor and client understand personality trains and underlying beliefs and behaviors EX.- Myers-Briggs Type Inventory (MBTI) Minnesota Multiphasic Personality Inventory (MMPI-2) Beck Depression Inventory, Projective test- Rorschach (inkblot) reveals unconscious thoughts, motives and views

Inter-Rater Reliability

Checks to see that raters (those administering, grading, or judging a measure) do so in agreement. Each rater should value the same measures and at the same degree to ensure consistency Prevents overly subjective ratings, since rater is measuring on the same terms

Face Validity

Commonsense view that a test measures what it should or looks accurate from a non-professional viewpoint

34% and 34% = 68% 13.5 % and 13.5 % = 95% 2% and 2% = 99%

Counselors should be familiar with the distribution of scores within the normal curve: _____________, and comprise one standard deviation _____________, and comprises two standard deviations _____________, and comprise three standard deviations

Average Inter-Item Correlation

Determines if scores on one item relate to the scores on all of the other items in that scale Ensuring that each correlation between items is a form of redundancy to ensure the same content is assessed with each question

intelligence achievement aptitude personality interests

Different types of test: - - - - -

Curricular Validity

Evaluated by experts, and measures that a test aligns with eh curriculum being tested For example, a high school exit exam measures the information taught in the high school curriculum

Criterion Validity Predictive validity Concurrent Validity

Measures success and the relationship between a test score and an outcome, such as scores on the SAT and success in college Two Subtypes:

Parallel-Forms Reliability (aka equivalence)

Involves administering two different versions of an assessment that measure that same set of skills, knowledge, etc. and tehn correlating the results. A test can be writing and split into two parts, thus creating parallel versions

Intelligence test

Measure mental capacity and potential EX. WISC, WAIS, WPPSI, Woodcock-Johnson, Kaufman Assessment Battery for Children

Aptitude Test

Measure the capacity for learning and can be used as part of a job application Can measure abstract/conceptual reasoning, verbal reasoning, and/or numerical reasoning EX- Differential Aptitude Test (DAT), Wonderlic Cognitive Ability Test, Career Ability Placement Survey (CAPS)

Achievement Test

Measures knowledge of a specific subject and are primarily used in education EX- exit exams for high school, GED

Range Standard Deviation Variance

Measures of variability: - - -

Validity face Validity Curricular Validity Criterion Validity Constructive Validity

Refers to how well a test or assessment measures what it's intended to measure For example, an assessment on depression should only measure the degree to which an individual meets the diagnostic criteria for depression Though it does indicate reliability, a test can be reliable but not be _____ • Four types:

Internal Consistency Average inter-item correlation Split-half Reliability

Refers to how well a test or assessment measures what it's intended to measure, while producing similar results each time. Questions on an assessment should be similar and in agreement, but not repetitive High _________ indicates that a measure is reliable Involves: - -

Consequential Validity

Social consequences of testing Though not all researchers feel it's a true measure of validity, some believe a test must benefit society in order to be considered valid

Objective Test Items

Standardized questions with clear correct or incorrect answers; not open to any interpretation

Correlation Coefficient

Statistic that describes the relationship between two variables and their impact on one another. In positive correlation, both variables react in the same direction. In negative correlation, variables react in opposite direction

reliable / valid Valid / reliable

Test may be ______ BUT NOT ________ _______ are ________ unless of course there is a change in the underlying trait or characteristic which might occur through maturation, training, or development

Power Based Speed Based

Test may be: ____________ : no time limits or very generous ones (such as NCE) ______________: timed, the emphasis is placed on speed and accuracy. (EX. measure of intelligence, ability, and aptitude)

T score T / ten (T)ransforming

The mean of this standardized score scale is 50 and the standard deviation is 10. By Transforming this standard score, negative scores are eliminated unlike the z-score. The ___ should remind you of ____ which is the standard deviation of this distribution

Split-Half Reliability

The random division of questions into two sets Results of both halves are compared to ensure correlation

Internal Consistency split half method Spearman-Brown Formula interitem consistency Kuder-Richardson formulas (there are two) Cronbach Alfa coefficient

Two methods: ____________ - the test is divided into two halves. The correlation between the two halves is calculated. *note: when you reduce the length of the test with this method, you necessarily reduce its measured reliability. ....to help you may apply the ________ to see how reliable the test would be had you not split it in two _________ - the more homogeneous the items, the more reliable the test. ______ are used if the test contains dichotomous items (yes or no; true or false) _______ is applied if the instrument contains nondichotomous items (essay, multiple choice)

Stability Equivalence Internal consistency (interrater)

Types of reliability - - -

Face Content Predictive Concurrent Construct

Types of validity - - - - -

Convergent Validity

Use two sets of tests to determine that the same attributes are being measured and correlated For example, two separate tests can measure students similarly

Discriminant Validity

Using tests that measure differently and results that don't correlate

standardized nonstandardized

_________ - the instruments are administered in a formal, structured procedure and the scoring is specified _________ - there are no formal or routine instructions for administration or for scoring. Some example may be checklists or rating scales

Intrusive (or reactive) measurement Unobtrusive (or nonreactive) measurement

_____________ - means the participant knows he or she is being watched or questioned and this knowledge may affect his or her performance. Examples- questionnaires, interviews, or observation ___________ - means data is collected without the awareness of the individual, or without changing the natural course of events. Examples are reviewing existing records or unobtrusive observation

Case or historical study rating scales

_______________ - this may be an analytical and/or diagnostic investigation of a person or group _______________ - these may be used to report the degree to which an attribute or characteristic is present

Regression toward the mean statistical regression

_______________ means that if one earns a very low score (15% or lower) or very high score (85% or higher) on a pretest, the individual will probably earn a score closer to the mean on the posttest This is because of the error occurring due to change, personal and environmental factors. These factors can reliably be expected to be different on the posttest

Grade equivalent scores Age equivalent scores

________________ - scores on an achievement test are often reported as this. the individual's score is compared to the average score of others in their grade. Usually done in school settings _____________- An individuals score is compared to the average score of others at the same age

maximal performance test typical performance

a __________________ may generate a person's best performance on an aptitude or achievement test and a ______ may occur on an interest or personality test

Measures of Central tendency: - Mean - Median -Mode

a distribution of scores (measurements on a number of individuals) can be examined using the following measures: -________: the arithmetic average symbolized by X or M -________ : the middle score in a distribution of scores -________ : the most frequent score in a distribution of scores All three of these fall in the same place when the distribution of scores is symmetrical, i.e. normally distributed (not skewed.)

test battery

a group or set of tests administered to the same group and scored against a standard

Test

a measuring device or procedure

Stanine (STAndard NINE)

a nine-point scaled used to convert a test score to a single digit. They are always positive whole numbers from zero to nine

Horizontal Test

a test covering material across various subjects

Construct multiple traits Convergent validation Discrimination validation

a test has construct validity to the extent it measures some hypothetical construct such as anxiety or creativity Usually several tests or instruments are used to measure different components of the construct or of the hypothesized relationships between that construct and other constructs this is best when ________ are being measured using a variety of methods _______________ - occurs when there is high correlation between the construct under investigation and others ______________ - occurs when there is no significant correlation between the construct under investigation and others

Percentile

a value below which a specified percentage of cases fall ex. - 75%. This score is higher than 74% of the scores; 25% of the scores are higher than this score

Norm referenced

comparing individuals to others who have taken the test before may be national, state, or local in this testing, how you compare with others is more important than what you know

Fluid:

ability to think and act quickly and to solve new problems these are skills that are independent of education and enculturation

Free Choice test

aka Liberal Choice; questions that allow for a subjective/open-ended response

Z-score

aka standard score measure the number of standard deviations a raw score is from the mean use zero as the mean

Aptitude

also called ability test, these measure the effects of general learning and are used to predict future performance

Percentile ranks

an individual's score can be compared to a group (norm group) already examined. this indicates what percentage of individuals in that group has scores above or below this individual

Psychological Assessment

an informal process of testing, interviews or observations used to determine the psychological needs of an individual. Assessments can expose the need for more formal testing

Halo Effect

an overgeneralized positive view of a person from limited data

Standard Error of Measurement (SEM) confidence band or confidence limits

another measure of reliability and useful in interpreting the test scores of an individual may also be referred to as ____ or _______ helps determine the range within which an individual's test score probably falls example pg 215

Obtrusive measurement

assessment tools (such as observation) conducted without knowledge of the individual

Sociometry sociogram

can be used to identify isolates, rejectees or stars (popular individuals) You can measure the structure and organization of social groups which could be a classroom of fourth graders who have been together for a few months, or a work unit It requires revealing personal feelings about others _____________ - a figure or map showing the interrelationships or structure of the group

Criterion referenced

comparing an individual's performance to some predetermined criterion which has been established as important Ex. - NCE cut off score

Ipsatively Interpreted

comparing the results on the test within the individual Ex. - looking at an individual's highs and lows on an aptitude battery which measures several aptitudes. There is no comparison with others Ex.- When an individual's score on a second test is compared to the score on the first test

Coefficient of determination

denoted by R^2, the proportion of the variance in the dependent variable that's predictbale from the independent variable and the square of the coefficient of correlation

external validity

described how well results from a study can be generalized to the larger population

Concurrent Validity

determine if measures can be substituted, such as taking an exam in place of a class. Measures must take place concurrently to accurately test for validity

Percentile

determines how test scores rank on a scale of 100. Determines the number of individuals who are at or below a given rank. For example, a test taker who scores in the 65th percentile performed better than 65 percent of the other test takers

rapport

development of trust, understanding, respect, and liking between two people; essential for an effective therapeutic relationship

Crystallized:

encompasses acquired and learned skills and is influenced by personality, motivation, education, and culture

Normal Bell Curve

essentially distributes the scores (individuals) into SIX equual parts --three above the mean and three below the mean** **See page 210

Stanine

from STAndard NINE, converts a distribution of scores into 9 parts (1 to 9) with five in the middle and a standard deviation of about 2.

measurement

general process of determining the dimensions of an attribute or trait

Predictive Validity

how useful test scores are at predicting future performance

Variance

how widely individuals in a group vary how data is distributed from the mean and the square of the standard deviation

Bell Curve

illustration of data distrubution that resembles the shape of a bell

Appraisal

implies going beyond measurement to making judgments about human attributes and behaviors and is used interchangeably with evaluation

Validity content construct criterion consequential

indicates how well any given test or assesssment measures what it's intended to measure. There are four types : * does indicate reliability

Subjective

individual perceptions/ interpretations based on feelings and opinions, but not necessarily based on fact

Personality projectives inventories specialized

is the dynamic product of genetic factors, environmental experiences, and learning to include traits and characteristics Three different types ______________________: these tests present a relatively unstructured task or stimulus. The person projects thought processes, needs, anxieties, etc.) _____________________ _____________________

Interpretation

making a statement about the meaning or usefulness of measurement data according to the professional counselor's knowledge and judgement

ipsative format

means of testing that measures how individuals prefer to respond to problems, people, and procedures and doesn't compare results to others

Normative format

means of testing to compare individuals to others

Standard deviation

measure of dispersion of numbers calculated by the square root of the variance

Difficulty Index

measure of the proportion of examinees who answer test items correctly

Reliability test-retest parallel forms inter-rater internal consistency

measures that a tool is producing consistent and stable results that must be quantified. Doesn't indicate validity. Four types:

Achievement

measures the effects of learning or a set of experiences These test may be used diagnostically

Trait

method of describing individuals through observable characteristics that are unique and distinguishable

Score

numerical value associated with a test or measure

J.P. Guilford

o Conducted psychometric studies of human intelligence and creativity in the early 1900s o Believed intelligence tests were limited and overly one-dimensional, and didn't factor in the diversity of human abilities, thinking, and creativity

Binet and Simon

o Developed first test to determine which children would succeed in school in 1900s o Focused on concept of mental age and included memory, attention, and problem solving o Brought to Stanford University, has since been revised many times and still used widely

David Wechsler

o Developed intelligence test for adults and children in children in the 1950s o Test were good at identifying learning disabilities in children o Believed intelligence has both verbal and performance components and factors other than pure intellect influenced intellectual behavior o WISC, WAIS, WPPSI

Robert Williams

o Developed the Black Intelligence Test of Cultural Homogeneity (BITCH test) to address the racial inequalities of traditional intelligence tests in 1970s o Used vernacular and experiences common to African American culture

Raymond Cattell and John Horn

o Developed theories of fluid and crystallized intelligence in 1940s

John Ertl

o Invented a neural efficiency analyzer to more effectively measure intelligence o Believed traditional intelligence tests were limited to understanding an abstract degree of intelligence o His system measured the speed and efficiency of electrical activity in the brain using an electroencephalogram (EEG)

Francis Galton

o One of the first to study intelligence in the late 1800s o Cousin to Darwin o Coined term Eugenics o Believed intelligence was genetically determined and could be promoted through selective parenting

Charles Spearman

o Responsible for bringing statistical analysis to intelligence testing in early 1900s o Proposed g Factor Theory for general intelligence, which laid the foundation for analyzing intelligence tests o Prior to him, tests weren't highly correlated with the factors the test attempted to measure

Arthur Jensen accounted for simple associative learning and memory involved more abstract and conceptual reasoning

o Supported g Factor Theory and believe intelligence consisted of two distinct sets of abilities Level I - _________________ Level II - ________________ o Believed genetic factors are were the most influential indicator of intelligence

Dichotomous Items

opposing choices on a test, such as yes/no or true/false options

Interests

preferences, likes and dislikes of an individual and more broadly includes values. These are often not stable in the teen years

Rating Scale

process of measuring degrees of experience and attitudes through questions

assessment

processes and procedures for collecting information about human behavior _______ tools include tests, inventories, rating scales, observation, interview data and other techniques

appraisal

professionally administered assessment tools and tests used to evaluate, measure, and understand clients

mean

provides the average for all scores; calculated by adding all given test scores and dividing by the number of tests

Correlation coefficient (r) cause and effect the degree of relationship

ranges from -1.00 to 1.00 shows the relationship between two sets of numbers. when a very strong correlation exists, if you know one score of an individual you can predict the other score of that person. A correlation between two variables is called ______ A correlation between three or more variables is called ___________ It can tell you NOTHING about _____, only _____

Likert Scale

rating measuring attitudes to a degree of like or dislike

Psychological Test

refers to any number of specific test or measurements conducted to evaluate, diagnose, or develop treatment plans. It can include personality assessments, projective or subjective tests, intelligence tests, or diagnostic batteries.

standard error of measurement (SEM)

refers to test reliability and the difference between the true scores vs the observed score since no test is without error, the SEM depicts the dispersion of scores of the same test to rule out errors, also referred to as the "standard error" of a score

median

refers to the middle or center number in an ordered list of scores or data; also referred to as the midpoint. In an even data set, the two middle numbers are typically averaged to determine the median

Projective Test

responses to ambiguous images that are intended to uncover unconscious desires, thoughts, or beliefs

Vertical Test

same-subject tests given to different levels or ages

Measure

score assigned to traits, behaviors, or actions

Q-Sort

self-assessment procedure requiring subjects to sort items relative to one another along a dimension, such as agree/disagree

T-Score

specific to psychometrics, used to standardize test scores and convert scores to positive numbers. Represent the number of standard deviations the score is from the mean (which is always 50)

Regression to the Mean

statistical tendency of a data series to gravitate towards the center of a distribution

range

subtraction of the lower score from the highest score

Intelligence

the ability to think in abstract terms; to learn Some also believe it is the ability to adapt to the environment and adjust to it aka general ability or cognitive ability

Reliability reliable reliable

the consistency of a test or measure the degree to which the test can be expected to provide similar results for the same subjects on repeated administration can be viewed as the extent to which a measure if FREE FROM ERROR If the instrument has little error, it is ________ a correlation coefficient is used to determine this -if the reliability coefficient is high, about .70 or higher, test scores have little error and the instrument is said to be ______

Skew positive skew (-->) negative skew (<--)

the degree to which a distribution of scores is not normally distributed - - *** see page 208

Validity

the degree to which a test measure what is purports to measure for the specific purpose for which it is used it is SITUATION SPECIFIC- depending on the purpose and population an instrument could be this for some purposes and not others

Reliability test-retest Parallel-Forms Reliability (aka equivalence) Inter-Rater Reliabiltiy Internal Consistency

the degree to which the assessment tool produces consistent and stable results Four types:

content

the instrument contains items drawn from the domain of items which could be included Ex. Two professors of Psych 101, create a final exam which covers the important content they both teach

face

the instrument looks valid Ex. A meth test has math items

z-score z / z score

the mean is 0; the standard deviation is 1.0. The range for the standard deviation is -3.0 to 3.0 the __ in ____ should remind you of ZERO which is the mean of this distribution

Skew

the measure a score deviates from the norm

Mode

the most common or frequent score that occurred in a group of tests. If a number/score occur twice, a test doesn't have one

Predictive

the predictions made by the test are confirmed by later behavior (criterion) Ex. The scores on the Graduate Record Exam predict later grade point average

Psychometrics

the process or study of psychological measurement

Concurrent

the results of the test are compared with other tests' results or behaviors (criteria) at or about the same time Ex. Scores of an art aptitude test may be compared to grades already assigned to students in an art class

Forced Choice Items

the use of two or more specific response options on a survey

Variance

this is simply the square of the standard deviation The variance does not describe the dispersion of scores as well as the standard deviation.

Stability Two weeks

this is test-retest reliability obtained using the same instrument on both occasions same group tested twice The results of the two administration are correlated the length of time and intervening experiences may influence reliability ________ is a good time between administrations

Range Inclusive range

this is the highest score minus the lowest score. Some researchers talk of ________ which is the high score minus the low score and adding one (1)/

Social desirability

this is the tendency for test takers to respond in ways they perceive to be socially desirable

semantic differential

this scale asks respondents to report where they are on dichototmous range between two affective polar opposites Very Good _________ _________ _________ Very bad

Standard Deviation mean of all the deviations

this value describes the variability within a distribution of scores We use the symbol SD to signify this of a sample this is essentially the _______________

behavioral observation

type of assessment used to document the behavior of clients or research subjects

scale nominal ordinal interval ratio

used to categorize and/or quantify variables four ___ of measurement:

observation as appraisal technique

with this technique, you observe samples from a stream of behavior. In observation, you may use schedules, coding systems, and record forms

Ethical Issues in Testing

• Counselor must be adequately trained and earn any certifications and supervision necessary to administer and interpret the test • Test must be appropriate for needs of client • Client must provide informed concerned and must understand the purpose and scope of any test**** • Test results must remain confidential • Test must be validated for the specific client and be unbiased toward the race, ethnicity, and gender of the client


Ensembles d'études connexes

Тестові екзаменаційні завдання з навчальної дисципліни «Політична економія»

View Set

Data Communications-Analog signal

View Set

Bontrager Ch 12 Biliary tract and Upper GI system

View Set

Starting Out with Python 2e Chapter 9 (Ch.8 question)

View Set