ASSESSMENT AND TESTING
intelligence testing
A subset of intellectual and cognitive functioning that assesses a broad range of cognitive capabilities that generally results in an I
mean
ADD SCORES AND DIVIDE BY NUMBER OF SCORES
measures of central tendency describe
ATTEMPT TO DESCRIBE A DISTRIBUTION BASED ONTYPICAL OR AVERAGE PERFORMANCE OR SCOREs
Alternate forms reliability
Alternate forms reliability is a way of assessing stability using different, but equivalent, versions of a test. To create alternate forms of a test, individual items are selected from a pool of test questions designed to assess a given construct. These forms should have similar score means, variance, item difficulty, and correlations with other measures. example: counselors are required to take a to become nationally certified counselors
CORRELATION COEFFICIENTS (R)THIS IS A NUMBER
BETWEEN -1 AND +1 THAT INDICATESDIRECTION AND STRENGTH OF THE RELATIONSHIPAS "R" APPROACHES +1, STRENGTH INCREASES IN A DIRECT ANDPOSITIVE WAYAS "R" APPROACHES -1, STRENGTH INCREASES IN AN INVERSE ANDNEGATIVE WAYAS "R" APPROACHES 0, THE RELATIONSHIP IS WEAK OR NONEXISTENT (AT ZERO
Range
COMPUTED BY SUBTRACTING THE LOWEST SCORE FROM THEHIGHEST SCOres
Assessment procedures can be
Formal: Rigor in test development (e.g., good, valid, and reliable)Informal: Rigor has not been demonstrated in the test development
Purpose of Assessment
Four general purposes: 1) screening 2) Identification and diagnosis 3) intervention planning 4) Progress and outcome evaluation
Describing scores measures of central tendency
HELPS TO PUT MORE MEANING TO SCORES• TELLS YOU SOMETHING ABOUT THE "CENTER" OF A SERIES OF SCORES • THREE COMMON MEASURES OF CENTRAL TENDENCY ARE: • MEAN • MEDIAN • MODE
The normal curve
IT IS UNIMODAL, WHICH MEANS THAT IT HAS A SINGLE POINT OF MAXIMUM FREQUENCY OR MAXIMUM HEIGHT • 100% OF THE SCORES FALL BETWEEN -3 AND +3 STANDARD DEVIATION FROM THE MEAN WITH APPROXIMATELY: • 68% OF THE SCORES FALLING BETWEEN -1 AND +1 STANDARD DEVIATIONS • 95% OF SCORES FALLING BETWEEN -2 AND +2 STANDARD DEVIATION • 99.5% OF SCORES FALLING BETWEEN -3 AND +3 STANDARD Deviation
Internal consistency
In internal consistency, you are looking at individual items to see if those items are consistent with one another and thus represent a singular construct or trait that is being assessed by the test. This method is used to determine whether errors associated with content sampling are present
Increasing reliability
Increase the number of test items Write understandable, unambiguous items Use selected-response (multiple choice)instead of constructed-response (essay)items Ensure items are not too difficult or too easy Provide clearly stated administration and scoring procedures Require training before individuals can administer, grade, and interpret a test
Select and implement assessment methods
Interviews Background information including family history, work and education background, social history, environmental factors, etc. Tests Evaluation of a person's cognitive functioning, knowledge, skills, abilities, or personal traits. Observation May be used to record or monitor a client's behavior in a particular setting
Variance
THE AVERAGE AMOUNT OF VARIABILITY IN A GROUP'S SCORES. ITIS COMPUTED AS THE AVERAGE SQUARED DEVIATIONS OF VALUES FROM THE MEAN
measures of variability refer to
THE DEGREE THAT SCORES ARE SPREAD OUT AND DIFFER FROM ONE ANOTHER•
What is a tes
Tests are a subset of assessment yielding scores based on collective data. Example: Finding the sum of correct items on multiple-choice exam
Test
Tests are instruments designed to measure specific attributes of an individual including knowledge or skill level, intellectual functioning, aptitude, interests, psychological symptoms, etc.
standards for scoring errors
The 2000 Guidelines of the International Test Commission (ICT) state that test users should apply quality control procedures to the scoring, analysis, and reporting process and that test users should maintain high standards • Scoring of assessments should be conducted properly and efficiently so that the results are reported accurately and in timely manner
Brief history of intelligence testing
The First test pioneered by Binet and Simon• Revised by Lewis Terman • Over the years, many new models of intelligence were defined • Many new intelligence tests were created based on the models
KR Formulas and coefficient Alpha
The Kuder-Richardson formulas were developed to serve as an alternative to the split-half method of assessing internal consistency. Like in the split-half approach, a single test is administered in a single session. The difference is that rather than arbitrarily dividing the test into two equivalent halves, a statistical process is applied to determine the the split-half reliability of all possible combinations of test halves that can be created by separating the items.
Standard deviation
The average distance of test scores from the mean
Factors that can impact the validity of score interpretations: Contextual factors
When interpreting results, test users should consider the relationship of the test to the instructional program, the opportunity to learn, quality of the educational program, work and home environment, and other factors. For example, if the test does not align to curriculum standards and how those standards are taught in the classroom, then the test results may not provide useful information
Z-scores
Z Scores convert test scores into a standard deviation value, ranging from -3.0 to +3.0.The mean for the distribution of z scores is 0and the standard deviation is 1.0
An effective assessment report is
a document that outlines the presenting problem that led the client to seek counseling, summarizes the services provided, and addresses the outcomes of those services that were provided
what is intelligence
a measure of your ability to acquire and apply knowledge
skilled counselors begin the process of screening their client when?
almost immediately nonverbal communication is an interesting phenomenon - appreance -motor skills (movements) -motivation -who accompanies the clients -voluntary or involuntary presentation
Case Conceptualization
can best be defined as "a method and clinical strategy for obtaining and organizing information about a client, understanding and explaining the client's situation and maladaptive patterns, guiding and focusing treatment, anticipating challenges, and road blocks, and preparing for successful termination
screening and initial assessments intake interview:
demographic information Referral reasons Current situation Previous counseling experience Birth and developmental history Family history Medical history Educational and/or vocational background intake and Assessment Role-Play Part 1 - Referral and Presenting Problems - YouTube
Frequency distribution
distribution is simply a set of scores. A frequency distribution orders a set of scores from high to low and lists the corresponding frequency of each score. Frequency distributions are useful in allowing examiners to see 1) the entire set of scores at a glance 2) whether scores are generally high or generally low 3) whether scores are concentrated in one area spread out
Selecting assessment instruments and strategies
does the instrument's manual provide clear and detailed instructions about administration procedures? does the manual provide sufficient information about scoring interpreting and reporting results? is the instrument biased? what level of competency is needed to use the instrument? A, B, or C?
Identify the problem
does this student have a learning disability? If so, does he/she qualify for special education or related services this child ready to begin kindergarten? Does this adult have PTSD? Is this individual suicidal? Does this parent have a mental disorder that might interfere with parenting? What are this individual's vocational interests
Five steps in selecting an assessment instrument: step five
evaluate and select an assessment instrument or strategy - When selecting which assessment instrument and strategies to use, counselors must evaluate instruments on the basis of several factors, including purpose, test scores, reliability, validity, etc.
Five steps in selecting an assessment instrument: step one
identify the type of information needed - Counselors determined what information to collect based on the purpose of the assessment. If you were looking to screen a client tos ee if there was a need for acute psychiatric care at the inpatient level, what information would be needed to help you make this decision? Information on presenting signs and symptoms, the presence of any psychotic features, and any verbalization of threats to harm self others would be supportive of your division to admit this client to an inpatient care facility
The Methods for collecting assessment information
interviews, tests, observations
observation
is an assessment method that involves watching and recording the behavior of an individual in a particular environment. It is away of seeing what a person actually does
Interrater
is used when we want to assess the level of agreement between two or more raters in their ecauation of a particular outcome. the most effective way to assess inter-rater reliability is to correlate the scores obtainned independently by each of the raters
What practical issues should be considered for this instrument? - It includes:
the time required for administration cost of the instrument format readability administration procedures scoring procedures interpretation
true score
the true reflection of ones ability, skills or knowledge
Time sampling error
time sampling errors result from repeated administrations of a test to the same individual. When an individualities the same test multiple times, the scores obtained will most likely vary. Fatigue
Presenting assessment reports, three kinds
to the adult client to minors debriefing
activities during administration
• The counselor may begin with a final check to see that all is in order• Examiners need to check on materials, lighting, ventilation, work space, etc. • Counselors need to deliver verbatim instructions given in the test manual• Counselors need to establish rapport with the examinee• The examiner should record any test behavior or other critical incidents that may increase or reduce an individual's opportunity to perform to capacity
purpose of intelligence testing
• To assist in determining giftedness• To assess mental retardation • To identify certain types of learning disabilities • To assess intellectual ability following an accident, the onset of dementia, substance abuse, disease processes, and trauma to the brain • Admissions process to certain private schools • As part of a personality assessment battery to aid in understanding the whole
The areas counselors assess are in diagnosis
• the history of the client's presenting problem • the developmental history of the problem • any precursors and consequences related to the problem • and the existence of any strengths or weaknesses the client possesses that may need to be highlighted or address use: the DSM
Charles Spearman's g theory
(1863-1945) two-factor approach: • Spearman postulated that performance on intelligence tests is based on• a general ability factor (g) and• one or more specific factors (s) • The g factor represented a measure of general intelligence that underlies performance on a wide variety of tasks while the s factors were specific learned skills that can influence intelligence performance
The assessment process is into four parts
1) identify the problem 2) Select and Implement Assessment Methods 3) Evaluate the Assessment Information 4) Report Assessment results and male recommendations
Scoring performance assessment components
1. One or more dimensions or attributes on which performance is rated 2. Descriptors of examples that illustrate each dimension or attribute being measured• Poor, Fair, Good, Excellent 3. A rating scale for each dimension• Unsatisfactory, Below satisfactory, satisfactory, and exemplary
CORRELATION COEFFICIENTS (R)
CORRELATION IS THE STATISTICAL EXPRESSION OF THE RELATIONSHIP BETWEEN TWO SETS OF SCORES (OR VARIABLES)• POSITIVE CORRELATION: AN INCREASE IN ONE VARIABLE ACCOMPANIED BY AN INCREASE IN ANOTHER "DIRECT" RELATIONSHIP • NEGATIVE CORRELATION: INCREASE IN ONE VARIABLE ACCOMPANIED BY DECREASE IN OTHER"INVERSE" RELATIONSHIPS
Content sampling error
Content sampling errors are related to the development and construction of tests. In some cases, it would be near impossible to create a test with enough items to sufficiently assess every aspect of a dimension or construct being assessed. E.g., IQ test
activities after administration
Counselors may need to collect materials according to a predetermined order, counting the test booklets and answer sheets and arranging them all face-up
Five steps in selecting an assessment instrument: step three
Determine the methods for obtaining information - Counselors may use interviews, tests,and observations when they established appropriate competency. However, the counselor should pay attention to the external variables, including setting, timing, financing, and legal involvement
Diagnosis
Diagnosis refers to the process of learning more about the client and his or her problem
Evaluate the assessment information
Evaluation involves scoring, interpreting, and integrating information obtained from all assessment methods and sources to answer referral questions' using basic skills such as, statistical concepts, psychometric principles, and procedures, counselors can organize collected data in the following way: Document any significant findings that clearly identify problem areas • Identify convergent findings across methods and sources • Identify and explain discrepancies in information across methods and sources • Arrive at a tentative formulation or hypothesis of the individual's problem • Determine the information to include in the assessment report
percentiles
Express the examinee's relative position on a norm-referenced test. Percentile ranks range from 0 to 100 and indicate the percentage of scores that were lower than the examinee. Percentiles are not equal-interval measurements
Factor analysis
FACTOR ANALYSIS ALSO ANALYZES THE RELATIONSHIP AMONGVARIABLES TO SIMPLIFY THE DESCRIPTION OF DATA BYREDUCING THE NUMBER OF NECESSARY VARIABLES
Factors that can impact the validity of score interpretations: Psychometric Factors
Factors such as the reliability, norms, standard error of measurement, and validity of instrument can impact an individual's scores and the interpretation of the test results
Selecting assessment instruments and strategies level A test
Level A tests are those that are designated for general use. They do not require advanced training or education. To administer a Level A test, users would need to familiarize themselves with the administration, scoring, and result-reporting protocols for the instrument. This information is typically found in an accompanying test manual. An example of a Level A test would be the Self-Directed Search(Holland, 1994).
Level B test selecting instruments and stratagies
Level B tests require users to possess technical knowledge related to the practice of assessment. This includes an understanding of instrument development, psychometric issues (reliability and validity), test score properties, and appropriate test usage. Individuals interested in using Level B tests should have completed graduate course workin assessment as part of their master's degree in counseling, psychology, or a related field. Most test publishers also require those interested in using these tests to document that they have appropriate licensing and credentialing in their field. Examples of Level B assessments are the Myers-Briggs Type Indicator and the 16-Personality Factor Questionnaire. When you begin your career as a counselor following the completion of your master's degree, these are the level tests you most likely will be using
level c test when selecting an assessment instrument and strategy
Level C is the highest level of qualification. Level C tests require advanced education and training. An earned doctorate in counseling, psychology, education, or a related field is required. In addition to the general assessment coursework required for Level B tests, potential users must also have had coursework in either a specific instrument (e.g., Wechsler Adult Intelligence Scale-Fourth Edition) or a class of instruments (intelligence testing). An example of a Level C test would be the Minnesota Multiphasic Personality Inventory-Second Edition(MMPI-2). Counselors interested in using Level tests in their practice should check to see if there are any state laws or regulations that may limit their use of certain assessments or tests from this level. In some states, the use of testing is protected and only members of certain professions(counselors, psychologist, social workers, can use specified tests
Thurstone's multifactor approach
Louis L. Thurstone, a British psychologist who lived from 1887 to1955, did not believe that g was the only factor that constitutes intelligence, nor did he support the idea that a single IQ fully and comprehensively assessed intelligence. • Seven primary mental abilities:• Verbal comprehension - ability to understand ideas expressed in word form• Number ability - ability to perform basic mathematic processes accurately and rapidly • Word fluency - ability to speak and write fluently • Perception speed - ability to perceive things quickly, such as visual details and similarities and differences among pictured objects • Spatial ability - ability to visualize and form relationships in three dimensions • Reasoning- abilities to derive rules and solve problems inductively Memory- the ability to recognize and recall information such as number letters and words
Median
MIDDLE SCORE • ODD NUMBER OF SCORES = EXACT MIDDLE SCORE • EVEN NUMBER = AVERAGE OF TWO MIDDLE SCORES
Five steps in selecting an assessment instrument: Step four
Mental MeasurementYearbook, tests in print, publisher's website or catalogs, manuals, research literature, internet resources, etc.
Scoring errors
Occur frequently, regardless of who is scoring the instrument or the scorer's level of experience with testing • Include assignment of incorrect score values to individual responses, incorrectly converting raw scores to derived scores, and making calculation errors
CORRELATION COEFFICIENTS (R)
PLOTTING TWO SETS OF SCORES FROM THE PREVIOUS EXAMPLES ON A GRAPH • PLACE PERSON A'S SAT SCORE ON THE X-AXIS, AND HIS/HERGPA ON THE Y-AXIS• CONTINUE THIS FOR PERSON B,C, D ETC. • THIS PROCESS FORMS A SCATTERPLOT
Treatment plan
Problems goals objectives interventions
qualification to use standardized test depend on at least four factors
Purpose of Testing Characteristics of tests Settings and conditions of test use Roles of test selectors, administrators, scorers, and interpreter
Regression
REGRESSION IS A STATISTICAL METHOD RELATED TOCORRELATION BUT IS PRIMARILY USED FOR PREDICTION • REGRESSION IS THE ANALYSIS OF RELATIONSHIPS AMONG VARIABLES FOR THE PURPOSE OF UNDERSTANDING HOW ONEVARIABLE MAY PREDIC T ANOTHER
Three commonly used measures of variability are
Range variance standard deviation
Standard scores
Refers to scores that have been converted to an interpretable scale that has a set mean and standard deviation. Standard scores allow individual test scores to be interpreted in terms of the normal curve
Report assessment results and make recommendations
Reporting results and making recommendations involve the following steps: • Describing the individual being assessed and his/her situation • Reporting general hypotheses about the individual • Supporting these hypotheses with assessment information • Proposing recommendations related to the original reason for referral
Stanine
Scores range from 1 to 9. Mean of 5 with a standard deviation of 2
Administering Assessment Instrument
Self-Administered • Individually Administered • Group Administered • Computer Administered • Video Administration• Audio Administration • American Sign Language • Nonverbal
Types of Norm-Referenced Scores
Standard Scores Normal curve Z-scores T -scores Stanine Percentile
T-scores
T scores use a fixed mean of 50 and a fixed standard deviation of 10. T scores eliminate the need for decimals and negative values. T-scores are calculated using the Z score as abase
Five steps in selecting an assessment instrument: step two
The existing information often provides useful ,relevant informationn about a client that may be used in the current assessment process intake questionnaires, biographical data, preliminary diagnosis, family information educational history, grades, etc.). Once you are able to quantify what you already know, you can look too select a test that helps you fill in the gaps and address those areas where you have an incomplete picture of the client or presenting situation.
Multiple Methods and Multiple Sources
The interview is a face-to-face meeting of the assessment professional and the client
Level of acceptable reliability
The level of acceptable reliability depends on: the construct being measured the way the test scores will be used the method used for estimating reliability
What is Assessment
The term assessment refers to any systematic procedure for collecting information that is used to make inferences or decisions about the characteristics of a person.
Classical test theory
The test scores are never completely consistent and there will always be at least some degree of variance.• To understand better the concept of measurement error, we have to understand the Classical test theory (CTT). • CTT often referred to as the true score model, describes a set of psychometric procedures that can be used to test the reliability, difficulty, and discriminatory properties of test items and scales
Factors that can impact the validity of score interpretations: Test-taker factors
The test taker's group membership (Gender, age, ethnicity, race, SES, relationship status) is a critical factor in the interpretation of test results. Test users should evaluate how the test taker's group membership can affect his or her test results
Methods of estimating reliability
There are four primary methods through which reliability can be assessed. These methods are: (a) test-retest, (b) alternate forms, (c) internal consistency, and (d) inter-rater reliability
test-retest
This approach is used when you are interested in assessing how reliable or stable scores on an instrument are overtime. The process of assessing test-retest reliability is straightforward. A set of participants is tested using the same test on two separate occasions. The observed scores of the participants on each test administration are then compared and a reliability coefficient is calculated. In this design, instability over time can be viewed as the primary source of measurement error
sources of measurement
Time-Sampling Error Content-Sampling Error Interrater Differences Other Sources of Error Quality of Test Items Test Length Task-Taker Variables Test Administration
Standard error of measurement (SEM)
measures an individual's test score fluctuations(due to error) if he/she took the same test repeatedly. The SEM is not a measure of reliability but can be used to create confidence intervals around specific observed scores. Confidence intervals establish the upper and lower limit in which a test taker's true score falls.
Multiple Linear Regression
multiple IVs and one DV
Scales of measurement
nominal ordinal interval ratio
Classical test theory consist of
observed score true score overserved score = true score + error (x=T+E)
Simple linear regression
one IV and one DV
Reliability
refers to the degree to which test scores are dependable, consistent, and stable upon additional tests. Multiple factors can impact reliability. Reliability refers to the results obtained; not the instrument
INTEGRATING ASSESSMENT INTO COUNSELING Practice 6 concepts
screening and initial assessments diagnosis case conceptualization developing client treatment plans writing effective assessment reports presenting assessment reports
Models of intelligence
spearman's g theory Cattell's fluid and crystallized intelligence Thurstone's primary mental abilities Vernon's hierarchical model of intelligence Sternberg's triarchic theory of intelligence Piaget cognitive developmental theory Gardner's multiple intelligences emotional intelligence information processing view
the are various sources of measurement error that can affect the reliability of test scores
systematic error unsystematic error (random error)
Observed score
the acutal score on a test