Assessment and Testing

Ace your homework & exams now with Quizwiz!

Objective Test Items

Standardized questions with clear correct or incorrect answers; not open to any interpretation

Range

Subtraction of the lowest score from the highest score

Skew

The measure of a score deviates from the norm

Mode

The most common or frequent score that occurred in a group of tests. If a number/score doesn't occur twice, a test doesn't have a mode

Major Types of Tests and Inventories

-Achievement Tests -Aptitude Tests -Intelligence Tests -Occupational Tests -Personality Tests

Test

A measuring device or procedure

Variance

How widely individuals in a group vary; how data is distributed from the mean and the square of the standard deviation.

Score

Numerical value associated with a test or measure

Test-Retest Reliability

Involves administering the same test twice to a group of individuals, then correlating the scores to evaluate stability

Regression to the Mean

Statistical tendency of a data series to gravitate towards the center of a distribution

Z-score

Also referred to as a standard score, Z-scores measure the number of standard deviations a raw score is from the mean. Z-scores use zero as the mean.

Isabella is ready to join the workforce, but she is unsure what kind of job she would like. A friend recommends to Isabella a test that is used to assess what kind of career she might like or what kind of work she might be good at. what kind of test is Isabella's friend probably referring to? a. Intelligence Test b. Occupational Test c. Achievement Test d. Personality Test

B: Isabella's friend is probably referring to an occupational test. Occupational tests assess skills and interests as they relate to occupational choices.

Predictive validity is used for what purpose? a. To measure abstract traits b. To predict future behavior c. To determine the ability to master skills or knowledge d. To compare test scores

B: Predictive validity predicts behavior in the future.

Subjective

Individual perceptions/interpretations based on feelings and opinions, but not necessarily based on fact

Trait

Method of describing individuals through observable characteristics that are unique and distinguishable.

Horizontal Test

A test covering material across various subjects

Psychological Assessment

An informal process of testing, interviews, or observations used to determine, diagnose, or develop treatment plans. It can include personality assessments, projective or subjective tests, intelligence tests, or diagnostic batteries.

Obtrusive Measurement

Assessment tools (such as observation) conducted without knowledge of the individual.

Which theorist developed ideas on fluid and crystallized intelligence? a. John Ertl b. Alfred Binet c. Sir Francis Galton d. Raymond Cattell

D: Raymond Cattell proposed the concept of fluid and crystallized intelligence in the 1940s.

External Validity

Describes how well results from a study can be generalized to the larger population

Reliability

Four types of reliability are test-retest parallel forms, inter-rater, and internal consistency. Each type measures that a tool is producing consistent and stable results that must be quantified. Reliability doesn't indicate validity.

David Wechsler

In the 1950s, American psychologist developed intelligence tests for adults and children. His tests were adept at identifying learning disabilities in children. He began his career developing personality tests for the US military. He disagreed with some aspects of the Stanford-Binet Intelligence Scale, and believed intelligence had both verbal and performance components. He also believed factors other than pure intellect influenced intellectual behavior. Wechsler's tests are still used today for adults, as well as school-age and primary-age children. They include the Wechsler Adult Intelligence Scale (WAIS-IV), the Wechsler Intelligence Scale for Children (WISC-IV), and the Wechsler Preschool and Primary Scale of Intelligence (WPPSI-III).

Concurrent Validity

Is used to determine if measures can be substituted, such as taking an exam in place of a class. Measures must take place concurrently to accurately test for validity

Normative Format

Means of testing to compare individuals to others

Achievement Tests

Measure knowledge of a specific subject and are primarily used in education. Examples include exit exams for high school diplomas and tests used in the Common Core for educational standards. The General Education (GED) and the California Achievement Test are both achievement tests that measure learning.

Difficulty Index

Measure of the proportion of examinees who answer test items correctly

Aptitude Tests

Measure the capacity for learning and can be used as part of a job application. These tests can measure abstract/conceptual reasoning, verbal reasoning, and/or numerical reasoning. Examples can include the Wonderlic Cognitive Ability Test, the Differential Aptitude Test (DAT), the Minnesota Clerical Test, and the Career Ability Placements Survey (CAPS)

Rating Scale

Process of measuring degrees of experience and attitudes through questions

Construct Validity

Refers to a test that measures abstract traits or theories, and isn't inadvertently testing another variable. For example, a math test with complex word problems may be assessing reading skills. Two subtypes of validation are: Convergent Validity and Discriminant Validity

Median

Refers to the middle or venter number in an ordered list of scores or data; also referred to as the midpoint. In an even data set, the two middle numbers are typically averaged to determine the median.

Content Validity

-Ensures that the test questions align with the content or study area. This can be measured my two subtypes of validation: Face validity and Curricular Validity

Mrs. Louise favors Judge Benson above all the other judges in Wilmington, North Carolina. Judge Benson is tall, has blue eyes, speaks clearly, and smiles warmly anytime Mrs. Louise walks into the room. What kind of effect does Judge Benson have on Mrs. Louise? a. Halo effect b. Bystander effect c. Audience effect d. Bandwagon

A: The Halo effect is a type of cognitive bias in which positive perceptions of an individual influence views. Mrs. Louise has a positive perception of Judge Benson due to his attractive appearance, and she is mistakenly extending that attractiveness to his personality

Intelligence Tests

Measure mental capability and potential. One example is the Wechsler Adult Intelligence Scale (WAIS-IV), currently in its fourth edition. The Wechsler Intelligence Scale for Children (WISC-IV), also in its fourth edition, is used for children ages 6 to 16 years 11 most of age, and can be completed without reading or writing. There's a separate version of the test for children aged 2 years 6 most to 7 years 7 mos, known as Wechsler Preschool and Primary Scale of Intelligence (WPPSI-III). Examples of other intelligence tests are the Stanford-Binet Intelligence Scale, the Woodcock-Johnson Tests of Cognitive Abilities, and the Kaufman Assessment Battery for Children.

Ethical Issues in Testing

A variety of ethics issues myst be considered before, during, and after any test or assessment is administered. To begin, the counselor must be adequately trained and earn any certifications and supervision necessary to administer and interpret the test. Test must be appropriate for the needs of the specific client. Next, the client must provide informed consent, and they must understand the purpose and scope of any test. Test results must remain confidential, which includes access to any virtual information. Finally, tests must be validated for the specific client and be unbiased toward the race, ethnicity, and gender of the client.

Face validity is a subtype of content validity. Content validity ensures that the questions align with the content. What is face validity? a. Common sense b. Verifiable by professionals c. Measures success d. Measures abstract traits or theories

A: Face validity is obvious to any user, so it's considered common sense.

Fluid intelligence refers to what type of abilities? a. Thinking and acting quickly, solving new problems b. Utilizing learned skills c. Adapting to new situations d. Developing opportunity from adversity

A: Fluid intelligence refers to the ability to think and act quickly and solve new problems. It is independent of education and culture, making other choices incorrect.

Stanine (STAndard NINE)

A since point scale used to convert a test score to a single digit. Stanines are always positive whole numbers from zero to nine.

Test Battery

A group or set of tests administered to the same group and scored against a standard

Criterion Validity

Measures success and the relationship between a test score and an outcome such as scores on the SAT and success in college. It's two subtypes are: Predictive Validity and Concurrent Validity

JP Guilford

American psychologist conducted psychometric studies of human intelligence and creativity in the early 1900s. He believed intelligence tests were limited and overly one-dimensional, and didn't factor in the diversity of human abilities, thinking, and creativity.

Halo Effect

An overgeneralized positive view of a person from limited data. An example for this would be favoring a politician for their attractiveness and assuming that attractiveness extends to their ethical beliefs or personality.

What kind of test is used to explore the client's unconscious attitudes or motivations? a. Objective test b. Projective test c. Free choice test d. Vertical test

B: a projective test would be given to explore the client's unconscious attitudes or motivations.

Who was one of the first to measure intelligence by way of a structured test? a. Raymond Cattell b. Carl Jung c. Alfred Binet d. Sir Francis Galton

C: Alfred Binet created one of the first intelligence tests, which when brought to the US, became the Stanford-Binet.

Which test did Robert Williams develop? a. Culture Fair Intelligence Test b. Wechsler Adult Intelligence Scale c. Black Intelligence Test of Cultural Homogeneity d. Minnesota Clerical Test

C: The Black Intelligence Test of Cultural Homogeneity, which addressed racial inequalities of traditional intelligence tests.

The Black Intelligence Test of Cultural Homogeneity, designed by Robert Williams, was used to do what? a. Show cultural dissimilarities b. Expose culture bias in testing c. Test one's cultural background d. All of the above

D: All of the above. The Black Intelligence Rest of Cultural Homogeneity (BITCH Test) was a culture specific test designed to expose cultural and racial bias in testing and to prove the dissimilarities of cultures within the US

Dichotomous Items

Opposes choices on a test, such as yes/no or true/false options

Face Validity

Refers to a commonsense view that a test measures what it should and looks accurate form a non-professional viewpoint

Standard Error of Measurement (SEM)

Refers to test reliability and the difference between the true score versus the observed score. Since no test is without error, the SEM depicts the dispersion of scores of the same test to rule out errors, also referred to as the "standard error" of a score

Consequential Validity

Refers to the social consequences of testing. Though not all researchers feel it's a true measure of validity, some believe that a test must benefit society in order to be considered valid.

Discriminant Validity

Refers to using tests that measure differently and results that don't correlate.

Reliability

Reliability in testing is the degree to which the assessment tool produces consistent and stable results

Projective Test

Responses to ambiguous images that are intended to uncover unconscious desires, thoughts, or beliefs

T-Score

Specific to psychometrics, used to standardize net scores and convert scores to positive numbers. T-Scores represent the number of standard deviations the score is from the mean (which is always 50).

Correlation Coefficient

Statistic that describes the relationship between two variables and their impact on one another. In positive correlation, both variables react in the same direction. In negative correlation, variables react in opposite directions.

Scale

Used to categorize and/or quantify variables. The four scales of measurement are nominal, ordinal, interval, and ratio.

Administering Tests to Clients

-As part of the counseling process, it can be necessary for the counselor to administer tests or assessments to measure and evaluate the client -Tests are a more formulated means to quantify information and guide treatment options, or to develop goals. -Assessments are more informal -They can include surveys, interviews, and observations

Types of Validity

-Content -Face Validity -Curricular Validity -Criterion Validity -Predictive Validity -Concurrent Validity -Construct Validity -Convergent Validity -Discriminant Validity -Consequential Validity

Reasons to administer a test or assessment

-Help the client gain a better understanding of themselves -Provide counselors with concrete data -Ensure a client's needs are within the counselor's scope of practice -Assist in decision-making and goal-setting for the counseling process -Provide insight to both the client and the counselor -Assist in setting clear expectations for clients -Help the counselor gain a deeper understanding of their client's needs -Set benchmarks to ensure client and counselor are making progress towards their goals -Evaluate the effectiveness of counseling interventions

Types of Reliability

-Test-Retest Reliability -Parallel-Forms Reliability (aka equivalence) -Inter-Rater Reliability (aka inter-observer) -Internal consistency

Interpreting Test Scores

-To begin, any test or assessment should be given under controlled circumstances. The counselor should follow any instructions provided in the test manual. Once completed, the counselor and client can discuss the results. -Best practices for interpreting results -Counselor must thoroughly understand the results -Counselor should explain results in easily understood terms, and be able to provide supporting details and norms as needed -Counselor should explain and understand average scores and meanings or results -Counselor should allow the client to ask questions and review aspects of the test to ensure understanding -Counselor must explain the ramifications and limitations of any data obtained through testing

John Ertl

A professor working in Canada in the 1970s, invented a neural efficiency analyzer to more effectively measure intelligence. He believed traditional intelligence tests were limited to understanding an abstract degree of intelligence. Ertl's system measured the speed and efficiency of electrical activity in the brain using an electroencephalogram (EEG)

Which of the following is NOT a reason a counselor would administer a test or assessment to a client? a. Aid the counselor in sharing pertinent information with the client's loved ones b. Ensure the client's needs are within the counselor's scope of practice c. Help the client gain a better understanding of themselves d. Evaluate the effectiveness of counseling interventions

A: Choice A is incorrect, this violates counselor/client confidentiality laws. Ensuring the clients needs are within the scope of practice, helping the client gain a better understanding of themselves, and evaluating the effectiveness of counseling interventions are all reasons a counselor would administer a test to a client.

Why did John Ertl invent a neutral efficiency analyzer to more effectively measure intelligence? a. He believed traditional intelligence tests were limited to understanding an abstract degree of intelligence b. His rival, Robert Williams, was gaining popularity with his own intelligence test, and Ertl wanted one that surpassed Williams' genius. c. He had been injured in an intelligence test in the past and wanted to develop an intelligence test that was safe for the subject. d. He believed that genetic factors were the most influential indicator of intelligence

A: He believed traditional intelligence tests were limited to understanding an abstract degree of intelligence.

Sir Francis Galton

An English anthropologist and explorer, was one of the first individuals to study intelligence in the late 1800s. A cousin of Charles Darwin, Galton coined the term eugenics and believed that intelligence was genetically determined and could be promoted through selective parenting.

Inter-Rater Reliability

Checks to see that raters (those administering, grading, or judging a measure) do so in agreement. Each rater should value the same measures and at the same degree to ensure consistency. Inter-rater reliability prevents overly subjective ratings, since each rater is measuring on the same terms

Which of the following is the type of reliability that involves administering two different versions of an assessment that measures the same set of skills and then correlates the results? a. Internal consistency b. Inter-Rater c. Test-Retest d. Parallel-forms

D: Parallel-forms reliability involves administering two different versions of an assessment that measures the same set of skills and then correlates the results

Coefficient of determination

Denoted by R^2, the proportion of variance in the dependent variable that's predictable from the independent variable and the square of the coefficient of correlation.

Percentile

Determines how test scores rank on a scale of 100. Percentiles determine the number of individuals who are at or below a given rank. For example, a test taker who scores in the 65th percentile performed better than 65 percent of the other test takers.

Rapport

Development of trust, understanding, respect, and liking between two people; essential for an effective therapeutic relationship

Bell Curve

Illustration of data distribution that resembles the shape of a bell

Arthur Jensen

Supported the g Factor Theory and believed intelligence consisted of two distinct sets of abilities. Level I consisted of two distinct sets of abilities. Level I accounted for simple associative learning and memory, while Level II involved more abstract and conceptual reasoning. Jensen also believed that genetic factors were the most influential indicator of intelligence. In 1998, he published the book The g Factor: The Science of Mental Ability.

Appraisal

Professionally administered assessment tools and tests used to evaluate, measure,, and understand clients

Validity

Refers to how well a rest or assessment measures what it's intended to measure. For example, an assessment on depression should only measure the degree to which an individual meets the diagnostic criteria for depression. Though acidity does indicate reliability, a test can be reliable but not value. There are four major types of validity, with subtypes.

Vertical Test

Same-subject tests given to different levels or ages.

Measure

Score assigned to traits, behaviors, or actions

Q-Sort

Self-assessment procedure requiring subjects to sort items relative to one another along a dimension, such as degree or disagree.

Convergent Validity

Uses two sets of tests to determine that the same attributes are being measured and correlated. For example, two separate tests can measure students similarly.

Which of the following is NOT a personality test? a. Minnesota Multiphasic Inventory (MMPI-2) b. Beck Depression Inventory c. Stanford-Binet Intelligence Scale d. Myers-Brigg Type Inventory (MBTI)

C: The Stanford Binet Intelligence Scale is an intelligence test.

Psychometrics

The process or study of psychological measurement

Forced Choice Items

The use of two or more specific response options on a survey

Behavioral Observation

Type of assessment used to document the behavior of clients or research subjects

Alfred Binet

1900s French psychologist, along with medical student Theodore Simon, developed the first test to determine which children would succeed in school. His initial test, the Binet-Simon, focused on the concept of mental age, and included memory, attention, and problem solving skills. In 1916, his work was brought to Stanford University and developed into the Stanford-Binet Intelligence Scale. It's since been revised multiple times and is still widely used.

Personality Tests

Can be objective (rating scale based) or projective (self-reporting based), and help the counselor and client understand personality traits and underlying beliefs and behaviors. The Myers-Briggs Type Inventory (MBTI) provides a specific psychological type, reflecting the work of Carl Jung. It's often used as part of the career development process. Other rating scale personality tests include the Minnesota Multiphasic Personality Inventory (MMPI-2), the Beck Depression Inventory, and the Tennessee Self-Concept Scale. The Rorschach (inkblot) and the Thematic Apperception Test are both projective rests, designed to reveal unconscious thoughts, motives, and views.

Parallel-Forms Reliability

Involves administering two different versions of an assessment that measure the same set of skills, knowledge, etc. and then correlating the results. A test can be written and split into two parts, this creating parallel versions.

Raymond Cattell

In the 1940s, began developing theories on fluid and crystallized intelligence. His student, John Horn, continued this work. The Cattel-Horn Theory hypothesized that over one hundred abilities work together to create forms of intelligence. Fluid intelligence is defined as the ability to think and act quickly and to solve new problems, skills that are independent of education and enculturation. Crystallized intelligence encompasses acquired and learned skills, and is influenced by personality, motivation, education, and culture. In 1949, Cattell and his wife, Alberta Karen Cattell, founded the Institute for Personality and Ability Testing at the University of Illinois. Cattell developed several assessment, including the 16 Personality Factor Questionnaire and the Culture Fair Intelligence Test.

Validity

Indicates how well any given test or assessment measures what it's intended to measure. There are four major types of validity: content, construct, criterion, and consequential. Validity does indicate reliability.

Curricular Validity

Is evaluated by experts, and measures that a test aligns with the curriculum being tested. For example, a high school exit exam measures the information taught in the high school curriculum.

Ipsative Format

Means of testing that measures how individuals prefer to respond to problems, people, and procedures and doesn't compare results to others.

Standard Deviation

Measure of dispersion of numbers; calculated by the square root of the variance

Mean

Provides the average of all scores; calculated by adding all given test scores and dividing by the number of tests.

Likert Scale

Rating scale measuring attitudes to a degree of like or dislike

Predictive Validity

Refers to how useful test scores are at predicting future performance

Internal Consistency

Refers to how well a test or assessment measures what it's intended to measure, while producing similar results each time. Questions on an assessment should be similar and in agreement, but not repetitive. High internal consistency indicates that a measure is reliable. -Average Inter-Item Correlation is used to determine if scores on one item relate to the scores on all of the other items in that scale. Ensuring that each correlation between items is a form of redundancy to ensure the same content is assessed with each question. -Split-Half Reliability is the random division of questions into two sets. Results of both halves are compared to ensure correlation.

What do achievement tests measure? a. Academic potential b. Unconscious behaviors c. Learning d. Intellectual capacity

C: Achievement tests measure learning and are often given at the end of grade levels.

Occupational Tests

Can assess skills, values, or interests as they relate to vocational and occupational choices. Examples include the Strong Interest Inventory, the Self-Directed Search, the O*Net Interest Profiler, the Career Assessment Inventory, and the Kuder-Career Interest Assessment

Charles Spearman

English psychologist was responsible for bringing statistical analysis to intelligence testing. In the early 1900s, Spearman proposed the g Factor Theory for general intelligence, which laid the foundation for analyzing intelligence tests. Prior to his work, tests weren't highly correlated with the factors they attempted to measure.


Related study sets

Anatomy chapter 4 (Review HW part 2)

View Set

Excel Keyboard Shortcuts for Formatting Cells

View Set

ch 19 Health Education Principles Applied in Communities, Groups, Families, and Individuals for Healthy Change

View Set

Organizational Leadership BA 205 - Chapter 12

View Set