Testing and assessment terminology (Delta)
criterion-referenced test
testing where exams are marked according to an agreed standard, as is the case in public examinations e.g. IELTS
CEFR
the Common European Framework of Reference which provides a common basis for language education in the areas of curriculum design, methodology and assessment e.g. It is divided into levels such as A2, B1, etc. Note: It was created by the Council of Europe.
discrimination
the degree to which a test or test item distinguishes between stronger and weaker students e.g. the DELTA M1 exam is designed to discriminate between Pass-, Merit- and Distinction-level candidates
spin off
the effect a test result may have on learner, teacher and class activities following test. e.g teacher finds from progress test that students are unclear about something so spends more time on it to improve that skill/area.
test battery
a collection of tests that usually assess a variety of different attributes e.g. there are 3 components in Cambridge Preliminary (PET): 1. reading plus writing 2. listening 3. speaking Note: it is composed of group of activities which may be scored individually or whose scores may be combined into one overall score
progress test
a form of assessment focussing on course content, administered periodically during a course to monitor the learning process e.g at end of a unit to test grammar and vocabulary covered in that unit Note: also called formative tests. Note: They are often set to encourage revision.
formal assessment
a pre-planned, systematic attempt to ascertain what students have learned e.g. a series of planned end-of-course tests
test
one form of assessment, generally completed individually, often considered objective e.g. an end-of-unit multiple choice grammar _______
predictive validity
the extent to which the results accurately indicate whether or not the test taker will be able to perform a specified "real life" task. e.g. The IELTS exam is intended to predict whether or not the test taker will be able to cope, as far as English language competence is concerned, when following a university course in English.
scorer reliability
the extent to which the score of the test is similar no matter which scorer is scoring the test e.g. two people scoring a test agree on the test score, meaning the test has high ______ _______ Note: You can have higher scorer reliability with objective/discrete-item testing. Note: You can improve scorer reliability to some extent via standardisation and/or clear marking guidelines.
distractors
the incorrect alternatives in a multiple choice test e.g. a,c and d in : Thanks for .... me about it. a) tell b) telling c) told d) you tell
rubric
the instructions for a test item e.g. 'Provide a definition and an example or illustration for each of the terms below.'
impact
the social consequences which the format or content of a test has e.g. its effect on the education system, employment policy or the economy
indirect test
the testing of underlying knowledge and competencies that make up the basic or advanced requirements of the skill e.g. an __________ ______ of telephone skills might include asking learners to complete a gapfill of common phrases used during phone calls
objective marking
the way a test is scored - everyone would give the same score e.g. a true/false test with a set of correct answers
subjective marking
the way a test is scored where different markers would give different scores e.g. essays without marking criteria
scoring validity
whether the test is only scored on the areas which it is supposed to be testing e.g. if it's a listening test, you shouldn't penalise for spelling mistakes as this reduces ______ ________
analytic scoring
a process for scoring that uses a description of major features to be considered when assessing writing or speaking e.g. the pronunciation, language use, discourse management and communicative achievement marks given in Cambridge Main Suite speaking exams Note: The commonly analyzed features in writing tasks include content, organization, cohesion, style, register, vocabulary, grammar, spelling and mechanics Note: The commonly analyzed features in speaking tasks include pronunciation, fluency, accuracy, and appropriateness.
barrier test
a test aimed at differentiating between learners' abilities, and therefore their suitability for a particular course e.g. SAT tests in the USA Note: a certain score is often needed on these test for course entry
achievement test
a test designed to assess what a person has learned / their progress either during or at the end of a course e.g. an end of unit test when studying from a coursebook Note: also called a summative test
reliability
a test is reliable if the same students with the same amount of knowledge could take the same test at different times and get more or less the same results e.g. if two B1 groups take the same vocabulary test after the same course of study, they should get a similar spread of results
placement test (AKA entry test)
a test or assessment done to place a student in the correct level/class at the start of a course e.g. the test you do when you first join a language school to work out which group you should be in
proficiency test
a test taken to assess candidates' language ability regardless of any course of study or previous training e.g. IELTS e.g. Cambridge First (FCE) e.g. Cambridge Advanced (CAE)
objective test
a test that gives the same score when different people mark it e.g. multiple-choice, matching, true/false, short-answer, and fill-in tests Note: scoring doesn't require interpretation / judgment Note: It makes tests results more reliable as there is no variation in scoring.
diagnostic test
a test that helps the teacher to determine students areas of weakness and to inform syllabus decisions and guide instruction e.g. a grammar test at the start of a course to find out which areas learners need to work on Note: It helps to identify specific areas, (sub-)skills or knowledge that are problems for the student. Note: It can be targetted at one specific area e.g. listening, or it can be a more general test.
direct test
a test that measures ability directly by requiring test takers to perform tasks designed to approximate an authentic target language use situation as closely as possible e.g. a ______ ______ of writing might include asking test takers to write an essay
cloze test
a test where words are deleted from a text at regular intervals (originally every 7th word) leaving blanks for the learners to fill e.g. Cambridge First uses an open _____ as part of the Use of English section e.g. When you write a sentence and __________ every seventh word blank for learners _____ fill in.
discrete-item test
a test which assesses knowledge of individual language items e.g. a gapfill asking learners to complete dependent prepositions
integrative test (AKA holistic test)
a test which combines assessment of a number of language elements at the same time e.g. writing an essay tests grammar structures, lexis, text layout, register, etc.
norm-referenced test
a test which compares test takers to each other rather than against external criteria e.g. once you get all of the results, you allocate a pass to the top 80% of learners, regardless of their scores
subjective test
a test which requires the markers to evaluate and not just to follow a mark sheet e.g. impressionistic marking of an essay band descriptors (for example, for IELTS or CAE) Note: using band descriptors is an attempt to make scoring more objective Note: results could be unreliable. Note: some judgment is required by the scorer during the scoring process
continuous assessment
a type of testing where some or all of the work that learners do during a course is considered by the teacher on a regular basis and contributes to the final grade given to learners e.g. a series of reading assessments across the course make up the final grade for reading Note: It may also include regular monitoring of classroom performance and contributions.
concurrent validity
a type of validity obtained by comparing the result of a new test with those of an older one, already known as valid e.g. the results of this year's DELTA M1 exam are compared to last year's - if they are similar, the tests have _____ ______
assessment
all the ways we can learn about a learner's progress and achievement, formal or informal e.g. we can notice progress by observing learners in class over time and giving marks for participation in speaking activities Note: they can be completed alone or with others and be objective or subjective
self-assessment
an evaluation of your own performance in a given situation, ideally done in line with clear criteria e.g. completely a __________ checklist at the end of a unit to assess your own performance in it
exam
an official, formal test, generally considered to be objective e.g. Cambridge First Note: these are often set by an external body Note: these often carry consequences such as access to a course, or proof of a level
summative assessment
assessment data collected after instruction to evaluate a student's mastery of the curriculum objectives and a teacher's effectiveness at instructional delivery e.g. an end-of-course test
informal assessment
assessment that results from a teacher's spontaneous, day-to-day observations of how students behave and perform in class or for homework e.g. noticing learners' ability to understand instructions given in class Note: It can be very subjective.
formative assessment
assessment used throughout the teaching of a lesson and/or unit to gauge students' understanding and inform and guide teaching e.g. an assessed speaking task in the middle of a unit, the results of which inform future speaking lessons in the unit
can-do statements
criteria concerning what a learner can successfully do with language in the real world, against which they can be assessed or self-assess themselves e.g. B1 Level: CAN write letters or make notes on familiar or predictable matters.
fresh starts
each task in an assessment is independent and does not rely on a good response to a previous one e.g. key word transformations on the Cambridge exams e.g. if the student has to write about 6 different pictures but is unable to produce a sentence for one of the pictures she may still do the others and gain some marks
aptitude test
estimates the probability that a person will be successful in learning a specific new skill e.g. a verbal reasoning test Note: you might use this type of test when selecting candidates for a role that would require them to learn a new language.
open questions
questions with no fixed answer/response and respondents can answer in any way they wish, and with whatever language they want to e.g. writing tasks for Cambridge Main Suite exams
construct validity
refers to whether an assessment tests what it claims to test and nothing else. e.g. progress test at end of unit to test if students have learnt grammatical structures from unit, but test requires them to understand/use new vocabulary = reduced ________ _________
closed questions
requires learners to provide a specific piece of information for completion e.g. sentence matching multiple choice Note: T_______ __________ ask for specific information and typically elicit a short, one- or two- word answer, a "yes" or "no," or a forced choice
holistic scoring
scoring method in which a single score is given to represent the overall quality of the essay across all dimensions e.g. the global mark given in Cambridge Main Suite speaking exams Note: It focuses on the text as a communicative whole. Note: It can be more subjective than analytic scoring, reducing validity / reliability.
face validity
the extent to which ...respondents feel like an assessment measures what it is supposed to measure ...it appears to be "a good test" to the people using it - the learners taking the test, their parents, the teachers and institutions putting them in for the test etc. e.g. writing an email and doing an oral interview to test a student who needs to write emails and speak to clients in their work is likely to have high ______ ________
validity
the extent to which a test measures or predicts what it is supposed to e.g. there are various types of _______, for example 'content ______', 'face _______', 'scoring _______', 'construct _______', 'predictive _______' [if you mention one or two of these in your example, that should be enough] Note: It is not usually possible to assess everything covered on a language course, but any test should give a balanced sample of the most important areas.
practicality
the extent to which an assessment instrument or procedure is inexpensive, easy to use and takes a suitable amount of time to administer and score compared to the importance of the assessment e.g. an assessment requiring 2 assessors for a 20-minute test for formative assessment is unlikely to be practical
content validity
the extent to which an assessment is representative of the range of knowledge and skills you are aiming to test e.g. if students who need to write business letters are asked to write about their last holiday this has low ______ _______ [Although it involves writing, the content in terms of language and discourse, as well as layout, would not be relevant.] Another way to think about this: the assessment tests only what has been covered in the preceding course (for progress tests), the preceding course syllabus (for achievement tests) or test specification (for proficiency tests)
washback (OR backwash)
the extent to which the format and content of assessments influences teaching and syllabus decisions e.g. if a written exam only includes essays, do students only learn how to write essays, or other genres too?