Types of Reliability and Validity

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Internal Reliability

Internal reliability assesses the consistency of results across items within a test

Improving inter-rater reliability

Clearly defining behavioural categories Pilot study Test-re-test

Construct Validity

Construct validity is the appropriateness of inferences made on the basis of observations or measurements (often test scores), specifically whether a test measures the intended construct (theory).

External Reliability

External reliability refers to the extent to which a measure varies from one use to another.

External Validity

External validity is the validity of generalised (causal) inferences in scientific research, usually based on experiments as experimental validity. In other words, it is the extent to which the results of a study can be generalized to other situations and to other people

Criterion Validity

In psychometrics, criterion validity is a measure of how well one variable or set of variables predicts an outcome based on information from other variables, and will be achieved if a set of measures from a personality test relate to a behavioural criterion on which psychologists agree.

Predictive Validity

In psychometrics, predictive validity is the extent to which a score on a scale or test predicts scores on some criterion measure. For example, the validity of a cognitive test for job performance is the correlation between test scores and, for example, supervisor performance ratings

Internal Validity

Internal validity refers to how well an experiment is done, especially whether it avoids confounding (more than one possible independent variable [cause] acting at the same time). The less chance for confounding in a study, the higher its internal validity is.

Temporal Validity

Refers to how relevant the time period is in affecting the findings. e.g. A study on attitudes conducted decades ago cannot be expected to have temporal validity due to how quickly attitudes shift in society.

Face validity

The degree to look as though it measures what it's supposed to do

Validity

The extent to which a research technique actually measures the behaviour it is claimed to measure

Reliability

The extent to which the measurement of a particular behaviour is consistent.

Split-half Reliability

The split-half method assesses the internal consistency of a test, such as psychometric tests and questionnaires. There, it measures the extent to which all parts of the test contribute equally to what is being measured. This is done by comparing the results of one half of a test with the results from the other half. A test can be split in half in several ways, e.g. first half and second half, or by odd and even numbers. If the two halves of the test provide similar results this would suggest that the test has internal reliability. The reliability of a test could be improved through using this method. For example any items on separate halves of a test which have a low correlation (e.g. r = .25) should either be removed or re-written. The split-half method is a quick and easy way to establish reliability. However it can only be effective with large questionnaires in which all questions measure the same construct. This means it would not be appropriate for tests which measure different constructs. For example, the Minnesota Multiphasic Personality Inventory has sub scales measuring differently behaviours such depression, schizophrenia, social introversion. Therefore the split-half method was not be an appropriate method to assess reliability for this personality test.

Test- Re-test Reliability

The test-retest method assesses the external consistency of a test. Examples of appropriate tests include questionnaires and psychometric tests. It measures the stability of a test over time. A typical assessment would involve giving participants the same test on two separate occasions. If the same or similar results are obtained then external reliability is established. The disadvantages of the test-retest method are that it takes a long time for results to be obtained. The timing of the test is important; if the duration is to brief then participants may recall information from the first test which could bias the results. Alternatively, if the duration is too long it is feasible that the participants could have changed in some important way which could also bias the results.

Ecological validity

What extent findings are generalizable to every day life -the extent to which the task represents a real world task e.g when measuring memory lists of words to remember is usually low EV

Inter-rater reliability

When observers agree on observed behaviour High = 80% agree


Ensembles d'études connexes

Chapter 8- Real Estate sales contract

View Set

Ch. 8 Ancient China Lesson 2 - China's Ancient Philosophies

View Set

Conversación para obtener información personal

View Set

Introduction to Health Assessment Test

View Set

Property & Casualty Insurance Chapter 2: General

View Set

6.13: Using Reference Variables as Parameters

View Set