SCALES OF MEASUREMENT, VALIDITY
T-Test
A formula for evaluating the means of two groups. It is used in comparing two groups such as in an experiment that involves controlling a variable in each group and looking for a difference in outcome. It is computed using a ratio of the difference of each group's mean (average) to the dispersion of the scores.
Variable
A measured characteristic of a unit. A valid measure of a property, assuming it is relevant or appropriate as a representation of that property; is a measured characteristic of a unit or subject (i.e. height, weight, mood, age, temperature) or may be a condition separate from the object under study but which is of interest for its potential impact on the unit or subject (i.e. maturation, weather, family, height). The variable that is specifically being studied is called the dependent variable, and the variable(s) believed to have an effect on the dependent variable(s) is the independent variable. The Standard deviation is a measure of variability.
Dependent Variable
A variable whose changes are being studied; a response variable.
Independent Variable
A variable with an effect upon dependent variables being studied. An independent variable in an experiment is called a factor.
Criterion Validity
Also referred to as empirical or predictive validity, seeks to know whether the measurement instrument correlates significantly with other variables that may be relevant. Most of the time, the other variables are previously established as valid measures. There are two primary forms of criterion validity, predictive and concurrent.
Treatment
Any specific experimental condition applied to the units. A treatment is usually a combination of specific values (called levels) of each experimental factor.
Construct Validity
Asks if the instrument draws from its theorized psychological construct it proposes to measure. Construct validity examines the scale to determine if the theoretical ideas or traits under consideration have been operationalized in the measure. To accomplish this, the measurement must be shown to have both convergent and discriminant validity.
Content Validity
Focuses on assessing if the material of the items asked is exploring the content desired for measure. It is not possible for any instrument to ask all questions relevant to a client's issue, but content validity works to validate the instrument toward a representative sample of the content area relevant to the issue of focus. Face and logical content validity are used to establish this task.
Range
Highest score minus the lowest score plus 1. A crude device that can be misleading when a distribution contains a very high or very low score, range is based only on highest and lowest scores.
Validity
In statistics, validity is concerned with the applicability of the measure to the characteristic being evaluated. This goes hand-in-hand with reliability. The conclusion of a study for the subjects of the study themselves is sometimes called internal validity. Generalization of the conclusion of a study to a larger population is sometimes called external validity; references the extent to which the instrument accurately measures what it was designed to measure.
Association In Bivariate Data
Means systematic connection links changes in one variable and changes in another. When an increase in one variable tends to be accompanied by an increase in the other, the variables are positively associated. When an increase in one variable tends to be accompanied by a decrease in the other, the variables are negatively associated.
standard scores
Observations expressed in standard deviation units about the mean
Units
The basic objects on which the experiment is done. When the units are human beings, they are called subjects.
Variance
The mean of the squares of the deviations of observations from their mean. Variance is not as reliable as standard deviation in describing the variability.
Standard Deviation
The positive square root of the variance. Describes the variability found within the distribution. The larger the SD, the greater is the dispersion of scores around the distribution's mean. SD is an excellent source of variance, especially when N is more than 120, but is also used when N is less than 120.
Double-blind technique
When both the subjects and those who evaluate the outcome are ignorant of which treatment was given, a double-blind experiment exists.
parameter
a number describing a population. For example, the proportion of the population with some special property is a parameter that may be called p. In a statistical inference problem, population parameters are fixed numbers, but their values are unknown.
Behavioral observations
are a useful tool, both for overt (headaches, walking, tantrums, sleeping, crying, laughing, eating) and covert (feeling, remembering, thinking) actions. The important element is that the behavior is measurable and countable by someone. The client can count the number of negative thoughts, the hours of sleep or the number of times he/she curses in while driving in traffic.
Frequency measurements
are simply the count of how many times the target behavior occurs.
Duration counts
are the measure of time a behavior lasts.
Convergent validity
explores if a construct, such as depression, correlates with a theoretically relevant variable, for example the amount of time a person spends crying, difficulty sleeping, decrease in appetite or negative self-talk. With convergent validity, it is important to locate statistically significant correlations between the instrument itself and the relevant variables.
Face Validity
explores whether the item appears to reach the content desired. It asks, "Does this appear to ask what it is supposed to ask?" While face validity is highly subjective, logical content validity follows a more systematic process.
statistic
is a number describing the sample data. For example, the proportion of the sample with some special property is a statistic that is called p. Statistics change from sample to sample. Observed statistics provide information about unknown parameters.
mean
is the average; for a set of observations, the arithmetic average. It is the sum of the observations (n+n+n) divided by the number of observations (n). Every score in the distribution affects the mean. The mean can be used only on interval and ratio scales of measurement.
median
is the typical value. It is the midpoint of the observations when data are arranged in increasing order, the score that divides the distribution in half. The value of the median is not even affected by extreme scores. Median can be used on ordinal, interval and ratio scales of measurement.
mode
most frequent value; When two or more numbers occur at the same frequency in a set of numbers, the set is either Bimodal: a distribution with two most-frequently occurring scores, or Multimodal: two or more most-frequently occurring scores When all scores occur equally, the distribution does not have a mode. Mode can be used on any scale of measurement: nominal, ordinal, interval or ratio.
relative frequency
of any value is the proportion or fraction or percent of all observations having that value. Data are univariate when only one variable is measured on each unit Data are bivariate when exactly two variables are measured on each unit Data are multivariate when more than one variable is measured on each unit
Predictive Validity
questions whether or not the instrument has a correlation to a future event.
Discriminant Validity
references how theoretically non-relevant variables and those variables without similarities to the theoretically relevant variables are not associated with scores on the measurement. The purpose to discriminant validity is to find an instrument that does not correlate variables that should not be correlated because they are clinically irrelevant.
Concurrent Validity
references the instrument's correlation to an event occurring simultaneous to the time the measure is taken.
Logical content validity
references the method the developer engaged with to ensure the required content was included in the test field.
Ratio scale
scale measurement provides for an absolute zero, which is not an arbitrary point. Any statistical measure that can be done on an interval scale may be done on a ration scale, in addition to geometric and harmonic means, coefficients of variations and logarithms. Interval and ratio scales are examples of quantitative values. Quantitative properties are measurable and are used for scientific investigation and analysis.
Interval scales
scales that are measurable, positive and linear. An example of an interval scale is the Celsius scale. On an interval scale, the zero point is an arbitrary point, and negative values may be used. Ratios are meaningless when used with interval values.
Ordinal scales
scales that have an ordered set, such as 1st, 2nd, 3rd and so forth, but do not necessarily provide for a set measure. For example, in a marathon, rank order could be assigned, but the time between runners is not established by assigning rank. First and second place could be seconds or minutes apart. Nominal and ordinal data are examples of qualitative data.
Nominal scales
scales that simply name data, placing them into categories, but nothing more. An example might be in describing surfaces, such as hard, soft, course and smooth. Nominal scales provide for a standard set or structure, but no order.
interval measure
selects a discrete unit of time and observes the time block for the target behavior. If the behavior occurs, it is marked as affirmative. If the behavior does not occur, it is marked in the negative. It is an all or nothing recording. If a behavior has occurred at a high frequency for an extensive period of time, or if no discrete unit of time measure can be delineated (as for the duration count), the interval measure is useful.
Normal Distribution
Half the observations fall above the mean and half fall below 68% of the observations fall within one standard deviation of the mean • Half of these (34%) fall within one standard deviation above the mean • The other half (34%) fall within one standard deviation below the mean Another 27% of the observations fall between one and two standard deviations away from the mean • So 95% (68% plus 27%) fall within two standard deviations of the mean In all, 99.7% of the observations fall within three standard deviations of the mean
Normal Curve Or Bell-shaped Curve
The largest cluster of scores falls in the center, and frequency decreases with progress towards the extremes in both directions The bell curve is bilaterally symmetrical with a single peak in the center Most distributions of human traits approximate the bell curve In general, the larger the group, the more closely the distribution will resemble the theoretical normal distribution The mean, median and mode fall at the same point in a normal curve A distinguishing feature of a ratio scale is an absolute zero point There is a .025 probability of error on either side of the normal curve when the alpha level is set at .05 Standard deviation is associated with the mean