Schuyler W. Huck, "Reading Statistics and Research" 6ed
Latent variable
(from Latin: present participle of lateo ("lie hidden"), as opposed to observable variables), are variables that are not directly observed but are rather inferred (through a mathematical model) from other variables that are observed (directly measured).
Leptokurtic
(of a frequency distribution or its graphical representation) having greater kurtosis than the normal distribution; more concentrated about the mean.
Intercept
(often labeled the constant) is the expected mean value of Y when all X=0. Start with a regression equation with one predictor, X. If X sometimes = 0, the intercept is simply the expected mean value of Y at that value.
Discriminant validity
(or divergent validity) tests whether concepts or measurements that are not supposed to be related are actually unrelated. Campbell and Fiske (1959) introduced the concept of discriminant validity within their discussion on evaluating test validity.
Standard error of measurement (SEM)
a measure of how much measured test scores are spread around a "true" score
Construct validity
"the degree to which a test measures what it claims, or purports, to be measuring." ... Modern validity theory defines construct validity as the overarching concern of validity research, subsuming all other types of validity evidence.
Familywise error rate
(FWE or FWER) is the probability of a coming to at least one false conclusion in a series of hypothesis tests . ... The term "familywise" error rate comes from family of tests, which is the technical definition for a series of tests on data.
Post hoc comparisons
(Latin, meaning "after this") means to analyze the results of your experimental data. They are often based on a familywise error rate; the probability of at least one Type I error in a set (family) of comparisons. The most common post-hoc tests are: Bonferroni Procedure
Factor
(also called an independent variable) is an explanatory variable manipulated by the experimenter. Each factor has two or more levels (i.e., different values of the factor). Combinations of factor levels are called treatments.
Multicollinearity
(also collinearity) is a phenomenon in which one predictor variable in a multiple regression model can be linearly predicted from the others with a substantial degree of accuracy.
Contingency table
(also known as a cross tabulation or crosstab) is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables. ... They provide a basic picture of the interrelation between two variables and can help find interactions between them.
Content validity
(also known as logical validity) refers to the extent to which a measure represents all facets of a given construct. ... A disagreement about a personality trait will prevent the gain of a high content validity.
Measurement error
(or observational error) is the difference between a measured value of a quantity and its true value. In statistics, an error is not a "mistake". Variability is an inherent part of the results of measurements and of the measurement process.
Simple main effect
(sometimes called simple main effects) are differences among particular cell means within the design. More precisely, a simple effect is the effect of one independent variable within one level of a second independent variable.
Omega squared
(ω2) is a measure of effect size, or the degree of association for a population. It is an estimate of how much variance in the response variables are accounted for by the explanatory variables.
Vectors
1 : a quantity that has magnitude and direction and that is usually represented by part of a straight line with the given direction and with a length representing the magnitude. 2 : an organism (as an insect) that transmits a pathogen from one organism or source to another.
Box plot
A box plot is a graphical rendition of statistical data based on the minimum, first quartile, median, third quartile, and maximum. The term "box plot" comes from the fact that the graph looks like a rectangle with lines extending from the top and bottom.
Line of best fit
A line of best fit is a straight line drawn through the center of a group of data points plotted on a scatter plot. Scatter plots depict the results of gathering data on two variables. The line of best fit shows whether these two variables appear to be correlated and can be used to help identify trends occurring within the dataset.
Split-half reliability coefficient
A measure of consistency where a test is split in two and the scores for each half of the test is compared with one another. ... A test that is consistent most likely is measuring something; the experimenter just does not know what that "something" is.
Quota sample
A sampling method of gathering representative data from a group. As opposed to random sampling, quota sampling requires that representative individuals are chosen out of a specific subgroup. For example, a researcher might ask for a sample of 100 females, or 100 individuals between the ages of 20-30.
Scheffé test
A statistical test that is used to make unplanned comparisons, rather than pre-planned comparisons, among group means in an analysis of variance (ANOVA) experiment.
Bonferroni test
A type of multiple comparison test used in statistical analysis. When an experimenter performs enough tests, he or she will eventually end up with a result that shows statistical significance, even if there is none.
Rectangular
A uniform distribution, also called a rectangular distribution, is a probability distribution that has constant probability.
Concomitant variable
A variable that is observed in a statistical experiment, but is not specifically measured or utilized in the analysis of the data. It is sometimes necessary to correct for concomitant variables in order to prevent distortion of the results of experiments or research.
Ungrouped frequency distribution
AKA "simple" data set which displays the categories and the frequency of responses on each one. There is no lumping together of those categories into any larger categories.
Conclusion
AKA Discussion
Subject
AKA participants, or the individual beings involved in the study in said role.
Participant
AKA subject, or the individual beings involved in the study in said role.
Analysis of covariance
ANCOVA) allows to compare one variable in 2 or more groups taking into account (or to correct for) variability of other variables, called covariates. Analysis of covariance combines one-way or two-way analysis of variance with linear regression (General Linear Model, GLM).
Dependent variable
An independent variable, sometimes called an experimental or predictor variable, is a variable that is being manipulated in an experiment in order to observe the effect on a dependent variable, sometimes called an outcome variable.
ANCOVA
Analysis of covariance is used to test the main and interaction effects of categorical variables on a continuous dependent variable, controlling for the effects of selected other continuous variables, which co-vary with the dependent.
Factor extraction
Available methods are principal components, unweighted least squares, generalized least squares, maximum likelihood, principal axis factoring, alpha factoring, and image factoring. ... Principal components analysis is used to obtain the initial factor solution.
Alpha
Before you run any statistical test, you must first determine your _____ level, which is also called the "significance level." By definition, the _____ level is the probability of rejecting the null hypothesis when the null hypothesis is true.
Abstract population
But suppose, after asking my question to everyone now enrolled in this course, I want to generalize my "findings" to other groups of students who will take this course next semester and in the academic terms that follow. If that's the case, then you and your peers no longer are the population; instead, you constitute the sample from which I make my inference. This population would be abstract (not tangible) in nature because I cannot list or point to the students who will be enrolled in this course in future semesters. At present, those "future" students exist only in my imagination. That's why they would constitute an "abstract" population.
Degrees of freedom (df)
Degrees of freedom of an estimate is the number of independent pieces of information that went into calculating the estimate. It's not quite the same as the number of items in the sample. In order to get the df for the estimate, you have to subtract 1 from the number of items. Let's say you were finding the mean weight loss for a low-carb diet. You could use 4 people, giving 3 degrees of freedom (4 - 1 = 3), or you could use one hundred people with df = 99.
DeltaR2
Delta R2 is the change in R2 between two equations. Usually you see this come up when doing hierarchical regression with more than one step. For example, Step 1 R2 = .25 and Step 2 deltaR2 = .10. This would mean that Step 2 added .10 beyond the .25 of step 1, for a total of R2 = .35. Edit: I should also mention that the deltaR2 will be associated with its own F value. This value will indicate whether the increase in R2 (deltaR2) is statistically significantly greater than no increase.
Endogenous variable
Dependent variable generated within a model and, therefore, a variable whose value is changed (determined) by one of the functional relationships in that model. For example, consumption expenditure and income is considered endogenous to a model of income determination.
Qualitative variable
Describes data that fits into non-numerical categories. For example: Eye colors (variables include: blue, green, brown, hazel).
Coefficient alpha
Exploratory factor analysis is one method of checking dimensionality. Technically speaking, Cronbach's alpha is not a statistical test - it is a coefficient of reliability (or consistency).
Kuder-Richardson 20 (K-R 20)
First published in 1937, is a measure of internal consistency reliability for measures with dichotomous choices. It is a special case of Cronbach's α, computed for dichotomous scores.
Post hoc test
In a scientific study, post hoc analysis (from Latin post hoc, "after this") consists of analyses that were not specified before seeing the data. This typically creates a multiple testing problem because each potential analysis is effectively a statistical test.
Factor
In an experiment, the factor (also called an independent variable) is an explanatory variable manipulated by the experimenter. Each factor has two or more levels (i.e., different values of the factor). Combinations of factor levels are called treatments.
Level
In an experiment, the factor (also called an independent variable) is an explanatory variable manipulated by the experimenter. Each factor has two or more levels (i.e., different values of the factor). Combinations of factor levels are called treatments.
Validity
In its purest sense, this refers to how well a scientific test or piece of research actually measures what it sets out to, or how well it reflects the reality it claims to represent.
Model specification
In regression analysis specification is the process of developing a regression model. This process consists of selecting an appropriate functional form for the model and choosing which variables to include.
Mixed ANOVA
In statistics, a mixed-design analysis of variance model (also known as a split-plot ANOVA) is used to test for differences between two or more independent groups whilst subjecting participants to repeated measures.
Planned comparisons
In the context of one-way ANOVA, the term planned comparison is used when: You focus in on a few scientifically sensible comparisons rather than every possible comparison. The choice of which comparisons to make was part of the experimental design
Ratings
In the social sciences, particularly psychology, common examples are the Likert response scale and 1-10 rating scales in which a person selects the number which is considered to reflect the perceived quality of a product.
MS
Mean squares are estimates of variance across groups. Mean squares are used in analysis of variance and are calculated as a sum of squares divided by its appropriate degrees of freedom.
Multiple comparison test
One popular way to investigate the cause of rejection of the null hypothesis is a Multiple Comparison Procedure. These are methods which examine or compare more than one pair of means or proportions at the same time.
Orthogonal rotation
Orthogonal and oblique are two different types of rotation methods used to analyze information from a factor analysis. Factor analysis is a type of statistical procedure that is conducted to identify clusters or groups of related items (called factors) on a test.
Oversampling
Oversampling and undersampling are opposite and roughly equivalent techniques. They both involve using a bias to select more samples from one class than from another.The usual reason for oversampling is to correct for a bias in the original dataset. One scenario where it is useful is when training a classifier using labelled training data from a biased source, since labelled training data is valuable but often comes from un-representative sources. For example, suppose we have a sample of 1000 people of which 66.7% are male. We know the general population is 50% female, and we may wish to adjust our dataset to represent this. Simple oversampling will select each female example twice, and this copying will produce a balanced dataset of 1333 samples with 50% female. Simple undersampling will drop some of the male samples at random to give a balanced dataset of 667 samples, again with 50% female. There are also more complex oversampling techniques, including the creation of artificial data points.
Highly Significant
P < 0.001 (less than one in a thousand chance of being wrong). ... The significance level (alpha) is the probability of type I error. The power of a test is one minus the probability of type II error (beta).
Practical significance
Refers to the unlikelihood that mean. differences observed in the sample have occurred due to sampling error. Given a large enough sample, despite seemingly insignificant population differences, one might still find statistical significance. Practical significance: Looks at whether the difference is large.
Unpaired samples
Scientific experiments often consist of comparing two or more sets of data. This data is described as unpaired or independent when the sets of data arise from separate individuals or paired when it arises from the same individual at different points in time.
Statistical significance
Statistical significance means that a result from testing or experimenting is not likely to occur randomly or by chance, but is instead likely to be attributable to a specific cause. Statistical significance can be strong or weak, and it is important to disciplines that rely heavily on analyzing data and research, such as finance, investing, medicine, physics and biology.
Attrition
The 'wearing away' or progressive loss of data in research. Distinctive Features. Attrition occurs when cases are lost from a sample over time or over a series of sequential processes. One form of sample attrition occurs in longitudinal research when the subjects studied drop out.
Power analysis
The power of any test of statistical significance is defined as the probability that it will reject a false null hypothesis. ... Statistical power is affected chiefly by the size of the effect and the size of the sample used to detect it.
Group separation
The process of sorting or distinguishing into different components, groups, or categories: the gradual separation of the sciences into physical and biological. b. The condition of being so sorted or distinguished: the unquestioned separation of labor by gender.
Homogeneity of variance assumption
The assumption of homogeneity of variance is that the variance within each of the populations is equal. This is an assumption of analysis of variance (ANOVA). ANOVA works well even when this assumption is violated except in the case where there are unequal numbers of subjects in the various groups.
Response variable
The changes in an experiment are made to the independent variable (also called the manipulated variable); the responses that happen as a result of those deliberate changes are the responding variables. ... The variable you change would be the amount of light. The responding variable would be the height of the plants.
Covariate variable
The most precise definition is its use in Analysis of Covariance, a type of General Linear Model in which the independent variables of interest are categorical, but you also need to control for an observed, continuous variable-the covariate.
Hierarchical multiple regression
This example shows you how to perform hierarchical multiple regression, a variant of the basic multiple regression procedure that allows you to specify a fixed order of entry for variables in order to control for the effects of covariates or to test the effects of certain ...
Tests of independence
This lesson explains how to conduct a chi-square test for independence. The test is applied when you have two categorical variables from a single population. It is used to determine whether there is a significant association between the two variables.
A priori power analyses
To review, power is defined as the probability that a statistical test will reject the null hypothesis or the ability of a statistical test to detect an effect. Power is equal to 1-b (beta). In order to reject the null hypothesis (which states that there is no relationship between the variables of interest), power should be at least .80. In general, the larger your sample size, the greater the power, but sometimes having too many subjects can decrease your power. A power analysis provides information for determining the minimum number of subjects you need to collect in order to make your study worthwhile.
Tukey test
Tukey's range test, also known as the Tukey's test, Tukey method, Tukey's honest significance test, Tukey's HSD (honest significant difference) test, or the Tukey-Kramer method, is a single-step multiple comparison procedure and statistical test.
Normality assumption
When the sample size is sufficiently large (>200), the normality assumption is not needed at all as the Central Limit Theorem ensures that the distribution of disturbance term will approximate normality. When dealing with very small samples, it is important to check for a possible violation of the normality assumption.
t-test
When you perform a t-test, you're usually trying to find evidence of a significant difference between population means (2-sample t) or between the population mean and a hypothesized value (1-sample t). The t-value measures the size of the difference relative to the variation in your sample data.
Fisher's LSD test
When you run an ANOVA (Analysis of Variance) test and get a significant result, that means at least one of the groups tested differs from the other groups. However, you can't tell from the ANOVA test which group differs. In order to address this, Fisher developed the least significant difference test in 1935. When you run an ANOVA (Analysis of Variance) test and get a significant result, that means at least one of the groups tested differs from the other groups. However, you can't tell from the ANOVA test which group differs. In order to address this, Fisher developed the least significant difference test in 1935, which is only used when you reject the null hypothesis as a result of your hypothesis test results. The LSD calculates the smallest significant between two means as if a test had been run on those two means (as opposed to all of the groups together). This enables you to make direct comparisons between two means from two individual groups. Any difference larger than the LSD is considered a significant result.
McNemar's chi-square test
You may have heard of McNemar tests as a repeated measures version of a chi-square test of independence. This is basically true, and I wanted to show you how these two tests differ and what exactly, each one is testing. First of all, although Chi-Square tests can be used for larger tables, McNemar tests can only be used for a 2×2 table. So we're going to restrict the comparison to 2×2 tables.
Ordinal
a categorical, statistical data type where the variables have natural, ordered categories and the distances between the categories is not known. These data exist on an ordinal scale, one of four levels of measurement described by S. S. Stevens in 1946.
Goodness-of-fit indices
a measure of fit between the hypothesized model and the observed covariance matrix. The adjusted goodness of fit index (AGFI) corrects the GFI, which is affected by the number of indicators of each latent variable.
Contingency coefficient
a coefficient of association that tells whether two variables or data sets are independent or dependent of each other. It is also known as Pearson's Coefficient (not to be confused with Pearson's Coefficient of Skewness). ... C is the contingency coefficient.
Analysis of variance
a collection of statistical models and their associated procedures (such as "variation" among and between groups) used to analyze the differences among group means. ANOVA was developed by statistician and evolutionary biologist Ronald Fisher
Concurrent validity
a concept commonly used in psychology, education, and social science. It refers to the extent to which the results of a particular test, or measurement, correspond to those of a previously established measurement for the same construct
Communality
a concept in community psychology and social psychology, as well as in several other research disciplines, such as urban sociology, which focuses on the experience of community rather than its structure, formation, setting, or other features. Sociologists, social psychologists, anthropologists, and others have theorized about and carried out empirical research on community, but the psychological approach asks questions about the individual's perception, understanding, attitudes, feelings, etc. about community and his or her relationship to it and to others' participation—indeed to the complete, multifaceted community experience. In his seminal 1974 book, psychologist Seymour B. Sarason proposed that psychological sense of community become the conceptual center for the psychology of community, asserting that it "is one of the major bases for self-definition." By 1986 it was regarded as a central overarching concept for community psychology. Among theories of sense of community proposed by psychologists, McMillan & Chavis's is by far the most influential, and is the starting point for most of the recent research in the field.
Histogram
a diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval.
Bar graph
a diagram in which the numerical values of variables are represented by the height or length of lines or rectangles of equal width.
Independence assumption
a foundation for many statistical tests. The assumption of independence is used for T Tests, in ANOVA tests, and in several other statistical tests. ... The observations between groups should be independent, which basically means the groups are made up of different people.
Scatter plot
a graph in which the values of two variables are plotted along two axes, the pattern of the resulting points revealing any correlation present.
Research hypothesis
a hunch totally irrespective of Ho and Ha.
Linearity
a linear approach for modelling the relationship between a scalar dependent variable y and one or more explanatory variables (or independent variables) denoted X. ... For more than one explanatory variable, the process is called multiple linear regression.
Principal components analysis
a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. ... PCA is sensitive to the relative scaling of the original variables.
Adjusted odds ratio
a measure of association between an exposure and an outcome. The OR represents the odds that an outcome will occur given a particular exposure, compared to the odds of the outcome occurring in the absence of that exposure.
Odds ratio
a measure of association between an exposure and an outcome. The OR represents the odds that an outcome will occur given a particular exposure, compared to the odds of the outcome occurring in the absence of that exposure.
Phi
a measure of association for two binary variables. Introduced by Karl Pearson, this measure is similar to the Pearson correlation coefficient in its interpretation.
KMO measure of sampling adequacy
a measure of how suited your data is for Factor Analysis. The test measures sampling adequacy for each variable in the model and for the complete model. The statistic is a measure of the proportion of variance among variables that might be common variance.
Internal consistency reliability
a measure of how well the items on a test measure the same construct or idea.
Cronbach's alpha
a measure of internal consistency, that is, how closely related a set of items are as a group. It is considered to be a measure of scale reliability. A "high" value for alpha does not imply that the measure is unidimensional.
Parallel-forms reliability
a measure of reliability obtained by administering different versions of an assessment tool (both versions must contain items that probe the same construct, skill, knowledge base, etc.) to the same group of individuals.
Pearson's product-moment correlation
a measure of the linear correlation between two variables X and Y.
Coefficient of determination
a measure of the proportion of variance of a predicted outcome. With a value of 0 to 1, the coefficient of determination is calculated as the square of the correlation coefficient (R) between the sample and predicted data.
Chi-square
a measurement of how expectations compare to results. The data used in calculating a chi square statistic must be random, raw, mutually exclusive, drawn from independent variables and drawn from a large enough sample. For example, the results of tossing a coin 100 times meets these criteria.
Stepwise multiple regression
a method of fitting regression models in which the choice of predictive variables is carried out by an automatic procedure. In each step, a variable is considered for addition to or subtraction from the set of explanatory variables based on some prespecified criterion.
Sobel test
a method of testing the significance of a mediation effect. ... In mediation, the relationship between the independent variable and the dependent variable is hypothesized to be an indirect effect that exists due to the influence of a third variable (the mediator).
Hit rate
a metric or measure of business performance traditionally associated with sales. It is defined as the number of sales of a product divided by the number of customers who go online, call, or visit a company to find out about the product.
Split-plot ANOVA
a mixed-design analysis of variance model (also known as a split-plot ANOVA) is used to test for differences between two or more independent groups whilst subjecting participants to repeated measures.
Two-way mixed ANOVA
a mixed-design analysis of variance model (also known as a split-plot ANOVA) is used to test for differences between two or more independent groups whilst subjecting participants to repeated measures.
Dunnett test
a multiple comparison procedure developed by Canadian statistician Charles Dunnett to compare each of a number of treatments with a single control. Multiple comparisons to a control are also referred to as many-to-one comparisons.
Duncan's multiple range test
a multiple comparison procedure developed by David B. Duncan in 1955. Duncan's MRT belongs to the general class of multiple comparison procedures that use the studentized range statistic qr to compare sets of means.
Structural model
a multivariate statistical analysis technique that is used to analyze structural relationships. This technique is the combination of factor analysis and multiple regression analysis, and it is used to analyze the structural relationship between measured variables and latent constructs.
NPMANOVA
a non-parametric test of significant difference between two or more groups, based on any distance measure (Anderson 2001). This is normally used for ecological taxa-in-samples data, where groups of samples are to be compared, but may also be used as a general non-parametric MANOVA.
Purposive sample
a non-probability sample that is selected based on characteristics of a population and the objective of the study. Purposive sampling is also known as judgmental, selective, or subjective sampling.
Convenience sample
a non-probability sampling technique where subjects are selected because of their convenient accessibility and proximity to the researcher.
Spearman's rho
a nonparametric measure of rank correlation (statistical dependence between the rankings of two variables)
Wilcoxon-Mann-Whitney test
a nonparametric test of the null hypothesis that it is equally likely that a randomly selected value from one sample will be less than or greater than a randomly selected value from a second sample. Unlike the t-test it does not require the assumption of normal distributions. It is nearly as efficient as the t-test on normal distributions.This test can be used to determine whether two independent samples were selected from populations having the same distribution; a similar nonparametric test used on dependent samples is the Wilcoxon signed-rank test.
Snowball sample
a nonprobability sampling technique where existing study subjects recruit future subjects from among their acquaintances. Thus the sample group is said to grow like a rolling snowball.
Correlation coefficient
a number between −1 and +1 calculated so as to represent the linear dependence of two variables or sets of data.
Average
a number expressing the central or typical value in a set of data, in particular the mode, median, or (most commonly) the mean
Convergent validity
a parameter often used in sociology, psychology, and other behavioral sciences, refers to the degree to which two measures of constructs that theoretically should be related, are in fact related. Convergent validity, along with discriminant validity, is a subtype of construct validity.
Null hypothesis
a pinpoint statement as to the unknown quantitative value of the parameter in the population of interest. Symbolized by H subscript 0.
Outlier
a point which falls more than 1.5 times the interquartile range above the third quartile or below the first quartile. Easy to spot on histograms.
Stratified random sample
a population sample that requires the population to be divided into smaller groups, called 'strata'. Random samples can be taken from each stratum, or group.
Newman-Keuls test
a post hoc test for differences in means. Once an ANOVA has given a statistically significant result, you can run a Newman-Keuls to see which specific pairs of means are different. The test is based on the studentized range distribution.
Directional hypothesis
a prediction made by a researcher regarding a positive or negative change, relationship, or difference between two variables of a population. This prediction is typically based on past research, accepted theory, extensive experience, or literature on the topic. Key words that distinguish a directional hypothesis are: higher, lower, more, less, increase, decrease, positive, and negative. A researcher typically develops a directional hypothesis from research questions and uses statistical methods to check the validity of the hypothesis.
Expected frequency
a probability count that appears in contingency table calculations including the chi-square test. ... Observed Frequencies are counts made from experimental data. In other words, you actually observe the data happening and take measurements.
Wilks' lambda
a probability distribution used in multivariate hypothesis testing, especially with regard to the likelihood-ratio test and multivariate analysis of variance (MANOVA).
Multimodal
a probability distribution with more than one peak, or "mode." A distribution with one peak is called unimodal. A distribution with two peaks is called bimodal. A distribution with two peaks or more is multimodal.
MANOVA
a procedure for comparing multivariate sample means. As a multivariate procedure, it is used when there are two or more dependent variables, and is typically followed by significance tests involving individual dependent variables separately.
Standard deviation
a quantity calculated to indicate the extent of deviation for a group as a whole.
Sum of squares (SS)
a quantity that appears as part of a standard way of presenting results of such analyses. It is defined as being the sum, over all observations, of the squared differences of each observation from the overall mean.
Confidence interval
a range of values so defined that there is a specified probability that the value of a parameter lies within it.
Attenuation
a reduction in the estimated effect size because of errors of measurement.
Positive correlation
a relationship between two variables in which both variables move in tandem
Negative correlation
a relationship between two variables in which one variable increases as the other decreases, and vice versa
Coefficient of equivalence
a reliability coefficient that measures the internal consistency of a test. The coefficient is the expected correlation of one test form with an alternate form that contains the same number of items
Notes
a section for miscellaneous acknowledgements, clarifications, or directions to the reader like contact info
Scree plot
a simple line segment plot that shows the fraction of total variance in the data.It is a plot, in descending order of magnitude, of the eigenvalues of a correlation matrix.
Effect size
a simple way of quantifying the difference between two groups that has many advantages over the use of tests of statistical significance alone. Effect size emphasises the size of the difference rather than confounding this with sample size.
Measure of central tendency
a single value that describes the way in which a group of data cluster around a central value. To put in other words, it is a way to describe the center of a data set. There are three: the mean, the median, and the mode.
Median test
a special case of Pearson's chi-squared test. It is a nonparametric test that tests the null hypothesis that the medians of the populations from which two or more samples are drawn are identical. The data in each sample are assigned to two groups, one consisting of data whose values are higher than the median value in the two groups combined, and the other consisting of data whose values are at the median or below. A Pearson's chi-squared test is then used to determine whether the observed frequencies in each sample differ from expected frequencies derived from a distribution combining the two groups.
Tetrachoric correlation
a special case of the polychoric correlation applicable when both observed variables are dichotomous. These names derive from the polychoric and tetrachoric series which are used for estimation of these correlations. These series were mathematical expansions once but not anymore.
Eigenvalue
a special set of scalars associated with a linear system of equations (i.e., a matrix equation) that are sometimes also known as characteristic roots, characteristic values (Hoffman and Kunze 1971), proper values, or latent roots (Marcus and Minc 1988, p. 144).
Beta weight
a standardized regression coefficient (the slope of a line in a regression equation). They are used when both the criterion and predictor variables are standardized (i.e. converted to z-scores). A beta weight will equal the correlation coefficient when there is a single predictor variable.
Test statistic
a standardized value that is calculated from sample data during a hypothesis test. You can use test statistics to determine whether to reject the null hypothesis. The test statistic compares your data with what is expected under the null hypothesis.
Kendall's tau
a statistic used to measure the ordinal association between two measured quantities.
Cohen's kappa
a statistic which measures inter-rater agreement for qualitative (categorical) items. It is generally thought to be a more robust measure than simple percent agreement calculation, as κ takes into account the possibility of the agreement occurring by chance
Validity coëfficient
a statistical index used to report evidence of validity for intended interpretations of test scores and defined as the magnitude of the correlation between test scores and a criterion variable.
Maximum likelihood
a statistical method for estimating population parameters (such as the mean and variance) from sample data that selects as estimates those parameter values maximizing the probability of obtaining the observed data.
Systematic sample
a statistical method involving the selection of elements from an ordered sampling frame. The most common form of systematic sampling is an equiprobability method. In this approach, progression through the list is treated circularly, with a return to the top once the end of the list is passed.
Sign test
a statistical method to test for consistent differences between pairs of observations, such as the weight of subjects before and after treatment. ... The sign test can also test if the median of a collection of numbers is significantly greater than or less than a specified value.
Correlated samples
a statistical procedure used to determine whether the mean difference between two sets of observations is zero. In a paired sample t-test, each subject or entity is measured twice, resulting in pairs of observations.
Correction for attenuation
a statistical procedure, due to Spearman (1904), to "rid a correlation coefficient from the weakening effect of measurement error" (Jensen, 1998), a phenomenon known as regression dilution. In measurement and statistics, the correction is also called disattenuation.
Bonferroni adjustment procedure
a statistical technique for dealing with the problem of an inflated Type I error risk. Modify the alpha in each study and it becomes the criterion against which its ANOVAs' p-values are compared.
Causal paths
a statistical technique that is used to examine and test purported causal relationships among a set of variables. A causal relationship is directional in character, and occurs when one variable (e.g., amount of exercise) causes changes in another variable (e.g., physical fitness).
Multiple regression
a statistical technique that uses several explanatory variables to predict the outcome of a response variable. The goal of multiple linear regression (MLR) is to model the relationship between the explanatory and response variables.
Confirmatory factor analysis (CFA)
a statistical technique used to verify the factor structure of a set of observed variables. CFA allows the researcher to test the hypothesis that a relationship between observed variables and their underlying latent constructs exists.
Pearson chi-square
a statistical test applied to sets of categorical data to evaluate how likely it is that any observed difference between the sets arose by chance. It is suitable for unpaired data from large samples. ... Its properties were first investigated by Karl Pearson in 1900.
z-test
a statistical test used to determine whether two population means are different when the variances are known and the sample size is large. The test statistic is assumed to have a normal distribution, and nuisance parameters such as standard deviation should be known for an accurate z-test to be performed.
Mauchly test
a statistical test used to validate a repeated measures analysis of variance (ANOVA).
Regression line
a straight line that de- scribes how a response variable y changes as an explanatory variable x changes. We often use a regression line to predict the value of y for a given value of x. Note.
Summary table
a table of Anova-values which researchers sometimes include and sometimes convey as a simple sentence or two of the text.
Correlation matrix
a table showing correlation coefficients between sets of variables. Each random variable (Xi) in the table is correlated with each of the other values in the table (Xj). This allows you to see which pairs have the highest correlation.
Confidence interval
a technique, related to sampling error, which helps to provide an estimate of the variability of the sample mean.
Two-tailed test
a test of a statistical hypothesis , where the region of rejection is on both sides of the sampling distribution. For example, suppose the null hypothesis states that the mean is equal to 10. The alternative hypothesis would be that the mean is less than 10 or greater than 10.
Manifest variable
a variable that can be directly measured or observed. ... Manifest variables are used in latent variable statistical models, which test the relationships between a set of manifest variables and a set of latent variables.
Predictor variable
a variable used in regression to predict another variable. It is sometimes referred to as an independent variable if it is manipulated rather than just measured.
Cramer's V
a way of calculating correlation in tables which have more than 2x2 rows and columns. It is used as post-test to determine strengths of association after chi-square has determined significance.
Reliability coefficient
a way of confirming how accurate a test or measure is by giving it to the same subject more than once and determining if there's a correlation which is the strength of the relationship and similarity between the two scores.
Wald test
a way of testing the significance of particular explanatory variables in a statistical model. In logistic regression we have a binary outcome variable and one or more explanatory variables.
Raw score
an unaltered measurement. For example, let's say you took a test in class and scored 85.
Equal variance assumption
across samples is called homogeneity of variance. Some statistical tests, for example the analysis of variance, assume that variances are equal across groups or samples. The Levene test can be used to verify that assumption. Levene's test is an alternative to the Bartlett test.
Factor loading
also called component loadings in PCA, are the correlation coefficients between the variables (rows) and factors (columns). Analogous to Pearson's r, the squared factor loading is the percent of variance in that variable explained by the factor.
A posteriori test
also called post hoc tests) are statistical analyses performed after the initial analyses have been run, to explore the results in more depth. For example, let's say we conduct an experiment to examine the effects of background music on a short-term memory task.
Principal axis factoring
also called principal factor analysis (PFA) seeks the least number of factors which can account for the common variance (correlation) of a set of variables.
Coefficient of concordance
also known as Kendall's coefficient of concordance) is a non-parametric statistic. It is a normalization of the statistic of the Friedman test, and can be used for assessing agreement among raters
Population
also known as a well-defined collection of individuals or objects known to have similar characteristics. All individuals or objects within a certain population usually have a common, binding characteristic or trait.
Response rate
also known as completion rate or return rate, is the number of people who answered the survey divided by the number of people in the sample. It is usually expressed in the form of a percentage
Independent-samples chi-square test
also written as χ2 test, is any statistical hypothesis test where the sampling distribution of the test statistic is a chi-squared distribution when the null hypothesis is true. ... A chi-squared test can be used to attempt rejection of the null hypothesis that the data are independent.
Bonferroni adjustment technique
an adjustment made to P values when several dependent or independent statistical tests are being performed simultaneously on a single data set. To perform a Bonferroni correction, divide the critical P value (α) by the number of comparisons being made
Bonferroni technique
an adjustment made to P values when several dependent or independent statistical tests are being performed simultaneously on a single data set. To perform a Bonferroni correction, divide the critical P value (α) by the number of comparisons being made.
Latin square
an ancient puzzle where you try to figure out how many ways Latin letters can be arranged in a set number of rows and columns (a matrix); each symbol appears only once in each row and column. It's called a Latin square because it was developed based on Leonard Euler's works, which used Latin symbols.
Model respecification
an approach to formal specification where the system specification is expressed as a system state model. ... The most widely used notations for developing model-based specifications are VDM and Z (pronounced Zed, not Zee).
Carryover
an effect that "carries over" from one experimental condition to another. Whenever subjects perform in more than one condition (as they do in within-subject designs) there is a possibility of carryover effects. For example, consider an experiment on the effect of rate of presentation on memory.
Within groups
an estimate of the population variance. It is based on the average of all variances within the samples. Within Mean is a weighted measure of how much a (squared) individual score varies from the sample mean score.
Consistency
an estimator—a rule for computing estimates of a parameter θ0—having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to θ0.
Binomial test
an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories.
Between subjects
an experiment that has two or more groups of subjects each being tested by a different testing factor simultaneously.
MANCOVA
an extension of analysis of covariance (ANCOVA) methods to cover cases where there is more than one dependent variable and where the control of concomitant continuous independent variables - covariates - is required.
Power analysis
an important aspect of experimental design. It allows us to determine the sample size required to detect an effect of a given size with a given degree of confidence.
Sphericity assumption
an important assumption of a repeated-measures ANOVA. It refers to the condition where the variances of the differences between all possible pairs of within-subject conditions (i.e., levels of the independent variable) are equal
Parameter
an important component of any statistical analysis. In simple words, a parameter is any numerical quantity that characterizes a given population or some aspect of it. This means the parameter tells us something about the whole population.
Interval estimation
an interval estimated around a sample mean and usually 95% of the sample population
Hypotheses
any a priori prediction, hunch, or inference about the results pre-investigation
Probability sample
any method of sampling that utilizes some form of random selection. In order to have a random selection method, you must set up some process or procedure that assures that the different units in your population have equal probabilities of being chosen.
Rank-order correlation
any of several statistics that measure an ordinal association—the relationship between rankings of different ordinal variables or different rankings of the same variable, where a "ranking" is the assignment of the ordering labels "first", "second", "third", etc. For example, two common nonparametric methods of significance that use this are the Mann-Whitney U test and the Wilcoxon signed-rank test.
F-test
any statistical test in which the test statistic has an F-distribution under the null hypothesis. It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model that best fits the population from which the data were sampled.
Bootstrapping
any test or metric that relies on random sampling with replacement. Bootstrapping allows assigning measures of accuracy (defined in terms of bias, variance, confidence intervals, prediction error or some other such measure) to sample estimates.
Omnibus F-test
are a kind of statistical test. They test whether the explained variance in a set of data is significantly greater than the unexplained variance, overall. One example is the F-test in the analysis of variance.
Nonparametric test
are also called distribution-free tests because they don't assume that your data follow a specific distribution. You may have heard that you should use nonparametric tests when your data don't meet the assumptions of the parametric test, especially the assumption about normally distributed data.
Practice effect
are influences on test results when a test is taken more than once. As a simple example, a practice effects occurs when you take multiple practice SAT exams; practice can increase your overall score.
Nested model
are used for several statistical tests and analyses, including multiple regression, likelihood-ratio tests, conjoint analysis, and independent of irrelevant alternatives (IIA)
Adjusted means
arises when statistical averages must be corrected to compensate for data imbalances. Outliers, present in data sets will often be removed as they have a large impact on the calculated means of small populations; an adjusted mean can be determined by removing these outlier figures.
Guttman's split-half reliability
assesses the internal consistency of a test, such as psychometric tests and questionnaires. There, it measures the extent to which all parts of the test contribute equally to what is being measured. This is done by comparing the results of one half of a test with the results from the other half.
Multivariate space
based on the statistical principle of multivariate statistics, which involves observation and analysis of more than one statistical outcome variable at a time.
Partial eta squared
can be defined as the ratio of variance accounted for by an effect and that effect plus its associated error variance within an ANOVA study. Formulaically, partial eta2, or η2partial, is defined as follows:...
A priori power analysis
can either be done before (a priori or prospective power analysis) or after (post hoc or retrospective power analysis) data are collected. A priori power analysis is conducted prior to the research study, and is typically used in estimating sufficient sample sizes to achieve adequate power.
Dependent variable
closely connected to the measuring instrument used to collect data
Independent samples
compares the means of two independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different. The Independent Samples t Test is a parametric test. This test is also known as: Independent t Test.
Curvilinear
contained by or consisting of a curved line or lines.
Homogeneous
containing terms all of the same degree.
Results
contains crucial and summative information about the study
Background
contextual information which highlights the connection between their study and others' work. There needs to be some kind of rationale in order to state why the work is being undertaken.
Dependent variable
corresponds to the measured characteristics of people, animals, or things described as a multivariate ANOVA, or MANOVA
Criterion-related validity
criterion or concrete validity is the extent to which a measure is related to an outcome. Criterion validity is often divided into concurrent and predictive validity. Concurrent validity refers to a comparison between the measure in question and an outcome assessed at the same time.
Prediction
data are collected on the variable that is to be predicted, called the dependent variable or response variable, and on one or more variables whose values are hypothesized to influence it, called independent variables or explanatory variables.
Univariate
data where only one variable is involved.
Power
defined as the probability that it will reject a false null hypothesis. Statistical power is inversely related to beta or the probability of making a Type II error. In short, power = 1 - β.
Coefficient of variation
defined as the ratio of the standard deviation to the mean : It shows the extent of variability in relation to the mean of the population. For example, suppose we wanted to determine which of 2 workers has the more consistent commuting time driving to work in the morning. If one lives 5 miles away and another 25, a direct comparison of their STDEVs does not yield a fair comparison because the worker with the longer commute is expected to have more variability.
Normal distribution
defined by two parameters, the mean (μ) and the standard deviation (σ). 68% of the area of a normal distribution is within one standard deviation of the mean. Approximately 95% of the area of a normal distribution is within two standard deviations of the mean.
Platykurtic
describes a statistical distribution with thinner tails than a normal distribution...The prefix of the term, 'platy', means broad, which is actually an historical error.
Model fit
describes how well it fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question.
Method
details about how the study was conducted: Who participated! What kinds of instruments? What are participants required to do?
Range
difference between the lowest and highest scores
Practical significance vs. statistical significance
differences observed in the sample have occurred due to sampling error. Given a large enough sample, despite seemingly insignificant population differences, one might still find statistical significance. Practical significance: Looks at whether the difference is large.
Nonprobability sample
does not attempt to select a random sample from the population of interest. Rather, subjective methods are used to decide which elements are included in the sample.
Quartile
each of four equal groups into which a population can be divided according to the distribution of values of a particular variable. Also, each of the three values of the random variable that divide a population into four groups.
Latent variable
from Latin: present participle of lateo ("lie hidden"), as opposed to observable variables), are variables that are not directly observed but are rather inferred (through a mathematical model) from other variables that are observed (directly measured).
Pairwise comparison
generally is any process of comparing entities in pairs to judge which of each entity is preferred, or has a greater amount of some quantitative property, or whether or not the two entities are identical.
PERMANOVA
geometric partitioning of variation across a multivariate data cloud, defined explicitly in the space of a chosen dissimilarity measure, in response to one or more factors in an analysis of variance design.
Conservative f-test
has the same general meaning as in other areas: avoiding excess by erring on the side of caution. In statistics, "conservative" specifically refers to being cautious when it comes to hypothesis tests, test results, or confidence intervals.
Shrinkage
has two meanings: In relation to the general observation that, in regression analysis, a fitted relationship appears to perform less well on a new data set than on the data set used for fitting. In particular the value of the coefficient of determination 'shrinks'.
Computer-generated random numbers
have important applications, especially in cryptography where they act as ingredients in encryption keys. These happen to be numbers that occur in a sequence such that two conditions are met: (1) the values are uniformly distributed over a defined interval or set, and (2) it is impossible to predict future values based on past or present ones.
Distributional shape
helps to show when a data set is said to be normal, skewed, bimodal, or rectangular.
Effect size
how large the true condition effect is relative to the true amount of variability in this effect across the population.
Dependent samples
if the members of one sample can be used to determine the members of the other sample. Tricks: The words like dependent, repeated, before and after, matched pairs, paired and so on are hints for dependent samples.
Cochran Q test
in the analysis of two-way randomized block designs where the response variable can take only two possible outcomes (coded as 0 and 1), Cochran's Q test is a non-parametric statistical test to verify whether k treatments have identical effects.
Heterogeneous
incommensurable through being of different kinds, degrees, or dimensions.
Semi-interquartile range
index the amount of dispersion among a group of scores.
Cause and effect
indicates a relationship between two events where one event is affected by the other. In statistics, when the value of one event, or variable, increases or decreases as a result of other events, it is said to be present.
Z-score
indicates how many standard deviations an element is from the mean. A z-score can be calculated from the following formula. z = (X - μ) / σ where z is the z-score, X is the value of the element, μ is the population mean, and σ is the standard deviation.
Interquartile range
indicates how much spread exists among the middle 50% of the scores
Bivariate
involving or depending on two variables.
Biserial correlation
is a correlation coefficient used when one variable (e.g. Y) is dichotomous; Y can either be "naturally" dichotomous, like whether a coin lands heads or tails, or an artificially dichotomized variable
Omega squared
is a measure of effect size, or the degree of association for a population. It is an estimate of how much variance in the response variables are accounted for by the explanatory variables.
Logistic regression
is a predictive analysis. Logistic regression is used to describe data and to explain the relationship between one dependent binary variable and one or more nominal, ordinal, interval or ratio-level independent variables.
Effect size
is a simple way of quantifying the difference between two groups that has many advantages over the use of tests of statistical significance alone. Effect size emphasises the size of the difference rather than confounding this with sample size.
Goodness-of-fit test
is a statistical hypothesis test to see how well sample data fit a distribution from a population with a normal distribution. In other words, it tells you if your sample data represents the data you would expect to find in the actual population.
Nagelkerke's R-squared
is a statistical measure of how close the data are to the fitted regression line. It is also known as the coefficient of determination, or the coefficient of multiple determination for multiple regression. 0% indicates that the model explains none of the variability of the response data around its mean.
R-squared change
is a statistical measure of how close the data are to the fitted regression line. It is also known as the coefficient of determination, or the coefficient of multiple determination for multiple regression. 0% indicates that the model explains none of the variability of the response data around its mean.
Mesokurtic
is a statistical term used to describe the outlier (or rare, extreme data) characteristic of a probability distribution. Has similar extreme value character as a normal distribution. Kurtosis is a measure of tails, or extreme values, of a probability distribution.
Explanatory variable
is a type of independent variable. The two terms are often used interchangeably. But there is a subtle difference between the two. When a variable is independent, it is not affected at all by any other variables. When a variable isn't independent for certain, it's an explanatory variable.
Residual error
is associated with latent variables, but only latent variables that function as dependent variables. This kind of error can be thought of as what is left after the relevant independent (exogenous) variable(s) explain, or account for, as much variability in the dependent (endogenous) variable as it or they can
Equal variance assumption
is called homogeneity of variance. Some statistical tests, for example the analysis of variance, assume that variances are equal across groups or samples. The Levene test can be used to verify that assumption. Levene's test is an alternative to the Bartlett test.
Practical significance
is concerned with whether the result is useful in the real world.
Kaiser's criterion
is expected to perform better than parallel analysis for short correlated scales, particularly for larger ρ, for two reasons. First, the reference eigenvalues of Kaiser (equal to 1) are smaller than those of parallel analysis (larger than 1).
Univariate analysis
is the simplest form of analyzing data. "Uni" means "one", so in other words your data has only one variable. It doesn't deal with causes or relationships (unlike regression) and it's major purpose is to describe; it takes data, summarizes that data and finds patterns in the data.
Pillai's trace
is used as a test statistic in MANOVA and MANCOVA. This is a positive valued statistic ranging from 0 to 1. Increasing values means that effects are contributing more to the model; you should reject the null hypothesis for large values.This test is considered to be the most powerful and robust statistic for general use, especially for departures from assumptions. For example, if the MANOVA assumption of homogeneity of variance-covariance is violated, this is your best option. It is also a good choice when you have uneven cell sizes or small sample sizes (i.e. when n is small). However, when the hypothesis degrees of freedom is greater than one, Pillai's tends to be less powerful than the other three. If you have a large deviation from the null hypothesis or the eigenvalues have large differences, Roy's Maximum Root is a far better option (Seber 1984).
Moderated multiple regression
is used to determine whether the relationship between two variables depends on (is moderated by) the value of a third variable. ... We use the standard method of determining whether a moderating effect exists, which entails the addition of an (linear) interaction term in a multiple regression model.
One-way ANOVA
is used to determine whether there are any statistically significant differences between the means of three or more independent (unrelated) groups.
Counterbalancing
is usually thought of as a method for controlling order effects in a repeated measures design (see the notes on variance and experimental design). In a counterbalanced design to control for order effects, we use separate groups of subjects, each group receiving treatments in a different order.
Skewed distribution
it's quite common to have one tail of the distribution considerably longer or drawn out relative to the other tail. A "skewed right" distribution is one in which the tail is on the right side. A "skewed left" distribution is one in which the tail is on the left side.
Normality assumption
just the supposition that the underlying random variable of interest is distributed normally, or approximately so. Intuitively, normality may be understood as the result of the sum of a large number of independent random events.
Stem-and-leaf display
like a grouped frequency distribution that contains no loss of information.
Inferential statistics
makes inferences about populations using data drawn from the population. Instead of using the entire population to gather the data, the statistician will collect a sample or samples from the millions of residents and make inferences about the entire population using the sample.
Interaction
may arise when considering the relationship among three or more variables, and describes a situation in which the simultaneous influence of two variables on a third is not additive. Most commonly, interactions are considered in the context of regression analyses.
Orthogonal
means "uncorrelated." An orthogonal model means that all independent variables in that model are uncorrelated. If one or more independent variables are correlated, then that model is non-orthogonal. ... The term "orthogonal" usually only applies to classic ANOVA
Bivariate regression
means the analysis of bivariate data. It is one of the simplest forms of statistical analysis, used to find out if there is a relationship between two sets of values. It usually involves the variables X and Y. Univariate analysis is the analysis of one ("uni") variable.
Equivalent-forms reliability
measure of the reliability of a test based on the correlation between equivalent forms of a test. Also called parallel-form reliability. From: equivalent-form reliability.
Factor analysis
measure of the reliability of a test based on the correlation between equivalent forms of a test. Also called parallel-form reliability. From: equivalent-form reliability.
Test-retest reliability coefficient
measured with a test-retest correlation. Test-Retest Reliability (sometimes called retest reliability) measures test consistency — the reliability of a test measured over time. In other words, give the same test twice to the same people at different times to see if the scores are the same.
Eta squared
measures the proportion of the total variance in a dependent variable that is associated with the membership of different groups defined by an independent variable. Partial eta squared is a similar measure in which the effects of other independent variables and interactions are partialled out.
Partial eta squared
measures the proportion of the total variance in a dependent variable that is associated with the membership of different groups defined by an independent variable. Partial eta squared is a similar measure in which the effects of other independent variables and interactions are partialled out.
Adjusted R-squared
measures the proportion of the variation in your dependent variable (Y) explained by your independent variables (X) for a linear regression model. Adjusted R-squared adjusts the statistic based on the number of independent variables in the model.
Regression equation
models the dependent relationship of two or more variables. It is a measure of the extent to which researchers can predict one variable from another, specifically how the dependent variable typically acts when one of the independent variables is changed.
Dichotomous variable
nominal variables which have only two categories or levels. For example, if we were looking at gender, we would most probably categorize somebody as either "male" or "female".
Criterial value
nothing more than the number extracted from one of many statistical tables developed by mathematical statisticians.
Quantitative variable
numerical variables: counts, percents, or numbers.
Tied observations
occur when two are more observations are equal, whether the observations occur in the same sample or in different samples. In theory, nonparametric tests were developed for continuous distributions where the probability of a tie is zero.
Confounding
occurs when the experimental controls do not allow the experimenter to reasonably eliminate plausible alternative explanations for an observed relationship between independent and dependent variables. ... As a result, many variables are confounded, and it is impossible to say whether the drug was effective.
Cohen's F
one appropriate effect size index to use for a oneway analysis of variance (ANOVA). Cohen's f is a measure of a kind of standardized average effect in the population across all the levels of the independent variable.
T-score
one form of a standardized test statistic (the other you'll come across in elementary statistics is the z-score). The formula enables you to take an individual score and transform it into a standardized form>one which helps you to compare scores.
Parceling
one of several procedures for combining individual items and using these combined items as the observed variables, typically as the observed variables in Confirmatory Factor Analysis (CFA) or Structural Equation Modelling (SEM). Parels are an alternative to using the individual items.
Parametric test
one that makes assumptions about the parameters (defining properties) of the population distribution(s) from which one's data are drawn, while a non-parametric test is one that makes no such assumptions.
Dummy variable
one that takes the value 0 or 1 to indicate the absence or presence of some categorical effect that may be expected to shift the outcome.[3][4] Dummy variables are used as devices to sort data into mutually exclusive categories (such as smoker/non-smoker, etc.).[2] For example, in econometric time series analysis, dummy variables may be used to indicate the occurrence of wars or major strikes. A dummy variable can thus be thought of as a truth value represented as a numerical value 0 or 1 (as is sometimes done in computer programming).
Standardized regression equation
or beta coefficients are the estimates resulting from a regression analysis that have been standardized so that the variances of dependent and independent variables are 1. ... Sometimes the unstandardized variables are also labeled as "b".
Mean square (MS)
or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors or deviations—that is, the difference between the estimator and what is estimated.
Negatively skewed
or skewed to the left, if the scores fall toward the higher side of the scale and there are very few low scores
Intraclass correlation
or the intraclass correlation coefficient (ICC), is an inferential statistic that can be used when quantitative measurements are made on units that are organized into groups. It describes how strongly units in the same group resemble each other.
Trimodal
possessing three equivalent modes
Spearman-Brown
prediction formula, also known as the S-B prophecy formula, is a formula relating psychometric reliability to test length and used by psychometricians to predict the reliability of a test after changing the test length.
Point estimation
procedure whereby an estimated guess is made (based on sample data) about the unknown value of the population parameter. No level of confidence, no statistical table, no interval. Just compute the statistic on the basis of the sample data and posit that unknown value of the population parameter is the same as the data-based number.
Alternative hypothesis
referred to as H subscript a, takes the same form as the null hypothesis but is given at the beginning of the hypothesis testing procedure.
Multicollinearity
refers to a situation where a number of independent variables in a multiple regression model are closely correlated to one another.
Cluster sample
refers to a type of sampling method . With cluster sampling, the researcher divides the population into separate groups, called clusters. Then, a simple random sample of clusters is selected from the population. The researcher conducts his analysis on data from the sampled clusters.
Accuracy
refers to the closeness of a measured value to a standard or known value. For example, if in lab you obtain a weight measurement of 3.2 kg for a given substance, but the actual or known weight is 10 kg, then your measurement is not accurate.
Data transformation
refers to the modification of every point in a data set by a mathematical function. When applying transformations, the measurement scale of the variable is modified. Data transformation is most often employed to change data to the appropriate form for a particular statistical test or method.
Specificity
relates to the test's ability to correctly reject healthy patients without a condition. Consider the example of a medical test for diagnosing a disease. Specificity of a test is the proportion of healthy patients known not to have the disease, who will test negative for it.
Linear
relationship which can be expressed either in a graphical format where the variable and the constant are connected via a straight line or in a mathematical format where the independent variable is multiplied by the slope coefficient, added by a constant, which determines the dependent variable.
Between subject variable
researchers sometimes refer to the study's independent variable as this. Comparisons are being made with data that have come from independent samples.
Type 2 error risk
retaining a false null hypothesis (also known as a "false negative" finding).
Exploratory factor analysis (EFA)
s a statistical method used to uncover the underlying structure of a relatively large set of variables. EFA is a technique within factor analysis whose overarching goal is to identify the underlying relationships between measured variables.[1] It is commonly used by researchers when developing a scale (a scale is a collection of questions used to measure a particular research topic) and serves to identify a set of latent constructs underlying a battery of measured variables.[2] It should be used when the researcher has no a priori hypothesis about factors or patterns of measured variables.[3] Measured variables are any one of several attributes of people that may be observed and measured. An example of a measured variable would be the physical height of a human being. Researchers must carefully consider the number of measured variables to include in the analysis.[2] EFA procedures are more accurate when each factor is represented by multiple measured variables in the analysis.
Mann-Whitney U test
same as Wilcoxon-Mann-Whitney test
Simple ANOVA
same as one-way ANOVA
Paired samples
samples in which natural or matched couplings occur. This generates a data set in which each data point in one sample is uniquely paired to a data point in the second sample.
Discussion
section which is devoted to nontechnical interpretation of the results, in a more direct way than is offered in the "statement of purpose" section.
ANOVA
similar to t-tests, it is a measure of signal vs noise
Measure of variability
simply indicates the degree of this dispersion among the scores.
Grouped frequency distribution
shows how many people (or animals or objects) were similar in the sense that, measured on the dependent variable, they ended up in the same category or had the same score.
Bimodal
situations in which data have been collected on two dependent variables. having or involving two modes, in particular (of a statistical distribution) having two maxima.
Outcome variable
sometimes called an experimental or predictor variable, is a variable that is being manipulated in an experiment in order to observe the effect on a dependent variable, sometimes called an outcome variable
Independent variable
sometimes called an experimental or predictor variable, is a variable that is being manipulated in an experiment in order to observe the effect on a dependent variable, sometimes called an outcome variable.
Unstandardized regression equation
standardized coefficients or beta coefficients are the estimates resulting from a regression analysis that have been standardized so that the variances of dependent and independent variables are 1. ... A regression carried out on original (unstandardized) variables produces unstandardized coefficients.
Parallel analysis
statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors.
Control variable
strongly influences experimental results, and it is held constant during the experiment in order to test the relative relationship of the dependent and independent variables. The control variable itself is not of primary interest to the experimenter.
Normality
tests which are used to determine if a data set is well-modeled by a normal distribution and to compute how likely it is for a random variable underlying the data set to be normally distributed.
Homogeneity of regression slopes
that the variance within each of the populations is equal. This is an assumption of analysis of variance (ANOVA). ANOVA works well even when this assumption is violated except in the case where there are unequal numbers of subjects in the various groups.
Ordinate
the Cartesian coordinate obtained by measuring parallel to the y-axis — compare abscissa.
Sensitivity
the ability of a test to correctly identify those with the disease (true positive rate), whereas test specificity is the ability of the test to correctly identify those without the disease (true negative rate).
Sample size determination
the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample.
Small standardized effect size
the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample.
Standardized effect size
the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample.
Homogeneity of variance assumption
the assumption that the variance within each of the populations is equal. This is an assumption of analysis of variance (ANOVA). ANOVA works well even when this assumption is violated except in the case where there are unequal numbers of subjects in the various groups.
Nonresponse bias
the bias that results when respondents differ in meaningful ways from nonrespondents. Nonresponse is often problem with mail surveys, where the response rate can be very low.
F
the calculated ANOVA (variance) value calculated by dividing the mean square (MS) on the between groups row by the mean square on the "within groups" row of the table.
Procedure
the content of what exactly was done and/or what steps were followed
Criterion variable
the criterion variable is the variable being predicted. In general, the criterion variable is the dependent variable
Reject or fail-to-reject
the decision which must be made at the end of the hypothesis testing procedure.
Interrater reliability
the degree of agreement among raters. It gives a score of how much homogeneity, or consensus, there is in the ratings given by judges, and it is one of the aspects of test validity.
Dispersion
the degree to which scores are very similar or very different.
Reliability
the degree to which the result of a measurement, calculation, or specification can be depended on to be accurate.
Mahalanobis distance measure
the distance between two points in multivariate space. ... The Mahalanobis distance measures distance relative to the centroid — a base or central point which can be thought of as an overall mean for multivariate data.
Main effect F
the effect of one independent variable on the dependent variable. It ignores the effects of any other independent variables.
Variance
the expected value of the squared deviation from the mean.
Predictive validity
the extent to which a score on a scale or test predicts scores on some criterion measure. For example, the validity of a cognitive test for job performance is the correlation between test scores and, for example, supervisor performance ratings.
Significant
the likelihood that a relationship between two or more variables is caused by something other than random chance. Statistical hypothesis testing is used to determine whether the result of a data set is statistically significant.
Statistical significance
the likelihood that a relationship between two or more variables is caused by something other than random chance. Statistical hypothesis testing is used to determine whether the result of a data set is statistically significant.
Sampling error
the magnitude of the difference of the sample statistic from the estimated parameter. If 20 coin flips yield 10 heads and 10 tails, this is said to equal 0.
Unit of analysis
the major entity that is being analyzed in a study. It is the 'what' or 'who' that is being studied. In social science research, typical units of analysis include individuals (most common), groups, social organizations and social artifacts.
Positively skewed
the mean is usually greater than the median, which is always greater than the mode.
Coefficient of stability
the more similar the scores, the higher the correlation. The test-retest reliability is heavily dependent upon the amount of time that passes between the two occasions.
Mode
the most frequently occurring score
Median
the number that lies in the midpoint of the distribution of earned scores.
Measurement model
the part of the model that examines relationship between the latent variables and their measures. The structural model is the relationship between the latent variables.
Mean
the point that minimizes the collective distances of scores from that point. Found by dividing the sum of the scores by the number of by the number of scores in the data set.
Alpha Level
the probability of making the wrong decision when the null hypothesis is true. Alpha levels (sometimes just called "significance levels") are used in hypothesis tests. Usually, these tests are run with an alpha level of .05 (5%), but other levels commonly used are .01 and .10
Level of significance
the probability of rejecting the null hypothesis in a statistical test when it is true — called also significance level.
Hypothesis testing
the procedure of evaluating hypothesis data inferentially before comparing it to the real data. Six steps: State the null hypothesis. State the alternative hypothesis. Select a level of significance. Collect and summarize the sample data. Refer to a criterion for evaluating the sample evidence. Make a decision to discard/retain the null hypothesis.
Estimation
the procedures by which researchers use sample statistics to make educated guesses as to the values of population parameters.
Model building
the process of developing a probabilistic model that best describes the relationship between the dependent and independent variables.
Sample
the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen.
Linearity
the property of a mathematical relationship or function which means that it can be graphically represented as a straight line. Examples are the relationship of voltage and current across a resistor (Ohm's law), or the mass and weight of an object.
Type 1 error
the rejection of a true null hypothesis (also known as a "false positive" finding).
Type 1 error risk
the rejection of a true null hypothesis (also known as a "false positive" finding).
Type 2 error
the rejection of a true null hypothesis (also known as a "false positive" finding).
Indicator
the representation of statistical data for a specified time, place or any other relevant characteristic, corrected for at least one dimension (usually size) so as to allow for meaningful comparisons.
Kurtosis
the sharpness of the peak of a frequency-distribution curve.
Standard score
the signed number of standard deviations by which the value of an observation or data point is above the mean value of what is being observed or measured. Observed values above the mean have positive standard scores, while values below the mean have negative standard scores. Also referred to as z-score.
One-factor ANOVA
the simplest version of ANOVA, permits the researcher to use the data in samples for the purpose of making a single inferential statement concerning the means of the study's populations. Are the means of the various populations equal to one another?
Sampling frame
the source material or device from which a sample is drawn. It is a list of all those within a population who can be sampled, and may include individuals, households or institutions. Importance of the sampling frame is stressed by Jessen and Salant and Dillman.
Error
the standard deviation of the sampling distribution of a statistic. Standard error is a statistical term that measures the accuracy with which a sample represents a population. In statistics, a sample mean deviates from the actual mean of a population; this deviation is the standard error.
Standard error
the standard deviation of values which make up the sampling distribution, AKA an index of how variable the sample statistic is when multiple samples of the same size are drawn from the same population.
Statement of Purpose
the stated reason for the investigation. EG: The ___ of this investigation thus was to assess the relationship of changes in BMI with changes in body satisfaction in a sample of European-American and African-American women with obesity who participated in a program of moderate exercise.
Hypothesis testing
the theory, methods, and practice of testing a hypothesis by comparing it with the null hypothesis. The null hypothesis is only rejected if its probability falls below a predetermined significance level, in which case the hypothesis being tested is said to have that level of significance.
Calculated value (or test statistic)
the value which the sample data always leads toward
Mediator variable
the variable that causes mediation in the dependent and the independent variables. In other words, it explains the relationship between the dependent variable and the independent variable. The process of complete mediation is defined as the complete intervention caused by the mediator variable.
Sampling distribution
the variety amongst the individual pieces of sample data taken from the population.
Abcissa
this and the ordinate are respectively the first and second coordinate of a point in a coordinate system. ... Usually these are the horizontal and vertical coordinates of a point in a two-dimensional rectangular Cartesian coordinate system.
Modification indices
this and the standardized expected parameter change (SEPC) are 2 statistics that may be used to aid in the selection of parameters to add to a model to improve the fit
Eta squared
this is a statistics resource. Educational level: this is a tertiary (university) resource. Eta-squared ( η 2 {\displaystyle \eta ^{2}} ) is a measure of effect size for use in ANOVA (Analysis of variance). η 2 {\displaystyle \eta ^{2}} is analogous to R2 from multiple linear regression.
Matched samples
those in which each member of a sample is matched with a corresponding member in every other sample by reference to qualities other than those immediately under investigation.
Independence
two events are (this term), statistically (this term), or stochastically (this term) if the occurrence of one does not affect the probability of occurrence of the other.
Roy's largest root
used are testing for mean differences on a single dependent measure while controlling for the other dependent measures.
Exogenous variable
used for setting arbitrary external conditions, and not in achieving a more realistic model behavior. For example, the level of government expenditure is exogenous to the theory of income determination. See also endogenous variable.
Indicator variable
used in regression analysis and Latent Class Analysis. ... A set of observed variables can "indicate" the presence of one or more latent (hidden) variables
Confidence band
used in statistical analysis to represent the uncertainty in an estimate of a curve or function based on limited or noisy data. Similarly, a prediction band is used to represent the uncertainty about the value of a new data-point on the curve, but subject to noise.
Tests of normality
used to determine whether sample data has been drawn from a normally distributed population (within some tolerance). A number of statistical tests, such as the Student's t-test and the one-way and two-way ANOVA require a normally distributed sample population.
Yates' correction for discontinuity
used to test independence of events in a cross table i.e. a table showing frequency distribution of variables. It is used to test if a number of observations belonging to different categories validate a null hypothesis. It is a correction made to chi-square values in a binomial frequency distribution table.
Point biserial correlation
used when one variable (e.g. Y) is dichotomous; Y can either be "naturally" dichotomous, like whether a coin lands heads or tails, or an artificially dichotomized variable.
Introduction
usually contains background and statement of purpose, also hypotheses
Within-subjects factor
variable is an independent variable that is manipulated by testing each subject at each level of the variable. Compare with a between-subjects variable in which different groups of subjects are used for each level of the variable.
Manifest variable
variable that can be directly measured or observed. ... Manifest variables are used in latent variable statistical models, which test the relationships between a set of manifest variables and a set of latent variables.
Observed variable
variables that are not directly observed but are rather inferred (through a mathematical model) from other variables that are observed (directly measured).
Homogeneity of proportions
we consider two or more populations (or two or more subgroups of a population) and a single categorical variable. ... The null hypothesis says that the distribution of proportions for all categories is the same in each group or population.
Alternate-forms reliability
when an individual participating in a research or testing scenario is given two different versions of the same test at different times. The scores are then compared to see if it is a reliable form of testing. An individual is given one form of the test and after a period of time (usually a week or so) the person is given a different version of the same test. If the scores differ dramatically then something is wrong with the test and it is not measuring what it is supposed to. If this is the case the test needs to be analyzed more to determine what is wrong with the different forms.
Direct relationship
when one increases so does the other or as one decreases so does the other
Regression coefficient
when the regression line is linear (y = ax + b) the regression coefficient is the constant (a) that represents the rate of change of one variable (y) as a function of changes in the other (x); it is the slope of the regression line.
One-tailed test
where the region of rejection is on only one side of the sampling distribution , is called a one-tailed test. For example, suppose the null hypothesis states that the mean is less than or equal to 10. The alternative hypothesis would be that the mean is greater than 10.
Wilcoxon test
which refers to either the Rank Sum test or the Signed Rank test, is a nonparametric test that compares two paired groups. ... The Wilcoxon Rank Sum test can be used to test the null hypothesis that two populations have the same continuous distribution.