Study Statistics and Research Design
what happens when you multiply or divide by a constant
-- both Central at tendency and variability change; for example, multiplying each value by Five results in a mean and a standard deviation five times larger than the originals
what is partial correlation
-- it is a correlation coefficient that reflects the degree of relationship between two variables after each variables relationship with a third variable has been removed
what is a MANCOVA
-- just as there is with ANCOVA, one or more covariates are added in order to reduce error variation
what happens when a constant is added to or subtracted from a variable
-- measures of central tendency change similarly -- measures of variability remain the same
what is internal validity
-- the extent to which a research study rules out alternative explanations and establishes causality --if you are seeking to establish that a causes B, then internal validity is of Paramount importance and more of a concern than external validity
what happens when you add, subtract, and multiply, or divide by a constant
-- the shape of the distribution Remains the Same -- correlations with other variables remain the same
what are type 1 errors and type 2 errors
-- type 1 errors assert that there is an effect when there is not -- type II errors say that there's no effect when there actually is
what are the two most common applications of the single subject design
--ABAB --multiple baseline and which the effective treatment tested with different behaviors, people, or settings
who were the first researchers to use meta-analysis
--Karl Pearson was the first to develop the concept of expect sizes and meta-analysis -- Smith, Glass, and Miller in 1980 were the first to name the statistical test and apply it to the realm of psychological research
what is sampling error
--discrepancy or amount of error that exists between a sample statistic and the corresponding population parameter -- for a normally distributed sampling distribution, there is a 50% chance that the sample will have either overestimated or underestimated the true population
what are the assumptions of the ANOVA
1. independence of observations 2. homogeneity of variance 3. normality
name 4 types of reliability and the methods for determining them
1. internal consistency: all items measure the same thing a. reliability coefficient r b. split-half c. cronbach's Alpha AKA coefficient Alpha d. Kuder-Richardson formula (KR20) e. Spearman Brown prophecy: used when adding test questions to see if it improves the reliability 2. consistency between Alternative forms measured by coefficient of equivalence 3. Test - retest consistency measured by the coefficient of stability 4. inter-rater reliability: assesses the extent to which raters of subjective measures rate objects consistently; measured by Kappa coefficient with nominal or ordinal data
what are 3 advantages of using the mode to determine central tendency
1. it represents the largest number of people 2. it is applicable to nominal data whereas the median and the mean are not 3. unaffected by extreme values/ outliers
how does one go about SEM
1. it starts with a hypothesis represented by a model 2. then the constructs of Interest are operationalized and measured 3. finally the model is tested
in what three ways does logistic regression differ from multiple regression
1. logistic coefficients are partial coefficients, controlling for other variables in the model, whereas correlation coefficients are uncontrolled 2. logistic coefficients reflect linear and nonlinear relationships, whereas correlation reflects only linear relationship 3. a significant logistic regression means that there is a relationship between the independent variable to the dependent variable for selected control Gramps but not necessarily overall
what are two additional particular strengths of SEM
1. models latent variables 2. can test the overall model instead of focusing on individual coefficients
what are two practical disadvantages of using the interrupted time series design
1. multiple measurements may be expensive and / or time-consuming 2. gaps in data may invalidate results ( or allowing insufficient time for effect to surface in post test measurements)
what are three ways to help ensure internal validity of the study
1. random assignment (most effective) 2. matching: subjects grouped together on the basis of similarities on a particular characteristic related to The Chosen variable 3. blocking: extraneous variables are included in the experimental design as additional independent variables
how does a larger sample size serve a study
1. reduce sampling error because the sample will be more representative of the population 2. increases statistical power so that with fewer errors, i.e. noise, differences or relationships are more likely to be deemed statistically significant
what are the three types of standard error
1. standard error of the mean: estimate how much the sample mean will deviate from the population mean; when the population size increases, sample size decreases, and the standard error of the mean increases; SD/sqrt of N 2. standard error of measurement: constructs confidence intervals for test scores; how much the person score is expected to vary from the score of the person is capable of receiving, based on actual ability; SD × sqrt(1 - reliabilty quotient) 3. standard error of the estimate SEE AKA confidence band: construct confidence intervals around a predicted score; it is affected by the magnitude of the standard deviation of the Criterion score; the higher SD, the higher SEE
what two steps does the sem center around
1. the measurement model is validated food confirmatory factor analysis 2. then the structural model is fitted through path analysis with latent variables
what are some measures of effect size
1. the square of Pearson's r/ r-squared/coefficient of determination: this shows the proportion of variation in one variable accounted for by the linear relationship with another 2. chi-square / Cramer's Phi: strength of relationship between two variables in a contingency table 3. T-Test/Cohen's d: the difference between two groups means in terms of standard deviation (control group or pooled 4. ANOVA/eta-squared or omega-squared): proportion of variation in the dependent variable accounted for by the independent variable
if you knew that there is a strong positive relationship between variables, for example Pearson's r equals .90, how much of the variability in one variable accounts for the changes in the other
81% because r-squared expresses a proportion of variability in one variable that can be accounted for by the changes in another
what is the relationship between validity and reliability
a measure can be reliable but not necessarily valid, but a measure can never be valid if it is not reliable
what is the main advantage to using the interrupted time series design
due to the ability to identify pre-existing Trends, it provides strong evidence that the intervention was effective
what are carryover effects
the effects of one treatment are carried over to the next
what is a triple blind study
the participants, experimenters, and others involved in the research outside the experimental setting are blind
what is Phi correlation coefficient
two dichotomous variables
what are Type I error, p-values, and alpha
-- Type I Error: False rejection of null hypothesis = a (alpha) --a type 1 error happens when you conclude there is an effect when there is no real effect -- the p-value is the probability of committing a type 1 error and is compared to a conventional a determined cut off value termed alpha (usually .05, .01, or .001) -- if the obtained p-value is less than alpha, then the results are unlikely to have occurred by chance alone and reflect a true effect
what is a type 2 error, beta, and its complement (1-B)
-- Type II Error: Retention of a false null hypothesis = B (beta) --a type 2 error is when they're actually is an effect but our study failed to capture it -- the probability of this occurrence is termed beta, and its complement (1-B) is power
what is a t-test
-- a commonly used to parametric inferential statistical test that compares the means of two independent samples on a given dependent variable ( independent samples t-test) -- the the t-statistic is equal to the variability between treatments / the variability within treatments
what is a Latin Square design
-- a counterbalancing technique used with multiple treatment designs (i.e. multiple treatments are administered to each participant) in which order effects (e.g. practice, fatigue) are of concern, but complete counterbalancing is Impractical given the large number of treatments --e.g. there are 5 treatments and a total of 5 sequences ABCDE, EABCD, DEABC, CDEAB, BCDEA -- Advantage is that it can determine if they were effects of treatment and/or treatment order -- disadvantage is that it doesn't represent all possible sequences of treatment -- in order to be a balanced Latin Square design it requires that any given treatment has to occur an equal number of times before and after any other treatment -- also, in order to be a Latin Square design the number of sequences must equal the number of treatments
what is cluster sampling
-- a sampling technique involving naturally occurring groups or clusters -- population is divided into clusters and some clusters are randomly selected for inclusion in the sample -- ideally the Clusters should be internally heterogeneous but relatively homogeneous between clusters
what is an ABAB design AKA Reversal Design
-- a single subject design in which Baseline data are obtained (A) before a treatment is introduced (B), removed (A), and then reintroduced (B). -- the advantage of this design is that with an additional Baseline the researcher is able to determine whether an observed change was actually due to the treatment rather than other confounding factors -- if the treatment is withdrawn and the behavior goes back to the original Baseline functioning and then improves again with treatment, one can say that the treatment was effective -- it can also incorporate additional treatments for II treatment; the ability to improvise treatment is an intrinsic part of this design that increases its clinical utility
what is Pearson's r AKA Pearson's product moment correlation coefficient
-- a statistic with values ranging from -1.0 (perfectly negatively related) through zero (no relationship whatsoever) to 1.0 (perfectly positively correlated) -- the best way to visualize the correlation is to plot them on a scatter plot
what is a discriminant functional analysis
-- a statistical procedure used to classify individuals into groups such as sex, age, education, and income to discriminate between variables measured -- for example, a researcher may want to discriminate between levels of education and perspectives on the death penalty
what is structural equation modeling (SEM)
-- a technique for building and testing statistical models -- a multivariate technique that uses factor analysis, path analysis (a statistical procedure that involves the application of correlational techniques to test models of causality), and regression -- confirmatory rather than exploratory and that it seeks to test rather than develop a theory
what is logistic regression
-- a technique for fitting a regression surface to data including a dichotomous dependent variable -- can be used to protect group membership from a set of variables -- can be used to provide knowledge of the relationships and strengths among variables --a regression with an outcome variable that is categorical and independent variables that can be a mix of continuous and/or categorical variables
what are the advantages and disadvantages of the Solomon four group design
-- advantages: can yield rich results ad it is a true experiment -- it requires several groups increasing the time and cost of research
what is cluster analysis
-- another method of data reduction in which groups are identified from a correlation Matrix -- the two most strongly correlated variables form the nucleus of the first cluster -- then other variables correlated with the nucleus are added to the cluster -- to strongly correlated variables with a weak correlation for the first cluster form the nucleus for the second cluster and so on
how are z-scores calculated and what do they mean
-- by subtracting the mean from every observation and dividing each resultant value by the standard deviation -- the z-score reveals the number of standard deviations that a score lies above or below the mean -- the z-score distribution is standardized to have a mean of 0 and a standard deviation of 1, permitting comparisons across different measures and tests -- the shape of the z-score distribution is identical to the shape of the distribution of raw scores
what is construct validity
-- constructs are characteristics that cannot be directly observed or measured such as anxiety or hope -- construct validity refers to the adequacy of measuring an abstract trait or the theoretical meaning of a construct -- it is directly related to the appropriateness and adequacy of the data, analysis, findings, and level of external validity
what is a correlation
-- correlational or observational studies are used to examine the relationship between two interval or ratio scale continuous variables -- a positive or direct correlation indicates that as one variable moves up so does the other, while a negative or inverse correlation shows that the variables move together but in opposite directions
what is a mixed method design
-- designs that use both quantitative and qualitative data in order to draw integrated conclusions on the construct of Interest -- it corrects for the limitations of both quantitative and qualitative designs, increasing the strength of the study
what are some assumptions of chi-square tests
-- each observation is dependent from the others --no individuals response can be classified in more than one category -- the percentage of observations with in categories cannot be compared -- chi-square statistics are restricted to situations in which the expected frequency of each cell is 5 or higher
what is measurement validity
-- face validity reflects the extent to which a test taker feels the instrument measures what it supposed to measure
what is forward selection analysis and backward elimination analysis
-- forward selection analysis is most commonly used and adds the independent variable with the highest correlation with the dependent variable first, then adding the most predictive ones one at a time -- backward elimination analysis starts with all of the significant independent variables removing the least productive ones one of the time
what is cross-sectional research
-- groups at each level are assessed at the same time; for example, 12 groups of students, one at each grade level, are observed at one time -- it assumes that differences reflect natural development -- Advantage: it requires much less time than a longitudinal study -- disadvantage: differences may be due to cohort effects; that is, group differences reflect different experiences rather than natural development
what is variability
-- he refers to the extent to which measurements differ from each other --it describes how much the scores differ among themselves within a given distribution -- the simplest statistic but limited in its usefulness is range because it's affected by outliers and doesn't give much information about the distribution -- variance and standard deviation are better messurements
what is a matched subjects design
-- individual in one sample is matched with a subject and another sample with respect to a specific variable of interest rendering them equivalent or at least nearly so -- this reduces the error variance and a dependent variable as long as it is related to the matching variable and increases the power of the study -- some disadvantages however are the potential decrease and external validity and the possibility that matching might alert the subject to the research hypothesis
what is principal component analysis PCA
-- it analyzes all the variability in a set of observed variables to produce a smaller number of components -- Factor loading is the correlation between an observed variable and a given Factor -- eigenvalue is the amount of variability in The observed variables accounted by a given factor and is calculated by the sum of the squared factor loadings for a given Factor -- preferable two principal axis factoring PAF
what is a cross sequential research design
-- it combines longitudinal and cross-sectional designs -- different groups are assessed repeatedly over time -- considered the most powerful of the three designs -- the advantage is that it reduces time required to perform and minimizes the assumptions/cohort effects -- the disadvantage is that it is much more demanding and therefore rarely-seen and research
what is the goodness of fit test AKA the one way chi-square test
-- it determines the extent to which observed frequencies across one independent variable fit expected frequencies predicted by the null hypothesis
what is statistical significance
-- it determines whether or not a relationship or main difference is of sufficient magnitude -- it can be subject to type I or type II errors
what is a mediating variable
-- it explains the process, mechanism, or means by which a variable produces a particular outcome -- for example, a researcher finds that a new therapy improves one's self beliefs, and this more positive self view then reduces depression ( self-belief quality would be the mediating variable here)
what is a MANOVA
-- it includes more than one DV -- if a significant effect is obtained, follow up with univariate ANOVAs for each DV in order to interpret -- the advantages are that it protects against inflated Alpha from numerous ANOVAs; with multiple DVs, it might reveal effects not found by separate ANOVAs -- disadvantages include a more complicated design and ambiguous as to which Ivy affects which DV; increases in power perhaps offset by loss and degrees of freedom
how is SEM more powerful than multiple regression
-- it incorporates multiple latent and manifest independent variables and dependent variables (perhaps measured with multiple indicators) -- it incorporates measurement error and correlated error terms -- it incorporates interactions and nonlinearities
what is stepwise multiple regression AKA statistical regression
-- it involves adding or subtracting predictors one of the time and calculating the multiple correlation coefficient (AKA R-squared or Coefficient of Multiple Determination) -- the ultimate goal of stepwise regression is to identify the fewest number of predictors needed to account for Criterion variability
what is adjusted r-squared
-- it is a conservative reduction to penalize r-squared for adding variables -- it is a response to the fact that chance variations in some independent variables explain small parts of the variance of the dependent variable -- required when the number of independent variables is high relative to the number of cases or when comparing models with different numbers of dependent variables
what is a meta-analysis
-- it is a set of statistical procedures used to combine the results of multiple studies to determine the relationship between variables -- it requires multiple related data sets, findings, and interpretations that must be recoded and analyzed as one research sample
how do degrees of freedom vary from test to test in a chi-square
-- it is an estimate of the number of categories used to organize the data and is therefore different in each study
what is the odds ratio OR
-- it is used in binary logistic regression analysis -- the closer the value is to 1, the more likely it is that independent variable categories will not be found to significantly influence other factors -- the closer the value is to 1 the more likely such factors will be found to be independent of the dependent variable -- it is the measure of effect size
what is an ANCOVA
-- it is used when a covariate ( extraneous variable) is used to account for a portion of the variance in the DV -- The covariate is continuous, must correlate with the DV, and must be measured prior to the IV to ensure independence of treatment -- the ultimate goal of this analysis is to reduce error variation
what is a moderating variable
-- it is what influences the direction, nature, and magnitude of the relationship between the specified IVs and DVs -- for example, a researcher finds that a new therapy reduces depression, but only among men. In this case sex is a moderating variable -- in correlational research, a moderating variable is explained in terms of a third variable that will affect the zero-order correlation between the other two variables --with ANOVA a basic moderator effect is seen as an interaction between the chosen independent variable and the factor that specifies the conditions that allow it to operate
what are the advantages of using a multiple Baseline design
-- it provides flexibility in that the findings can be extended to other people, places, and behaviors -- if a particular intervention is not effective the researchers can make efforts to improve it before extending it to other people, behaviors, or things
what is principal axis factoring (PAF)
-- it reduces a set of observed variables two factors but does so by removing the variability unique to the individual items -- usually, PCA and PAF yield similar results but PCA is the preferred method for data reduction, while PAF is preferred when the goal of analysis is to detect structure
what is theoretical validity
-- it refers to the adequacy and appropriateness of how the stated theories explain the relationships between constructs and the degree to which the theoretical findings are consistent with the analysis and interpretation of the research data
what is a family-wise alpha and the bonferroni adjustment
-- it refers to the overall probability of committing a type 1 error for a set of statistical tests (for example, running multiple t-tests) -- it is approximately equal to the product of all the alphas ( for example, running 5 t-tests each with an alpha of .05 yields a family-wise alpha of .25 -- this can be corrected with the bonferroni adjustment or an alternative test can be conducted such as MANOVA to replace multiple ANOVAs
what is R-squared AKA coefficient of determination or multiple determination if there are two or more predictors
-- it reflects the amount of are made when using the regression model to predict the value of the dependent variable, referenced to the total amount of error made when only using the dependent variables mean as the basis for estimating all cases -- it cannot be compared between samples due to differences in the variances of the autonomous dependent variables
when does one use stepwise regression analysis and what are the drawbacks
-- it should only be used in the exploratory phase of research for purposes of peer production and not for Theory testing -- the .05 significance level used at each step is subject to inflation so that the real significance level by the last step may be much worse-- potentially below .50 dramatically increasing the chances of type 1 errors --collinearity can also be a problem with stepwise methods
what is a mixed design ANOVA
-- multiple IVs including both within-subjects (e.g. time) and between-subjects (condition) factors -- example: pre and post-test with control condition
what is a factorial (n-way) ANOVA
-- n represents the number of IVs, or factors -- used when examining effects of two or more IVs -- example would be a 2x2 design with two IVs each with two levels (sex: male/female; instruction method: novel/traditional) -- main effects Colin difference across sexes, difference across instructional method -- interaction effects: differing effect of one IV at different levels of another (e.g. women achieve higher score than men, but only with the novel method of instruction) -- with an interaction, interpret the main effect with caution because there is more to the story then just women scoring higher than men -- when graphing means using separate lines for different levels of one IV, parallel lines indicate a lack of an interaction effect
what are some problems with single subject designs
-- often non statistical methods for evaluating data are implemented and used in drawing conclusions -- autocorrelation and practice effects stemming from repeatedly measuring the same subject on the same variable -- the amount of time required -- potential lack of generalizability -- the carryover effect: the effect that carries over from one experimental condition to another whenever subjects perform in more than one condition
what is longitudinal research
-- one group is followed for an extended period of time --advantage: it can provide valuable qualitative and quantitative data --disadvantages: lengthy time requirement, participant mortality, lack of randomization, and history is a primary threat to external validity
how do percentile ranks differ from percentages
-- percentage scores reflect the number of items that were answered correctly out of a specified total number -- percentile rank refers to a number between 0 and a hundred that indicates the percentage of cases that fall at or below a particular score; for example if your score on an examination is in the 93rd percentile, 93% of the scores fall below yours
what is an interrupted time series design
-- repeated measurements are made on the participants both before and after a manipulated intervention or a naturally occurring event -- it is a quasi-experimental, single group, multiple pre and post-tests Design -- with multiple pre-tests, Trends can be identified prior to the treatment -- seasonal effects entangled with measurements could be a compound with the treatment effect
what is research validity
-- researcher bias must be addressed in both qualitative and quantitative but is particularly important in qualitative studies -- one prior experience, values, and personal beliefs must be addressed as well as the personal and or professional relationships with a study participants
what is ABA Design
-- similar to the ABAB, but it does not implement the treatment again --ABAB is the preferred design
what are some differences between cluster and stratified sampling
-- stratified sampling drawers from each stratum; main objective is improved precision -- cluster sampling involves only elements of randomly-selected clusters; main objective is to improve sampling efficiency and reduce cost
what is a one-way ANOVA
-- tests for differences in 1 DV across multiple levels (i.e. conditions) of one IV -- it does not state which means differ from one another and one must conduct post hoc tests
what are some criticisms with meta-analysis
-- the central issue is that researchers occasionally include poor quality studies and meta-analysis affords equal weight to all studies -- a sound meta-analysis should clarify the inclusion and exclusion criteria used in determining which studies are selected
what is reliability
-- the extent to which a measure or test is consistent and repeatable -- it is necessary, but not sufficient, for validity
what is a proband AKA patient zero
-- the first family member to seek medical attention for a genetic disorder -- following the study of the probe and, relatives are studied to determine the frequency with which they are likely to receive the same diagnosis -- if a genetic predisposition is found, the likelihood that first-degree relatives of the pro bands also have the disorder is higher than that found in the general population --e.g. approximately 10% of first-degree relatives of pro bands diagnosis schizophrenia versus 1% of the general population
what is the Solomon four group design
-- the purpose of it is to control for the effects of testing, that is practice effects. --4 groups 1. experimental (pre and post-test): these subjects experienced a pretest, treatment, and post test 2. control (pre and post-test): these subjects experience the pretest and posttest but no treatment 3. experimental (post test only): these subjects experience the treatment as well as the post test but no pretest 4. control (post test only): these subjects experience only the post test with no pretest or treatment
what is the F ratio
-- the variance between groups (error + treatment) divided by variance within groups (error) --formula is MSB/MSW --MSB = the mean of the squared deviations between groups --MSW= the mean of the squared deviations within groups -- a ratio of about 1 means lack of treatment effect -- a ratio that is significantly larger than 1 may conclude that means are farther apart than what would be expected from sampling error alone and this a significant effect of the IV on the DV
what did Grove and Meehl in their 1996 study
-- they found that clinical judgments rarely out performed Actuarial data when calculating birth and death risks -- out of 136 studies: 8 favored clinical, 64 favored Actuarial, and 64 were equivalent -- a later meta-analysis found that the effect sizes for Actuarial data was 10% better than clinical judgement in terms of delinquent and criminal behaviors
what are some special considerations regarding percentile rank
-- they have a uniform distribution: in equal number of values are to be expected for any given percentile rank -- changes in the scores in the middle of a distribution where most values are clumped are associated with larger changes and percentile rank then at extreme ends; e.g. percentile ranks increase from 50th to 84th percentile when going from the mean to one standard deviation above the mean, whereas they only increase from 84th to 98th percentile when going from one to two standard deviations above the mean
what is the test for Independence aka the two-way chi-square test
-- used to determine if the frequency distribution for one variable is or is not dependent upon the categories of the second variable (e.g. occurrence of an illness among those with a certain risk factor) -- the expected frequency for each cell of a contingency table is calculated by multiplying the column (e.g number of people ill) by the row total (e.g. number of people exposed to risk factor) and dividing by N, the number of people observed
what is counterbalancing
-- varying the sequence of treatments when subjects are exposed to all of the treatments -- the Latin Square design is one type of counterbalancing procedure -- the advantages that a researcher can separate and examine the between-subjects treatment variable sequence in addition to the within subjects treatment variable -- the disadvantages are that counterbalancing may require more participants and more sophisticated analysis
how does protocol analysis evaluate the individuals problem solving approach
-- verbalizations -- reaction times, error rates, patterns of brain activation, and sequences of eye fixations -- there seems to be a close correspondence between the participants thoughts and the information or objects they examine -- protocol analysis is one of the principal research methods within cognitive psychology, cognitive science, and behavior analysis
how did the relationships among the measures of central tendency depend on the shape of the distribution
-- with a normal distribution, all are equal -- in a positively-skewed distribution, skewed to the right, the mean is greater than the median, which is greater than the mode -- in a negatively skewed distribution, the mode is greater than the median, which is greater than the mean
how is multiple Baseline design different from an ABAB design
-- with multiple Baseline change only occurs when the treatment is introduced and may be used when I return to Baseline is not possible; for example, one cannot unlearn how to ride a bike -- with an ABAB design it examines one Behavior while introducing, removing, and reintroducing an intervention to one person in one setting
can you determine the percentile rank from a standard score and vice versa
--84% of a distribution Falls below a z score of 1 or 80 score of 60; as such either of these two standard scores may also be referred to as the 84th percentile
what is the difference between discriminant analysis and logistic regression
--LR prefers dichotomous DV whereas DFA prefers nominal (more than two categories) --LR accepts continuous and categorical predictors, DFA uses only continuous predictors --LR is more interested in the independent variables prediction power of the outcome, whereas DFA is more interested in the outcome itself --LR is preferred over DFA when the stricter DFA assumptions are not met --LR generally requires larger samples
what is latent trait analysis LTA AKA Item Response Theory (IRT)
--a form of factor analysis for binary/dichotomous or order categorical data (e.g multiple choice) -- frequently used in educational testing and psychological measurement -- researchers use LTA to reduce a set of many binary or order- categorical variables to a small set of factors called latent traits
what is multistage sampling
--a more complex form of cluster sampling --e.g. a school district might be chosen from all school districts, then a school might be randomly chosen from that district, and then a classroom randomly chosen from that school, and the children from that classroom are included in the study
what is counterbalancing and the 2 types
--a procedure that is a tool for controlling order Effects by strategically presenting treatments in a variety of sequences 1. within subject counterbalancing: different treatment sequences are presented to each participant 2. within group counterbalancing: the presentation of treatment is varied in different ways across participants
what is standard error
--a way to measure the average or standard distance between a sample mean and the population mean -- basically, it is a way to measure sampling error -- it is expressed in terms of standard deviation units and is affected by two variables: standard deviation of the Criterion variable and the sample size --SD/Square root of N
what is Criterion related validity
--establishes a statistical relationship with a particular Criterion variable independent of the measurement instrument itself --2 types 1. concurrent validity: the degree to which scores on a given test (e.g. exposure to sex abuse) correlate with scores on another, already established test (e.g. post-traumatic stress reactions) administered at the same time 2. predictive validity: determined by the degree to which scores on a given test correlate with scores on another already established test or criteria and that is administered at a future point in time
what is effect size
--it determines whether an effective practical rather than statistically significant; that is is the effect large enough to matter? -- it is used in meta-analysis to combine findings from multiple research studies due to its independence from sample sizes -- specific effect size measures very by context
what is internal validity
--it involves the appropriate research design and methodology, data collection procedures, and the analysis and interpretations of find things -- method triangulation ( using multiple research methodologies used within the study) and data triangulation ( multiple sources of data used for analysis) can increase the internal validity of a study
what is content validity
--it refers to the adequacy with which a specified domain of content or theoretical / social structure is measured within a representative sample -- this usually involves an expert systematically assessing whether the questions in the test accurately measure the necessary key characteristics of the specific domain or phenomenon they are intended to measure
what is the central limit theorem
--it states that if random samples of a fixed sample size N are drawn from any population, the distribution of sample means will approach a normal distribution curve as N increases -- the mean will be equal to the mean of the population -- also with a standard deviation (that is, the standard error of the mean) equal to the sample standard deviation divided by the square root of N
how does hierarchical regression differ from stepwise regression
--it utilizes the researcher, not the computer to determine the order of Entry of the variables --F-tests are used to compute the significance of each variable or group of variables reflected in R-square
what is a chi-square test
--non-parametric test (doedn't assume normality) -- tests the significance of the difference among frequencies within a given data distribution --data are categorical (e.g. eye color)
how are confidence intervals constructed
--say you have a mean = 20 --standard error = 0.1 --assume a large enough sample that the z-distribution may be used to approximate t --multiply the two-tailed z-value of 2 (rounded up from 1.96) times the standard error of the mean (.01) --the confidence interval of our mean would then be 20 +/- 0.2 (19.8-20.2) --you can use this for single scores also to determine the range in which a test-taker's true score lies
what are two disadvantages to using the mean
--the data requires interval or ratio variable; that is it needs to have meaningful differences between values -- it is affected by extreme values / outliers
what is autocorrelation
--the relationship between two values of the same variable measured at different times -- the correlation between measurements of a dependent variable taken at different times from the same subject -- in regression analysis it tends to underestimate are terms, inflating T values and reducing P values so that results are more likely to be deemed statistically significant
what is variance
--the statistical average of the amount of dispersion in a distribution of scores -- it is equal to the square of the standard deviation and measures a distributions variability or how the scores are dispersed around the mean -- when the range of scores increases, the variance also increases; conversely the tighter of the distribution the smaller the variance --sample variance is denoted by s2 (s squared) and population variance by greek o2 --calculated by Sum of squared deviations divided by (N-1) -- the variance is a critical step in the calculation of the standard deviation (SD; square root of variance)
what is pooled variance
--the weighted average of two sample variances -- calculated by first multiplying each sample variance by its degrees of freedom (condition sample size N-1) -- then dividing the sum of those two values by the total degrees of freedom (total sample size minus 2) -- pooled variance operates on the assumption that the population variances are equal, even if the sample variances differ -- the combined estimate of the population variance is thought to be in better estimate than either of the two sample variances alone
what are 2 disadvantages of using median and mode
--they are not as easily manipulated algebraically as the mean; they are not mathematically friendly -- they are not as stable of an estimate as the mean
what is external validity
--this has to do with how generalizable the research findings and conclusions are two other populations with similar characteristics -- internal validity must be strong because it influences external validity
what is the purpose of Developmental research
--to assess changes over an extended. Of time -- includes longitudinal, cross-sectional, and cross sequential studies
what is the attenuation correction formula
--true correlation = obtained correlation ÷ sqrt (reliabity of x * reliability of y) -- note that if the absolute value of the result is greater than 1, round to one because persons are ranges from -1 to 1
what is canonical correlation coefficient
--two sets of variables, one representing multiple independent variables, the other representing multiple dependent variables -- so it examines many-to-many rather than one to one or many to one relationships -- similar to factor analysis it produces multiple, McCall correlation coefficients (the first accounting for the largest portion of the relationship)
what is the difference between univariate analysis, bivariate analysis, and multivariate Analysis
--univariate is descriptive and explains the quality of a single variable at a time, while by there yet evaluates the properties of two variables in relation to each other -- multivariate considers the properties and relationships between more than two variables at a time -- both bivariate and multivariate statistical analyses are inferential
what is a confidence interval
--usually it is set at 1-alpha or (1-a; 95%) --basically, it says the 95% of the sample reflects the population mean --e.g. if 100 subjects were in the sample, 95 of them would be expected to contain the population mean
z-scores are assumed to have a normal distribution. What percentages are included between the mean and one, two, and three standard deviations from the mean
1. 50% of the distribution falls on either side of the mean--and here is hi ow it breaks down on one side of the mean: 2. 34% of the scores lie between the mean and one standard deviation 3. 14% between one and two standard deviations from the mean 4. 2% between two and three standard deviations from the mean
what are some examples of post hoc tests one can run if you're a Nova shows significant differences in means
1. Schefe's test: conservative; provides more protection against type 1 errors, and increases type 2 likelihood 2. Tukey's HSD: best for pairwise tests and protection against type 1 errors (Dunnett's is also good) 3. Fisher's LSD: liberal
what are the steps in meta-analysis
1. conduct a thorough literature search 2. one has to input all results and do a common scale, using a statistical method known as effect size; this is important because different studies typically use different statistical tests
what are the four key characteristics of a single subject design
1. continuous assessment: the subject is observed at several different times over a set period before and after the intervention is introduced 2. Baseline assessment: where the subject is assessed for a distinct period of time before the intervention is implemented 3. stability of performance: when there is little variability over time; if there is too much variability in the behavior prior to the intervention, being able to evaluate the impact of a subsequent intervention would be very difficult 4. the use of different phases: such as weeks or days and which particular conditions (either the baseline or an intervention) are implemented and the data are collected; the effects are inferred from assessing the pattern of data across the phases
what are two categories / subtypes of construct validity
1. convergent validity: demonstrated when tests that purport to measure the same attribute correlate strongly, and when tests that purport to measure theoretically related attributes correlate to a moderate degree 2. Divergent validity: test that report to measure theoretically unrelated attributes should correlate negligibly
what are three ways one can control extraneous variables in a study
1. elimination: for example, researchers May eliminate clothing to ensure that cultural elements of dress do not influence identification of facial expression 2. constancy Colin ensure that the variable is experienced by all participants 3. balancing: Distributing the variable evenly across groups using a matched pairs design
what are the four General designs
1. explanatory: qualitative data used to clarify quantitative data; with the sequential explanatory strategy, the quantitative data are collected and analyzed first to indicate major factors, domains, or variables for further qualitative exploration 2. exploratory: quantitative data used to clarify qualitative data; sequential exploratory designs are the opposite of explanatory designs in that the design begins with a collection of qualitative data to identify a theoretical framework which can be later tested through quantitative data collection and Analysis 3. triangulation: strengths of both methods used to compare, contrast, and validate findings; AKA concurrent triangulation 4. embedded: another concurrent design where one method has priority over the other; that is, one method is identified as the primary data set and the interpretive analyses as the secondary method
what are 9 main threats to internal validity
1. history, for example, we cannot determine whether changes in neighborhood crime rates are attributable to the new program or the additional officers on patrol 2. maturation: an instructional method appears to improve reading ability for 4th graders over the course of an academic year but is it due to the instructional method or does it reflect normal child development 3. testing: do the fourth grade students score higher on the second administration of a reading ability measure because of the treatment or did they just learn from already taking the test 4. instrumentation: changes to the measure; for example, if the contents of a political knowledge survey become widely known, the value of the instrument as a general knowledge measure is reduced 5. regression to the main Cullen for example if extremely aggressive offenders are placed in an experimental program, one might question whether decreased aggression levels are a result of the intervention or just regression to the mean 6. selection: participants may differ systematically by the experimental condition to which they are assigned; for example, someone is comparing the student heights of public vs private colleges but the public College sample includes only women and the private school contains only men 7. mortality AKA dropouts: for example maybe the drug abuse treatment subjects who are waiting for treatment drop out of the study to go get it and now the equivalent groups differ in proportion of severely addicted participants 8. interactions with selection: for example in selection-maturation grapes might mature at different rates leaving doubts as to what role the independent variable played in producing the effect 9. ambiguity about the direction of causation: for example an observational study measuring so see ability and self-esteem May establish a relationship between the two but fail to determine the causal nature of the relationship and / or whether both are brought about by a third Factor
name three different threats to internal validity when using the interrupted time series design
1. history: an event other than that under investigation occurs; may include control condition in two- group time series design 2. instrumentation: different pre and post-test may produce an illusory effect 3. testing: merely completing the pre-tests serves as a form of intervention to alter post test results
what are some ways to increase the power of a study oh, that is to increase the likelihood of finding an effect
1. increase the main difference or magnitude of effect 2. decrease the sample variance 3. increase sample size ( a power analysis entails calculating the sample size necessary to find an effect of a given size) 4. increasing alpha there by running a less conservative test 5. conduct a one tailed or directional test instead of a two tailed
what are the assumptions of Pearson's r
1. independent observations 2. linear relationship between the two variables 3. bivariate normality -- also important to note is that Pearson's R is susceptible to bivariate outliers and at times a suppressor variable may affect results, reducing or even obscuring a correlation that is actually present
what are the five main threats to external validity that limit the generalizability of research results
1. interaction between different treatments: this happens when subjects receive more than one treatment 2. interaction between testing and treatment: the measurement interacts with the treatment; for example, a protest prepares participants to pay particular attention to select components of the treatment; comparison to post test only control groups is recommended if this is a concern 3. interaction between setting and treatment: findings in one setting (for example a grade school classroom) are not applicable in another (the front line of a combat zone) 4. interaction between history and treatment: a treatment has different effects at different times (for example, before and after the attacks on the World Trade Center); if this is a concern one might perform a literature review on earlier work or replicate the study at a different time 5. reactivity: when measuring Behavior, an individual's Behavior may be a reaction to being measured
what are the four types of measurement scales and which variables can be measured
1. nominal: used for identifying or labeling 2. ordinal: ranking from highest to lowest but no indication of how much higher or lower one is in relation to another (e.g. Likert Scales 3. interval: represent continuous numeric values but do not have an absolute zero point 4. ratio: also represent continuous numeric values but do have an absolute zero, so you cannot say that X is 2 times more than y
what are three types of t-tests
1. one sample t-Test: test the hypothesis that a single sample mean is different than a specific hypothesized value 2. independent samples t-test: test the hypothesis that two unrelated samples are different from each other 3. related or dependent samples T t-test: test the hypothesis that the difference between two related samples (for example, pre and post-test scores) is not equal to zero (that is samples have different means)
what are some measures of variability accounted for
1. r-squared (single predictor) and R-squared (multiple predictors): proportion of variation accounted for in one variable through linear relationship with another (or others for R-squared) 2. Eta-squared: proportion of variability accounted for in one variable but a relationship (not necessarily linear) with another (or others) 3. squared Factor loading: proportion of variability accounted for in one variable by a factor
in a Time series design what two purposes does autocorrelation serve
1. to detect non Randomness in the data ( actual significance) 2. to identify an appropriate time series model if the data are not random -- consistent relationships in Time series data can be used to predict future values in the series
in which cases is the t-test more powerful
1. with larger sample sizes 2. larger mean differences 3. small sample variation
how did literature reviews and meta-analysis differ
a literature review is a summation of all of the research regarding a particular topic, while a meta-analysis includes both a summation and a calculation of effect size
what is a subject variable
a quasi-independent variable that is not manipulated by the experimenter e.g sex and comparing males vs females on something
what is portfolio assessment
a technique used in research in which a representative set of tasks assignments are assembled for the psychologist to review
what is a uniform frequency distribution
equal frequencies across the distribution; that is, a block
what kind of validity is dependent upon random selection
external validity
how are observational / correlational studies and experimental studies different with regard to the measurement of variables
in correlational studies both the independent and dependent variables are measured, where as an experimental studies one variable is manipulated (IV) and one is measured (DV)
what validity is affected by random assignment
internal validity-- when the researcher cannot be certain that the independent variable cause and effect nonparametric statistics are implemented
what is a multiple Baseline design
it attempts to replicate treatment effects across different behaviors, people, or settings
what is a double-blind
it is a design in which neither the experimenter nor the participants know what condition a participant has been assigned to
what is protocol analysis
it is a qualitative research technique that assumes that it is possible to instruct participants to verbalize their thoughts as they complete a task, thereby identifying the cognitive processes regarding the task at hand
what are some advantages of using the median
it is applicable to all BUT nominal data and is resistant to extreme values / outliers
what is discriminant analysis
it is used when several independent variables are used to predict group membership
what is systematic sampling
it uses a set interval to select participants from a randomly determined starting point
what is a semi partial correlation
it's the relationship between two variables after only one variable relationship with a third variable has been removed
what are data reduction techniques
methods by which the interrelationships between a set of variables are analyzed to produce a smaller number of Dimensions or factors
what is Eta
nonlinear relationships between two variables
what are sequence or order effects
occur when the order of treatment in a series influences a participants responds
what is point biserial correlation coefficient
one continuous variable, one dichotomous variable possessing only two values; and alternative or supplement to the t-test
what is Actuarial data
predictions based on statistical information rather than judgement
what is a j-shaped frequency distribution
skewed but without a tail on the side of the distribution with the mode
what is the coefficient of determination AKA r- squared
the proportion of variation shared by two variables; that is the amount of variability in one measure that is accounted for by the variability in the other
what are demand characteristics
they are the unintended cues that create an experimental artifact in that subjects can guess the hypothesis being tested and change their actions accordingly
what are demand characteristics
this refers to when a participant behaves according to what they think is expected based on external cues that a participant may receive throughout the treatment
what is tetrachoric correlation coefficient
two artificial dichotomous variables
what is biserial correlation coefficient
two continuous variables, one transformed into an artificial dichotomous variable
what is contingency correlation coefficient
two nominal variables
what is spearman's rho
two ordinal variables
what is attenuation
when a correlation coefficient decreases in response to measurement error in one or both variables