Research Methods and Statistics

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Experimental Research - Group Designs

1. Between-Subjects Designs: A study using a between-subjects design includes two or more groups of subjects, with each group being exposed to a different level of the independent variable. -For example, in a study comparing the effectiveness of low, moderate, and high doses of an antidepressant for reducing depressive symptoms, one group of subjects would receive the low dose, a second group would receive the moderate dose, and a third group would receive the high dose. 2. Within-Subjects Designs: When using a within-subjects design, each participant is exposed to some or all levels of the independent variable, with each level being administered at a different time. -When using a single-group within-subjects design to evaluate the effects of an antidepressant drug dose on depression, the low, moderate, and high doses would be compared by sequentially administering the three doses to all subjects and evaluating their depressive symptoms after they've taken each dose for a prespecified period of time. -The time-series design is a type of within-subjects design that's essentially a group version of the single-subject AB design and involves measuring the dependent variable at regular intervals multiple times before and after the independent variable is administered so that all participants act as both the control (no treatment) and treatment groups. 3. Mixed Designs: When using a mixed design, a study includes at least two independent variables, with at least one variable being a between-subjects variable and another being a within-subjects variable. -A mixed design is being used when the effects of drug dose on depressive symptoms are measured weekly for six weeks after participants begin taking either the low, moderate, or high dose of the antidepressant. In this situation, drug dose is a between-subjects variable because each subject will receive only one dosage level and time of measurement is a within-subjects variable because each subject's depressive symptoms will be measured at regular intervals over time.

Threats to internal validity

1. History: History refers to events that occur during the course of a study and are not part of the study but affect its results. -The best way to control history when it's due to events that occur outside the context of the study is to include more than one group and randomly assign participants to the different groups. --When this is done, participants in all groups should be affected to the same extent by history. -History can also be a threat when participants are exposed to the independent variable in groups and one group experiences an unintended event (e.g., a power outage or other disturbance) that's not experienced by other groups and that affects the results of the study. --This type of history is more difficult to control and must be considered when interpreting the results of a study. 2. Maturation: Maturation refers to physical, cognitive, and emotional changes that occur within subjects during the course of the study that are due to the passage of time and affect the study's results. The longer the duration of the study, the more likely its results will be threatened by maturation. -The best way to control maturation is to include more than one group in the study and randomly assign participants to the different groups. --When this is done, participants in all groups should experience similar maturational effects and any differences between the groups at the end of the study will not be due to maturation. 3. Differential Selection: Differential selection is a misnomer because it actually refers to differential assignment of subjects to treatment groups. It occurs when groups differ at the beginning of the study due to the way they were assigned to groups and this difference affects the study's results. -The best way to control differential selection is to randomly assign participants to groups so the groups are similar at the start of the study. 4. Statistical Regression: Statistical regression is also known as regression to the mean and threatens a study's internal validity when participants are selected for inclusion in the study because of their extreme scores on a pretest. -It occurs because many characteristics are not entirely stable over time and many measuring instruments are not perfectly reliable. -Statistical regression is controlled by not including only extreme scorers in the study or by having more than one group and ensuring that the groups are equivalent in terms of extreme scorers at the beginning of the study. 5. Testing: Testing threatens a study's internal validity when taking a pretest affects how participants respond to the posttest. -This threat is controlled by not administering a pretest or by using the Solomon four-group design, which is described below. 6. Instrumentation: Instrumentation is a threat to internal validity when the instrument used to measure the dependent variable changes over time. -For example, raters may become more accurate at rating participants over the course of the study. -The only way to control instrumentation is to ensure that instruments don't change over time. If that's not possible, its potential effects must be considered when interpreting the study's results. 7. Differential Attrition: Differential attrition threatens internal validity when participants drop out of one group for different reasons than participants in other groups do and, as a result, the composition of the group is altered in a way that affects the results of the study. -Attrition is difficult to control because researchers often don't have the information needed to determine how participants who drop out from a study differ from those who remain.

Threats to External Validity

1. Reactivity: Reactivity threatens a study's external validity whenever participants respond differently to the independent variable during a study than they would normally respond. Factors that contribute to reactivity include demand characteristics and experimenter expectancy. --Demand characteristics: are cues that inform participants of what behavior is expected of them. --Experimenter expectancy: occurs when the experimenter acts in ways that bias the results of the study and can involve (a) actions that take the form of demand characteristics and directly affect participants (e.g., saying "good" whenever a participant gives the expected or desired response) or (b) actions that don't directly affect participants (e.g., recording the responses of participants inaccurately in a way that supports the purpose of the study). -The best ways to control reactivity are to use unobtrusive measures, deception, or the single- or double-blind technique. -When using the single-blind technique, participants do not know which group they're participating in (e.g., if they're in the treatment or control group); when using the double-blind technique, participants and researchers do not know what group participants are in. 2. Multiple Treatment Interference: Multiple treatment interference is also referred to as carryover effects and order effects. It may occur whenever a within-subjects research design is used - i.e., when each participant receives more than one level of the independent variable. -For example, if a low dose, moderate dose, and high dose of a drug are sequentially administered to a group of participants and the high dose is most effective, its superior effect may be due to the fact that it was administered after the low and moderate doses. -Multiple treatment interference is controlled by using counterbalancing, which involves having different groups of participants receive the different levels of the independent variable in a different order. -The Latin square design is a type of counterbalanced design in which each level of the independent variable occurs equally often in each ordinal position. 3. Selection-Treatment Interaction: A selection-treatment interaction is a threat to external validity when research participants differ from individuals in the population, and the difference affects how participants respond to the independent variable. -For example, people who volunteer for research studies may be more motivated and, therefore, more responsive to the independent variable than non-volunteers would be. The best way to control this threat is to randomly select subjects from the population. 4. Pretest-Treatment Interaction. A pretest-treatment interaction is also known as pretest sensitization and threatens a study's external validity when taking a pretest affects how participants respond to the independent variable. -For example, answering questions about a controversial issue in a pretest may make subjects pay more attention to information about that issue when it's addressed in a lecture or discussion during the study. -The Solomon four-group design is used to identify the effects of pretesting on a study's internal and external validity. --When using this design, the study includes four groups that allow the researcher to evaluate (a) the effects of pretesting on the independent variable by comparing two groups that are both exposed to the independent variable, with only one group taking the pretest and (b) the effects of pretesting on the dependent variable by comparing two groups that are not exposed to the independent variable, with one group taking the pre- and posttests and the other taking the posttest only.

Factorial Designs

A research design is referred to as a factorial design whenever it includes two or more independent variables. An advantage of a factorial design is that it allows a researcher to obtain information on the main effects of each independent variable as well as the interaction between the variables. -A main effect is the effect of one independent variable on the dependent variable -An interaction effect is the combined effect of two or more independent variables on the dependent variable. Note that any combination of significant main and interaction effects is possible: -There can be main effects of one or more independent variables and interaction effects, main effects of one or more independent variables and no interaction effects, or no main effects but significant interaction effects. -In addition, when there's a significant interaction, any main effects must be interpreted with caution because the interaction may modify the meaning of the main effects. As an example, a research study evaluating the effects of type of therapy (cognitive-behavioral therapy, interpersonal therapy, and supportive therapy) and antidepressant drug dose (high, moderate, and low) on depressive symptoms has two independent variables (type of therapy and drug dose) and subjects will be assigned to one of nine groups that each represent a different combination of the levels of the two variables - cognitive therapy and low dose, interpersonal therapy and low dose, supportive therapy and low dose, cognitive therapy and moderate dose, etc. The results of the statistical analysis of the data obtained in this study will indicate if there are significant main and/or interaction effects. For example, the results might indicate that, overall, cognitive therapy is significantly more effective than interpersonal therapy and supportive therapy and the high dose is significantly more effective than the moderate or low dose of the antidepressant drug. In other words, there are main effects for both type of therapy and drug dose. The results of the study might also indicate that the moderate dose is most effective for people who received cognitive therapy but that the high dose is most effective for people who received interpersonal therapy or supportive therapy. -In other words, there's an interaction between type of therapy and drug dose: The effects of drug dose differ for different types of therapy.

Decision errors

Because inferential statistics is based on probability theory, when a researcher makes a decision to retain or reject the null hypothesis based on the results of an inferential statistical test, it's not possible to be certain whether the decision is correct or incorrect. With regard to correct decisions, a researcher can either retain a true null hypothesis or reject a false null hypothesis. -When a researcher retains a true null hypothesis, he or she has correctly concluded that the independent variable has not had a significant effect on the dependent variable and that any observed effect is due to sampling error or other factors. -And when a researcher rejects a false null hypothesis, the researcher has correctly concluded that the independent variable has had a significant effect on the dependent variable. With regard to incorrect decisions, a researcher can either reject a true null hypothesis or retain a false null hypothesis. -When a researcher rejects a true null hypothesis, the researcher has concluded that the independent variable has had a significant effect on the dependent variable, but the observed effect is actually due to sampling error or other factors. --This type of incorrect decision is known as a Type I error. ---For the EPPP, you want to know that the probability of making a Type I error is equal to alpha, which is also known as the level of significance and is set by a researcher before analyzing the data he/she has collected. Alpha is usually set at .05 or .01: When it's .05, this means there's a 5% chance of making a Type I error; when it's .01, this means there's a 1% chance of making a Type I error. The second type of incorrect decision occurs when a researcher retains a false null hypothesis. -In other words, the researcher has concluded that the independent variable has not had a significant effect on the dependent variable when it actually has, but the researcher was not able to detect the effect because of sampling error or other factors. --This type of incorrect decision is referred to as a Type II error. ---The probability of making a Type II error is equal to beta which is not set by the researcher but can be reduced by increasing statistical power.

Community-Based Participatory Research (CBPR)

CBPR "is a collaborative approach to research that equitably involves all partners in the research process and recognizes the unique strengths that each brings. CBPR begins with a research topic of importance to the community and has the aim of combining knowledge with action and achieving social change to improve health outcomes and eliminate health disparities". The research topic can be identified by the community itself or by the community in collaboration with the research team which may include educators; policy decision-makers; and psychologists, physicians, and other healthcare professionals. Nine core principles of CBPR have been identified by Israel and her colleagues (1998): CBRP (1) recognizes the community as a unit of identity, (2) builds on the community's strengths and resources, (3) emphasizes an equitable and collaborative partnership during all phases of the research study, (4) fosters co-learning and capacity building among all partners, (5) integrates knowledge generation and intervention for the benefit of all partners, (6) recognizes that research should be driven by the community and locally relevant problems, (7) involves a cyclical and iterative process, (8) disseminates research findings to all partners and involves community participants in the dissemination process, and (9) understands that CBPR is a long-term process that requires a commitment to sustainability. These principles are not considered to be absolute or exhaustive and are modified to fit the particular circumstances of a research project.

regression analysis

Correlation is often of interest because the goal is to use obtained predictor scores to estimate criterion scores. Prediction is made possible with regression analysis, which uses the data collected in a correlational study to produce a regression equation. This equation is then used to predict a person's criterion score from his/her obtained predictor score. The accuracy of prediction increases as the correlation between the predictor and criterion increases.

Correlation

Correlation is used to determine the degree of association between two or more variables. It's often of interest when the goal is to use one or more predictors to estimate status on one or more criteria. (In the context of correlation, an independent variable is usually referred to as the predictor or X variable, and a dependent variable is referred to as the criterion or Y variable.) Most correlation coefficients range from -1.0 (a perfect negative correlation) to +1.0 (a perfect positive correlation), and many coefficients are symbolized with the letter "r." It's important to notice the subscript of a correlation coefficient: When it contains two different letters or numbers (e.g., "xy"), this means the coefficient is a measure of the relationship between two different variables. -Depending on the context, this coefficient can just be a measure of the relationship between variables or it can be a criterion-related validity coefficient or a factor loading. In contrast, when the subscript contains two of the same letters or numbers (e.g., "xx"), it's a reliability coefficient. This summary addresses correlation coefficients for two different variables; reliability coefficients are covered in a test construction summary.

Parametric ns Nonparametric tests

Inferential statistical tests are divided into two types - nonparametric and parametric. Nonparametric tests are used to analyze nominal and ordinal data and include the chi-square test parametric tests are used to analyze interval and ratio data and include the t-test and analysis of variance. -In addition, the use of a parametric test assumes that certain assumptions are met - e.g., that the data are normally distributed and that there is homogeneity of variances, which means that the variances for the different groups are similar. Consequently, even when the dependent variable is measured on an interval or ratio scale, a nonparametric test is ordinarily used when these assumptions are violated and the group sizes are small and unequal.

Probability Theory and Sampling Distributions

Inferential statistics are used to determine if the results of a research study are due to the effects of an independent variable on a dependent variable or to sampling error and involve using an inferential statistical test to compare the obtained sample value to the values in an appropriate sampling distribution. When the sample value of interest is a mean, the appropriate sampling distribution is a sampling distribution of means. -It's the distribution of mean scores that would be obtained if a very large number of same-sized samples were randomly drawn from the population, and the mean score on the variable of interest was calculated for each sample. While many of the sample means would be equal to the population mean, some of the samples would have higher or lower means because of the effects of sampling error, which is a type of random error. -In other words, the sample means would vary, not because individuals in the samples were exposed to the independent variable, but because of the effects of sampling error. In inferential statistics, a sampling distribution of means is not actually constructed by obtaining a large number of random samples from the population and calculating each sample's mean. -Instead, probability theory - and, more specifically, the central limit theorem - is used to estimate the characteristics of the sampling distribution. The central limit theorem makes three predictions about the sampling distribution of means: (a) The sampling distribution will increasingly approach a normal shape as the sample size increases, regardless of the shape of the population distribution of scores. (b) The mean of the sampling distribution of means will be equal to the population mean. (c) The standard deviation of the sampling distribution - which is referred to as the standard error of means - will be equal to the population standard deviation divided by the square root of the sample size.

Sampling Methods for Group Designs

It is ordinarily not possible to collect data from all members of the target population when conducting a research study and, consequently, a sample of individuals is selected from the population. Methods for selecting a sample are categorized as probability and non-probability sampling methods. 1. Probability Sampling: Probability sampling requires the random selection of the sample from the population, which helps ensure that members of the sample are representative of the population. However, even when a sample is randomly selected, the sample may be affected by sampling error, which means that the sample is not completely representative of the population from which it was selected due to the effects of chance (random) factors. Sampling error is most likely to be a problem when the sample size is small. Methods of probability sampling include the following: (a) When using simple random sampling, all members of the population have an equal chance of being selected. -Using a computer-generated sample of individuals that was randomly chosen from a list of all individuals in the population is one method of simple random sampling. (b) Systematic random sampling can be used when a random list of all individuals in the population is available. -It involves selecting every nth (e.g., 10th or 25th) individual from the list until the desired number of individuals has been selected. (c) Stratified random sampling is useful when the population is heterogeneous with regard to one or more characteristics that are relevant to the study (e.g., gender, age range, DSM diagnosis), and the researcher wants to make sure that each characteristic is adequately represented in the sample. -This involves dividing the population into subgroups (strata) based on the relevant characteristics and selecting a random sample from each subgroup. (d) Cluster random sampling is used when it is impossible to randomly select individuals from a population because the population is very large and there are natural clusters in the population (e.g., mid-sized cities, school districts, mental health clinics). -It involves randomly selecting a sample of clusters and then either including in the study all individuals in each selected cluster or a random sample of individuals in each selected cluster. 2. Non-Probability Sampling: When using non-probability sampling, individuals are selected on the basis of non-random criteria and all members of the population do not have an equal chance of being selected. Non-probability sampling is vulnerable to sampling error and sampling bias, which is also known as selection bias and systematic error. Sampling bias occurs when participants in the study over- or underrepresent one or more relevant population characteristics because of the way that the sample was obtained. Consequently, non-probability sampling is most useful for qualitative and exploratory studies designed to acquire a better understanding of an under-researched issue or population rather than studies designed to test hypotheses. Methods of non-probability sampling include the following: (a) Convenience sampling involves including in a sample individuals who are easily accessible to the researcher (e.g., the students in a psychologist's clinical psychology classes). (b) When using voluntary response sampling, the sample consists of individuals who volunteered to participate in the study. (c) Purposive sampling is also known as judgmental sampling. When using this method, researchers use their judgment to select individuals who are appropriate for the purposes of their studies. (d) Snowball sampling is used when direct access to members of the target population is difficult. It involves asking initial individuals who participate in the study if they can recommend others who qualify for inclusion in the study.

Leptokurtic and platykurtic distributions

Leptokurtic and platykurtic distributions are another type of non-normal distribution. Leptokurtic distribution: has a sharper peak and flatter tails than a normal distribution (i.e., most scores are "piled up" in the middle of the distribution). Platykurtic distribution: is flatter in the middle and has thicker tails than a normal distribution (i.e., scores are more evenly distributed throughout the distribution).

Multivariate correlational techniques

Multivariate correlational techniques are extensions of bivariate correlation and regression analysis. They make it possible to use two or more predictors to estimate status on one or more criteria. (a) Multiple regression is the appropriate technique when two or more predictors will be used to estimate status on a single criterion that's measured on a continuous scale. There are two forms of multiple regression: 1. Simultaneous (standard) multiple regression involves entering data on all predictors into the equation simultaneously. 2. Stepwise multiple regression involves adding or subtracting one predictor at a time to the equation in order to identify the fewest number of predictors that are needed to make accurate predictions. When using multiple regression, the optimal circumstance is for each predictor to have a high correlation with the criterion but low correlations with other predictors since this means that each predictor is providing unique information. When predictors are highly correlated with one another, this is referred to as multicollinearity. (b) Canonical correlation is the appropriate technique when two or more continuous predictors will be used to estimate status on two or more continuous criteria. (c) Discriminant function analysis is the appropriate technique when two or more predictors will be used to estimate status on a single criterion that's measured on a nominal scale. -Logistic regression is the alternative to discriminant function analysis when the assumptions for discriminant function analysis are not met (e.g., when scores on the predictors are not normally distributed).

Qualitative Research - Approaches

Qualitative research is used to study the kind and quality of behavior and produces information that's interpreted and usually summarized in a narrative description. The approaches to qualitative research include the following: (a) Grounded Theory: The primary goal of research based on grounded theory is "to derive a general, abstract theory of a process, action, or interaction grounded in the views of the participants in a study". -The primary data collection methods are interviews and observations. (b) Phenomenology: The purpose of research using a phenomenological approach is to gain an in-depth understanding of the "lived experience" of participants - i.e., "how they perceive it, describe it, feel about it, judge it, remember it, make sense of it, and talk about it with others". -In-depth interviews are the primary source of information. (c) Ethnography: Ethnography involves "studying participants in their natural culture or setting ... [while they're engaged] in their naturally occurring activities". -The primary data collection method is participant observation, which involves joining a culture and participating in its activities. (d) Thematic Analysis: Thematic analysis "is a method for identifying, analyzing, and reporting patterns (themes) within the data". It is a "stand-alone" method but also sometimes serves at the starting point for other methods. The primary sources of information are in-depth interviews and focus groups.

Quantitative Research

Quantitative research is used to identify and study differences in the amount of behavior and produces data that's "expressed numerically and can be analyzed in a variety of ways". The types of quantitative research can be categorized as descriptive, correlational, or experimental. (a) Descriptive research is conducted to measure and "describe a variable or set of variables as they exist naturally". (b) Correlational research involves correlating the scores or status of a sample of individuals on two or more variables to determine the magnitude and direction of the relationship between the variables. -Variables are usually measured as they exist, without any attempt to modify or control them or determine if there's a causal relationship between them. -The data collected in a correlational research study are often used to conduct a regression analysis or multiple regression analysis to derive a regression or multiple regression equation. -The equation is then used to predict a person's score on a criterion from his/her obtained score(s) on the predictor(s). (In the context of correlational research, an independent variable is often referred to as the predictor or X variable and the dependent variable is referred to as the criterion or Y variable.) (c) Experimental research is conducted to determine if there's a causal relationship between independent and dependent variables. -A distinction is made between true experimental and quasi-experimental research: --True experimental: A researcher conducting a true experimental research study has more control over the conditions of the study and, consequently, can be more confident that an observed relationship between independent and dependent variables is causal. ---The most important aspect of control for true experimental research is the ability to randomly assign subjects to different levels of the independent variable, which helps ensure that groups are equivalent at the beginning of the study.

Positively and Negatively Skewed Distributions

Skewed distributions are one type of non-normal distribution. These distributions are asymmetrical, with most scores "piled up" in one side of the distribution and a few scores in the extended tail on the other side: Negatively skewed distribution: the few scores are in the negative tail (the low side of the distribution); the mean has the lowest value, the median is the middle value, and the mode has the highest value. Positively skewed distribution: the few scores are in the positive tail (the high side of the distribution) - i.e., the "tail containing the few scores tells the tale." The mean has the highest value, the median is the middle value, and the mode has the lowest value In skewed distributions, the mean, median, and mode do not equal the same value: Instead, the mean is in the extended tail with the few scores, the median is in the middle, and the mode is in the side of the distribution that contains most of the scores. When a distribution is skewed, the median is often the preferred measure of central tendency because, unlike the mean, it's not distorted by the atypical scores in the distribution

Statistical Power

Statistical power refers to the ability to reject a false null hypothesis and is affected by several factors: One factor is the size of alpha: The larger the size of alpha, the greater the power. -However, alpha is kept small to reduce the probability of making a Type I error, which is why it's usually set at .01 or .05 rather than at a larger value. A second factor is the size of the effect of the independent variable on the dependent variable: An independent variable is more likely to have a significant effect when it's of sufficient magnitude and is administered for a sufficient length of time. A third factor is the sample size: The larger the sample, the greater the power. And a fourth factor is the type of inferential statistical test that's used to analyze the data. Parametric tests are more powerful than nonparametric tests, but they can be used only when the data to be analyzed are interval or ratio data and certain assumptions are met. The t-test and analysis of variance are parametric tests and are more powerful than the chi-square test, which is a nonparametric test that's used to analyze nominal data.

Clinical Significance

Statistical significance and practical significance do not indicate if the effects of an intervention have clinical significance, which refers to the importance or meaningfulness of the effects. For example, even when an intervention has statistical and practical significance, this does not indicate if the intervention is likely to move an individual from a dysfunctional to a normal level of functioning. The Jacobson-Truax method is one method for evaluating the clinical significance of an intervention for each participant in a clinical trial or other research study. It involves two steps: The first step is to calculate a reliable change index (RCI) to determine if the difference in an individual's pretreatment and posttreatment test scores is statistically reliable - i.e., if the difference is due to actual change rather than measurement error. It's calculated by subtracting the individual's pretest score from his or her posttest score and dividing the result by the standard error of the difference. -The RCI can be positive or negative, depending on whether a high or low test score is indicative of improvement. -When the change in scores is in the desired direction and RCI is larger than +1.96 or -1.96, the change is considered reliable (not due to measurement error). The second step is to identify the test cutoff score that distinguishes between dysfunctional and functional behavior or performance to determine if an individual's posttest score is within the functional range. -One way to determine the cutoff score is to calculate the score that is midway between the mean score for the dysfunctional (patient) population and the mean score for the functional (non-patient) population. Finally, using the information derived from these two steps, the individual is classified as recovered (passed RCI and cutoff criteria), improved (passed RCI but not cutoff criteria), unchanged/indeterminate (passed neither criteria), or deteriorated (passed RCI in the unintended direction).

Students t-test

The Student's t-test is used when a study includes one independent variable that has two levels and one dependent variable that's measured on an interval or ratio scale. In this situation, the t-test will be used to compare two means. For example, the t-test would be used to compare the mean mock EPPP exam scores obtained by psychologists who participated in either a live exam review workshop or an online exam review workshop. There are three t-tests and the appropriate one depends on how the two means were obtained: -The t-test for a single sample is used to compare an obtained sample mean to a known population mean. (In this situation, the population is acting as the no-treatment control group.) -The t-test for unrelated samples is also known as the t-test for uncorrelated samples and is used to compare the means obtained by two groups when subjects in the groups are unrelated - e.g., when subjects were randomly assigned to one of the two groups. -Finally, the t-test for related samples is also known as the t-test for correlated samples and is used to compare two means when there's a relationship between subjects in the two groups. --This occurs when (a) participants are "natural" pairs (e.g., twins), and members of each pair are assigned to different groups; (b) participants are matched in pairs on the basis of their pretest scores or status on an extraneous variable, and members of each pair are assigned to different groups; or (c) a single-group pretest-posttest design is used and subjects are "paired" with themselves.

Chi-Square Test

The chi-square test is used when the data to be analyzed are nominal data. There are two chi-square tests - the single-sample chi-square test, which is also known as the chi-square goodness-of-fit test, and the multiple-sample chi-square test, which is also known as the chi-square test for contingency tables. It will be easier to determine which chi-square test to use for an exam question if you substitute the word "variable" for "sample": The single-sample (single-variable) chi-square test is used to analyze data from a descriptive study that includes only one variable The multiple-sample (multiple-variable) chi-square test is used to analyze data from (a) a descriptive study that has two or more variables that can't be identified as independent or dependent variables or (b) an experimental study that has independent and dependent variables. Remember that, when determining the number of variables for the chi-square test, you count all of the variables. As an example, you would use the single-sample chi square test to analyze data collected in a study to determine whether college undergraduates prefer to use a hard-copy textbook or an online textbook for their introductory statistics class. This is a descriptive study with a single nominal variable that has two categories, and the single-sample chi-square test would be used to compare the number of students in the two categories. -If this study is expanded to include type of course (face-to-face course or online course), the study is still a descriptive study but it includes two variables, and a statistical test will be used to compare the number of students in the four categories (prefer hard-copy text/face-to-face course, prefer hard-copy text/online course, prefer online text/face-to-face course, and prefer online-text/online course). --Because the study includes two variables and the data to be analyzed are nominal (the number of subjects in each nominal category), the multiple-sample chi-square test is the appropriate statistical test.

Bivariate Correlation Coefficients

The choice of a bivariate correlation coefficient is based primarily on the scale of measurement used to measure the two variables. (a) The Pearson r is also known as the Pearson product moment correlation coefficient and is used when both variables are measured on a continuous (interval or ratio) scale and the relationship between the variables is linear. -When the relationship between variables is nonlinear, the Pearson r will underestimate the degree of that relationship. -An alternative to the Pearson r is eta, which can be used when both variables are continuous and their relationship is linear or nonlinear. (b) The Spearman rho is also known as the Spearman rank correlation coefficient and is used when data on both variables are reported as ranks. (c) The point biserial correlation coefficient is used when one variable is continuous and the other is a true dichotomy. (A dichotomy is a nominal variable with only two categories.) -The distinction between being pregnant or not being pregnant is a true dichotomy. (d) The biserial correlation coefficient is used when one variable is continuous and the other is an artificial dichotomy. -An artificial dichotomy occurs when a continuous variable is dichotomized. -Final exam scores represent an artificial dichotomy when a cutoff score is used to divide the scores into two categories - pass and fail. (e) The contingency correlation coefficient is used when both variables are measured on a nominal scale.

Other Forms of the Analysis of Variance

The factorial ANOVA, mixed ANOVA, randomized block ANOVA, ANCOVA, MANOVA, and trend analysis are other forms of the analysis of variance that you want to be familiar with for the exam. (a) The factorial ANOVA is an extension of the one-way ANOVA that's used when a study includes more than one independent variable. It's also referred to as a two-way ANOVA when the study includes two independent variables, a three-way ANOVA when a study includes three independent variables, etc. A factorial ANOVA produces separate F-ratios for the main effects of each independent variable and their interaction effects. (b) The mixed ANOVA is also known as the split-plot ANOVA and is used when the data were obtained from a study that used a mixed design - i.e., when the study included at least one between-subjects independent variable and at least one within-subjects independent variable. (c) The randomized block ANOVA is used to control the effects of an extraneous variable on a dependent variable by including it as an independent variable and determining its main and interaction effects on the dependent variable. When using the randomized block ANOVA, the extraneous variable is referred as the "blocking variable." (d) The analysis of covariance (ANCOVA) is also used to control the effects of an extraneous variable on a dependent variable but does so by statistically removing its effects from the dependent variable. When using the ANCOVA, the extraneous variable is the "covariate." (e) The multivariate analysis of variance (MANOVA) is the appropriate statistical test when a study includes one or more independent variables and two or more dependent variables that are each measured on an interval or ratio scale. (f) Trend analysis is used when a study includes one or more quantitative independent variables and the researcher wants to determine if there's a significant linear or nonlinear (quartic, cubic, or quadratic) relationship between the independent and dependent variables.

One-Way Analysis of Variance

The one-way analysis of variance (ANOVA) is the appropriate statistical test when a study includes one independent variable that has more than two levels and one dependent variable that's measured on an interval or ratio scale and the groups are unrelated. -It would be the appropriate statistical test to compare the effects of cognitive-behavior therapy, interpersonal therapy, and acceptance and commitment therapy on severity of depressive symptoms when clinic clients are randomly assigned to one of the therapies and symptoms are measured on an interval or ratio scale. Although the one-way ANOVA can be used when a study has one independent variable with only two levels, the t-test has traditionally been used in this situation. Also, separate t-tests can be used to compare three or more levels of a single independent variable, but this would require conducting separate t-tests for each pair of means. The one-way ANOVA produces an F-ratio. For the EPPP, you want to know that the numerator of the F-ratio is referred to as the "mean square between" (MSB) and is a measure of variability in dependent variable scores that's due to treatment effects plus error and that the denominator is referred to as the "mean square within" (MSW) and is a measure of variability that's due to error only. Whenever the F-ratio is larger than 1.0, this suggests that the independent variable has had an effect on the dependent variable. Whether or not this effect is statistically significant depends on several factors including the size of alpha. A disadvantage of this approach is that it increases the probability of making a Type I error - i.e., it increases the experimentwise error rate. As an example, when the independent variable has three levels and separate t-tests are used to compare means, three t-tests would have to be conducted (Group #1 vs. Group #2, Group #1 vs. Group #3, and Group #2 vs. Group #3). -If alpha is set at .05 for each t-test, this would result in an experimentwise error rate of about .15. When using the one-way ANOVA, all possible comparisons between means are made in a way that maintains the experimentwise error rate at the alpha level set by the researcher. (Note that the terms experimentwise error rate and familywise error rate are often used interchangeably, but that some authors distinguish between the two, with experimentwise error rate referring to the Type I error rate for all statistical analyses made in a research study and familywise error rate referring to the Type I error rate for a subgroup of statistical analyses. For example, in a study with two or more independent variables, analyses of the main effects of each independent variable would be one family and analyses of the interaction effects of the independent variables would be another family.)

Practical Significance

The results of a statistical test indicate whether or not the results of a study are statistically significant; however, researchers often want to know if the results have practical significance, which refers to the magnitude of the effects of an intervention - i.e., the intervention's "effect size." Cohen's d is one of the methods used to measure effect size and indicates the difference between two groups (a treatment group and a control group or two different treatment groups) in terms of standard deviation units. The results of a statistical test indicate whether or not the results of a study are statistically significant; however, researchers often want to know if the results have practical significance. Cohen's d is one of the methods used to assess practical significance and indicates the difference between two groups (a treatment group and a control group or two different treatment groups) in terms of standard deviation units. It's calculated by dividing the mean difference between the groups on the dependent variable by the pooled standard deviation for the two groups. -As an example, if d is .50 for treatment and control groups, this means that the treatment group's mean on the dependent variable was one-half standard deviation above the control group's mean. Cohen (1969) provided guidelines for interpreting d: A d less than .2 indicates a small effect of the independent variable, a d between .2 and .8 indicates a medium effect, and a d larger than .8 indicates a large effect. (Cohen's f is the alternative to Cohen's d when the comparison involves more than two groups.)

Choosing an inferential statistical test

The selection of an inferential statistical test requires consideration of several factors. The first factor is the scale of measurement of the data that will be analyzed with the test. The scale of measurement narrows the choices, but it's sometimes necessary to consider other factors such as the number of independent variables, the number of levels of the independent variable, whether the groups are related or unrelated, if an extraneous variable needs to be controlled, and the number of dependent variables.

Correlation Assumptions

The use of most correlation coefficients is based on three assumptions. 1. The first assumption is that the relationship between variables is linear. -When the relationship is nonlinear, the correlation coefficient may underestimate their actual relationship. 2. The second assumption is that there's an unrestricted range of scores for all variables. -When there's a restricted range (e.g., when the sample included only people with average scores rather than low, average, and high scores), the correlation coefficient may underestimate the actual relationship. 3. And the third assumption is that there's homoscedasticity - i.e., that the variability of criterion scores is similar for all predictor scores. -When this assumption is violated (when the variability of criterion scores differs for different predictor scores), the use of a regression equation to predict people's criterion scores from their obtained predictor scores will not have the same accuracy of prediction for all predictor scores.

Experimental Research - Single Subject Designs

The various single-subject designs share the following characteristics: (a) They include at least two phases: a baseline (no treatment) phase, which is designated with the letter "A," and a treatment phase, which is designated with the letter "B." (b) The treatment phase does not usually begin until a stable pattern of performance on the dependent variable is established during the baseline phase. (c) The dependent variable is measured multiple times during each phase, which helps a researcher determine if a change in the dependent variable is due to the independent variable or to maturation, history, or other factor. Although single-subject designs are ordinarily used with a single subject, the multiple-baseline design across subjects design includes two or more subjects and the other single-subject designs can be used with multiple subjects when the subjects are treated as a single group. 1. AB Design: The AB design consists of a single baseline (A) phase and a single treatment (B) phase. Like all single-subject designs, it helps a researcher determine if an observed change in the dependent variable is due to the independent variable or to maturation since maturational effects (e.g., fatigue, boredom) usually occur gradually over time. Consequently, changes in performance on the dependent variable due to maturation would be apparent in the pattern of the individual's performance. The AB design does not control history, however, because any change in the dependent variable that occurs when the independent variable is applied could be due to the independent variable or to an unintended event that occurred at the same time the independent variable was applied. 2. Reversal Designs: A single-subject design is referred to as a reversal or withdrawal design when at least one additional baseline phase is added. The ABA and ABAB designs are reversal designs. The ABAB design begins with a baseline phase which is followed by a treatment phase, withdrawal of the treatment during a second baseline phase, and then application of the same treatment during the second treatment phase. The advantage of adding phases is that doing so helps a researcher determine if a change in the dependent variable is due to history rather than the independent variable: When the dependent variable returns to its initial baseline level during the second baseline phase and to its initial treatment level during the second treatment phase, it's unlikely that changes in the dependent variable were due to unintended events. 3. Multiple Baseline Design: When using the multiple baseline design, the independent variable is sequentially applied across different "baselines," which can be different behaviors, tasks, settings, or subjects. For example, a psychologist might use a multiple-baseline across behaviors design to evaluate the effectiveness of response cost for reducing a child's undesirable interactions with other children during recess - i.e., name calling, hitting, and making obscene gestures. To do so, the psychologist would obtain baseline data on the number of times the child engages in each behavior during morning recess for five school days. He would then apply response cost to name calling during recess for the next five school days while continuing to obtain baseline data for hitting and making obscene gestures. Next, the psychologist would apply response cost to name calling and hitting for the next five school days while continuing to obtain baseline data for making obscene gestures. And, finally, he would apply response cost to name calling, hitting, and making obscene gestures during recess for the next five school days. If the results of the study indicate that each undesirable behavior remained stable during its baseline phase and decreased only when response cost was applied to it, this would demonstrate the effectiveness of response cost for all three behaviors. An advantage of the multiple baseline design over the reversal designs is that, once the independent variable is applied to a behavior, task, setting, or participant, it does not have to be withdrawn during the course of the study.

Ordinal Scale

Variables measured on an ordinal scale divide people into categories that are ordered in terms of magnitude. When a variable is measured on an ordinal scale, you can conclude that one person has more or less of the characteristic being measured but you cannot determine how much more or less. Likert scale scores (e.g., strongly agree, agree, disagree, and strongly disagree) and ranks (1st, 2nd, 3rd, etc.) are ordinal scores

Null and Alternative Hypothesis

To use inferential statistics to test a hypothesis about the relationship between independent and dependent variables, the verbal hypothesis about the relationship must be converted to two statistical hypotheses: the null hypothesis and the alternative hypothesis. The null hypothesis is stated in a way that indicates that the independent variable does not have an effect on the dependent variable The alternative hypothesis is stated in a way that indicates that the independent variable does have an effect on the dependent variable. The alternative hypothesis is usually most similar to the verbal hypothesis.

Qualitative Research - Triangulation

Triangulation "is the research practice of comparing and combining different sources of evidence in order to reach a better understanding of the research topic". It is most associated with qualitative research as a method for increasing the credibility of a study's data and results, but it is also used in quantitative research and mixed methods research which combines qualitative and quantitative methods. Denzin (1978) distinguished between four types of triangulation: -Methodological triangulation: is the most commonly used type --involves using multiple methods to obtain data (e.g., interviews, focus groups, observations, questionnaires, documents). -Data triangulation: involves using the same method to obtain data at different times, in different settings, or from different people. -Investigator triangulation: involves using two or more investigators to collect and analyze data. Theory triangulation: involves interpreting data using multiple theories, hypotheses, or perspectives.

Nominal Scale

Variables measured on a nominal scale divide people into unordered categories. Gender, eye color, and DSM diagnosis are nominal variables. Numbers can be assigned to the categories, but they're just labels and do not provide any quantitative information. When a nominal variable has only two categories, it's also known as a dichotomous variable

Ratio Scale

Variables measured on a ratio scale assign people to ordered categories, with the difference between adjacent categories being equal and the scale having an absolute zero point. Weight measured in pounds and yearly income measured in dollars represent a ratio scale. Because of the absolute zero point, it's possible to draw certain conclusions about ratio data that can't be drawn about interval data. For example, it's not possible to conclude that a person who has an IQ of 200 is twice as intelligent as a person who has an IQ of 100 because IQ scores represent an interval scale, but it is possible to conclude that a person who weighs 200 pounds is twice as heavy as a person who weighs 100 pounds because weight in pounds is a ratio scale

Interval Scale

Variables measured on an interval scale assign people to ordered categories, with the difference between adjacent categories being equal. Scores on standardized tests often represent an interval scale. For example, IQ scores are interval scores, and the one point difference between IQ scores of 100 and 101 is considered to be the same as the one point difference between IQ scores of 101 and 102. -Note, however, that interval scales do not have an absolute zero point: Even if it were possible for an examinee to obtain a score of 0 on an IQ test, this would not mean the examinee has no intelligence

Planned Comparisons and Post Hoc Tests

When an analysis of variance produces a statistically significant F-ratio, this indicates that at least one group is significantly different from another group but does not indicate which groups differ significantly from each other. Conducting planned comparisons and post hoc tests are two ways to obtain this information. Planned comparisons are also known as planned contrasts and a priori tests. These comparisons are designated before the data is collected and are based on theory, previous research, or the researcher's hypotheses. -For example, assume that a psychology professor at a large university conducts a study to test the hypothesis that adding instructor-led study sessions to her introductory psychology lectures will improve the final exam scores of undergraduate students. To test this hypothesis, she designs a study to evaluate four teaching methods: two that are currently available to students and two new methods that include instructor-led study sessions. --The four teaching methods are lectures only (L), lectures with peer-led study sessions (LP), lectures with instructor-led in-person study sessions (LIP), and lectures with instructor-led Zoom study sessions (LIZ). Because the professor is interested only in comparing lectures to lectures with instructor-led study sessions, she will not conduct a one-way analysis of variance but, instead, will use two t-tests to compare the mean final exam scores obtained by students in the L and LIP groups and the mean final exam scores obtained by students in the L and LIZ groups. Post hoc tests are also known as a posteriori tests and are conducted when an ANOVA produces a significant F ratio. For the teaching method study, if the psychology professor decides she is interested in comparing the effects of all of the teaching methods, she will first conduct a one-way ANOVA. If the ANOA yields a significant F-ratio, this indicates that at least one teaching method differs significantly from another teaching method but does not indicate which teaching methods differ significantly from each other. Therefore, the professor will use t-tests to compare all possible pairs of group means: L versus LP, L versus LIP, L versus LIZ, LP versus LIP, LP versus LIZ, and LIP versus LIZ. As noted in the description of the one-way ANOVA, the greater the number of statistical tests used to analyze the data collected in a research study, the greater the experimentwise error rate. Consequently, when conducting planned comparisons or post hoc tests, it is desirable to control the experimentwise error rate. One way to do so for both planned comparisons and post hoc tests is to use the Bonferroni procedure, which simply involves dividing alpha by the total number of statistical tests to obtain an alpha level for each test. -For example, there are two planned comparisons for the teaching method study and, if the professor sets alpha at .05, alpha would be .025 (.05/2) for each comparison. An alternative for post hoc tests is to use one of the modifications of the t-test that are each appropriate for a different situation and differ in terms of the ways they control the experimentwise error rate. Frequently used post hoc tests include Tukey's honestly significant difference (HSD) test, the Scheffe test, and the Newman-Keuls test.

Frequency Polygons

When the data obtained on a variable represent an interval or ratio scale, the data can be depicted in a frequency polygon by plotting scores on the horizontal (X) axis and the frequency of each score on the vertical (Y) axis. Frequency polygons can be described in terms of their shape as being normal or non-normal. A normal distribution is symmetrical and bell-shaped and has certain characteristics. -First, in a normal distribution, the three measures of central tendency - the mean, median, and mode - are equal to the same value. -Second, 68% of scores fall between the scores that are plus and minus one standard deviation from the mean, about 95% of scores fall between the scores that are plus and minus two standard deviations from the mean, and about 99% of scores fall between the scores that are plus and minus three standard deviations from the mean. For example, if a distribution of scores on a job knowledge test for a sample of employees has a mean of 100 and standard deviation of 10, 68% of employees obtained scores between 90 and 110, about 95% obtained scores between 80 and 120, and about 99% obtained scores between 70 and 130

Moderator variable

affects the direction and/or strength of the relationship between independent and dependent variables. If a study finds that cognitive-behavior therapy is more effective for treating adolescents with social anxiety disorder when the adolescents have authoritative parents than when they have authoritarian parents, parenting style is a moderator variable

Mediator variable

explains the relationship between independent and dependent variables. For instance, cognitive therapies are based on the assumption that therapy reduces anxiety because it alters client's dysfunctional thinking. In other words, therapy (the independent variable) leads to more realistic thinking (mediator variable) which, in turn, leads to reduced anxiety (dependent variable).

Extraneous Variables

not an intentional part of a research study but affect the relationship between the study's independent and dependent variables and make it difficult to determine if an apparent effect of an independent variable on a dependent variable is actually due to the independent variable. In a study designed to compare two different teaching methods on memorization, if participants in one group unintentionally receive instruction in noisier conditions than participants in the other group do, noise is likely to be an extraneous variable. Note that the terms extraneous variable, confounding variable, and disturbance variable are often used interchangeably. -Some authors, however, describe a confounding variable as a specific type of extraneous variable that's related to the independent variable and a disturbance variable as another type of extraneous variable that's related to the dependent variable

internal validity

the extent to which it's possible to derive accurate conclusions from the results about the nature of the relationship between the study's independent and dependent variables.

External Validity

the extent to which it's possible to generalize those conclusions to other people and conditions

Independent Variable

the variable that a researcher believes has an effect on the dependent variable and on which research participants differ, either because they're exposed to different levels of the independent variable during the study or because they begin the study with different levels. Assigning participants with major depressive disorder to either cognitive-behavior therapy or interpersonal therapy is an example of the former; comparing participants who begin the study with high, average, or low levels of self-esteem is an example of the latter. The independent variable always has at least two levels - e.g., treatment versus no treatment, treatment #1 versus treatment #2

Dependent variable

the variable that's expected to be affected by the independent variable, and it's data on the dependent variable that's analyzed with an inferential statistical test.


Kaugnay na mga set ng pag-aaral

Pioneer C1/C2/Unit 4- Comp. pg 57-58

View Set

National Test- Unit 5 Law of Agency

View Set

Pediatrics ATI Practice Exam!! Type A

View Set

Arid Region Soils ( High pH, Sodic, Saline Soils)

View Set

Unit 4 Week 3-Amazing Wildlife of the Mohave

View Set

Cambridge Vocabulary for IELTS Advanced U1

View Set