Final Bleh

Ace your homework & exams now with Quizwiz!

APA Guidelines: Tables

- Results in tables do not need to be repeated in a graph or in the text. - The table includes horizontal lines spanning the entire table at the top and bottom, and just beneath the column headings. - Every column has a heading, including the leftmost column, and there are additional headings that span two or more columns that help to organize the information and present it more efficiently. - Tables are numbered consecutively starting at 1 (Table 1, Table 2, Table 3...) and given a brief but clear and descriptive title.

APA Guidelines: Layout of Graphs

- Scatterplots, bar graphs, and line graphs should be slightly wider than they are tall. - The independent variable should be plotted on the x-axis and the dependent variable on the y-axis. - Values should increase from left to right on the x-axis and from bottom to top on the y-axis.- The x-axis and y-axis should begin with 0.

Basic Features of Single-Subject Research

- The dependent variable is measured repeatedly over time at regular intervals. - The study is divided into distinct phases, and the participant is tested under one condition per phase. The conditions are often designated by capital letters: A, B, C...For example, participants were tested first in one condition (A), then tested in another condition (B), and retested in the original condition (A). - Steady state strategy. The idea is that when the dependent variable has reached a steady state, then any change across conditions will be relatively easy to detect. The effect of an independent variables is easier to the detect when the "noise" in the data is minimized.

APA Guidelines: Figures

- The figure should always add important information rather than repeat information that appears in the text or in a table. If data is better suited for a figure, then don't include it in the text or in a table. - The figure should be as simple as possible, avoiding color (unless necessary) in professional settings, but colors are effective for posters, slideshows, or textbooks. - The figure should be easy to interpret, even by the lay person. They should understand the result based on the figure and its caption.

Statistical Procedures for Single-Subject Research

- The mean and standard deviation of each participant's responses under conditions are computed and compared, and inferential statistical tests such as the t-test or analysis of variance are applied. Averaging across participants is less common. - Compute the percentage of non-overlapping data.These are supplements to visual inspection, NOT a replacement.

Problems With Visual Inspection

- Visual inspection is inadequate for deciding whether and to what extent a treatment has affected a dependent variable. - Visual inspection can be unreliable, with different researchers reaching different conclusions about the same set of data. - The results of visual inspection, an overall judgment of whether or not a treatment was effective, can't be clearly and efficiently summarized or compared across studies (unlike the measures of relationship strength in group research).

Type I and II Error Implications

- We should be cautious about interpreting the results of any individual study because there is a chance that it reflects a Type I or II error. - This possibility is why researchers consider it important to replicate their studies. - Each time researchers replicate a study and find a similar result, they become more confident that the result represents a real phenomenon and not just a Type I or II error.

When is Pearson's r Misleading?

- When the relationship is nonlinear. It is imperative to make a scatterplot to confirm that a relationship is approx. linear before Pearson's r. - When one or both of the variables have a limited range in the sample relative to the population (restriction range). Studies should avoid restriction range (like if age is a primary example, have a wide range of ages) since it's not always anticipated or easily avoidable. It is also good practice to examine data for possible restriction range and interpret Pearson's r with that info in mind.

APA Guidelines: Scatterplots

- When the variables on the x-axis and y-axis are conceptually similar and measured on the same scale, where they are measures of the same variable on two different occasions, this can be emphasized by making the axes the same length. - When two or more individuals fall at exactly the same point on the graph, one way this can be indicated is by offsetting the points slightly along the x-axis or displaying the number of individuals in parentheses next to the point or making the point larger or darker in proportion to the number of individuals. - The straight line that best fits the points in the scatterplot, regression line, can also be included.

Online (Internet) Surveys

A common way of conducting surveys. They're easy to construct and use. Contact is made via email, but it is tough to find a comprehensive list of email addresses to serve as a sampling frame. Alternatively, a request to participate in the survey with a link to it can be posted on websites that the population would go on, but it would be difficult to get anything approaching a random sample this way because the members of the population who visit the websites are likely to be different from the population as a whole. All in all, with their low cost and more people online, internet surveys are likely to become the dominant approach to survey data collection in the near future.

Normal Distribution (Bell Curve)

A continuous probability distribution that is symmetrical on both sides of the mean and fall off into tails at both ends.

Reversal/ABABA Design (Single-Subject)

A design in which a baseline condition (A) is measured first, followed by measurements during a treatment condition (B), followed by a return to the baseline measurement condition (A), followed by a return to the treatment condition (B) and a final baseline measurement condition (A) to verify that the change in behavior is linked to the experimental condition.

Mixed Factorial Design

A design that includes both independent groups (BS) and repeated measures (WS) variables; each participant is tested in some conditions.

Central Tendency

A distribution is its middle, the point around which the scores in the distribution tend to cluster. Mean, mode, and median.

Symmetrical Distribution

A distribution where the left and right are mirror images of each other.

Negatively Skewed Distribution

A distribution where the peak is shifted toward the upper end of its range and the distribution has a long negative tail.

Positively Skewed Distribution

A distribution where the peak is toward the lower end of its range and the distribution has a long positive tail.

Higher-Order Factorial Design

A factorial research design with more than two factors.

Combined Strategy

A factorial study that combines two different research strategies, such as experimental and non-experimental or quasi-experimental, in the same factorial design.

Mixed Design

A factorial study that combines two research designs (i.e., BS and WS) in the same factorial design.

Null Hypothesis Testing/Null Hypothesis Significance Testing

A formal approach to deciding between two interpretations of a statistical relationship in a sample.

Bar Graph

A graph that uses horizontal or vertical bars to display data; used to present and compare the mean scores for two or more groups or conditions.

Scatterplot

A graphed cluster of dots, each of which represents an individual rather than the mean for a group of individuals. It presents relationships between quantitative variables when the variable on the x-axis (IV) has a large number of levels.

Histogram

A graphical display of a distribution that has the same info as a frequency table but it's easier and quicker to grasp. X-axis is the number of individuals with the given score, y-axis is the frequency. There is sometimes a small gap between the bars, but that is only for categorical labels; not quantitative.

Two-Tailed Test

A hypothesis test in which rejection of the null hypothesis occurs if the t score for the sample is extreme in either direction.

One-Tailed Test

A hypothesis test in which rejection of the null hypothesis occurs if the t score for the sample is extreme in one direction that we specify before collecting the data.

Cohen's d

A measure of effect size indicating how many standard deviations two group means are from each other. An appropriate effect size measure for the comparison between two means.

Variance

A measure of variability/the spread based on the standard deviation squared.

Manipulation Check

A measure used to determine whether the manipulation of the independent variable was successful at the end of the study.

Likert Scale

A numerical scale used to assess people's attitudes; it includes a set of possible answers with labeled anchors on each extreme.

Repeated-Measures ANOVA

A one-way ANOVA that involves correlated groups of participants. The main difference is that measuring the dependent variable multiple times for each participant allows for a more refined measure of MSW.

Alternative Explanations to a Change in Posttest Scores: Regression to the Mean

A participants who scores extremely high or low on a variable on one occasion will tend to score score less on the next occasion. This is especially an issue when participants are selected for further study because of their extreme scores.

Replicability Crisis

A phrase that refers to the inability of researchers to replicate earlier research findings.

Open Science Practices

A practice in which researchers openly share their research materials with other researchers in hopes of Increasing the transparency and openness of the scientific enterprise.

Cluster Sampling

A probability sampling technique in which larger clusters of individuals are randomly sampled and then individuals within each cluster are randomly sampled. It is especially useful for surveys that involved face-to-face interviews because it minimizes the amount of traveling that the interviewers must do.

Disproportionate Stratified Random Sampling

A probability sampling technique that can be used to sample extra respondents from particularly small subgroups, allowing valid conclusions to be drawn about those subgroups.

Proportionate Stratified Random Sampling

A probability sampling technique that can be used to select a sample in which the proportion of respondents in each of the various subgroups matches the proportion in the population.

Sampling Bias

A problem that occurs when a sample is selected in such a way that it isn't representative of the population and therefore produces inaccurate results.

Interrupted Time-Series Design

A quasi-experimental design where a set of measurements is taken at intervals over a period of time. It's similar to a pretest-posttest design since it measures the dependent variable before and after, but it instead includes multiple pretest and posttest measurements.

One-Group Posttest Only Design

A quasi-experimental design where a treatment is used (independent variable is manipulated) and then the dependent variable is measured after.

One-Group Pretest-Posttest Design

A quasi-experimental design where the dependent variable is measured before and after the treatment is used (the independent variable is manipulated).

Open-Ended Item

A questionnaire item that asks a question and respondents can answer in whatever way they want. Usually qualitative.

Closed-Ended Item

A questionnaire item that asks a question with several response options that respondents choose from. For categorical variables like sex, race, or political party; the categories are usually listed and participants choose the one (or ones) that they belong to. Quantitative variables use rating scales.

Linear Relationships

A relationship that fits a straight line a graph.

Nonlinear Relationship

A relationship where the points of a scatterplot are a better fit by a curved line rather than a straight line.

Alternative Hypothesis (H1)

A relationship/difference between variables (sample and population).

Factorial Design

A research design that includes two or more factors.

Three-Factor Design

A research study involving three independent or quasi-independent variables.

Two-Factor Design

A research study involving two independent or quasi-independent variables.

Single-Factor Design

A research study with one independent variable or one quasi-independent variable.

Quasi-Experimental Research

A research technique in which the two or more groups that are compared are selected based on predetermined characteristics, rather than random assignment.

Multiple-Treatment Reversal/ABCACB Design (Single-Subject)

A researcher establishes a baseline of studying behavior for a disruptive student (A), then introduces a treatment involving positive attention from the teacher (B), and then switch to a treatment involving mild punishment for not studying (C). The participant is returned to a baseline phase (A), receives a mild punishment (C), and then receives positive attention from the teacher (B).

Cluster Sampling Example

A researcher wants to select a sample of small-town residents in the US. The researcher might randomly select several small towns and then randomly select several individuals within each town.

Confidence Interval Example

A sample of 20 students might have a mean calorie estimate for a cookie for 200 with a 95% confidence interval of 160 to 240. There is a very good (95%) chance that the mean calorie estimate for the population of the students lies between 160 and 240. The sample mean of 200 is significantly difference at the 0.05 level from any hypothetical population that is outside the confidence interval. The confidence interval of 160 to 240 tells us the sample mean is statistically significantly different from a hypothetical population mean of 250 because the confidence interval does not include the value of 250.

Simple Random Sampling

A sampling procedure in which each individual in the population has an equal probability of being selected for that sample. This could involve putting the names of the individuals into a hat, mix them, and draw out the amount needed for the sample. However, random sampling is more likely to involve computerized sorting or selection of respondents.

Non-Probability Sampling

A sampling technique when the researcher can't specific these probabilities. Convenience sampling is the most common form.

Applied Behavior Analysis

A scientific approach to understanding behavior. It refers to a set of principles that focus on how behaviors change, or are affected by the environment, and how learning takes place. The goal is to establish and enhance socially important behaviors.It plays an important role in contemporary research on developmental disabilities, education, organizational behavior, health, etc.

Interrupted Time-Series Design with Nonequivalent Groups

A set of measurements at intervals over time before and after an intervention of interest in two or more nonequivalent groups; this an improvement of interrupted time-series and adds a control group.

Descriptive Statistics

A set of techniques for summarizing and displaying data.

Registered Reports

A solution to the file drawer problem. Journal editors and reviewers evaluate research submitted for publication without knowing the results of that research. If the research question is judged to be interesting and the method is judged to be sound, then a non-significant result should be published like significant results. There's now even journals that just publish nonsignificant results, like the Journal of Articles in Support of the Null Hypothesis.

Data File

A spreadsheet program like Excel or a statistical analysis program like SPSS is used to create a data file. The most common format is for each row to represent a participant and for each column to represent a variable.

Test Statistic

A statistic that is computed to only find the p-value.

Multiple Regression

A statistical technique that includes two or more predictor variables in a prediction equation. A huge advantage is that it can show whether an independent variable makes a contribution to a dependent variable over and above the contributions made by other independent variables.

Factor Analysis

A statistical technique that organizes the variables into a smaller number number of clusters and are strongly correlated WITHIN each cluster but weakly correlated BETWEEN clusters; then each cluster is interpreted as multiple measures of the same underlying construct (factors); Big Five personality factors is a good example.

t-Test (t is ALWAYS italicized)

A statistical test used to evaluate the size and significance of the difference between two means.

Group Research

A type of quantitative research that involves studying large numbers of participants and examining their behavior, primarily in terms of group means, standard deviations, etc. It is the most common approach in psychological research. It is the contrast and complete opposite to single-subject.

Single-Subject Research

A type of quantitative research that involves studying the behavior of each of a small number participants in detail. Typically 2-10 participants. Also known as small-n designs.

Probability Sampling

A type of sampling in survey research where each member of the population has a known probability of being selected for the sample. Types of probability sampling are simple random sampling, stratified random sampling, and cluster sampling.

Degrees of Freedom (df)

A value derived from the number of subjects and the number of levels within the independent variables within a study (N-1).

Confidence Interval Formula

Above, but look at photo.

Does Psychotherapy Work?

After many studies, psychotherapy has been found to be effective. 80% of treatment participants improved more than the control participants.

Between-Subjects Factorial Design

All of the independent variables are manipulated between subjects; each participant is tested in only one condition.

Within-Subjects Factorial Design

All of the independent variables are manipulated within subjects; each participant is tested in all conditions. This is more efficient for the researcher and controls extraneous participant variables compared to BS.

Mean, Median, Mode (Unimodal and Symmetrical)

All three measures of central tendency will be very close to each other at the peak of the distribution.

Multiple-Baseline Design Across Settings (Single-Subject) Example

A baseline might be established for the amount of time a child spends reading during their free time at school and during their free time at home. Then, a treatment, such as positive attention, might positive attention might be introduced first at school and later at home. If the dependent variable changes after the treatment is introduced in each setting, this gives the researcher confidence that the treatment is responsible for the change.

Multiple-Treatment Reversal Design (Single-Subject)

A baseline phase followed by separate phases in which different treatments are introduced.

Quasi-Experimental Research vs Experimental Research

Quasi-experimental research resembles experimental research as it has a manipulated independent variable but isn't a true experiment because it doesn't have random assignment or counterbalancing. It does eliminate the directionality problem since the independent variable is manipulated before the dependent variable is measured, but it also creates a huge issue of confounds.In terms of internal validity, this is in between experiments and non-experiments. This is usually conducted in field settings unlike true experiments where the setting is more controlled.

Cognitive Model

Respondents must interpret the question, retrieve relevant information from memory, form tentative judgement, convert the tentative judgement into one of the response options provided, and edit their response as necessary.

What is Survey Research?

Non-experimental. It describes single variables (% of voters who prefer one candidate or another) and assess statistical relationships between variables (relationship between income and health.BUT they can also be used in experimental research since you can use surveys while also manipulating the independent variable and using random sampling.

Snowball Sampling

Non-probability sampling in which existing research participants help recruit additional participants.

Self-Selection Sampling

Non-probability sampling in which individuals choose to take part in the research on their own accord, without being approached by the researcher directly.

Quota Sampling

Non-probability sampling in which subgroups in the sample are recruited to be proportional to those subgroups in the population.

Sample Size (N=50) and Relationship Strength

Not weak Definitely medium and strong.

Sample Size (N=20) and Relationship Strength

Not weak nor medium. d = maybe strong r = strong

Testing Correlation Coefficients

Null states ρ = 0. Alternative states ρ =/ 0. The t-test can be two-tailed if the researcher doesn't have a reason to be believe the relationship will have one particular direction. The correlation coefficient is treated as its own test statistics. If the p-value is <=0.05, we reject the null.If the p-value is >=0.05, we fail to reject the null.

Posttest Only Nonequivalent Groups Design

Participants in one group are exposed to a treatment while the nonequivalent group isn't exposed, and the two groups are compared. To increase internal validity, which is already low due to confounds and no random assignment, researchers should try to select similar groups.

Defense For Null Hypothesis Testing

Robert Abelson argues that when it is correctly understood and carried out, the null does serve an important purpose; especially when dealing with new phenomena, it gives researchers a principled way to convince others that their results shouldn't be dismissed as mere chance occurrences.

Convenience Sampling

Studying individuals who happen to be nearby and willing to participate.

Mail Surveys

Surveys sent via mail and are less costly but generally have even lower response rates; they are the most susceptible to non-response bias.

Alternative Explanations to a Change in Posttest Scores: Testing

The act of measuring the dependent variable during the pretest affects the participants' response during the posttest. The researchers or the participants completing the measure might've created a bias or change in attitude.

Standard Deviation

The average distance between the scores and the mean; the most common measure of variability.

Variability

The extent to which the scores in a data set tend to vary from each other and from the mean.

Mean, Median, Mode (Highly Skewed)

The mean can be pulled so far in the direction of the skew (tail) that it isn't a good measure of central tendency of that distribution.

Mean, Median, Mode (Skewed)

The mean will differ from the median in the direction of the skew (tail).

Median

The middle score in a distribution; half the scores are above it and half are below it. To find the median, it is best to organize the scores from lowest -> highest and locate the score in the middle.

Nonresponse Bias

The most pervasive form of sampling which occurs when people who don't respond to the survey differ in important ways from ways people who do respond. The best way to minimize this is to maximize the response rate by pre-notifying respondents (minimize the number of non-responders), sending them reminders, constructing questionnaires that are short and easy to complete, and offering incentives.

Survey Research

The most popular technique for gathering primary data, in which a researcher interacts with people or people take a questionnaire to obtain facts, opinions, and attitudes. They typically have large samples and use random sampling.

Interaction Example

People who are highly motivated to change improve more in psychotherapy than people who aren't motivated to change. The effect of one independent variable (whether or not one receives psychotherapy) depends on the level of another (motivation to change).

Pretest-Posttest Design with Switching Replication

The nonequivalent (second) group is given a pretest of the dependent variable, then the first group receives a treatment while the nonequivalent doesn't, the dependent variable is assessed again, then the treatment is added to the nonequivalent group, and the dependent variable is assessed.

Z-Score

The number of standard deviations a particular score is from the mean.

Item-Order Effect

The order in which the items are presented may affect people's responses. One item can change how participants interpret a later item or change the information that they retrieve to respond to later items.

Percentage of Non-Overlapping Data (PND)

The percentage of responses in the treatment condition that are more extreme than the most extreme response in a relevant control condition.

Percentile Rank

The percentage of scores in the distribution that are at or below a particular value.

Sampling Error

The possible variance (random variability) between the true value of a parameter in the population and the estimate of that value from the sample data. It's not a mistake, it's variability between the parameters.

Results of Factorial Experiments

The results with two independent variables can be graphed by one independent variable on the x-axis and representing the other by using different colors. The y-axis is the dependent variable. Line graphs are better when measurements have been over a time interval and when the variables on the x-axis are quantitative with a small number of distinct levels.

Grouped Frequency Table

The same as a frequency table but is used when there's a wide range of values. The first column lists the ranges of values and the second column lists the frequency of scores in each range.

Level

The specific values that the experimenter chooses for a factor. If the dependent variable is much higher or much lower in one condition than another, this suggests that the treatment had an effect.

When to Use N-1

The standard deviation of a sample tends to be lower than the standard deviation of the population of interest. N-1 corrects this and the results are usually a better estimate of the population standard deviation.Use this when you are mainly drawing conclusions about the population.

Standard Error of the Mean Equation

The standard deviation of the group divided by the sample size of that group.

Standard Error of the Mean

The standard deviation of the sampling distribution of the mean. A difference between group means that is greater than two standard errors is statistically significant.

Statistical Power Example

The statistical power of a study with 50 participants and an expected Pearson's r of 0.30 in the population is 0.59. There is a 59% chance of rejecting the null if the population correlation is 0.30.To figure out the probability of a Type II error would be 1-0.59 = 0.41.

Main Effect (Detailed Definition)

The statistical relationship between one independent variable and a dependent variable, averaging across the levels of the other independent variables; there is one main effect for each independent variable.

Dependent Samples t-Test

The statistical test used to compare two means for the same sample tested at two different times or under two different conditions. Usually used for pretest-posttest designs or within-subjects experiments. The null hypothesis is that the means at the two times or under the two conditions are the same in the population. The alternative hypothesis is that they are not the same. This test can also be one-tailed if the researcher has good reason to expect the difference goes in a particular direction.

Mean Equation

The sum of the scores divided by the number of scores. mu = M.

Alternative Explanations to a Change in Posttest Scores: Spontaneous Remission

The tendency for many medical and psychological problems to improve over time without any treatment, like the common cold or depression.

The Test Statistic for ANOVA

The test statistic is F, it's a ratio of two estimates of the population variance based on the sample data. One estimate of the population variance is called the means squares between groups (MSB) and is based on the differences among the sample means. The other is the mean squares within groups (MSW) and is based on the differences among the scores within each group. So, F = MSB/MSW. The reason that F is useful is that we know how it is distributed when the null hypothesis is true.

Null Hypothesis Test and Relationship Strength Example

There is a BS experiment with 20 participants in each of the two conditions and a medium difference (d = 0.50) that is expected in the population. So, the statistical power is 0.34. If there is a medium difference in the population, there is only about a 1 in 3 chance of rejecting the null and a 2 in 3 chance of a Type II error. For this, there is an unacceptably low chance of rejecting the null and an unacceptably high chance of a Type II error.

Alternating Treatments Design

Two or more treatments are alternated relatively quickly a regular schedule. For example, positive attention for studying could be used one day and mild punishment for not studying the next, and so on. Or one treatment could be implemented in the morning and another in the afternoon. The alternating treatment design can be a quick and effective way of comparing treatments, but only when the treatments are fast acting.

When to Use N

When the goal is to describe the variability in a sample and it emphasizes that the variance is the mean of the squared differences, and the standard deviation is the square root of this mean.

Type I Error Rate

When the null is true and alpha is 0.05, we mistakenly reject the null 5% of the time.

Implication of Sample Error

When there is a statistical relationship in a sample, it's not always clear that here is a statistical relationship in the population. A small difference between two group sample means might indicate a small difference between the two groups in the population, but it could also be that there is no difference due to sample error. A correlation of -0.29 might mean there is a negative relationship in the population, but could also be that there is no relationship in the population.

Fail to Reject the Null Hypothesis

When you do not have enough statistical strength to show a difference or an association. p>=0.05.

Interaction/Interaction Between Factors

In a factorial design, it occurs when the effect of one independent variable depends on the level of another independent variable.

Main Effect

In a factorial design, the mean differences among the levels of one factor.

Pearson's r Equation

It is the mean cross-product of z-scores. For the x variable, subtract the mean of x from each score and divide each difference by the standard deviation of x. For the y variable, subtract the mean of y from each score and divide each difference by the standard deviation of y. For each individual, multiply the two z-scores together to form a cross-product. Finally, take the mean of the cross-products.

Bipolar Questions (Rating Scale)

It is useful to offer an earlier question that branches them into an area of the scale. For example, if it's about liking ice cream, first ask "Do you generally like or dislike ice cream?" Then, refine it by offering them relevant choices from the seven-point scale.Branching improves reliability and validity.

Single-Subject Research Data

It relies heavily on visual inspection. Inferential statistics aren't typically used.

What is Misleading About the Term: Effect Size?

It suggests a casual relationship; but effect size doesn't make a relationship casual.

When to Use Group Research

It's ideal for testing the effectiveness of treatments at the group level. Among the advantages of this approach is that it allows researchers to detect weak effects. Finding a weak treatment effect might lead to refinements of the treatment that eventually produce a larger and more meaningful effect. Group research is also good for studying interactions between treatments and participants characteristics. It's also necessary to answer questions that can't be addressed using the single-subject approach, including questions about independent variables that cannot be manipulated like number of siblings, extraversion, and culture.

One-Group Pretest-Posttest Design vs Within-Subjects Experiment

It's like a WS experiment because each participant is first tested in the control condition and then the treatment condition. It's not like a WS experiment because the order of conditions is not counterbalanced since it's practically impossible for a participant to be tested in the treatment condition first, then in the control group.

When to Use Single-Subject Research

It's particularly good for testing the effectiveness of treatments on individuals when the focus is on strong, consistent, and biologically or socially important effects. It's especially useful when the behavior of particular individuals is of interest. Clinicians who work with only one individual at a time may find that it is their only option for doing systematic quantitative research.

Preregistrations of Analysis Plans

Level 0: Journal says nothing. Level 1: Journal encourages pre-analysis plans and provides link in article to registered plan if it exists. Level 2: Journal encourages pre-analysis plans and provides link in article and certification of meeting preregistration badge requirements. Level 3: Journal requires preregistration of studies with analysis plans and provides link and badge in article to meeting requirements.

Preregistration of Studies

Level 0: Journal says nothing. Level 1: Journal encourages preregistration of studies and provides link in article to preregistration if it exists. Level 2: Journal encourages preregistration of studies and provides link in article and certification of meeting preregistration badge requirements. Level 3: Journal requires preregistration of studies and provides link and badge in article to meeting requirements.

Visual Inspection

Plotting individual participants' data as shown throughout the lesson, looking carefully at those data, and making judgements about whether and to what extent the independent variable had an effect on the dependent variable.

BRUSO (Unambiguous Example)

Poor: Are you a gun person Effective: Do you currently own a gun?

BRUSO (Brief Example)

Poor: Are you now or have you ever been the possessor of a firearm? Effective: Have you ever owned a gun?

BRUSO (Objective Example)

Poor: How much do you support the new gun control measure? Effective: What is your view of the new gun control measure?

BRUSO (Specific Example)

Poor: How much have you read about the new gun control measure and sales tax? Effective: How much have you read about the new sales tax?

BRUSO (Relevant Example)

Poor: What is your sexual orientation? Effective: Do not include this item unless it is clearly relevant to the research.

Preconceptions and Findings Pertaining to Internet Surveys

Preconception: Internet samples are not demographically diverse. Finding: Internet samples are more diverse than traditional samples in many domains, although they are not completely representative of the population. Preconception: Internet samples are maladjusted, socially isolated, or depressed. Finding: Internet users do not differ from nonusers on markers of adjustment and depression. Preconception: Internet-based findings differ from those obtained with other methods. Finding: Evidence so far suggests that Internet-based findings are consistent with findings based on traditional methods (e.g., on self-esteem, personality), but more data are needed.

Null Hypothesis (H0)

No relationship/difference between variables (sample and population).

Cohen's d Examples

spooled = SD A Cohen's d of 0.50 means that the two groups differ by 0.50 standard deviations; it is a medium-sized difference between the two means. A Cohen's d of 1.20 means that the two group means differ by 1.20 standard deviations; it is a large-sized difference between the two means.

What Do Outliers Represent?

- Extreme scores on the variable of interest. - Errors or misunderstandings made by the the researchers or participants. - Equipment malfunctions.

Factor

A variable that differentiates a set of groups or conditions being compared in a research study. In an experimental design, it's an independent variable.

Difference Score

A variable that has been formed by subtracting one variable from another.

HARKing

Mining the data without an "a priori" hypothesis, only to claim that a statistically significant result had been originally predicted or hypothesizing after the results are known.

Multiple Dependent Variables

More than one dependent variable in the same study, it allows the researchers to answer more questions with minimal additional effort.

p-Value (p is ALWAYS italicized)

The probability level which forms basis for deciding if results are statistically significant (not due to chance).

Reject the Null Hypothesis

When you have enough statistical strength to show a difference or a relationship. p<=0.05.

Independent-Samples t-Test Equation

X-Bar = M s = SD

One-Sample t-Test Equation

X-bar = M mu = mu0 s = SD M is the sample mean. mu0 is the hypothetical population. SD is the sample standard deviation. N is the sample size.

Confidence Interval Equation

X-bar = M σ = SD

Reversal/ABA Design (Single-Subject) Simple Definition

An experimental design, often involving a single subject, wherein a baseline period (A) is followed by a treatment (B). To confirm that the treatment resulted in a change in behavior, the treatment is then withdrawn (A).

Multiple-Baseline Design Across Behaviors (Single-Subject)

An experimental, single-subject design where multiple baselines are established for the same participant but for different dependent variables, and the treatment is introduced at a different time for each dependent variable.

Multiple-Baseline Design Across Participants (Single-Subject)

An experimental, single-subject design where multiple baselines are established for the same participant but for different dependent variables, and the treatment is introduced at a different time for each dependent variable.

Multiple-Baseline Design Across Settings (Single-Subject)

An experimental, single-subject design where multiple baselines are established for the same participant but in different settings.

Outlier

An extreme score that is either very high or very low in comparison with the rest of the scores in the distribution.

Case Studies

An in-depth analysis and description of an individual, which is typically qualitative in nature. Only one person so it isn't single-subject.

Non-Manipulated Independent Variable

An independent variable that is measured but not manipulated; they are usually participant variables (i.e., self-esteem) and, by definition, BS factors.

Rating Scale

An ordered set of responses that the participants must choose from, quantitative.

Who Clarified the Assumptions of Single-Subject Research?

B. F. Skinner clarified many assumptions and refined many of its techniques. He and others used it to describe how rewards, punishments, and other external factors affect behavior over time. This work was carried out using nonhuman subjects, mostly rats and pigeons. Skinner called this approach the experimental analysis of behavior, and remains an important subfield of psychology and continues to rely almost exclusively on single-subject research.

p-Value (High)

Fail to reject H0, means that the sample or more extreme result would be likely if the null hypothesis were true.

Type II Error

Failing to reject a false null hypothesis, false negative.

Parameters

Characteristics of a population (i.e. standard deviation, mean, etc.).

One-Sample t-Test

Compares a sample mean (M) with a hypothetical population mean (mu0) that provides some interesting standard of comparison. The null hypothesis is that the mean for the population (mu) is equal to the hypothetical population mean (mu = mu0). The alternative hypothesis is that the mean for the population is different from the hypothetical population mean (mu =/ mu0). To decide between these two hypotheses, you need to find the probability of obtaining the sample mean (or one more extreme) if the null were true. But to get the p-value, you need to compute a test statistic called t.

Reject the Null in One-Way ANOVA

Conclude that the group means are not all the same in the population. With three groups, it can indicate that all three means are significantly different from each other, or it can indicate that one of the means is significantly different from the other two, but the two other two are not significantly different from each other.

Group Research Data

Group data are described using statistics such as means, standard deviations, correlation coefficients, etc. to detect general patterns. Inferential statistics are used to help decide whether the result for the sample is likely to generalize to the population.

Factors in Visual Inspection

Level, trend, and latency.

Sample Size (N=100) and Relationship Strength

Definitely medium and strong. d = weak r = not weak

Sample Size (N=500) and Relationship Strength

Definitely weak, medium, and strong.

Strengths of Switching Replication with Treatment Removal Design

Demonstrating a treatment effect two groups staggered over time and demonstrating the reversal of the treatment effect after the treatment has been removed provides strong efficacy. It can also provide evidence for whether the treatment continues to show effects after it's been withdrawn.

Effect Size

Describes the strength of an association/statistical relationship.

Reversal/ABA Design (Single-Subject) OG Definition

During the first phase, A, a baseline is established for the dependent. This is the level of responding before any treatment is introduced, therefore the baseline phase is a control condition. When the steady state responding is reached, phase B begins when the researcher introduces the treatment. There maybe a period of adjustment to the treatment during which the behavior of interest becomes more variables and begins to increase or decrease. The researcher waits until the dependent variable reaches a steady state so it is clear whether and how much has changed. Finally, the researcher removes the treatment and waits until the dependent variable reaches a steady state (phase A).

Experimental Analysis of Behavior vs Applied Behavior Analysis

EBA involves basic research designed to add the body of knowledge while ABA focuses on applying these behavior principles to real-world situations.

BRUSO (Relevant)

Effective questionnaire items are also relevant to the research question. This makes the questionnaire faster to complete, but also avoids annoying respondents with what they will rightly perceive as irrelevant or "nosy" questions.

BRUSO (Objective)

Effective questionnaire items are objects in the sense that they don't reveal the researcher's own opinions or lead participants to answer in a particular way.

BRUSO (Brief)

Effective questionnaire items are short and to the point. They avoid long, overly technical, or unnecessary words.

BRUSO (Specific)

Effective questionnaire items are specific, so that it is clear to respondents what their response should be about and clear to researchers what it is about.

BRUSO (Unambiguous)

Effective questionnaire items are unambiguous so they can be interpreted in only one way.

Why Do Type I Errors Happen?

Even when there is no relationship in the population, sampling error will occasionally produce an extreme result.

Low Replicability

Evidence of the questionable research practices by psychological researchers might accuse the research team of this if their studies are impossible to replicate or have low replicability: - The selective deletion of outliers to influence (usually by artificially inflating) statistical relationships among the measured variables. - The selective reporting of results, cherry-picking only those findings that support one's hypotheses. - HARKing. - p-hacking.- Fabrication of data.

Exploratory Analysis

Examining the possibility there might be relationships in the data that were not hypothesized. These analyses will help explore the data for other interesting results that might be used in this research or in the future. If it is included in this study, they should separated and taken with a grain of salt and urged to be replicated.

Standard Deviation Equation

Find the difference between each score and the mean, square each difference and find that mean, and find the square root of that mean. mu = M. σ = SD. N or it can be N-1.

Qualitative Research

Focuses on understanding people's subjective experience by observing behavior and collecting relatively unstructured data (i.e., detailed interviews) and analyzing those data using narrative rather than quantitative techniques. Opposite of single-subject.

Figures

Graphs, diagrams, flowcharts, etc. are in figure form.

Similarities BetweenDependent Samples t-Test and One-Sample t-Test

However, the first step in the dependent-samples t- test is to reduce the two scores for each participant to a single difference score by taking the difference between them. At this point, the dependent-samples t- test becomes a one-sample t- test on the difference scores. The hypothetical population mean (µ0) of interest is 0 because this is what the mean difference score would be if there were no difference on average between the two times or two conditions. We can now think of the null hypothesis as being that the mean difference score in the population is 0 (µ0 = 0) and the alternative hypothesis as being that the mean difference score in the population is not 0 (µ0 ≠ 0).

Alternative Explanations to a Change in Posttest Scores: Maturation

Participants might've changed between the pretest and posttest because they're constantly learning and growing; it is inevitable. Their attitudes will change, especially in a long study.

p-Value (Low)

Reject H0, means that the sample or more extreme result would be unlikely if the null hypothesis were true.

Type I Error

Rejecting null hypothesis when it is true, false positive.

P-Hacking

Researchers make various decisions in the research process to increase their chance of a statistically significant result (and Type I error) by arbitrarily removing outliers, selectively choosing to report dependent variables, only presenting significant results, etc. until their results yield a desirable p-value.

Prefix of Quasi

Resembling

Participants in Survey Research Are...

Respondents.

Context Effect

Survey questionnaire responses are subject to numerous context effects due to wording, item order, response options, and other factors. Researchers should be sensitive to such effects when constructing surveys and interpreting results.

Mean

The average of a distribution.

Practical Significance

The importance or usefulness of the result in some real-word context. Different from statistically significant, but is similar is to clinical significance if it is in a clinical setting.

Mean, Median, Mode (Bimodal)

The mean and median will be between the peaks, while the mode will be the tallest peak.

Mode

The most frequent score in a distribution.

Latency

The time it takes for the dependent variable to begin changing after a change in conditions. In general, if a change in the dependent variable begins shortly after a change in conditions, this suggests that the treatment was responsible.

Switching Replication with Treatment Removal Design

The treatment is removed from the first group when it's added to the second group.

Critical Values (for df and alpha)

The values that lie exactly on the boundary of the region of rejection.

Weaknesses of One-Group Posttest Only Design

The weakest type of quasi-experimental design. It lacks a control group or a comparison group. However, these results are frequently reported in the media and then misinterpreted by the population.

Bimodal

Two peaks (modes).

Raw Data

Unanalyzed data.

Survey Research Characteristics

- Quantitative and qualitative. - The variables of interest are measured using self-reports (questionnaires or interviews). - There is considerable attention paid to the issue of sampling, usually survey research uses large random sample so they can provide the most accurate estimates of what is true in the population.

When to Use Tables

- A common use is to present several means and standard deviations (usually for complex research designs with multiple independent and dependent variables). - A common use is to present correlations, usually Pearson's r, among several variables. It is called a correlation matrix.

Null Hypothesis Testing Techniques

- Assume that the null hypothesis is true. - Determine how likely the sample relationship would be if the null hypothesis was true. - If the sample relationship would be extremely unlikely, reject the null hypothesis in favor of the alternative hypothesis. - If the sample relationship would not be extremely unlikely, then retain (fail to reject) the null hypothesis.

APA Guidelines: Axis Labels and Legends

- Axis labels should be clear and concise and include the units of measurement if they don't appear in the caption. - Axis labels should be parallel to the axis. - Legends should appear within the figure. - Text should be in the same simple font throughout and no smaller than 8 point and no larger than 14 point.

What To Do Before Analyzing Raw Data

- Be sure they don't include any information that might identify individual participants and be sure that you have a separate secure location where you can store any consent forms. - Unless the data are highly sensitive, a locked room or password-protected computer is usually good enough. - Make photocopies or backup files of your data and store them in another secure location until the project is complete. But, professional researchers usually keep a copy of their raw data and consent forms for several years in case questions about the procedure, the data, or participant consent arise after the project is completed. - Check your raw data to ensure that they are complete and appear to have been accurately recorded. - If there are illegible or missing response or obvious misunderstandings (like a response of 12 on a 1-10 rating scale), you will have to decide whether such problems are severe enough to make a participant's data unusable. If info about the main independent or dependent variable is missing, or if several responses are missing or sus, you may have to exclude that participant's data from the analyses. Set them aside and keep notes on why you excluded them.

How to Improve p-Values and Null Hypothesis Testing

- By adding an effect size measure (like Cohen's d or Pearson's r) to the null provides the researcher an estimate of how strong the relationship in the population is, not just whether is one or not. - Confidence intervals. - Bayesian statistics.

APA Guidelines: Captions

- Captions are titled with the word "Figure", followed by the figure number in the order in which it appears in the text, and terminated with a period. - The title is italicized. - After the title is a brief description of the figure terminated with a period (e.g., "Reaction times of the control versus experimental group.") - Following the description, include any information needed to interpret the figure, such as any abbreviations, units of measurement (if not in the axis label), units of error bars, etc.

How to Approach Post Hoc Comparisons

- Conduct a series of independent-samples t-tests comparing each group to each of the other groups means. But, if we conduct several t-tests when the null is true, the chance of mistakenly rejecting at least one null increases with each test we conduct. - So, standard t-tests aren't used and modified versions (Bonferonni procedure, Fisher's least significant difference [LSD] test, and Turkey's honestly significant difference [HSD] test) are used instead.

Replicability Crisis Benefits

- Designing and conducting studies that have sufficient statistical power to increase the reliability of findings. - Publishing both null and significant findings (counteracting the publication bias and reducing the file drawer problem). - Describing one's research designs in sufficient detail to enable other researchers to replicate that study with an identical or very similar procedure. - Conducting high-quality replications and publishing these results.

Preliminary Analyses

- For multiple-response measures, you should assess the internal consistency of the measure. Stat programs will allow you to compute Cronbach's a or Cohen's k and even compute and evaluate a split-half correlation. - Analyze each important variable separately if they're not manipulated. Make histograms for each one, note their shapes, compute the measures of central tendency and variability, and understand what each statistical result MEANS. - Identify outliers, examine them more closely, and decide what to do about them. Keep in mind that outliers don't always represent an error, misunderstanding, or lack of effort. They might represent truly extreme responses or participants; it's plausible that they represent honest and even accurate estimates.

Cohen's d Benefits

- Has the same meaning regardless of the variable being compared or the scale it was measured on. - It's easier for researchers to communicate their results to other researchers. - Makes it possible to combine and compare results across different studies using different measures.

Closed-Ended Item Benefits

- Quick and easy for participants to complete. - Easier for researchers to analyze because the responses can be easily converted to numbers and entered into a spreadsheet.

Who First Used Single-Subject Research?

- In the late 1800s, one of psychology's founders, studied sensation and consciousness by focusing intensively on each of a small number of research participants. - Herman Ebbinghaus's research on memory. - Ivan Pavlov's research on classical conditioning.

How to Increase Statistical Power

- Increase the strength of the relationship. This can be achieved by using a stronger manipulation or by more carefully controlling extraneous variables to reduce the amount of noise in the data (i.e., using a WS design rather than a BS design). - Increase the sample size.

Making Conclusions from Sample Size

- It allows you to develop expectations about how your formal null hypothesis tests are going to come out, which allows you to detect problems in your analyses. - The ability to make this kind of intuitive judgment is an indication that you understand the basic logic of this approach in addition to being able to do the computations.

Assumptions of Single-Subject Research

- It is important to focus intensively on the behavior of individual participants. One reason is that group research can hide individual differences and generate results that do not represent the behavior of any individual. For example, a treatment that has a positive effect for half of the people but a negative effect of the other would, on average, appear to have no effect at all. However, single-subject would reveal these individual differences. A second reason is that sometimes it is the behavior of a particular that is primarily of interest. For example, a school psychologist might be interested in changing the behavior of a particular disruptive student. Although previous published research (single-subject and group) is likely to provide some guidance on how to do this, conducting a study on this student would be more direct and probably more effective. - It is important to discover casual relationships through the manipulation of an independent variable, the careful measurement of a dependent variable, and the control of extraneous variables. For this reason, single-subject research is often considered a type of experimental research good internal validity. For example, Hall and his colleagues measured their dependent variable (studying) many times, first under a no-treatment control condition, then a treatment condition (positive teacher attention), and then the control condition again. Because there was a clear increase in studying when the treatment was introduced, a decrease when it was removed, and an increase when it was reintroduced, there is little doubt that the treatment was the cause of the improvement. - It is important to study strong and consistent effects that have biological or social importance. Applied researchers, in particular, are interested in social validity. For example, Hall and his colleagues had good social validity because it showed strong and consistent effects of positive teacher attention on a behavior that is of obvious importance to teachers, parents, and students. Furthermore, the teachers found the treatment easy to implement, even in their often chaotic elementary school classrooms.

Closed-Ended Item Negatives

- More difficult to write because they must include an appropriate set of response options.

Important Notes About Factorial Designs

- Non-manipulated independent variables are usually participant variables (i.e., self-esteem, gender). - These studies are generally considered to be experiments as long as at least one independent is manipulated. - Casual conclusions can only be drawn about the manipulated independent variable.

Criticisms of Null Hypothesis Testing

- The p-value is widely misinterpreted as the probability that the null is true, it's the probability of the sample result if the null is true.- 1-p = the probability of replicating a statistically significant result, which is not true. - The strict convention of rejecting the null when p<=0.05 and retaining when p>=0.05 makes little sense. There shouldn't be a rigid dividing line between results that are considered significant and results that aren't. - Deciding that results are only significant when p<=0.05 adds to the file drawer problem. - It's not very informative. It basically is that there is no relationship between variables in the population (Cohen's d or Pearson's r is precisely 0), so rejecting the null is just saying that there is some nonzero relationship in the population, but it's not saying much. - Rejecting the null doesn't tell us anything we didn't know before.

File Drawer Problem Effects

- The published literature (that has significant results) probably contains a higher proportion of Type I errors than we might expect on the basis of statistical considerations alone. - Even when there is a relationship between two variables in the population, the published research literature is likely to overstate the strength of that relationship.

External Validity

- The strong and consistent effects they are typically interested in, even when observed in small samples, are likely to generalize to others in the population. - There is a strong emphasis on replicating their research results. When they observe an effect with a small sample of participants, they typically try to replicate it with another small sample, perhaps with a slightly different type of participant or under slightly different conditions. Each time they observe similar results, they rightfully become more confident in the generality of those results. - The principles of classical operant, most of which were discovered using the single-subject approach, have been successfully generalized across an incredibly wide range of species and situations. - Studying large groups of participants doesn't entirely solve the problem of generalizing to other individuals!

Sample Size and Relationship Strength

- The stronger the sample relationship and the larger the sample, the less likely the result would be if the null hypothesis was true (lowers the p-value). - Sometimes the result can be weak and the sample is large, or the result can be strong and the sample small.

Statistical Relationship (Sample)

- There is a relationship in the population, and the relationship in the sample reflects this. - There is no relationship in the population, and the relationship in the sample reflects only sampling error.

Advantages of Confidence Intervals

- They are much easier to interpret than null hypothesis tests. - They provide the information necessary to do a null hypothesis test if the person wants to.

Open-Ended Item Negatives

- They take more time and effort of the participants. - They're more difficult for the researcher to analyze because the answers must be transcribed, coded, and submitted to some form of qualitative analysis, such as context analysis. - Respondents are more likely to skip them because they take longer to answer.

Open-Ended Item Benefits

- They're easy to write because are no response options to worry about.

Short History of Survey Research

- Turn of the 20th century: researchers wanted to document the extent of social problems like poverty. - Beginning of 1930s: advances in questionnaire design, including techniques techniques that are still used today like the Likert scale. - 1930s: the US government conducted surveys to document economic and social conditions in the country. Conclusions about the population helped advance sampling procedures. - 1936: the presidential election between Alf Landon and Franklin Roosevelt; a magazine, Literary Digest, conducted a survey by sending ballots to millions of Americans. With scientific methods, they predicted Roosevelt would win in a landslide. - 1948: the first national election survey was conducted by the Survey Research Center at the University of Michigan.

End of Lesson 10

.

End of Lesson 11

.

End of Lesson 12

.

End of Lesson 13

.

End of Lesson 8

.

End of Lesson 9

.

Start of Lesson 10

.

Start of Lesson 11

.

Start of Lesson 12

.

Start of Lesson 13

.

Start of Lesson 8

.

Start of Lesson 9

.

Misinterpretation of P-Value (1-p)

1-p = the probability of replicating a statistically significant result.60% of a sample of professional researchers thought that a p-value of 0.01 (for an independent-samples t-test with 20 participants in each sample) meant there was a 99% chance of replicating the statistically significant result.

Multiple-Baseline Design Across Participants (Single-Subject) Example

A study by Scott Ross and Robert Horner explored how a school-wide bullying prevention program affected the bullying behavior of particular problem students. At each of three different schools, the researchers studied two students who had regularly engaged in bullying. During the baseline phase, they observed the students for 10-minute periods each day during lunch recess and counted the number of aggressive behaviors they exhibited toward their peers. After 2 weeks, they implemented the program at one school. After 2 more weeks, they implemented it at the second school. And after 2 more weeks, they implemented it at the third school. They found that the number of aggressive behaviors exhibited by each student dropped shortly after the program was implemented at the student's school. Notice that if the researchers had only studied one school or if they had introduced the treatment at the same time at all three schools, then it would be unclear whether the reduction in aggressive behaviors was due to the bullying program or something else that happened at about the same time it was introduced (i.e., a holiday, TV program, a change in weather). But with their multiple-baseline design, this kind of coincidence would have to happen three separate, a very unlikely occurrence, to explain their results.

Factorial Design (Detailed Definition)

A study in which there are two or more independent variables. Each level of the independent variables (factors) is combined with all the others to produce possible combinations.

Multiple-Baseline Design Across Behaviors (Single-Subject) Example

A study on the effect of setting clear goals on the productivity of an office worker who has two tasks: making sales calls and writing reports. Baselines for both tasks could be established. The researcher could be introduced for one of these tasks, and at a later time the same treatment could be introduced for the other. If productivity increases on one tasks after the treatment is introduced, it is unclear whether the treatment caused the increase. But if productivity increases on both tasks after the treatment is introduced, especially when the treatment is introduced at two different times, then it seems much clearer that the treatment was responsible.

Frequency Table

A table for organizing a set of data that shows the number of times each item or number appears. Often listed from most frequent to least frequent. Sometimes can be used for categorical variables.

Factorial Design Table

A table that represents all possible conditions of the experiment.

Correlation Matrix

A table that shows the correlation (Pearson's r) between every possible pair of variables in a study.

APA Guidelines: How Are Statistical Results Written?

Always in numbers, NEVER words.

AB vs ABA Design

An AB design is essentially an interrupted time-series design applied to an individual participant. One problem with that design is that if the dependent variable changes after the treatment is introduced, it isn't always clear that the treatment was responsible for the change. It's possible that something else changed at around the same time and that this extraneous variable is responsible for the change in the dependent variable. But if the dependent variable changes with the introduction of the treatment and then changes back with the removal of the treatment (assuming the treatment doesn't create a permanent effect), it is much clearer that the treatment and removal of the treatment is the cause. The reversal greatly increases the int

Stratified Random Sampling

An alternative to simple random sampling in which the population is divided into different subgroups (strata), usually based on demographic characteristics, and then a random sample is taken from each stratum. It can be used to select a sample in which the proportion of respondents in each of various subgroups matches the proportion in the population.

Factorial ANOVA

An analysis of variance involving two or more independent variables or predictors. It produces separate F ratios and p-values for the main effects and interactions. So, modifications must be made depending on whether design is BS, WS, or mixed.

Bayesian Statistics

An approach in which the researcher specifies the probability that the null hypothesis and any important alternative hypotheses are true before conducting the study, conducts the study, and then updates the probabilities based on the data.It is less popular than null hypothesis testing and confidence intervals.

Experimental Analysis of Behavior

An approach to experimental psychology that explores the relationships between particular experiences and changes in behavior, emphasizing the behavior of individuals rather than group averages. Founded by Skinner and is used in single-subject research.

Tables

An arrangement of data made up of horizontal rows and vertical columns.

Spreading Interactions

An effect of one independent variable at the level of another AND there is either a weak effect or no effect.

Reversal/ABAB Design (Single-Subject)

An experimental design, often involving a single subject, wherein a baseline period (A) is followed by a treatment (B). To confirm that the treatment resulted in a change in behavior, the treatment is then withdrawn (A) and reinstated (B). Hall and his colleagues used an ABAB design; baseline, positive attention, baseline, positive attention.

Seven-Point Rating Scales

Best for bipolar scales where there is a dichotomous, such as liking (Like very much, Like somewhat, Neither like nor dislike, Dislike slightly, Dislike somewhat, Dislike very much.)

Five-Point Rating Scales

Best for unipolar scales where only one construct is tested, like frequency (Never, Rarely, Sometimes, Often, Always).

Problems with the Reversal Design

Both have to do with the removal of the treatment. - If the treatment is working, it may be unethical to remove it. For example, if a treatment seemed to reduce the incidence of self-injury in a child with an intellectual delay, it would be unethical to remove that treatment just to show that the incidence of self-injury increases. - The dependent variable may not return to baseline when the treatment is removed. For example, when positive attention for studying is removed, a student might continue to study at an increased rate. This could mean that the positive attention had a lasting effect on the student's studying, but it could also mean that the positive attention wasn't the cause of the increased studying. Something else might've happened at about the same time, like the student's parents rewarding them for good grades. One solution is to use a multiple-baseline design.

Simple Effects

Breaking down the interaction to figure out precisely what is happening. This allows researchers to determine the effects of each independent variable at leave level of the other independent variable(s). This only occurs when there is an interaction.

BRUSO

Brief, relevant, unambiguous, specific, objective.

Introduction to the Survey

Every survey should have a written or spoken introduction that encourages respondents to participate in the survey. - The intro should briefly explain the purpose of the survey and its importance, provide, info about the sponsor of the survey (university-based?), acknowledge the importance of the respondent's participation, and describe any incentives for participating. - It should establish informed consent. This includes the topics covered by the survey, the amount of time is it likely to take, the respondent's option to withdraw at any time, confidentiality issues, etc. Written consent isn't always used in survey research when the research is minimal risk and completion of the survey instrument is accepted by the IRB as evidence of consent to participate, so it's important that this part of the intro be well documented and present clearly and in its entirety to every respondent. - Present clear instructions for completing the questionnaire, including examples of how to use any unusual response scales. - The intro is the point at which respondents are usually most interested and least fatigued, so it's good practice to start with the most important item for purposes of the research and proceed to less important items. - Items should be grouped by topic or by type. - Demographic items are often presented last since they're boring but also easy to answer in the event respondents are tired or bored.

In-Person Surveys

Face-to-face interviews that have the highest response rates and provide the closest personal contact with respondents. It is the most costly approach.

Distribution

How the scores of variables are distributed across the levels of those variables.

One-Group Pretest-Posttest Design and Internal Validity

If the average posttest score is better than the average pretest score, it is logical to conclude that the treatment was responsible for the improvement. However, it can't be concluded with a high degree of certainty because there might be other explanations why the posttest scores changed. It has low internal validity. Reasons are history, maturation, testing, instrumentation, regression to the mean, and spontaneous remission.

p<=0.05

If there is a 5% chance or less of a result at least as extreme as the sample result if the null hypothesis were true, then the null hypothesis is rejected.

p>=0.05

If there is a 5% or greater chance of a result as extreme as the sample result when the null is true, then the null is retained. DOES NOT mean the null is accepted.

Levels

In an experiment, the different values of the independent variable are selected to create and define the treatment conditions. In other research studies, the different values are a factor.

PND Example

In the study of Hall and his colleagues, all measures of a participant's study time in the first treatment condition were greater than the highest measure in the first baseline, for a PND of 100%. The greater the percentage of non-overlapping data, the stronger the treatment effect.

Four Ways to Conduct Surveys

In-person interviews, by telephone, through the mail, and online.

Sample Sizes to Achieve Statistical Power of 0.80: Null Hypothesis Test and Relationship Strength (Medium; d = 0.50, r = 0.30)

Independent-Samples t-Test: 128 Test of Pearson's r: 84

Sample Sizes to Achieve Statistical Power of 0.80: Null Hypothesis Test and Relationship Strength (Strong; d = 0.80, r = 0.50)

Independent-Samples t-Test: 52 Test of Pearson's r: 28

Sample Sizes to Achieve Statistical Power of 0.80: Null Hypothesis Test and Relationship Strength (Weak; d = 0.20, r = 0.10)

Independent-Samples t-Test: 788 Test of Pearson's r: 782

Focus of Single-Subject Research

It focuses on understanding objecting behavior through experimental manipulation and control, collecting highly structured data, and analyzing those data quantitatively.

Strengths of Pretest-Posttest Design with Switching Replication

It includes built-in replication and can test the efficacy twice. It has more control with history effective as it's unlikely an outside event would coincide with the treatment for the first group and the delayed treatment for the second group.

Replication

Level 0: Journal discourages submission of replication studies, or says nothing. Level 1: Journal encourages submission of replication studies. Level 2: Journal encourages submission of replication studies and conducts results blind review. Level 3: Journal used Registered Reports as a submission option for replication studies with peer review prior to observing the study outcomes.

Citation Standards

Level 0: Journal encourages citation of data, code, and materials; or says nothing. Level 1: Journal describes citation of data in guidelines to authors with clear rules and examples. Level 2: Articles provides appropriate citation for data and materials used consistent with journal's author guidelines. Level 3: Article is not published until providing appropriate citation for data and materials following journal's author guidelines.

Analytic Methods (Code) Transparency

Level 0: Journal encourages code sharing, or says nothing. Level 1: Article states whether code is available, and, if so, where to access them. Level 2: Code must be posted to a trusted repository. Exception must be identified at article submission. Level 3: Code must be posted to a trusted repository, and reported analyses will be reproduced independently prior to publication.

Data Transparency

Level 0: Journal encourages data sharing, or says nothing. Level 1: Article states whether data are available, and, if so, where to access them. Level 2: Data must be posted to a trusted repository. Exceptions must be identified at article submission. Level 3: Data must be posted to a trusted repository, and reported analyses will be reproduced independently prior to publication.

Design and Analysis Transparency

Level 0: Journal encourages design and analysis transparency, or says nothing. Level 1: Journal articulates design transparency standards. Level 2: Journal required adherence to design transparency standards for review and publication. Level 3: Journal requires and enforces adherence to design transparency standards for review and publication.

Research Methods Transparency

Level 0: Journal encourages materials sharing, or says nothing. Level 1: Article states whether materials are available, and, if so, where to access them. Level 2: Materials must be posted to a trusted repository. Exception must be identified at article submission. Level 3: Materials must be posted to a trusted repository, and reported analyses will be reproduced independently prior to publication.

Z-Score Examples

Mean = 100 SD = 15 Score = 110 (110-100)/15=0.67. A score of 110 is 0.67 standard deviations above the mean. Mean = 100 SD = 15 Score = 85 (85-100)/15=-1 A score of 85 is one standard deviation below the mean.

Outliers and Z-Scores

Most of the time, z-scores less than -3 or greater than 3 are defined as outliers.

Unimodal

One peak (mode).

Alternative Explanations to a Change in Posttest Scores: History

Other things might've happened between the pretest and posttest that is out of the researcher's hands that changed the scores. It could be from reading something or watching something about something similar to what the study is researching.

Statistically Significant

Refers to a result that is statistically unlikely to have occurred by chance, p<=0.05.

Trend

Refers to gradual increases or decreases in the dependent variable across observations. If the dependent variable begins increasing or decreasing with a change in conditions, then again this suggests that the treatment had an effect. In can be especially telling when a trend changes directions, like when an unwanted behavior is increasing during baseline but then decreases with the introduction of the treatment.

Converging Evidence

Similar findings reported from multiple studies using different methods.

Disproportionate Stratified Random Sampling Example

Since Asian Americans make up a relatively small % of the American population (~5.6%), a simple random sample of 1,000 American adults might include too few Asian Americans to draw any conclusions about them as distinct from any other subgroup.

Proportionate Stratified Random Sampling Example

Since about 12.6% of the American population is African-American, stratified random sampling can be used to ensure that a survey of 1,000 American adults includes about 126 African-American respondents.

Error Bars

Small bars at the top of each main bar in a bar graph that represent the variability in each group or condition. They either extend one standard error (more common) or standard deviation in each direction.

Post Hoc Comparisons

Statistical comparisons made between group means after finding a significant F ratio.

Replicability Crisis Example

The Reproducibility Project involved over 270 psychologists around the world coordinating their efforts to test the reliability of 100 previously published experiences. 97 out of the 100 studies had statistically significant effects, only 39 replications did. 61 non-replications reported similar findings to those of their original papers. Even the effect sizes of the replications were half of those found in the original studies. A failure to replicate a result by itself doesn't discredit the original study as differences in the statistical power, populations samples, and procedures used; even the effects of moderating variables could explain the different results.

Steady State Strategy

The change from one condition to the next doesn't occur after a fixed amount of time or number of observations. Instead, it depends on the participant's behavior; the researcher waits until the participant's behavior in one condition becomes fairly consistent from observation to observation before changing conditions.

α (Alpha)

The criteria to consider the sample result is considered unlikely to reject the null, set at 0.05.

Pretest-Posttest Nonequivalent Groups Design

The dependent variable is measured before and after the treatment is given; the nonequivalent group goes through the same process without receiving the treatment. This is used to see if participants who received the treatment improved more than the participants who didn't receive the treatment; but alternatives, like history and maturation, might explain the change and difference in the posttest scores.

Z-Score Equation

The difference between an individual's score and the mean of the distribution, divided by the standard deviation of the distribution. x-bar = M.

Range

The difference between the highest and lowest scores in a distribution; a simple measure of variability.

Cohen's d Equation

The difference between the two means divided by the standard deviation. M = M1. mu = M2.

Interaction

The effect of one independent variable depends on the level of another. The primary research question is typically about an interaction.

Statistical Power

The probability of rejecting the null hypothesis given the sample size and expected relationship strength. It's also the complement of the probability of committing a Type II error. There is an 80% chance of rejecting the null for the expected relationship strength.

Confidence Intervals

The range of values that is computed in such a way that some percentage of the time, usually 95%, will be the population parameter.

Why Do Type II Errors Happen?

The research design lacks adequate statistical power to detect the relationship. For example, the sample size is too small.

Telephone Surveys

They have lower response rates and still provide some personal contact with response, but are less costly than in-person surveys. Traditionally, telephone directories have provided fairly comprehensive sampling frames; but with cell phones, this is dwindling.

Crossover Interaction

They literally cross over each other in different situations (best represented in a graph), the strongest form of interaction between independent variables.

Sampling Frame

This occurs once the population has been specified. It is a list of all the members of the population from which to select the respondents. They can come from a variety of sources like telephone directories, lists of registered voters, hospital records, and even a map (selection of cities, streets, or households).

What is the Goal in Conclusions?

To draw conclusions about the population and NOT the sample.

Social Validity

Treatments that have substantial effects on important behaviors and that can be implemented reliably in the real-world contexts in which they occur.

Alternative Explanations to a Change in Posttest Scores: Instrumentation

When the basic characteristics of the measuring instrument change over time. Participants might've been measured differently during the pretest and posttest.

Analysis of Variance (ANOVA)

Used for designs with three or more sample means.

One-Way ANOVA

Used to compare the means of more than two samples (M1, M2,...MG) in a between-subjects design. The null hypothesis is that all the means are equal population (mu1=mu2=...=muG). The alternative hypothesis is that not all the means in the population are equal.

Independent-Samples t-Test

Used to compare the means of two separate samples (M1 and M2). The two samples might have been tested under different conditions in a between-subjects or they could be pre-existing groups in a cross-sectional design. The null hypothesis is that the means of the two populations are the same (mu1 = mu2). The alternative hypothesis is that they aren't the same (mu1 =/ mu2).

One-Sample t-Test Better Definition

Used to determine if a single sample mean is different from a known population mean.

Planned Analysis

Used to test a relationship that you expected in your hypothesis. If you expected a difference between group or condition means, you can compute the relevant means and standard deviations, make a bar graph, and compute Cohen's d. If you expected a correlation between quantitative variables, you can make a line graph or scatterplot (check for for nonlinearity and restriction range), and compute Pearson's r.

Line Graphs

Used when the independent variable is measured in a more continuous manner (e.g., time) or to present correlations between quantitative variables when the independent variable has, or is organized into, a relatively small number of distinct levels. Each point in a line graph represents the mean score on the dependent variable for participants at one level of the independent variable.

Statistical Control

Using statistical techniques to control for the influence (third-variables) of certain variables. Since this is correlational, it is not controlled by random assignment or holding them constant; the researcher measures them and includes them in the analysis.

Cohen's d Strength (Small/Weak)

Values near 0.20. (Pearson's r equivalent to 0.10).

Cohen's d Strength (Medium)

Values near 0.50. (Pearson's r equivalent to 0.30).

Cohen's d Strength (Large/Strong)

Values near 0.80. (Pearson's r equivalent to 0.50).

Restriction of Range

When one or both of the variables have a limited range in the sample relative to the population.

Nonequivalent Groups Design

When participants aren't randomly assigned to conditions, the resulting groups are likely to be dissimilar and nonequivalent. This is a between-subjects design where the participants haven't been randomly assigned to conditions.

M and SD vs Mean and Standard Deviation

When presented in the narrative mean and standard deviation are written out, but when presented parenthetically, M and SD are used instead. So, the treatment group had a mean of 23.40 (SD = 9.33), while 20.87 was the mean of the control group, which had a standard deviation of 8.45.

File Drawer Problem

When researchers obtain nonsignificant results, they tend not to submit them for publication, or if it is submitted, journal editors and reviewers tend not to accept them; so the results go into a file draw (or in a folder on a computer).


Related study sets

Argument Technique in Martin Luther King, Jr.'s "I Have a Dream" Speech Quiz Complete

View Set

12 EnvSci Energy Resources & Consumption Test

View Set

Violence/ Trauma study questions

View Set