ANOVA and Experimental Design

Ace your homework & exams now with Quizwiz!

Consider an experiment conducted to study the effectiveness of different hand washing techniques (factor) on the prevalence of bacteria (response). The experiment tested four different methods—washing with water only, washing with regular soap, washing with antibacterial soap (ABS), and spraying hands with antibacterial spray (AS) (containing 65% ethanol as an active ingredient). Ten different hand washings were included within each level (i.e., ten people washed with water only, with regular soap, etc.). In the regression context, which of the following correctly gives the dimensions of the design matrix, 𝑋? 40 rows and 3 columns 10 rows and 3 columns 10 rows and 4 columns 40 rows and 4 columns

40 rows and 4 columns

What is a factor? 3 points A factor is any variable used to predict or explain the response variable. A factor is a discrete, categorical variable used to predict or explain the response variable. A factor is a continuous variable used to predict or explain the response variable. A factor is a fixed constant in an ANOVA model that is estimated from experimental data.

A factor is a discrete, categorical variable used to predict or explain the response variable.

Blocking is a technique used to include other experimental units in an experiment, in order to reduce undesirable variability in the response True False

Answer: False Explanation: Blocking is indeed used to reduce undesirable variability in the response by grouping similar experimental units together based on certain characteristics. However, the statement that blocking is used "to include other experimental units" is misleading. Blocking does not include other experimental units but rather organizes existing experimental units into blocks to control for variability.

If the nuisance variable in question is known and controllable, then we should measure it and use it as a treatment factor. True False

Answer: False Explanation: If a nuisance variable is known and controllable, it is typically handled by either blocking or covariate adjustment, not by making it a treatment factor. A treatment factor is usually a primary factor of interest that you are explicitly testing. The nuisance variable, while controlled or accounted for, is not the primary focus of the experiment and should not be confused with treatment factors.

In a randomized complete block design, units are randomly assigned to blocks True False

Answer: False Explanation: In a randomized complete block design, experimental units are grouped into blocks based on certain characteristics (e.g., similar conditions, time periods, etc.). The blocks are not randomly assigned; rather, treatments are randomly assigned within each block. The purpose of blocking is to account for variability within each block.

In a randomized experiment, controls are assigned at random to different levels of a treatment factor. 3 / 3 points True False

Answer: False Explanation: In a randomized experiment, treatments (including the control) are assigned at random to experimental units. The statement as written suggests that controls are randomly assigned to different treatment levels, which is not correct. Instead, treatments (including control) are randomly assigned to different units.

One-way ANOVA can be used to analyze the results of a randomized complete block design. True False

Answer: False Explanation: One-way ANOVA is typically used to analyze experiments where there is a single factor of interest and no blocking factor. In a randomized complete block design, there are two factors: the treatment factor and the blocking factor. To properly analyze the results of a randomized complete block design, a two-way ANOVA is used, which accounts for both the treatment and blocking factors.

Any factorial design always includes experimental units in all possible factor level combinations. True False

Answer: False Explanation: While many factorial designs do include all possible combinations of factor levels, this is not universally true. For example, in fractional factorial designs, not all combinations of factor levels are included to reduce the number of experimental runs. These designs are often used when it's impractical to test all combinations due to resource constraints.

Consider the following modeling scenario: For a single year, researchers measure the number of motor vehicle accidents that result in death in each of the 50 states in the United States. They also record each state's speed limit laws over the same time period, and each state's population. They are interested in the following research question: Are the number of motor vehicle deaths in a given state related to a state's speed limit laws? Based on the information given, a reasonable first attempt at answering this question would include: A Poisson regression model with an offset term. A binomial regression model without an offset term. A binomial regression model with an offset term. A Poisson regression model without an offset term.

Correct Answer: A Poisson regression model without an offset term. Reasoning: In this scenario, the response variable (number of motor vehicle deaths) is count data, which is typically modeled with Poisson regression. An offset term is generally used when the number of events is measured relative to some unit of exposure, such as population. However, since population is explicitly mentioned as a recorded variable, it can be directly included as a predictor in the model, so an offset term isn't needed.

Suppose that a group of potential pet food customers were surveyed on the type of pet they owned (factor A). Individuals had either cats or dogs. Then, individuals were asked about their purchasing preferences for pet food (factor B). Specifically, cat owners were only asked about cat foods produced by Brands X, Y, and Z. Dog owners were only asked about dog foods produced by brands X, Y, and Z. Factors A and B are... Nested. Crossed

Correct Answer: Nested Explanation: Factors A and B are nested because each level of Factor A (type of pet: cats or dogs) has a different subset of Factor B (brands of pet food). Specifically, cat owners were asked only about cat food brands, and dog owners were asked only about dog food brands, which implies that the type of food brand is nested within the type of pet.

Suppose that a group of potential pet food customers were surveyed on the type of pet they owned (factor A). Individuals had either cats or dogs. Then, individuals were asked about their purchasing preferences for pet food (factor B). Specifically, cat owners were only asked about cat foods produced by Brands X, Y, and Z. Dog owners were only asked about dog foods produced by those same brands. This study can be analyzed using a two-way ANOVA model because... There are two factors. Factor A has two levels. Factor B has two levels. There were two participants in the study.

Correct Answer: There are two factors. Explanation: A two-way ANOVA is appropriate here because the study involves two factors: the type of pet (Factor A) and the brand of pet food (Factor B). This analysis method is used to examine the influence of these two factors on a response variable, potentially identifying interactions between the factors as well.

Researchers designed an experiment to study factors affecting the particle size in the production of polyvinyl chloride (PVC) plastic. In the experiment, three operators (Factor A, levels X, Y, and Z) used eight different devices called resin railcars (Factor B, levels 1-8) to produce PVC. There were 24 total particle size measurements in the study. Suppose that, for resin railcar 1, operator X produces a larger particle size than operator Y, but a lower particle size for resin railcar 2. This is an example of... An interaction between Factor A and particle size. an interaction between Factor A and Factor B. a balanced design. an unbalanced design. An interaction between Factor B and particle size.

Correct Answer: an interaction between Factor A and Factor B. Explanation: The different outcomes of particle sizes based on the combination of specific operators and specific railcars indicate an interaction between these factors. An interaction occurs when the effect of one factor (e.g., operator) on the outcome (particle size) depends on the level of another factor (e.g., railcar).

A difference across sample means (for example, through visual assessment) implies a difference across population means. 3 points True False

False

A one-way ANOVA model with a J-level factor is a multiple linear regression model with J predictor/explanatory variables, and a continuous response. 3 points True False

False

A two-way ANOVA model in regression form will always have 2−1=1 indicator variables. True / False

False

Consider a one-way ANOVA regression model with a factor of five levels: a control group and four treatments. The mean of the response variable in the third treatment group is 𝛽3 True False

False

Consider a two-way ANOVA model with two factors, τ, of three levels, and α of two levels, with the following regression form: Yi=β0+β1τi,2+β2τi,3+β3αi,2+β4τi,2αi,2+β5τi,3αi,2+εi, where the typical definitions of the indicator variables hold (e.g., τi,2=1 when the ith unit is in the second level of τ and τi,2=0 otherwise). Whether there is an interaction in the data can be successfully tested with a t-test. True False

False

Consider performing 𝑚>1 post hoc comparisons. Using Bonferroni's method, the type I error rate for each individual test will always be larger than the familywise type I error rate. True False

False

In the context of one-way ANOVA, the alternative hypothesis for the (full) F-test is: 𝐻1​: there are no differences with respect to mean of a continuous variable across groups of experimental units True False

False

In the context of one-way ANOVA, the between groups variability is equal to the total variability plus the within group variability. 3 points True False

False

Let 𝛾=(𝛾1,...,𝛾4)be a set of parameters. Then 𝛾1−𝛾2+2𝛾3−4𝛾4​ is a contrast True False

False

Pairwise comparisons can still be conducted when the one-way ANOVA model assumptions are violated. True False

False

Planned hypothesis tests can be conducted if the p-value for the full F-test is greater than the pre-specified significance level for the full-F-test. True False

False

Suppose that we have one factor, ττ , with 2 levels and another factor, αα, at 2 levels. The regression form of this model is: Yi=β0+β1τi,1+β2τi,2+β3αi,1+β4αi,2+εi where εi∼iidN(0,σ2). True / False

False

The Bonferroni method always compares the mean of each group to the mean of every other group (i.e., makes all pairwise comparisons). True False

False

The effects model for one-way ANOVA is given as: Y_(i,j)=μ+τ_j+ε_(i,j), where ε_(i,j)∼iidN(0,σ^2). In this model, 𝜇 can be interpreted as the mean response limited to the units in 𝑗𝑡ℎ level of the factor, 𝜏𝑗. 3 points True False

False

The full F-test can tell researchers which groups differ with respect to the mean of a continuous response. 3 points True False

False

The hypotheses specificied in post hoc comparisons are specified before looking at the data. True False

False

The larger the sample size in a given one-way ANOVA analysis, the smaller the power of associated tests (e.g., pairwise comparisons). True False

False

The means model for one-way ANOVA is given as: Y_(i,j)=μ_j+ε_(i,j), where ε_(i,j)∼iidN(0,σ^2), In this model, 𝜇_𝑗​ can be interpreted as the mean of the response over all units in the population. 3 points True False

False

The means model for one-way ANOVA is given as: Y_(i,j)=μ_j+ε_(i,j), where ε_(i,j)∼iidN(0,σ^2), In this model, 𝜇_𝑗​ can be interpreted as the mean of the response over all units in the sample. 3 points True False

False

The test statistic for the Tukey Method can be positive or negative True False

False

A very small p-value suggests that the differences with respect to the mean of a continuous variable across groups of experimental units is very small. 3 points True False

False P-values are affected by effect size (size of the differences across groups) and sample size.

If bar(𝑌).𝑗≠bar(𝑌).𝑘​ for some 𝑗≠𝑘, then there are differences, with respect to the population mean of a continuous variable, across groups of experimental units. 3 points True False

False These are sample quantities that contain sampling error which could be the cause of the difference rather than real differences in the population Further investigation needs to be conducted to see if the differences reflect real differences in the population, rather than differences due to sampling variability

When conducting planned comparisons in a two-way ANOVA, it is important to: Specify all tests after observing the data. Make sure that the full F-test is not statistically significant Make sure that the full F-test is statistically significant Specify all tests before observing the data. Ajust for multiple comparisons.

Make sure that the full F-test is statistically significant Specify all tests before observing the data.

Which of the following is not a necessary condition for successfully conducting planned comparisons? Planned comparisons must be conducted regardless of the outcome of the full F-test. Planned comparisons must have a significance level greater than or equal to the familywise error rate. The number of planned hypothesis tests is no more than the corresponding degrees of freedom (number of groups minus one). The contrasts that are defined in the planned hypothesis tests are orthogonal.

Planned comparisons must be conducted regardless of the outcome of the full F-test. Planned comparisons must have a significance level greater than or equal to the familywise error rate.

An ANCOVA model includes additional variables over and above ANOVA. These additional variables are sometimes referred to as: 4 points Predictors Explanatory variables Features Covariates

Predictors Explanatory variables Features Covariates

One factor at a time (OFAT) designs... Are better when sample sizes are smaller. cannot detect or estimate interactions. produce estimates of treatment effects that are less precise than those produced by an appropriately designed experiment. cannot account for more than one experimental factor. often require more resources, such as time, energy, and material.

Produce estimates of treatment effects that are less precise than those produced by an appropriately designed experiment. Cannot detect or estimate interactions. Often require more resources, such as time, energy, and material.

What role does randomization play in statistical inference? 3 points Randomization helps mitigate the risk of applying the treatment to experimental units in some systematic way that would affect the causal conclusions of an experiment. Randomization is detrimental in making statistical inferences. Randomization helps make an experiment harder to replicate. Randomization helps researchers assign experimental units to levels of the response.

Randomization helps mitigate the risk of applying the treatment to experimental units in some systematic way that would affect the causal conclusions of an experiment.

Reminder to review graph questions in Week 3: Quiz: Interaction Terms in the Two-way ANOVA Model: Definitions and Visualizations

Reminder to review graph questions in Week 3: Quiz: Interaction Terms in the Two-way ANOVA Model: Definitions and Visualizations

Which of the following are true statements about the difference between an experimental and observational study? There is no difference between an experimental study and an observational study. An experimental study is conducted in a lab. Researchers have control over the treatment in an experimental study. Researchers have control over the treatment in an observational study

Researchers have control over the treatment in an experimental study.

The power of tests associated with a one-way ANOVA analysis are impacted by: Sample size The units of the response The significance level The true size of the mean differences across groups The within group variability

Sample size The significance level The true size of the mean differences across groups The within group variability

A completely randomized design is inappropriate if we have reason to believe that 3 / 3 points Experimental units are homogeneous. The one-way ANOVA assumptions are violated in the analysis of the experiment. experimental units are not homogeneous. There are several treatment levels.

The one-way ANOVA assumptions are violated in the analysis of the experiment. experimental units are not homogeneous.

In the context of two-way ANOVA, before testing for marginal effects, researchers should be reasonably sure that there are significant interactions between factors. True False

This statement is False. Explanation: If significant interactions are present, the interpretation of marginal effects becomes problematic because the effect of one factor may not be consistent across the levels of the other factor. Therefore, it's important to ensure that interactions are not significant before interpreting marginal effects.

Testing the "marginal effect" of factor τ in a two-way ANOVA model with factors τ and α means that we test the effect of α while holding τ at its average value. True False

This statement is False. Explanation: Testing the marginal effect of factor τ\tauτ means that we are testing the effect of τ\tauτ while averaging over the levels of α\alphaα, not holding τ\tauτ at its average value. The statement as written reverses the roles of the factors.

Researchers designed an experiment to study factors impacting the foam index for expresso. In the experiment, three expresso brewing machines (Factor A, machines X, Y, and Z) were used. Researchers also tested whether foam index was impacted by filtered or unfiltered water (Factor B, levels 1-2). Researchers can conduct up to 3−1=2 planned tests without correcting for the family wise type I error. True False

This statement is False. Explanation: The number of planned comparisons allowed without adjusting for family-wise error rate isn't simply k−1k - 1k−1 (where kkk is the number of levels). Planned comparisons should always consider the risk of increasing the Type I error rate, and adjustments are typically recommended when conducting multiple comparisons, regardless of the number.

In the two-way ANOVA context, when using the F-test to test whether an interaction exists for a particular dataset, the model with the interaction term is the reduced model. True False

This statement is False. Explanation: When testing for interaction effects using an F-test, the reduced model (which serves as the baseline or null model) does not include the interaction terms, while the full model does. The F-test compares these two models to determine if adding the interaction terms significantly improves the fit of the model to the data. If the interaction terms do improve the fit significantly, this suggests the presence of an interaction effect.

In the two-way ANOVA model with interactions, the interaction term describes how the relationship between a factor and the response differs as a function of the level of the other factor. True False

This statement is True. Explanation: In a two-way ANOVA with interaction terms, the interaction term captures how the effect of one factor on the response variable changes depending on the level of the other factor. This is the essence of an interaction in ANOVA.

Researchers designed an experiment to study factors affecting the particle size in the production of polyvinyl chloride (PVC) plastic. In the experiment, three operators (Factor A, levels X, Y, and Z) used eight different devices called resin railcars (Factor B, levels 1-8) to produce PVC. There were enough observations to reasonably rule out a statistically significant interaction. Let α=0.05 For p-values <α , is there a statistical significance?

Yes

In the space below, type R code to run a one-way ANOVA, using the aov() function. Assume that the response is called response, there is one predictor called predictor, and the data frame is called data.

aov(response ~ predictor, data = data)

What does the F test tell us in terms of one-way anova?

the full F-test can help us answer the question, are there differences with respect to the mean of a continuous response variable across different groups or across different levels of a factor. If we fail to reject the null hypothesis of the full F-test, then we probably don't need to go any further, because we haven't come up with any evidence that there are differences across any of the means ... but we don't know what the diff is

Consider a two-way ANOVA model with two factors, τ, of three levels, and α of two levels, with the following regression form: Yi=β0+β1τi,2+β2τi,3+β3αi,2+β4τi,2αi,2+β5τi,3αi,2+εi, where the typical definitions of the indicator variables hold (e.g., τi,2=1 when the ith unit is in the second level of ττ and τi,2=0 otherwise). Suppose that there is, in fact, an interaction between factors. What then is the mean of the response for units in the first level of the τ factor and the first level of the α factor?

μ1,1=β0

Suppose that the following regression model corresponding to a two-way ANOVA is correct (factors τ with levels 1-2 and α with levels 1-2): Yi=β0+β1τi,1+β2αi,1+εiwhere: τi,1=1 if the ith unit is in the first level of τ and τi,1=0 if the ith unit is in the second level of τ. αi,1=1 if the ith unit is in the first level of α and αi,1=0 if the ith unit is in the second level of α. εi∼iidN(0,σ2) What is the mean of the response for all units in the the second level of τ and the second level of α ?

μ2,2=β0

Suppose that the following regression model corresponding to a two-way ANOVA is correct (factors τ with levels 1-4, and α with levels 1-2): Yi=β0+β1τi,1+β2τi,3+β3τi,4+β4αi,1+εi where: τi,1=1 if the ith unit is in the first level of τ and τi,1=0 if the ith unit is in any other level of τ. τi,3=1 if the ith unit is in the third level of τ and τi,3=0 if the ith unit is in any other level of τ. τi,4=1 if the ith unit is in the fourth level of τ and τi,4=0 if the ith unit is in any other level of τ. αi,1=1 if the ith unit is in the first level of α and αi,1=0 if the ith unit is in the second level of α. εi∼iidN(0,σ2) What is the mean of the response for all units in the the second level of τ and the second level of α?

μ2,2=β0

Consider a two-way ANOVA model with two factors, τ, of three levels, and α of two levels, with the following regression form: Yi=β0+β1τi,2+β2τi,3+β3αi,2+β4τi,2αi,2+β5τi,3αi,2+εi, where the typical definitions of the indicator variables hold (e.g., τi,2=1 when the ith unit is in the second level of τ and τi,2=0 otherwise). Suppose that there is, in fact, no interaction between factors at any level. What then is the mean of the response for units in the second level of the τ factor and the second level of the α factor?

μ2,2=β0+β1+β3

In a 2014 paper titled "Involving Children in Meal Preparation," published in the journal Appetite, researchers hoped to determine the effect of child participation in meal preparation (factor with two levels) on caloric intake (response). In one group, children participated in the preparation of a meal. In a second group, children did not participate. Which of the following is a correct interpretation of 𝜇 in the one-way ANOVA effects model? 3 points 𝜇 is the population mean of caloric intake, across both meal preparation groups. 𝜇 is the population mean of caloric intake in the group where children did not help prepare meals. 𝜇 is the sample mean of caloric intake, across both meal preparation groups. 𝜇 is the population mean of caloric intake in the group where children helped prepare meals.

𝜇 is the population mean of caloric intake, across both meal preparation groups.

Interaction plots... are helpful visualizations for gaining insight into the nature of interactions in a two-way ANOVA. provide a formal statistical analysis of the interactions between factors. show precisely how the sample means of the response change as a function of the factors. Show precisely how the population means of the response change as a function of the factors.

"are helpful visualizations for gaining insight into the nature of interactions in a two-way ANOVA." True: Interaction plots are indeed useful for visualizing how the interaction between factors affects the response variable. "provide a formal statistical analysis of the interactions between factors." False: Interaction plots do not provide a formal statistical analysis; they only offer a visual insight. Statistical tests like ANOVA are needed for formal analysis. "show precisely how the sample means of the response change as a function of the factors." True: Interaction plots depict how the sample means change across different levels of the factors. "Show precisely how the population means of the response change as a function of the factors." False: Interaction plots show sample means, not population means. The sample means are used as estimates of the population means, but the plot itself does not directly show the population means.

Rresearchers are conducting an experiment with one treatment of five levels. However, they also believe that an additional continuous variable will impact the response. Given this information, which method is most appropriate? 3 points Analysis of variance (ANOVA) Analysis of covariance (ANCOVA) Analysis of variance (ANOVA) with a separate regression model to control for the additional continuous variable. A regression model with the treatment as a discrete predictor/explanatory variable.

Analysis of covariance (ANCOVA)

Consider an experiment exploring factors related to the amount of pressure needed to have mountain climbing ropes fail. The experiment studied three factors, each with two levels: Abrasion: whether the rope had an abrasion on it or not Dirt: whether the rope was dirty or clean Soaked: whether the rope was soaked in water or not Two replicates were recorded for each combination of factor levels. Suppose that each of these factors is negatively associated with the response - for example, an abrasion decreases the amount of pressure needed to have the rope fail. However, rope failure is actually largely due to a fourth variable, namely, whether the rope has been "fatigued" (that is, whether a climber has fallen on it before). Which condition for causal reasoning from experimental data is not met? Temporal relationship Nonspiriousness Empirical relationship

Answer: Nonspuriousness Explanation: Nonspuriousness means that the observed relationship is not due to some other confounding factor. In this case, the presence of fatigue as a potential confounding factor suggests that nonspuriousness might not be met.

In a completely randomized design, randomization helps with: 3 / 3 points Empirical association Correct temporal relationship Nonspuriousness

Answer: Nonspuriousness Explanation: Randomization in a completely randomized design helps ensure that the treatment groups are comparable, reducing the likelihood of confounding factors influencing the results, which helps establish nonspuriousness (i.e., the observed effect is due to the treatment and not some other variable).

Gina is a soccer play looking to improve her soccer skills. One measure of a player's skills is their "plus/minus" score. The plus/minus score subtracts a point for every goal surrendered while the player is on the field,, and adds a point for every goal scored while the player is on the field. One week, Gina performs an aerobic conditioning workout, and notices that her plus/minus score is higher week. From this, she reasons: if I did not perform this aerobic conditioning workout, I would not have increased my plus/minus score. In reasoning in this way, Gina is most likely employing which theory of causality? The structural model theory of causality The counterfactual theory of causality The probabilistic theory of causality

Answer: The counterfactual theory of causality Explanation: Gina is reasoning by considering what would have happened in the counterfactual scenario where she did not perform the aerobic conditioning. This reasoning aligns with the counterfactual theory of causality, which evaluates causality by considering alternate scenarios (what would happen if the factor were absent).

Beth and Jessie's Ice Cream Company is attempting to perfect their recipe for vegan chocolate ice cream. One change in the recipe is related to the milk substitute. In some recipes they use oat milk, and in others, almond milk. Through extensive focus group analysis, they notice an interesting trend: whenever they use the oat milk recipe, the mean consumer rating is higher. They conclude that, for any given focus group where oat milk was used, if almond had been used instead, the mean consumer rating would have been lower, and the cause would be the milk substitution type. In reasoning in this way, Beth and Jessie are most likely employing which theory of causality? The counterfactual theory of causality The probabilistic theory of causality The structural model theory of causality

Answer: The counterfactual theory of causality Explanation: The counterfactual theory of causality involves reasoning about what would have happened if a different action or condition had been in place. In this case, Beth and Jessie are considering how the mean consumer rating would have been different (lower) if almond milk had been used instead of oat milk, which is a classic example of counterfactual reasoning.

In the study, 100 different senior-level high school math classes were randomly chosen from all such classes in the state of Colorado. Each class had n=25 students. Among the 100 classes, 50 were randomly chosen, and teachers were asked to teach a lesson using a new active learning teaching method; in the classes, teachers used the standard "lecture" teaching method. The response in this experiment was an exam, which was administered to each student and measured the extent to which each student learned the content of the lesson. Identify the experimental units in this study. The experimental units are the 100 teachers teaching the senior-level high school classes. None of the above. The experimental units are the 100 senior-level high school classes. The experimental units are the 2500 students in the senior-level high school classes.

Answer: The experimental units are the 100 senior-level high school classes. Explanation: The experimental unit is the entity to which the treatment is applied. In this case, the treatment (teaching method) is applied to the entire class, so the class is the experimental unit.

Beth and Jessie's Ice Cream Company is attempting to perfect their recipe for vegan chocolate ice cream. One change in the recipe is related to the milk substitute. In some recipes they use oat milk, and in others, almond milk. Through extensive focus group analysis, they notice an interesting trend: the use of oat milk increases the odds of a high mean consumer rating at a given focus group by a factor of two. They conclude that the oat milk is the cause of the odds increase. In reasoning in this way, Beth and Jessie are most likely employing which theory of causality? The probabilistic theory of causality The counterfactual theory of causality The structural model theory of causality

Answer: The probabilistic theory of causality Explanation: Beth and Jessie are focusing on the increase in odds (probability) of a high consumer rating due to oat milk, which fits within the probabilistic theory of causality.

Gina is a soccer play looking to improve her soccer skills. One measure of a player's skills is their "plus/minus" score. The plus/minus score subtracts a point for every goal surrendered while the player is on the field, and adds a point for every goal scored while the player is on the field. Suppose, through detailed analysis, Gina notices that, if she performs an aerobic conditioning workout in a given week, she is more likely to have a higher plus/minus score that week. From this, she concludes that the aerobic conditioning workout causes the increase in her plus/minus score. In reasoning in this way, Gina is most likely employing which theory of causality? The structural model theory of causality The counterfactual theory of causality The probabilistic theory of causality

Answer: The probabilistic theory of causality Explanation: Gina is reasoning that the aerobic conditioning increases the likelihood (probability) of a higher plus/minus score. This aligns with the probabilistic theory of causality, which focuses on the increased likelihood of an outcome due to a certain factor.

Beth and Jessie's Ice Cream Company is attempting to perfect their recipe for vegan chocolate ice cream. One change in the recipe is related to the milk substitute. In some recipes they use oat milk, and in others, almond milk. Beth and Jessie hire several food scientists who explain to them that oat milk is sweeter, and there is a stable (stochastic) positive relationship between sweetness and consumer ratings. Based on this information, Beth and Jessie conclude that oat milk is the cause of the higher consumer rating. The structural model theory of causality The counterfactual theory of causality The probabilistic theory of causality

Answer: The structural model theory of causality Explanation: This scenario involves a structural explanation (sweetness leads to higher ratings), which aligns with the structural model theory of causality. This theory involves understanding causality through structural relationships between variables.

In a completely randomized design, every experimental unit has the same chance of being assigned to the control or treatment groups. 3 / 3 points True False

Answer: True Explanation: A completely randomized design ensures that each experimental unit has an equal probability of being assigned to any of the treatment groups, including the control group.

Factorial designs are experimental designs that consist of two or more treatment factors. True False

Answer: True Explanation: Factorial designs involve two or more factors, with each factor having multiple levels, allowing for the study of interactions between the factors.

Randomization reduces the effect of unknown nuisance variables that might be correlated with the response True False

Answer: True Explanation: Randomization helps ensure that unknown nuisance variables, which might otherwise influence the response, are evenly distributed across treatment groups. This minimizes the likelihood that these nuisance variables will confound the results, thereby reducing their effect on the response.

In a randomized experiment, a control is a baseline to which treatments are compared. 3 / 3 points True False

Answer: True Explanation: The control group serves as the baseline in a randomized experiment, allowing the effects of treatments to be compared against a standard or default condition.

Two-way ANOVA can be used to analyze the results of a randomized complete block design with one treatment factor and one blocking factor True False

Answer: True Explanation: Two-way ANOVA is commonly used to analyze data from a randomized complete block design. In this context, one factor represents the treatment, and the other factor represents the blocks. The two-way ANOVA allows the experimenter to assess both the treatment effect and the blocking effect.

Which of the following are benefits of casting one-way ANOVA as a linear regression model? 4 points Casting one-way ANOVA as a linear regression model allows us to rely on least squares (or maximum likelihood) to estimate our parameters. Casting one-way ANOVA as a linear regression model allows us to rely on the interpretation of regression parameters to answer our research questions of interest. Casting one-way ANOVA as a linear regression model provides a set of inference techniques (e.g., t-tests, F-tests) that might help us answer research questions of interest.

Casting one-way ANOVA as a linear regression model allows us to rely on least squares (or maximum likelihood) to estimate our parameters. Casting one-way ANOVA as a linear regression model allows us to rely on the interpretation of regression parameters to answer our research questions of interest. Casting one-way ANOVA as a linear regression model provides a set of inference techniques (e.g., t-tests, F-tests) that might help us answer research questions of interest.

Prospective power analyses are often used to: Choose a sample size to achieve a particular power level Estimate the power of a given study design Choose the effect size Minimize the type I error

Choose a sample size to achieve a particular power level Estimate the power of a given study design

Markus is conducting a study on the effect of eating dark chocolate on health. In the study, Markus recruits 𝑛=24 individuals, and splits them into three groups: 1. A control group that eats no dark chocolate. 2. A group that eats one ounce of dark chocolate per day for six weeks. 3. A group that eats one ounce of dark chocolate per day for six weeks and performs at least 30 minutes of exercise four times per week. Which of the following would help avoid data dredging: If hypotheses about the relationships between dark chocolate and health markers cannot be specified before conducting the study, then, to achieve an overall false positive rate of 5%, the team should set the familywise type I error rate to 5%, and adjust individual hypothesis test type I errorrates accordingly. Specify any hypotheses about the relationships between dark chocolate and health markers before conducting the study.

If hypotheses about the relationships between dark chocolate and health markers cannot be specified before conducting the study, then, to achieve an overall false positive rate of 5%, the team should set the familywise type I error rate to 5%, and adjust individual hypothesis test type I errorrates accordingly. Specify any hypotheses about the relationships between dark chocolate and health markers before conducting the study.

Cconsider a one-way ANOVA with a 𝐽 level factor. How many pairwise comparisons are possible (i.e., hypotheses of the form 𝐻0:𝜇𝑖=𝜇𝑗​ for 𝑖≠𝑗)?

J choose 2 comparisons are possible

Social science researchers are sometimes interested in what social or cultural factors influence happiness. One such research question might be: are married people happier than non-married people? Select the most promising modeling approach for answering this question. 4 points An ANOVA model with a continuous happiness measure as the response and a two-level factor that records marital status as a predictor/explanatory variable. Since other variables like income level influence happiness, an ANCOVA model with a continuous happiness measure as the response, a two-level factor that records marital status, and other variables like income level as a predictors/explanatory variables. Since other variables like income level influence happiness, a multiple linear regression model with a continuous happiness measure as the response, a two-level factor that records marital status, and other variables like income level as a

Since other variables like income level influence happiness, an ANCOVA model with a continuous happiness measure as the response, a two-level factor that records marital status, and other variables like income level as a predictors/explanatory variables. Since other variables like income level influence happiness, a multiple linear regression model with a continuous happiness measure as the response, a two-level factor that records marital status, and other variables like income level as a predictors/explanatory variables.

Consider an experiment conducted to study the effectiveness of different hand washing techniques (factor) on the prevalence of bacteria (response). The experiment tested four different methods—washing with water only, washing with regular soap, washing with antibacterial soap (ABS), and spraying hands with antibacterial spray (AS) (containing 65% ethanol as an active ingredient). Ten different hand washings were included within each level (i.e., ten people washed with water only, with regular soap, etc.). Whitney mistakenly sets up a one-way ANOVA regression model with four indicator variables--one for each level. What consequences does this mistake have? 4 points The regression model is non-identifiable. hat(𝛽)=(𝑋^𝑇𝑋)^(−1)X^TY cannot be accurately from the data. The matrix 𝑋^𝑇𝑋 (where 𝑋 is the design matrix) is invertible. The matrix 𝑋^𝑇𝑋 (where 𝑋 is the design matrix) is not invertible.

The regression model is non-identifiable. hat(𝛽)=(𝑋^𝑇𝑋)^(−1)X^TY cannot be accurately from the data. The matrix 𝑋^𝑇𝑋 (where 𝑋 is the design matrix) is not invertible.

In the study, 100 different senior-level high school math classes were randomly chosen from all such classes in the state of Colorado. Each class had n=25 students. Among the 100 classes, 50 were randomly chosen, and teachers were asked to teach a lesson using a new active learning teaching method; in the classes, teachers used the standard "lecture" teaching method. The response in this experiment was an exam, which was administered to each student and measured the extent to which each student learned the content of the lesson. Identify the sampling units in this study. The sampling units are the 2500 students in the senior-level high school classes. The sampling units are the 100 teachers teaching the senior-level high school classes. None of the above. The sampling units are the 100 senior-level high school classes.

The sampling units are the 2500 students in the senior-level high school classes.

Interaction plots can tell researchers whether there is a statistically significant interaction between factors (with respect to the mean of a continuous response variable). True False

The statement is False. Explanation: Interaction plots are helpful visual tools for visualizing potential interactions between factors by showing how the mean response changes across levels of the factors. However, they do not provide a formal statistical test of significance. To determine whether an interaction is statistically significant, researchers need to conduct a formal statistical test, such as an ANOVA test, which provides p-values for the interactions.

Researchers designed an experiment to study factors affecting the particle size in the production of polyvinyl chloride (PVC) plastic. In the experiment, three operators (Factor A, levels X, Y, and Z) used eight different devices called resin railcars (Factor B, levels 1-8) to produce PVC. There were 24 total particle size measurements in the study. This experiment is balanced. This experiment is unbalanced. There is not enough information to conclude that the experiment is balanced or unbalanced.

There is not enough information to conclude that the experiment is balanced or unbalanced.

Researchers designed an experiment to study factors affecting the particle size in the production of polyvinyl chloride (PVC) plastic. In the experiment, three operators (Factor A, levels X, Y, and Z) used eight different devices called resin railcars (Factor B, levels 1-8) to produce PVC. There were 24 total particle size measurements in the study. However, there was no measurement of particle size for operator Y and resin railcar 3. This experiment is unbalanced. This experiment is balanced. This experiment contains replication. This experiment contains the same number of replications for each factor level combination

This experiment is unbalanced (correct, due to the missing measurement). This experiment contains replication (correct, in the broader sense that each operator-railcar combination was likely intended to be replicated across all railcars, though the missing data point complicates this).

The testing of combined effects in the context of two-way ANOVA implies that there are not interactions present. True False

This statement is False. Explanation: Combined effects can be tested regardless of whether interactions are present. However, if interactions are significant, they may complicate the interpretation of main effects because the effect of one factor may depend on the level of another factor.

Researchers designed an experiment to study factors affecting the particle size in the production of polyvinyl chloride (PVC) plastic. In the experiment, three operators (Factor A, levels X, Y, and Z) used eight different devices called resin railcars (Factor B, levels 1-8) to produce PVC. Suppose that there were several replications in the study. If the interaction terms in this study are statistically significant, then we cannot easily interpret the marginal effect of resin railcar in particle size. True False

This statement is True. Explanation: When the interaction term is statistically significant, it means the effect of one factor depends on the level of the other factor. In this case, the marginal effect of resin railcar on particle size cannot be easily interpreted without considering the specific operator (Factor A) involved. The interaction complicates the interpretation of the main effects because the effect of one factor is conditional on the levels of the other factor.

A one-way ANOVA model with a J-level factor is a multiple linear regression model with J-1 predictor/explanatory variables, and a continuous response. 3 points True False

True

Analysis of Covariance (ANCOVA) can help answer the question: are there differences, with respect to the population mean of a response variable, across groups, adjusting for several continuous variables thought to be correlated with the response? 3 points True False

True

Analysis of Covariance (ANCOVA) can help answer the question: how does the relationship between a continuous response and continuous predictor differ, on average, across groups? 3 points True False

True

Consider a one-way ANOVA regression model with a factor of five levels: a control group and four treatments. The mean of the response variable in the third treatment group is 𝛽0+𝛽3. True False

True

Consider a two-way ANOVA model with two factors, τ and α, each of two levels, with the following regression form: Yi=β0+β1τi,2+β2αi,2+β3τi,2αi,2+εi, where the typical definitions of the indicator variables hold (e.g., τi,2=1 when the ith unit is in the second level of τ and τi,2=0otherwise). Whether there is an interaction in the data can be successfully tested with a t-test. True False

True

Consider the following ANCOVA regression model, which has a three level factor and a continuous predictor 𝑋X. One can test the statistical significance of interactions with an F-test for regression coefficients. 3 points True False

True

In the context of one-way ANOVA, the total variability is equal to the between groups variability plus the within group variability. 3 points True False

True

In the one-way ANOVA regression model, the intercept term 𝛽0​ is the expected response in the baseline/reference/control group. 1 point True False

True

Post hoc comparisons performed without adjusting for type I error rates is a form of data dredging. True False

True

Retrospective power analyses are widely thought to be epistemically unjustified True False

True

Suppose that a one-way ANOVA is conducted, with a factor of 5 levels. The contrast using 𝑐=(1,−1,1,0,−1) and 𝜇=(𝜇1,𝜇2,𝜇3,𝜇4,𝜇5)is equivalent to the null hypothesis 𝐻0:𝜇1−𝜇2+𝜇3=𝜇5 True False

True

The effects model for one-way ANOVA is given as: Y_(i,j)=μ+τ_j+ε_(i,j), where ε_(i,j)∼iidN(0,σ^2). In this model, 𝜇 can be interpreted as the mean of the response over all units in the sample. 3 points True False

True

The means model for one-way ANOVA is given as: Y_(i,j)=μ_j+ε_(i,j), where ε_(i,j)∼iidN(0,σ^2), In this model, 𝜇_𝑗 can be interpreted as the mean response limited to the units in 𝑗𝑡 th level of the factor, 𝜏_𝑗​. 3 points True False

True

The use of one-way ANOVA in a completely randomized design provides the justification for the "empirical association" condition of causality. 0 / 3 points True False

True

Tukey's method always compares the mean of each group to the mean of every other group (i.e., makes all pairwise comparisons). True False

True

When conducting several hypothesis tests after the data have been observed, one must adjust the p-values of those tests to correct for the _____________ error rate. familywise type I individual type I

familywise type I

In a 2014 paper titled "Involving Children in Meal Preparation," published in the journal Appetite, researchers hoped to determine the effect of child participation in meal preparation (factor with two levels) on caloric intake (response). In group one, children participated in the preparation of a meal. In group two, children did not participate. Which of the following is a correct interpretation of 𝜇2​ in the one-way ANOVA means model? 1 point 𝜇2​ is the mean of the caloric intake in the meal preparation group where children did not participate. 𝜇2​ is the mean of the meal preparation variable where children did not participate. 𝜇2​ is the mean of the caloric intake in the meal preparation group where children did participate. None of the above are correct.

𝜇2​ is the mean of the caloric intake in the meal preparation group where children did not participate.


Related study sets

Chemistry study guide for test 8

View Set

Healthy Living Cal Poly KINE 250 Quiz 3

View Set

Vocabulary Workshop Level E Unit 6 (Definitions)

View Set

Chapter 10 Section 1 - Weathering

View Set

Driver Education Chapter 2 New Jersey Driver Testing

View Set