Math 1201 chapter 4
suppose there is a medical study that has the variable alcohol, which counts the number of drinks per day that a patient drinks. Suppose the mean of the variable alcohol is 3.5 drinks per day. A particular consumes 2 drinks per day. which of the following symbols would be used to represent the value 2 in the notation of the general linear model?
e i
suppose you make a histogram of the ages of housekeepers from the mindsetmatters data frame with the following code.
gf_histogram(~Age, data=MindsetMatters, color="red",fill="blue")%>% gf_labs(title="Age of Housekeepers", x="Age(years)",y="Number of Housekeepers")
we run a study and calculate a parameter estimate b0=43 we decide to run the study again, and again find a parameter estimate of b0=43. based on these two studies, what do we know about the population parameter b0?
we know that 43 is our best estimate of b0^+ but we can never be sure of the true population parameter.
what notation can be used to represent the mean of a sample?
- y
if the mean age of the housekeepers is 37, and a certain observations has an age of 25, what is the residual
-12
what are some characteristics of a statistical model? (choose all that apply)
A model is an oversimplification of some aspects of the real world a model is used to summarize distributions a model focuses on only one dimension, usually what you are most interested in a model helps us predict future observations a model contains errors which are called deviations
if a given observation is 23.8 and the mean is 33.8, what part of this general linear model notation represents 23.8?
Y i
which of the following cannot be calculated from a data set?
a parameter
in your own words, define what a statistical model is
a statistical model is the function that produces a predicted score for each observations in a distribution
suppose there is a medical study that has the variable alcohol, which counts the number of drinks per day that a patient drinks. Suppose the mean of the variable is 3.5 drinks per day. A particular patient consumes 2 drinks per day. which of the following represents the residual for this patient under the empty model
all of these
in the general linear model notation, which of the following represents the model (or prediction)?
b 0
why is the mean a good model for a distribution?
because the mean balances the deviations above and below the mean.
when we use the mean as a model, why do we call it a " parameter estimate"?
because we cant calculate the mean of the data generating process, we must estimate it.
what notation can be used to represent the mean of a population?
both of these
in the general linear model notation, which of the following represents the residual (or error)?
e i
a parameter describes a sample
false
a statistic describes a population
false
we can think of the mean and the median as the center of the distribution. However , the median is affected by outliers
false
What R code will you use to fit an empty model of age from mindsetmatters?
lm(Age~NULL, data=MindsetMatters)
a best fitting model is a model which minimizes deviations and makes the error as small as possible.
true
we can think of the mean and the median as the center of the distribution however, the mean is affected by outliers
true
in an empty model, the mean of the residuals is:
zero or very close to zero