Final Exam Stat 1430
if p-value = alpha
"marginal result"
what is the derivative of e^-x
-e^-x
e^-1 =
1/e
what is the notation for coefficient of determination
R-Sq
type 1 error
Rejecting null hypothesis when it is true - false alarm - ex: your yogurt might have been ok and you said it wasn't
what does "Sy" stand for?
SD of y values
Sigma Xbar
SigmaX / sqrt.(n)
variance of a discrete random variable
Weighted average of the squared deviations from the mean - notation: sigma - doesn't use units
If X is a binomial random variable then X is also a __________________ random variable.
discrete finite
population
entire group of interest
probability density function of a continuous random variable
f(x) function that tells you how much probability is in the area near x. (not on x, around it)
if p-value > alpha
fail to reject Ho
type 2 error
failing to reject a false null hypothesis - the yogurt might not be filling correctly but you didn't detect it - more complex than type 1 error
Suppose the correlation between two variables X and Y is .8. That means the correlation between Y and X is -.8. (true/false)
false
Suppose the correlation between yards rushing and yards passing is .6. That means the correlation between feet rushing and feet passing is .6 x 12 (since you multiply yards by 12 to convert to feet). (true/false)
false
uniform distribution means
flat
what type of correlation is -1
perfect downhill
what type of correlation is + 1
perfect uphill
With a continuous random variable, P(a < X < b) is the area under the curve f(x) between a and b. (true/false)
true
If x is continuous then P(x < 4) = P(x <= 4). (true/false)?
true because the probability of x=4 is 0.
outliers affect confidence interval (true/false)
true because they affect the mean
what type of correlation is +- .3
weak linear relationship
the mean of a discrete random variable
weighted average of the possible outcomes; weights are the probabilities mu x = sum of x p(x)
use _______ to predict _________
x , y
coefficient of determination
% of variability in y that is due to x
confidence level=
(1 - alpha) %
characteristics of a continuous random variable
- X is a continuous random variable if it takes on values that are in an interval on the real number line uncountably infinite
how to find the best line
- smallest SSE - find the values of b0 and b1 that minimize SSE
Central Limit Theorem (CLT)
If X has any distribution (not normal) then the SHAPE of the sampling distribution of Xbar is approximately normal, as long as n > 30
Mxbar =
Mx
what does "Sx" stand for?
SD of x values
how should you round when your finding sample size
always round up
sampling distribution
finding the distribution of all possible values of the sample statistic (from all possible samples of size n)
discrete random variable
finite or countably infinite ex: number of flips till 100 heads - x = 0, 1, 2, 3, ......
if you switch x and y in a regression problem, does the correlation change?
no
what type of correlation is 0
no linear relationship
if X has a normal distribution. the shape of the distribution of X is _______________ and the shape of the sampling distribution is _______________
normal; normal
if X does not have a normal distribution. the shape of the distribution of X is _______________ and the shape of the sampling distribution is _______________
not normal; approximately normal if n is large enough
what are the 2 conditions n needs to meet to use z
np >= 10 n(1-p) >= 10
b1=
slope
Correlation is affected by outliers (true/false)
true
b0=
y-intercept
best line formula =
y^ = b0 + b1x
can you have a high level of confidence but a small MOE?
yes, bc Z can increase based on the high level of confidence so as increase in n bring MOE down
If X is a continuous random variable, then p(x) = ___________ for any value of x. why?
0 because there is no area or probability at any single point
what is e^0
1
probability distribution of a discrete random variable 2 requirements:
1. 0 <= P(x) <= 1 2. sum of P(x) = 1
probability density function of a continuous random variable 2 requirements:
1. f(x) >= 0 2. total area under the curve = 1
what are the 2 conditions you need to check for confidence interval
1. np hat >= 10 2. n(1 - p hat) >= 10
what are the 5 properties of correlation
1. quantitative variables only 2. linear relationship only 3. no units 4. if you switch x and y you get the same correlation 5. affected by outliers and skewness
to be 90% confident add and subtract ___________ standard errors
1.645
to be 95% confident add and subtract ___________ standard errors
1.96
Suppose X has a uniform distribution on the interval [0, 10]. What is f(x)?
1/10
Let f(x) = kx where x is between 0 and 2. What is the value of k that makes this a legitimate density function?
1/2
to be 99% confident add and subtract ___________ standard errors
2.58
Selling price = $5,240 + $33.80 (Number of Square Feet). How do you interpret the slope for this equation?
As square feet increase by 1, selling price increases by $33.80
how to get the formula for sample size
Xbar +- Z(sigmax/sqrt.n) = desired value of MOE*
random variable
a characteristic you can measure, count, or categorize
what does "observed y" mean?
a data point
Suppose the equation y = 3.45 - 2.58x represents a valid regression equation and X can be used to predict Y. From this information, we know that X and Y have _____________ correlation.
a negative
statistic
a number that describes the sample ex: sample mean (X bar)
parameter
a number that summarizes the population ex: pop mean
sample
a subset of the population that you select
SD of a discrete random variable
a weighted average of the deviation from the mean
best y-intercept formula =
b0 = ybar - b1 (xbar)
best slope formula
b1 = r (sy/sx)
why do you want to choose the sample size beforehand?
bc you don't want surprises with the MOE
If a residual is negative, then that data point lies _________________ the regression line.
below
if the problem asks you to "estimate" what are you trying to find?
confidence interval
What factors affect margin of error?
confidence level sample (size) population SD
is Time continuous or discrete
continuous
when n increases, MOE ____________
decreases
Your boss gives you the following regression equation. X = square feet and Y = selling price Selling price = $5,240 + $33.80 x Does it make sense to interpret the Y-intercept for this equation? (true/false)
false
when the pop SD increases, MOE _____________
increases
when the confidence level increases, Z ____________ and MOE _______
increases, increases
continuous random variable
infinite number of possibilities (uncountably infinite) ex: wait 10 minutes and then you're going to leave
percentile
kth percentile is the value of x where k% lies below x
probability distribution of a discrete random variable
list of all the possible values of x and how often you expect them to occur (looking into future)
how does outliers and skewness affect correlation
makes the relationship weaker (lower)
what does y bar stand for
mean/average of y-values
what type of correlation is +- .5
moderate linear relationship
what type of correlation is +- .6
moderately strong linear relationship
what is the Notation for the mean of a Continuous Random Variable
mu x
mu (x + y) =
mu x + mu y
Residual=
observed y - predicted y
residual=
observed y - predicted y
p-value
probability of being beyond the test statistic
if p-value < alpha
reject Ho
what does "r" stand for
sample correlation
the CLT only pertains to the ____________ of the distribution of the random variable Xbar
shape
spread
standard error/standard deviation
what type of correlation is +- .7
strong linear relationship
SSE stands for
sum of squares for error
significance level
the pre-set cut-off value before you collect any data - usually 0.05 - ALPHA
confidence interval
the range of values within which a population parameter is estimated to lie
what does correlation measure?
the strength and direction of linear relationships
If X and Y are independent, then True/False: Variance of (X-Y) = Variance of (Y-X)
true
Suppose X has a uniform distribution on the interval [0, 10]. True or false, the mean of X is 5.
true
linear transformation formula
y = ax + b mu y = a (mu x) + b