AP Stat Midterm
How to calculate r
2nd-catalog-alpha- x-1 - diagnosticon, then stat-calc-linreg
tend to be strong opinions-biased, so not rep. of population (call-in polls) bad to use
voluntary response sample
-.25<r<0
weak negative linear association
0<r<.25
weak positive linear association
explanatory is the __ variable
x
symbol for means of statistic
xbar
response is the ___ variable
y
how to calcuate residuals
y-yhat
how to find y intercept
ybar-b(xbar)
does equal to matter for discrete?
yes
the distance between an observation and the mean expressed as a certain number of standard deviations; this is positive (negative) if the observation lies above (below) the mean
z-score
what does the upside-down U mean in the A and Bs?
and
lurking variable affects both x and y
common response.
lurking variable affects onlly y (response)
confounding
measureable random variable
continuous
effects of lurking variables, compare treatment and placebo (principle of experimental design)
control
selection bias (mall intercept interviews) bad to use
convenience sample
both units are quantitave
correlation
imposes treatment to observe responses
experiment
use of regression to predict outside range of explanatory variables (not always accurate, never do)
extrapolation
explanatory variables in experiments
factors
how to find normal estimation for prob __ or more outcome
find mean, standard deviation, and make sure sample size is large enough. normcdf(outcome # [lower lim], 10^99, mean, standard dev.)
How do you find relative frequency
for each count, divide by total number of data points, then convert to a percentage
Use when finding the first success on or before the nth trial (p,n)
geometcdf
either of two discrete probability distributions: The probability distribution of the number X of Bernoulli trials needed to get one success, supported on the set { 1, 2, 3, ...} The probability distribution of the number Y = X − 1 of failures before the first success, supported on the set { 0, 1, 2, 3, ... }
geometric random variable
Why doesn't equal to matter in continuous
it's area, and there is no area of a line
how do you get estimates
large random samples
law stating that a large number of items taken at random from a population will (on the average) have the population statistics
law of large numbers
a straight line that describes how a response variable changes as an explanatory variable changes
least squares regression line
symbol for size of statistic
lowecase n
not resistant to outliers
mean
resistant to outliers
median
symbol for means of parameter
mew
expected sum of 2 outcomes
mew sub x + mew sub (add means)
expected difference of 2 outcomes
mew sub x- mew sub y (subtract means)
-.75<r<-.25
moderate negative linear association
.25<r<.75
moderate positive linear association
how to find the mean of a probability distribution
multiply each value of X by each probability P(X), then add these results together
how to get prob two independent things (and)
multiply their probabilities
Relative cumulative frequency graph
ogive
symbol for proportion of parameter
p
symbol for proportion of statistic
p hat
"and" independent p(a) etc
p(A) x p(b)
p(A) + p(B)-p(a and b) =
p(a or b)
"and" not independent p(a) etc
p(a) x p(A I B)
"or" all events p(a) etc
p(a)+ p(b) - p(a and b)
"or" mutually exclusive p(a) etc
p(a)+p(b)
sampling distribution- true. describes population. fixed #, don't know value
parameter
how to find mean of statistic proportion
parameter proportion
Experimental results caused by expectations alone; any effect on behavior caused by the administration of an inert substance or condition, which is assumed to be an active agent.
placebo effect
how to find y hat
plug in x to l-s regression line
entire group of individuals we want info about
population
how to identify causation
put x in L1and y in L2. make scatterplot. make L3 log(l2). go to2nd "y="(stat pbt) and make y list L3 and graph. then make L4 log(L1). on 2nd "y=", make x L4 and y L3. if this makes a straighter line, it's power. if so do stat-calc-pwrreg then put in y= equation into "y=" and then to 2nd "y=" and make them L1 and L2 again then graph.
What unit is r
r is unitless.
how to find slope
r(sy/sx)
impersonal chance in assigning experimental units (principle of experimental design)
randomize
best way to show causation
randomized experiments
reduce chance variarion in result (principle of experimental design)
replicate
the difference between an observed value of the response variable and the value predicted by the regression line (y - y-hat)
residual
symbol for means stand. dev. of statistic
s
part of population we examine to get info
sample
union (a + b)
sample space
the set of all possible outcomes of an event
sample space S
the most effective method of displaying the relation between two quantitative variables
scatterplot
how to find binomial stand dev
square root (np(1-p))
how to find stand. dev. of statistic prop.
square root (p(1-p))/n) or sqare root (pq/n)
how to find the standard deviation of a probability distribution
square root of : sum of x^2 times p(x)-mean^2 (or square root of variance)
simple random sample
srs
whole population in hat and draw names, everyone has an equal chance
srs
sampling dist.- describes sample. value known, but can change sample to sample. estimates.
statistic
experimental groups split by similarities. sampling
stratified
-1<r<-.75
strong negative linear association
.75<r<1
strong positive linear association
how to find the variance of a probability distribution
sum[sigma] of (x^2 x p(x)^2 - mean^2)
standard deviation explanatory shorthand (sample)
sx
standard devation response shorthand
sy
how to find a high outlier
the number is higher than Q3+1.5(IQR)
how to find a low outlier
the number is lower than Q1-1.5(IQR)
levels and combinations in experiments
treatment
how to word correlation (linear regression)
a% of the variation in y can be explanws by the linear model relating x to y.
what numbers can r be
between -1 and 1
systematically favors certain outcomes (prison survey to find prop of people who commit crimes)
bias
finds AT MOST. (total number, percent, number wanted)
binomcdf
how to find prob. at most __ outcome
binomcdf ( # trials, probability, outcome wanted)
used when working with total # of trials, not srs
binomials
how to find prob. exactly __ outcome
binompdf(# of trials,probability,outcome wanted)
grouping experiment subjects by differences (male, female)
blocking
How to determine symmetry in distribution
boxplot
symbol for size for parameter
capital N
contacts every individual in population
census
As the sample size increases, the distribution of the sample mean of a randomly selected sample approaches the normal distribution.
central limit theorem
area of a density curve
1
area under density curve
1
what does cumulative relative freq. add up to
1
how to find prob __ or more outcome
1-binomcdf (# trials, prob, # to stop at or get [one less that wanted outcome number])
how to find at least one in a binomial on calculator
1-binompdf(# of trials, prob, 0)
How to find at least 1
1-p(0)
how to find prob. at least one
1-p(0), so 1-binompdf(sample, p(success), 0)
divide population into clusters
cluster sampling
simple random sample: make a list of your population, assign numbers to the list, randomly pick numbers, do not replace. unbiased
SRS
inverse cumulative probability; used to find the X value when the mean, standard deviation, and z-score are known
inverse norm
how to find inverse norm (not area of curve)
invnorm(area,mean,st.dev)
how to find z-score
(x-mean) divided by standard deviation
how to measure correlation by hand
+-(1-(length minor axis/length major axis))
how to find mean sub x
1/p
states that nearly all values lie within three standard deviations of the mean in a normal distribution
68-95-99.7 rule
how to find (pA or B) if independent
P(A) + P(B) -p(a and B)
how to find interquartile range (IQR)
Q3-Q1
percent of a whole
Relative frequency
mathematical model, always above x-axis, normal dist., area always equals one, area under curve and above any range of value equals proportion of all aobservations that fall in that range
density curve
countable random variable
discrete
mutually exclusive, can't happen at the sme time
disjoint
random events must not be dependent to other events in probability
independence
can you say explanatory variable causes the response variable
no
does equal to matter for continuous?
no
the tendency for a sample to differ from the population because measurements are not obtained from all individuals selected for inclusion in the sample
nonresponse bias
(left end point, right end point, mean, standard deviation)
normcdf
calculator function used to find the area under the standard normal curve
normcdf
what does the c mean on a and b (complementary)
not that one. 1-P(that one)
how to find binomial mean
np
how do you know if the sample size is large enough
np>/= 10 and n(1-p) >/= 10
symbol for means standard dev. of parameter
o with a ponytail
observes individual and measure variable of interest but does not attemp to influence respoense
observational study