STATS EXAM 1
p-value
to look at the statistical significance for correlation, check the _________
false (could be an identifier, etc.)
(t/f) a variable that consists of numbers is always quantitative variable
true
(t/f) outliers can make a weak correlation strong
false
(t/f): A p-value for correlation which is statistically significant implies the correlation is due to random chance
false
(t/f): If the correlation coefficient (r) of two variables is close to 0, this implies there is no relationship between these two variables?
false (we do not visualize identifiers)
(t/f): a bar chart is a good way to visualize an identifier variable
true
(t/f): a strong relationship can have correlation equal to 0
false
(t/f): an ordinal number is a quantitative variable that has been sorted in ascending or descending order
false
(t/f): outliers can dramatically impact the IQR and median
true
(t/f): quantitative variables must be numerical
false
(t/f): the correlation coefficient (r) is not affected by outliers
true
(t/f): there is sometimes a good reason to remove outliers
linear; non-linear
Correlation Coefficient is a measure of ________ association between two variables. Two variables could have a significant ____________ relationship.
N(98.2,0.7) number for percent higher than 99.26 area for 97.14 and 99.26 is 0.13 0.13/2 = 0.065 percent higher than 99.26 = 6.5%
In a 1992 article in the Journal of the American Medical Association, researchers reported that a more accurate average body temperature for adults is 98.2°F with a standard deviation of 0.7°F. Assume that a Normal model is appropriate N(98.2, 0.7). The area outside (shaded) of 97.14°F and 99.26°F is 0.13. What percent of adults have body temperatures higher than 99.26°F?
N(98.2,0.7) 80th percentile means 80% below OR say 20% above (top 20) --> begins at 0.842 (point for 60%) so 0.842 is the number we are looking for 80th percentile: (X - 98.2) / 0.7 = 0.842 X - 98.2 = 0.5894 X = 0.5894 + 98.2 X = 98.789 98.79 adult body temperature at the 80th percentile
In a 1992 article in the Journal of the American Medical Association, researchers reported that a more accurate average body temperature for adults is 98.2°F with a standard deviation of 0.7°F. Assume that a Normal model is appropriate N(98.2, 0.7). The middle 60% of N(0,1) is between -0.842 and 0.842 What adult body temperature is at the 80th percentile? (round to one decimal point)
SAMPLES give you STATISTICS to help you understand PARAMETERS about POPULATION
SSPP
N(470,24.4) your amount: 442 (442 - 470) / 24.4 -28/24.4 = -1.147 z-score: -1.15
The Chocolataboom Candy Company manufactures many types of different candies. Their most popular Valentine's Day box of candy follows a Normal Distribution with a mean of 470 grams and a standard deviation of 24.4 grams. You decide to buy a box of candy and weight it. Your box of candy weighs 442. Calculate the z-score for your box of candy.
identifier
UTK uses your student ID to track your database. this is a ______ type of identifier
absolute value
__________ tells strength of correlation
z-score of -1 is 70(1 standard deviation below the mean). (75-5= 70) -1 to 1 is 68% outside 1 is 16 68 + 16 = 84 OR 50 + 34 = 84 OR 100 - 16 = 84 84% is above 70 (or -1)
a STATS 201 test is normally distributed with a mean of 75 and a standard deviation of 5. what percent make above 70?
nominal categorical variable
a categorical variable in which the categories do not have a natural order
ordinal categorical variable
a categorical variable in which the categories have a natural order (category order)
scatterplots
all timeplots are ____
IQR
best measure of spread for skewed right distribution
nominal or ordinal
categorical variables are either called ______ or _______
mean
center for normal data
median
center for skewed data
sample
collected/used to make inferences about a population
2 quantitative variables straight enough no outliers
correlation conditions:
x and y
correlation measures linear strength between ________
linear relationship
correlation of 0 means there is still a relationship between variables but no ____________ _________
perfect linear
correlation of 1
shape center spread
describe the overall pattern of the distribution of a quantitative variable (univariate)
scatterplots and timeplots
display bivariate quantitative data
pie chart and bar chart (interchangeable)
graphic can be used for univariate categorical variables
quantitative
histogram variable type
the distribution follows the line
how does a normal probability plot determine if a distribution is normal?
unique
identifiers must be
bigger
if the data is right skewed to the median, the MEAN is ________
not due to pure chance
if this p-value falls below 0.05, we conclude that the linear relationship we are seeing is:
discrete
number has to be whole numbers
correlation
outliers can do anything when it comes to ______
true
outliers can make a strong correlation weak. outliers can also make the slope of a negative line positive
median and IQR
outliers do not impact _______
not significant (not correlation)
p-value greater than (<) 0.05
significant (correlation)
p-value less than (>) 0.05
categorical
pareto chart variable type
categorical
pie chart variable type
identifiers
special type of categorical variable
standard deviation
spread for normal data
IQR
spread for skewed data
quantitative
stem and leaf variable type
population
the group of individuals or things we want to understand
direction form strength unusual features
ways to describe bivariate quantitative data: (4 things)
0
weakest correlation strength
mosaic plots and contingency tables
what graphics can be used for bivariate categorical variables?
the plot doesn't thicken
what is NOT a condition for calculating correlation
mean
what is more impacted by skew? (mean or median)
how much
what phrase helps identify quantitative variables?