Stats

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Sampling Techiniques

1. SRS - Number the entire population, draw numbers from a hat (every set of n individuals has equal chance of selection) 2. Stratified - Split the population into homogeneous groups, select an SRS from each group 3. Cluster - Split the population into heterogeneous groups called clusters for the sample. Ex. Choosing a carton of eggs actually chooses a cluster (group) of eggs. 4. Census - An attempt to reach the entire population 5. Convenience - selects individual easiest to reach 6. Voluntary response - People choose themselves by responding to a general appeal.

Experimental Designs

1. CRD (Completely Randomized Design) - All experimental units are allocated at random among all treatments 2. RBD - (Randomized Block Design) - Experimental units are put into homogeneous blocks. The random assignment of the units to the treatments is carried out separately within each block.

Interpreting a Residual Plot

1. Is there a curved pattern? If so, a linear model may not be appropriate. 2. Are the residuals small in size? If so, predictions for larger (smaller) values of x will be fairly precise. 3. is there increasing (or decreasing) spread? If so, predictions for larger (smaller) values of x will be more variable.

Linear Transformations

Adding "a" to every member of a data set adds "a" to the measures of position, but does not change the measures of spread or the shape. Multiplying every member of a data set by "b" multiplies the measures of position by "b" and multiplies most measures of spread by abs(b), but does not change the shape.

Interpret r

Correlation measures the strength and direction of the linear relationship between x and y. - r is always between -1 and 1 - close to zero = very weak - close to 1 or -1 = stronger - exactly 1 or -1 = perfectly straight line - positive r = positive correlation - negative r = negative correlation

Interpret LSRL Slope "b"

For every one unit change in the x variable (context) the y variable (context) is predicted to increase/decrease by __ units (context).

Interpret Standard Deviation

Standard Deviation measures spread by giving the "typical" or "average" distance that the observations (context) are away from their (context) mean

Advantage of using a Stratified Random Sample over an SRS

Stratified random sampling guarantees that each of strata will be represented. When strata are chosen properly, a stratified random sample will produce better (less variable/more precise) information than an SRS of the same size.

Goal of Blocking Benefit of Blocking

The goal of blocking is to create groups of homogeneous experimental units. The benefit of blocking is the reduction of the effect of variation within the experimental units. (context)

Outlier Rule

Upper Bound=Q3 + 1.5(IQR) Lower Bound=Q1 - 1.5(IQR) IQR = Q3 -Q1

What is an Outlier?

When given 1 variable data: An outlier is any value that falls more than 1.5(IQR) above Q3 or below Q1 Regression Outlier: Any value that falls outside the pattern of the rest of the data.

Interpret LSRL y-intercept "a"

When the x variable (context) is zero, the y variable (context) is estimated to be (put value here).

INterpret LSRL "^y"

^y is the "estimated" or "predicted" y-value (context) for a given x-value (context)

Interpret r^2

__% of the variation in y (context) is accounted for by the LSRL of y (context) on x (context). OR __% of the variation in y (context) is accounted for by using the linear regression model with x (context) as the explanatory variable

Extrapolation

s = ___ is the standard deviation of the residuals. it measures the typical distance between the actual y-values (context) and their predicted y-values (context)

Interperet a z-score

z= (value-mean)/standard deviation A z-score describes how many standard deviations a value or statistic (x, x_bar, ^p, etc.) falls away frmo the mean of the distributation and in what direction. The further the z-score is away from zero the more "surprising" the value of the statistic is.

Using Normalcdf and invnorm (Calculator Tips)

Normalcdf (min, max, mean, standard deviation) Invnorm (area to the left as a decimal, mean, standard deviation)

What is a Residual?

Residual = y - ^y A residual measures the difference between the actual (observed) y-value in a scatterplot and the y-value that is predicted by the LSRL using its corresponding x value. in the calculator: L3 = L2 - Y1(L1)

Interpret LSRL "SEb"

SEb measures the standard deviation of the estimated slope for predicting the y variable (context) from the x variable (context) SEb measures how far the estimated slope will be from the true slope, on average.

Describe the Distributions OR Compare the Distribution

SOCS! Shape, Outliers, Center, Spread Only discuss outliers if there are obviously outliers present. Be sure to address SCS in context! If it says "Compare" YOU MUST USE comparison phrases like is greater than" or "is less than" for Center and Spread

SOCS

Shape - Skewed Left (Mean <Median) Skewed right (Mean > Median) Fairly Symmetric (Mean ~= Median) Outliers - Discuss them if there are obvious ones Center - Mean or Median Spread - Range, IQR, or Standard Deviation Note: Also be on the lookout for gaps, clusters, or other unusual features of the data set. Mae Observations!


Kaugnay na mga set ng pag-aaral

Chapt. 4 - Medical Privacy - HIPAA Privacy Rule & Security Rule

View Set

AP Macroeconomics Modules 1, 2 & 3

View Set

UNIT 2- REAL ESTATE LICENSE AND QUALIFICATIONS FOR LICENSURE (CONTENT)

View Set

American History B. Samuel Gomper and Mother Jones

View Set