STA 2023 Exam 1

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Probability

=(UV-LV)/(Max-Min)

Expected

=mean=average

Multiplicative Rule

P(A and B) = P(A) x P(B)

Additive rule for disjoint events

P(A or B) = P(A) + P(B)

Additive rule

P(A or B) = P(A) + P(B) - P(A and B)

Conditional probabilities

P(AlB) = P(A and B)/P(B) or P(BlA) = P(A and B)/P(A)

Compliment rule

P(not A) = 1-P(A)

Random variable

a variable that represents the numerical outcome(s) of a random phenomenon

Outliers

any observations that are significantly far away from the rest of the data points

Categorical

bar charts (bars do not touch) and pie charts

Normal

bell curve, mean=median=mode

Discrete RV's

binomial and non binomial, RV's which assume a finite number of outcomes, countable, probability of an exact value can be computed

Exact

binompdf (n,p,x)

Graphical

categorical and quantitative

Numerically

center and spread

r2

coefficient of determination, percent of variation in y that is explained by x, between 0% and 100%, 1-r2= the fraction of the variation in y that is NOT explained by x, larger r2 is better

Binomial RV's

collection of yes/no (binary) outcomes

r

correlation coefficient, sign of the slope, measures strength and direction, between -1 and 1, not affected by units of x and y

Discrete

countable

Quantitative

data that is describing using numbers, can be averaged

Categorical

data that is describing using words or categories, qualitative

Probabilities

determined by the proportion of times the event(s) will occur in a long series of independent trials (law of large numbers)

Less than x times

doesn't include x

More than x times

doesn't include x

At least 1

equals 1 - none

Quantitative

histograms (bars usually touch), stem plots, box plots, and dot plots

At least x times

includes x

At most x times

includes x

IQR

interquartile range, Q3-Q1 where Q3= 75th percentile (the median of the 'top' half) and Q1= 25th percentile (the median of the 'bottom' half), gives the spread of the central (middle) 50% of the data set

Non binomial RV's

many types (distinguishing specific type is not required)

Center

mean, median, mode

Skewed right

mean->median->mode

Skewed left

mean<-median<-mode

Bimodal

mean=median

Rectangular/uniform

mean=median

Bell shaped with an outlier

mean>median

Continuous

measurable

Range

min-max, the measure of spread that is affected most by outliers

Least square regression line

minimizes the sum of the residuals squared

Disjoint

mutually exclusive, cannot occur together

Binomial distribution

observation of binary, probability of success is constant, n fixed observations, n observations are independent, X~B(n,p)

Independent

occurrence of one does not affect the probability of the other

Uniformly distributed RV's

outcomes are equally likely

Value

probability, normalcdf(LV,UV,mean,SD), for z use 0 as mean and 1 as SD

Spread

range, variance, standard deviation, IQR

Standard deviation

represents the 'typical' distance from the mean, in units of data, measure of spread that is smaller for distributions where the points are clustered around the middle, cannot be negative

Variance

represents the 'typical' squared distance from the mean, in squared units, measure of spread around the mean, but its units are not the same as those of the data points

Normally distributed RV's

should be told the population is normal, z score can be used

b

slope, as x increases by 1, y is predicted to increase/decrease by b

Mean

the 'average', the balancing point, distances from the data points always add up to zero

Sensitivity

the condition is correctly determined to exist in the subject

Specificity

the condition is correctly determined to not exist in the subject

False positive

the condition is incorrectly determined to exist in the subject

False negative

the condition is incorrectly determined to not exist in the subject

Median

the middle ordered value, the 50th percentile, falls in (n+1)/2 position, always exactly 50% of the observations on either side of it and is not very sensitive to outlier, robust

Mode

the most frequently occurring number, the measure of center represents the most common observations or class of observations

Statistics

the science of collecting, analyzing, interpreting, and presenting data

Non binomial

three or more outcomes, mean

Binomial

two outcomes, mean and SD

Continuous RV's

uniformly distributed and normally distributed, RV's which assume an infinite number of outcomes, measurable, probability of any exact value cannot be computed, probability is 0 (move on)

%

value, invNorm(area to the left,mean,SD), for z use 0 as mean and 1 as SD

Residuals

vertical distance between the points and the line, sum to 0, error=actual-predicted

a

y intercept, when x=0, y is predicted to be a


Ensembles d'études connexes

Real Estate Principle Quizzes, Modern real estate practice 20th edition chapters 1 through 4, Real Estate Principle Quizzes, Modern real estate practice 20th edition chapters 1 through 4

View Set

Chapter 47: Assessment of Endocrine System

View Set

Archer Child Health - Cardio/Respiratory

View Set

MA driver's learning permit test

View Set

Data Collection, Behavior, & Decisions

View Set

Davis Pediatric Success Chapter 5 Cardiovascular Disorders

View Set

Catcher in the Rye Character List

View Set

Chapter 19 Cardiovascular System: Heart

View Set

NRSC4032 - Chapter 11: Memory Consolidation

View Set

PSY 3403.Experimental Psychology

View Set

Chapter 6 International Business Wild

View Set