Stats W21 Midterm

Ace your homework & exams now with Quizwiz!

equal if disjoint set theory

0 <= P(AB) <= P(AUB) <= P(A) + P(B)

Axioms of Probability

1) Chances are always at least zero. For any event A, P(A) >= 0 2) The chance that something happens is 100%. P(S) = 100%. 3) If two events cannot both occur at the same time (if they are disjoint or mutually exclusive), the chance that either one occurs is the sum of the chances that each occurs. If AB={}, P(AU) = P(A) + P(B)

Correlation coefficient (r)

= (X1*Y1 + X2*Y2 + ... + Xn*Yn)/n where X and Y are in standard units How nearly the data fall on a straight line (nonlinear curve -> bad summary of association; even if association is strong, if it is nonlinear, r can be small or 0). If two points are on a straight line, r=1. If two variables are perfectly correlated, doesn't mean there is a causal connection. (- slope line) - 1 < r < 1 (+ slope line) always, r=0 if data doesn't cluster along straight line.

SU

= (original value-mean)/SD or (X - E(x))/SE(x); List in standard units -> new mean is 0 and new SD is 1

Original value

= (value in SU)*SD + mean To find normal approximation: convert to SUs and find area under curve between those two points Secular trend is a linear association (trend) with time Not a good approximation if (1) Nonlinear association (2) Heteroschedastic (3) There are outliers

Expected value of Geometric Distribution

= 1/p

IQR

= 75th% - 25th% = middle 50%; resistant/insensitive to extremes IQR = 0 if at least half 3s in list are equal to zero

A & B are independent if P(AB) =

= P(A) * P(B)

P(AUB)

= P(A) + P(B) - P(AB)

Conditional probability of A given B

= P(A|B) = P(AB)/P(B) = P(B|A) * P(A)/(P(B|A)*(A) + P(B|Ac)) = (P(ABC)+ P(ABCc))/P(B)

Probability of Geometric Distribution

= P(X = x) = ((1-p)^x-1)*p

Point of averages (red square)

= a measure of the center of a scatterplot (mean(x), mean(y))

Vertical residual

= difference between the value of Y and the height of the regression line (measure Y) - (estimated Y)

Extrapolation

= estimating value of Y with value bigger/smaller than any observed

Expected value of Hypergeometric Distribution

= n(G/N)

Probability of binomial distribution

= nCk(p)^k(1-p)^(n-k)

Regressing Y on X -> (predicted Y in SU)

= r * (measured X in SU)

Regressing X on Y -> (predicted X in SU)

= r * (measured Y in SU)

Slope of regression line

= rSDu/SDx where |r| <= 1; the regression line for regressing Y on X is not as steep as the SD line of SDx and SDy are > =.

Standard deviation

= sqrt(((sum (x - mean))^2)/n) mean = 0 -> SD = RMS Sd = 0 -> all numbers in list =, IQR = 0, range = 0

Root mean square (rms)

= sqrt((sum of x^2)/(number of entries))

Standard error of Geometric Distribution

= sqrt(1-p)/p

Exhaust

A collection is exhaustive of A if every element of A is in at least one of the sets

Regression Line

Passes through the point of averages, line of which the rms of vertical residuals is the smallest Independent variable = variable that is regressed upon (x-axis); dependent variable = variable being regressed (y-axis)

Common Fallacies of Relevance

Positively Relevant Ad Hominem (personal attack) Bad Motive Tu Quoque (look who's talking) Two Wrongs Make a Right Ad Misericordium (appeal to pity) Ad Populum (bandwagon): it is moral because it is common, not everyone can be wrong Straw Man Red Herring Equivocation Ad Baculum

Equally likely outcomes

Probability assignments depend on the assertion that no particular outcome is preferred over any other by Nature. Probability of each outcome is 100%/(n possible outcomes). Relies on natural symmetries

Frequency theory

Probability is the limit of the relative frequency with which an event occurs in repeated trials; repeat enough under ideal conditions and the percentage it will occur will converge on a %

Types of data

Qualitative data = ordinal (hot, warm, cold) Qualitative data = discrete (countable, ex: annual number of sunny days) or continuous (no min spacing between the values, ex: temperature), categorical (gender, zip code, type of climate)

Geometric Distribution

The number of random draws with replacement from a 0-1 box until the first time a ticket labeled "1" is drawn is a random variable with a geometric distribution with parameter p=G/N, where G is the number of tickets labeled "1" in the box and N is the total number of tickets in the box

Inconsistency

A. Not A. "Nobody goes there anymore. That place is too crowded."

Bad Motive

Addresses motives of person to attack them

Positively Relevant

Adds weight to assertion

Normal Approximation

Approximates probability by the area under part of a special curve, the normal curve (in SU)

Expected value

As the number of successes within a range of EV decreases, percentage of successes within range of EV increases E(x) = x1*P(X = x1) + x2*P(X = x2) + x3 * P(X = x3) + ... EV of the sample sum of n random draws with or without replacement from a box of labeled tickets is n*(average of the labels on all the tickets in the box) (if the draws are without replacement, the number of draws cannot exceed the number of tickets in the box.) If the random variables X and Y are independent, then E(X * Y) = E(X) * E(Y)

Central Limit Theorem

Asserts normal approximations to the probability distributions of the sample sum and mean improve as # draws grows, no matter what #s are on the tickets. Normal curve approximates it well if the sample size is large and p=50% or is not too close to 0% or 100%. Accuracy does not depend on # tickets or mean or SD of tickets

Median

At least half the data are equal to or smaller than the median and equal to or larger than the median ***Choose the left number when it is an even amount of data*** Histogram: Median is where the area is split in half evenly; harder to skew, must corrupt half the data to make the median arbitrarily large or small. (ex: whether a country is affluent, typical salary at a job)

Straw Man

Attack the more vulnerable claim as if it refutes the original

Informal fallacies (error in reasoning)

Non sequitur of relevance: He says X is true. He does Y. Anyone who does Y is a bad person. Therefore, X is false. (If A then B. A. Therefore C). Non sequitur of evidence: All Ys are Zs. Mary says X is a Y. Therefore X is a Z. (Need to add if Mary says X is a Y. X is a Y)

P(AB) ? P(A)

P(AB) <= P(A)

If A is a subset of B, P(AB) = ? and P(AUB) = ?

P(AB) = P(A) P(AUB) = P(B)

SU for Random Variables

X - E(X)/SE(X)

Weak analogy

X is similar to y in some regards. Therefore, everything that is true for x is true for y.

Binomial Probability Histogram

area (in bins) is closest to the area under the normal curve when p is close to 50% and far from 0 and 100% Increase in sample size, normal approx is more accurate. Mean, SD of ticket #s do not influence how large a sample is needed. Skewness does.

Ad Hominem

attack person rather than reasoning

Partition

break up a complicated set without double countin

Fallacious

deductive reasoning that is incorrect

Valid

deductive reasoning that is mathematically correct; when premises are true, conclusion must be true

Red Herring

distraction from the real topic

Standard Error of Affline Transformation

does not dependon additive constant b -> if Y=aX+b, SE(Y) = |a|*SE(X)

Homoscedasticity

equal scatter

Heteroscedasticity

equal scatter depending on where take slice

Interpolation

estimating within actual range

Two Wrongs Make a Right

fine to do something because someone else did

Ad Populum

it is moral because it is common, not everyone can be wrong

Markov's Inequality for Random Variables

limits probability that a non random variable exceeds any multiples of EV a>0, P(X >= a) <= E(X)/a

Chebychev's Inequality for Random Variables

limits probability that a random variable differs from its EV by multiples of SE P(|X - E(X)| >= kSE(X)) <= 1/k^2

Combinations

nCk = n!/k!(n-k)!

Permutations

nPk = n!/(n-k)!

Tu Quoque

person is wrong because they are a hypocrite

Ad Misericordium

pleading with extenuating circumstances

Questionable cause

post hoc ergo propter hoc (after this, therefore because of this), giving coincidences special significance.

Independent

two events can occur in the same trial. The probability of their intersection is the product of their probabilities. The probability of their union is less than the sum of their probabilities, unless at least one of the events has probabilities, unless at least one of the events has probability zero.

Mutually exclusive

two events cannot both occur in the same trial. The probability of their intersection is zero. The probability of their union is the sum of their probabilities. One is incompatible with occurrence of the other. ex: P(A|B) is largest when A and B are mutually exclusive

Equivocation

use fact that word can have more than one meaning

Binomial distribution

with replacement Chance of success must be the same in every trial

Football-shaped graphs

work well with r and are summarized well by mean of x, mean of y, SD of x, SD of y

equation of regression line

y = r SDy/SDx(x) + [mean(Y) - rSDy/SDx (mean(x))] ***If the regression line was computed correctly, the point of averages of the residual plot will be on the x axis, and the residuals will not have a trend (horizontal line good): the correlation coefficient for the residuals and X will be zero. If the residuals have a trend and their average is not zero, then the slope of the regression line was computed incorrectly. A residual plot shows heteroscedasticity, nonlinear association, or outliers iff the original scatterplot does Special cases: r=0 -> line is horizontal and slope = 0; r=1 -> all points fall on a line with positive slope, regression = SD line

Cardinality

# of elements it contains

Probability of Hypergeometric Distribution

(G)C(k) * (N-G)C(n-k) / (N)C(n)

Probability of Negative Binomial Distribution

(k-1)C(r-1) * p^(k-r) * (1-p)^(k-r) * p = (k-1)C(r-1) * p^r * (1-p)^(k-r)

Standard error of Negative Binomial Distribution

(sqrt(r(1-p)))/p

Strategies for Counting

1) divide into smaller, non-overlapping subsets 2) divide by 2 for double counting 3) make a tree

P(not A) = ?

100% - P(A)

Estimating percentiles from histograms

25th = smallest number that is at least as large as 25% of data 50th = smallest number that is at least as large as half the data 75th = the smallest number that is at least as large as 75% of the data general: choose number at least as big as % given pth percentile: approximate point on horizontal axis such that area under to the left of the point is p%

Common formal fallacies

A or B. Therefore A. (It could be B). A or B. A. Therefore, not B (Affirming the Disjunct) Not both A and B are true. Not A. Therefore, B. (Denying the Conjunct; both can be false) If A then B. B. Therefore, A. (Affirming the Consequent) If A then B. Not A. Therefore, not B (Denying the Antecedent) If A then B. C. Therefore, B. (Nonsequitur of Evidence; C sounds like A) If A then B. Not C. Therefore, not A (Nonsequitur of Relevance; if B sounds like C) If A then B. A. Therefore, C. (Nonsequitur of Relevance; If C sounds like B) If A then B. Not B. Therefore, not C (Nonsequitur of Relevance)

Valid Rules of Reasoning

A or not A (Law of the Excluded Middle) Not (A and not A) A. Therefore, A or B. A. B. Therefore, A and B Not A. Therefore, not (A and B) A or B. Not A. Therefore, B. (Denying the Disjunct). Not (A and B. Therefore, (not A) or (not B). (de Morgan) Not (A or B). Therefore, (not A) and (not B). (de Morgan). If A then B. A. Therefore, B. (Affirming the Precedent). If A then B. Not B. Therefore, not A. (Denying the Consequent)

Hypergeometric Distribution

A random sample without replacement of size n from a population of N units. It gives for each k the chance that the sample sum of the labels on the tickets equals k, for a simple random sample of size n from a box of N tickets of which G are labeled "1" and the rest are labeled "0."

Histograms

Base = class interval Area = fraction of data Area of the bin = (fraction of data in the class interval) = (# observations in class interval) / (total # of observations) Height of bin = (relative frequency) / width of class interval OR (fraction of data in the class interval) / (width of class interval)

RMS error of residuals of Y against X = sqrt((1-r^2)(SD(Y))

Basically SD; it is the rms of vertical residuals from the regression line

The Graph of Averages

Divides a scatterplot into class intervals of the horizontal (x) variables and plots the averages of the Y values in those intervals against the midpoints of the intervals, not a line but a cluster of points

Expected value of binomial distribution

EV = np

The SD Line

Goes through the point of averages (a single point) Has slope equal to SDy/SDx if the correlation coefficient R is greater than or equal to zero; -SDy/SDx if r is negative When r>0 most values of Y are above SD line to the left and below SD line to the right When r<0 most values of Y are below SD to the left and above SD line to the right

Mean

Histogram: Mean is where the histogram would balance; Sum of data/ # data, smallest rms difference *Changing one datum can make the mean arbitrarily large or small* Ex: how much can a family afford to spend on housing

Mode

Histogram: highest bump; most common value (if all occur at once, all #s are the mode)

Inappropriate appeal to authority

If A then B. C. Therefore, B All animals with rabies go crazy. Jessie says my cat has rabies. Thus, my cat will go crazy.

Slippery slope.

If A then B. If B then C. If C then D, etc. Eventually, Z. So, you must prevent A.

Chebyshev's Inequality (for lists)

If the mean of a list of numbers is M and the standard deviation of the list is SD, then for every positive number k, [the fraction of numbers in the list that are k*SD or further from M] <= 1/k^2 Inside a range is at least (1-1/k^2); outside a range is at most (1/k^2) **Use whichever produces the smallest number (more restrictive) **

Markov's Inequality (for lists)

If the mean of a list of numbers is M, and the list contains no negative number then [fraction of numbers in the list that are greater than or equal to x] <= M/x (multiply by n for actual #)

Ad Baculum

If you do/don't do something, something bad will happen

Common Fallacies of Evidence

Inappropriate appeal to authority Appeal to ignorance False dichotomy Loaded question Questionable cause Slippery slope Hasty generalization Weak analogy Inconsistency

2 Types of Reasoning

Inductive: Requires correct deductive reasoning; inherently uncertain, generalize from experience Deductive (aka logic): thinking mathematically

False dichotomy

It starts with a premise that is an artificial "either-or". It is possible to do both.

Appeal to ignorance

Lack of evidence that a statement is false is not evidence that the statement is true.

Frequency Tables

Lists frequency (number) or relative frequency (fraction) of observations that fall in various class intervals based on a decided endpoint convention (usually include L boundary and exclude R)

Affine Transformations

Mode/Median/Mean is a*(__ of original) + b Range/SD = |a|*(__ of original) <- not affected by b IQR = a*(__ of original) if a > 0

Regression effect

Second score is less extreme than the first (students landing planes)

Skewness and modes

Skew left: mean < median Skew right: mean > median Unimodal: consists of only one "bump" (usually multimodal/bimodal)

Hasty generalization

Some x are (sometimes) A. Therefore, most x are (always) A. Sample could be biased.

Loaded question

Statements the presuppose something. Did you know that the sun goes around the Earth?

Residual plots (x1,e1), (x2,e2),...,(xn,en)

Tells us whether it is appropriate to use a linear regression, and whether the regression was computed correctly Vertical residual is e1=y1(ax1+b) where y1 is the measurement and (ax1+b) represents the residual y value Points above the regression line are >0 on the residual plot Residuals should average to zero and not have a trend if done correctly Easier to see heteroschedasticity, nonlinear association, and outliers on a residual plot than a scatterplot

Association

property of 2 or more variables (not the same as causation; scatter in x is smaller than SDx) (-) association: larger than average values of one variable have smaller than average values of the other

Expected value of Negative Binomial Distribution

r/p

Sound

reasoning is valid and based on true premises (valid & unsound = factually incorrect b/c one of the premises is false)

Standard error of binomial distribution

sqrt (n(p(1-p)))

Standard error of Hypergeometric Distribution

sqrt((N-n)/(N-1))*sqrt(n)*sqrt((G/N)*(1-(G/N))

Negative Binomial Distribution

the chance that it takes k draws to get a ticket labeled "1" the rth time


Related study sets

Nutrition Chapter 8: Minerals and Water

View Set

[ PRE-TEST ] Chapter 11: Public Relations

View Set

Prep U and Other Practice Questions (Thermoregulation)

View Set

HTML form elements and attributes

View Set

Exam #3 BA370 Marketing SDSU Gaffen

View Set