ECO 045 Final

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Measures of distribution shape (and outliers)​: Empirical rule

if we know more about the distribution, we can go further than Chebyshev's (Ex: bell curve or normal distribution)

selection bias

in an experiment, unintended differences between the participants in different groups

Population has a Normal Distribution

in this case, the sampling distribution of x ̅ is normally distributed for any sample size

t statistic

indicates the distance of a sample mean from a population mean in terms of the estimated standard error

normal probability density function (standard normal probability distribution)

describes a symmetric, bell shaped curve; completely defined by the mean and variance (standard deviation)

Measures of association between two variables: Interpreting covariance

descriptive measure of linear association between variables

Frame

list of elements from which the sample will be selected from **we could sample from a finite or an infinite population

Quantitative variable: Frequency Distribution

must be careful in defining classes (or categories) - Step 1: determine number of classes;​ - Step 2: determine the width of each class or category;​ - Step 3: beware of limits!​

Measures of location: Mean

the arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores

expected value

the average of each possible outcome of a future event, weighted by its probability of occurring

geometric mean

the mean of n numbers expressed as the n-th root of their product

Hypothesis Testing (P-Value Approach): If the P-value is greater than...

then the null hypothesis is not rejected

​Events and their probabilities​: Probability of an Event

the sum of the probabilities of the sample points (or experimental outcomes) in the event​

zscore formula

x = µ + zs

Properties of point estimators: notation

θ=population parameter θ ̂=sample statistic or point estimator of θ

correlation btwn random variables x and y

ρ_xy=σ_xy/(σ_x σ_y )

Variance for Poisson probability table

σ^2=μ

If the population is too small relative to the sample, we need to add a finite population correction factor(Standard Deviation of p ̅)

σ_p ̅ =√((N-n)/(N-1))* √((p(1-p))/n) **where N is the population size.

Standard Deviation of p ̅

σ_p ̅ =√((p(1-p))/n) ** where n is the sample size and σ_p ̅ = standard error

Standard Deviation of x ̅

σ_x ̅ =σ/√n **where is the population standard deviation (σ) and is the sample size (n)

Sampling Distribution of x ̅ Example: In the managers example, the population standard deviation was equal to 3999.2 and the sample size was 30. Hence, the standard error of the mean is equal to:

σ_x ̅ =σ/√n=3999.2/√30≈730.15

If the population is too small relative to sample, we need to add a finite population correction factor:

σ_x ̅ =√((N-n)/(N-1)) * (σ/√n) **where N is the population size

​Events and their probabilities​: Event

​ Collection of sample points (or experimental outcomes) (Ex: A = At least 28 students are present = {29, 30, 32})

Experiment, sample space, counting rules: sample space

​ The set of all possible experimental outcomes.​ ​ (Examples: {1,2,3,4,5,6}, {Heads, Tails}, {Yes, No, Surprised Look Followed By Uncontrollable Laughter})

Poisson Probability Example: You observe a call center during regular business hours for, say, 100 days. The average number of calls received in a 5-minute interval is equal to 12. If properties 1 and 2 are valid, the number of calls received during a 5-minute interval follows a Poisson distribution 2) What is the probability that we observe 4 calls in a 2.5-minute interval?

= 0.1338

Poisson Probability Example: Suppose the number of thunders on the coast of Scotland follows a Poisson distribution with an average of 10 occurrences every 60 minutes. What is the probability of observing exactly 4 thunders in any given half-hour?

= 0.1754 or 17.54%

Binomial Experiment Example: Suppose that Lehigh has 5,000 undergraduates, of which 1,000 have taken or are currently taking ECO45. From a list of emails, we randomly select (with replacement) a student and check if she/he has taken (or is taking) ECO45. We repeat this experiment 3 times. What is the chance that exactly 1 student is found to have had this amazing experience?

= 0.384

Population mean: σ known: test statistic for hypothesis tests about a population mean

= stat by the confidence level; find critical value; compare to your test stat z=(x ̅-μ_0)/(σ⁄√n)

Experiment, sample space, counting rules: random experiment

A process that generates well-defined outcomes. Moreover, on any single trial (or repetition) the outcome is determined by chance. ​ ​ (Examples: rolling a dice; flipping a coin; asking someone on a date)

Expected value (or mean) for a random variable: measure of central location

a weighted average of all possible values for the RV, where weights are the associated probabilities • E(x) = μ = Σxf(x) • Ex: x= # of heads after we flip a fair coin 3 times > E(x) = 0*(1/8) +1*(3/8) + 2*(3/8)+3*(1/8) = (3+6+3)/8 = 12/8 = 1.5

Summarizing Data for Two Variables: Graphical Displays

We can use side-by-side or stacked bar graphs for categorical variables​

Conditional Probability: Venn diagrams

a diagram made of overlapping circles that is used to show relationships between groups

Poisson probability table

a table that we can use under certain conditions that will make calculating probabilities a little easier when using the Poisson Distribution.

matched samples

a technique whereby the participants in two groups are identical in terms of a third variable

Classical, relative frequency, and subjective methods: subjective methods

a type of probability derived from an individual's personal judgment or own experience about whether a specific outcome is likely to occur

Categorical variable

a variable that names categories (whether with words or numerals) {ex: race, sex, age group, and educational level}

ordinal categorical variable

categorical variables that can be put in a meaningful order

nominal categorical variable

categories do not have a natural order

Bayes Theorem P(A|B)

P(A)

Binomial Probability Distribution Formula

P(x)= (nCx) (p^x) (1-p)^n-x **Where x is the number of successes; n is the number of trials; p is the probability of success in any given trial

Conditional Probability: multiplication law

provides a way to compute the probability of the intersection of two events

Difference between population means, σ_1 and σ_2 known : Example 2: A study set out to investigate differences in quality between two training centers. The measure of quality is the score in a standardized test. The difference in quality is measured by the mean scores of the two groups. Two independent random samples were collected, of 30 individuals in groups 1 and 40 individuals in group 2. The respective sample means are x ̅_1=82 and x ̅_2=78. Assume that the population standard deviations of the scores is equal to 10 for both groups. With a level of significance of 5%, can you reject the null hypothesis of no difference in means?

(1)H_0: :μ_1-:μ_2 =0 H_a: :μ_1-:μ_2≠0 (2) Assume H_0 is true;derive sampling distribution N(0, σ_x1-x2) σ_x1 - σ_x2 = √(σ1 ^2/n1 + σ2^2/n2) = √(10^2 /30 + 10^2 /40) = 2.41

# of experimental outcomes providing exactly "x" successes in "n" trials

(nCx) = n!/ x!(n-x)!

Cumulative Distribution Function for the Exponential(Exponential Probability Distribution)

(the culminative probability of obtaining a value for the exponential rand var of less than or equal to some value denoted by x_0) P(x≤x_0 )=1-e^(-x_0⁄μ) **where x_0 is a fixed number. This gives us the probability of the random variable assuming any value below x_0. It's the same information as in the probability tables we've been using!

Standard deviation of two random variables (x and y)

(x-(E(x))(y-E(y))*p +.....

Poisson Probability Distribution formula

(μ^x e^-μ)/x! **Where e (Euler constant) is equal to 2.71828...; μ is the mean or expected number of occurrences in an interval; and x is the number of occurrences

Experimental Design and ANOVA: Assumptions for Analysis of Variance

1. For each population, the response variable is normally distributed. > If sample sizes are equal, ANOVA results are not sensitive to this assumption. 2. The variance of the response variable, denoted σ^2, is the same for all the populations. 3. The observations must be independent.

Properties of a Poisson experiment

1. The probability of an occurrence is the same for any two intervals of same length (e.g. 30 minutes or 50 miles); 2. The occurrence in any interval is independent of the occurrence in any other interval.

Experimental Design and ANOVA: EXAMPLE: A chemical plant is investigating the efficiency of 3 different methods - A, B, and C - of producing a filtration system. The three methods define three populations of interest: (a) the population of employees who use method A, (b) the population that uses method B, and (c) the population that uses method C.

1. The production method is the independent variable or factor. a. This is a single-factor experiment (more complicated experiments are also allowed) 2. Since there are three factors, we say this experiment has three distinct treatments or treatment arms. 3. The dependent or response variable is the number of filtration systems produced per unit of time. 4. In a completely randomized design, treatments are assigned randomly to experimental units. a. In this example, we could select, say, 3 workers at random from the chemical plant and randomly assign each of them to a different production method.

Hypothesis Testing Steps

1.Is proposition Q true? 2.First, assume Q is true. 3. What would you expect to see if Q was indeed true? 4. Collect evidence systematically. 5. Ask yourself: is it compatible with Q being true? Or does it mean Q is not very likely? >>> 1. specify null and alternative 2. choose an appropriate statistical test 3. specify the significant level (A) (.05) 4. obtain p value from test 5. compare p value with A and make a conclusion.

Comparing the variances: the F test

> MSTR is unbiased only if the null is true, otherwise it overestimates it > MSE is unbiased whether the null or the alternative is true Test procedure: 1.Check the ratio between MSTR and MSE 2.If ratio close to 1, we cannot reject the null 3.If ratio far from 1, we can reject the null 4.But how close is close? Turns out the ratio of the two estimates follows a known probability distribution, the F distribution 5.Use the F distribution to find critical values and/or p-values to assess the hypotheses F= MSTR/MSE

normal distribution (continuous probability distribution)

A function that represents the distribution of variables as a symmetrical bell-shaped graph.

Measures of location

A statistic that describes a location within a data set. Measures of central tendency describe the center of the distribution.

Calculating the probability of a Type II error

Compute the probability of a Type II error for a hypothesis test of a population mean

Expected return a certain portfolio

E(ax+by) = a*E(x) + b*E(y)

Standard Normal Probability Distribution Examples 1 + 2

Example 1: let z be a random variable with a standard normal probability distribution. What is the probability that z assumes the value 0.5 or lower? P(E <= 0.5) = 0.69146 or 69.14% Example 2: What is the probability that z assumes the value -0.5 or lower? P(E<= -0.5) = P(E>=0.5) = 1-P(E<=0.5) = 1-0.69146 = 0.31 or 31%

Probability functions (continuous probability distribution)

Functions that assign probabilities to values of a random variable, determine the information in a probability distribution. Needs to have the sum of the probability values equal to one and also needs each probability value to be greater than zero and less than one.

joint probability distribution

Gives probability of all combinations of values of all variables or a probability distribution for two (or more) random variables

EXAMPLE: The Department of Agriculture (DA) routinely audits farmers about the use of fungicides. A particular type of herbicide has a maximum allowed concentration of 8 micrograms per unit of eggplant. Suppose the DA collects a random sample of 100 eggplants with a mean of 8.24 micrograms of the controlled substance. Assume the population standard deviation is known and equal to 1.5 micrograms. Do you have enough evidence to conclude, at the 1% significance level, that farmers are using more than the maximum allowed?

H_0: μ<= 8 H_a: μ>8 Assume H_0 is true w/ equality: μ =8

Difference between population means, σ_1 and σ_2 known : We can also do hypothesis testing. Let D_0 denote the difference between means:

H_0:μ_1-μ_2≥D_0 H_a:μ_1-μ_2<D_0 H_0:μ_1-μ_2≤D_0 H_a:μ_1-μ_2>D_0 H_0:μ_1-μ_2=D_0 H_a:μ_1-μ_2≠D_0

How to go from any normal to a standard normal distribution(prob)

N(a, b) > z = x-μ / σ - z_a= a-a/b -z_b = (a+b)-a / b

Hypergeometric probability distribution example: In a classroom of 30 students, 10 of them scored 90 or higher in a final exam. The instructor forms a group of three students by randomly selecting, without replacement, among the class. What is the probability that exactly one student in this randomly selected group scored 90 or higher?

N= 30 n =3 r = 10 f(1) = (10,1) (30-10, 3-1) / (30,3) ~ .4679

Simpson's Paradox

Or why we should always be skeptical and dig deeper

Predictive Analytics

Predictions about future behavior or relationship between variables (ex.: regression, machine learning)

Complement of an event (A and A^c)

Refers to the event "not A" or all outcomes that are NOT the event > the Complement of an event is all the other outcomes (not the ones we want) > all together the Event and its Complement make all possible outcomes

Population mean: σ known: Rejection rule using p-value

Reject if H_0 if p-value≤α

Dealing with small samples EXAMPLE: Your boss is thinking of rolling out a training program for thousands of employees. Before deciding, she wants an estimate of the time it takes to complete the training. You have access to a random sample of 20 employees who have completed the training in a pilot. Can you use that (small) sample to construct a confidence interval?

STEP 1: compute sample mean and sample standard deviation. x ̅=51.5 s=6.84 STEP 2: find the t-value corresponding to σ⁄2=2.5% and n-1=19 degrees of freedom. STEP 3: construct confidence interval.

Population

Set of all elements of interest in a study

Prescriptive Analytics

Statistical techniques to directly aid in decision-making {ex.: optimization, pricing models}

Exponential Probability and the Poisson Distribution

The Poisson distribution (discrete) and the Exponential distribution (continuous) are closely related

Between-treatments estimate of σ^2

The general formula for computing the between-treatments estimate of the population variance, also known as mean square due to treatments, is: MSTR=(∑_(j=1)^k▒〖n_j ((x_j ) ̅-x ̿ )^2 〗)/(k-1)

Within-treatments estimate of σ^2

The general formula for computing the within-treatments estimate of the population variance, also known as mean square due to error, is: MSE= (∑_(j=1)^k▒〖〖(n〗_j-1)〖s_j〗^2 〗)/(n_T-k)

Conditional Probability: Independent events

The outcome of one event does not affect the outcome of the second event

ANOVA and the completely randomized design: FORMULAS

The overall sample mean is: x ̿=(∑_(j=1)^k▒∑_(i=1)^(n_j)▒x_ij )/n_T where n_T=n_1+n_2+...+n_k When the sample sizes are all equal, the formula reduces to: x ̿=(∑_(j=1)^k▒x ̅_j )/k

Types of Discrete Probability Distributions

Uniform, binomial, Poisson, hypergeometric

Standard Deviation a certain portfolio (or variance of a linear combo of 2 Rand Vars x and y)

Var(ax + by) = a^2Var(x) + b^2Var(y) +2ab*σ_xy ==> then sqrt answer

variance for hypergeometric probability distribution

Var(x)=σ^2=n(r⁄N)(1-r⁄N)((N-n)⁄(N-1))

Variance for Uniform Probability Distribution

Var(x)=〖(b-a)〗^2/12

Measures of association between two variables: Pearson Product Moment Correlation

a bivariate, correlational measure that indicates the degree to which two quantitative variables are related

uniform probability distribution

a continuous probability distribution for which the probability that the random variable will assume a value in any interval is the same for each interval of equal length

Exponential Probability Distribution

a continuous probability distribution that describes the interval (of time, length) between occurrences

normal probability distribution

a continuous probability distribution. Its probability density function is bell-shaped and determined by its mean and standard deviation .

Probability DENSITY functions (continuous probability distribution)

a function of a continuous random variable, whose integral across an interval gives the probability that the value of the variable lies within the same interval

Scatter Diagram (or plot)

a graphical depiction of the relationship between two variables

Quantitative variable: A Dot Plot

a graphical device that summarizes data by the number of dots above each data value on the horizontal axis

standard normal distribution

a normal distribution with a mean of 0 and a standard deviation of 1

Random variables (discrete probability distributions)

a numerical description of the outcomes of an experiment 1) case 1: numbers are neutral > Ex: roll a dice twice. Possible outcomes: 36; Random variable (x): sum of the faces; possible outcomes: (2 to 12) 2) case 2: we can assign numbers ourselves • Ex: student assessment (good = 0, extra good = 1, excellent = 2, super excellent =3)

discrete uniform probability distribution

a probability distribution for which each possible value of the random variable has the SAME PROBABILITY • f(x) = 1/n, where n - # of values r.v. may assume

Bivariate Distributions

a probability distribution involving two random variables • Ex: sum of numbers in two die rolls and result of a (fair) coin flip (0 is T, 1 is H) • Ex: number of rainy days in a year and yearly return of a commodity index

Poisson Probability Distribution

a probability distribution showing the probability of x occurrences of an event over a specified interval of time or space

Binomial Probability Distribution

a probability distribution showing the probability of x successes in n trials of a binomial experiment

DISCRETE random variable (discrete probability distributions)

a random variable that may assume either a FINITE number of values or an INFINITE SEQUENCE of values - (-3, -2, 1, 7, 10); (1,2,3,4,5,....) • Ex: # of classes attended by a randomly selected student • Ex: # of galaxies in the universe

confidence interval

a range of values so defined that there is a specified probability that the value of a parameter lies within it • x ̅: ± margin of error • p ̅: ± margin of error • Case 1: σ is known • Case 2: σ is not known

Simple random sample (finite population)

a simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected **in practice, not so easy. Example: mammography trials in the 1960s.

ONE-tailed hypothesis test

a statistical test in which the critical area of a distribution is one-sided so that it is either greater than or less than a certain value, but not both. If the sample being tested falls into the one-sided critical area, the alternative hypothesis will be accepted instead of the null hypothesis

Variance of a discrete random variable: measure of variability

a weighted average of the squared differences between possible values for RV and the expected value, where weights are the associated probabilities • Var(x)=σ^2=∑(x-μ)^2 f(x) • Ex: x= # of heads after we flip a fair coin 3 times > Var(x) = (1.5)^2 * (1/8) + (1-1.5)^2 * (3/8) + (2-1.5)^2 * (3/8) + (3-1.5)^2 * (1/8) = 9/4 * 1/8 + 1/4 * 3/8 + 9/4 * 1/8 = 24/32 = 3/4 or 0.75

Poisson Distribution EXAMPLE: Consider a Poisson distribution with mean equal to 3. a. Write the appropriate Poisson probability function. Recall that f(x)=(μ^x e^(-μ))/x! b. Compute f(2). c. Compute f(1). d. Compute P(x>=2)

a) f(x) = (3^x e^-3)/x! > f(0) = 3^0 e^-3/0! = 0.0497 b) f(2) = 3^2 e^-3 /2! = 0.224 or 22.4% c) f(1) = 3^4 e^-3/1! = 0.1493 or 14.9% d) P(x>=2) = 1-(P(x<2) = 1-P(x=0 U x=1) > 1-(P(x=1) + P(x=0)) = 1- (0.1493 +0.0487) = 0.801 or 80%

the probability density function (normal probability function)

an equation used to compute probabilities of continuous random variables

and > *

mutually exclusive events

events that cannot happen at the same time

Normal Approximation to a Binomial Distribution Example: the internal audit team of a large company knows that there is, on average, a 20% chance that any client interaction will be non-compliant with a particular corporate policy. Suppose the audit team decided to audit a random sample of 100 such interactions. What is the chance that exactly 15 of the audited interactions will be labeled non-compliant?

f(15) = (100C15) * (0.2)^15 * (0.8)^85 = 0.04806 or 4.81%

Hypergeometric probability distribution formula

f(x)=(r¦x)((N-r)¦(n-x))/((N¦n) ) **Where x is the number of successes (having scored 90+); n is the number of trials (3 selected students); N is the number of elements in the population (30 students); and r is the number of successes in the population (10 students are known to have scored 90+).

Exponential Probability Distribution formula (It's a continuous probability distribution that describes the interval (of time, length) between occurrences)

f(x)=1/μ e^(-x⁄μ) **where μ is the expected value and is x≥0 is the random variable (e.g., length of time between occurrences)

uniform probability density distribution formula

f(x)={(1/(b-a), for a≤x≤b, 0, for other x)

standard normal distribution formula

f(z)=1/√2π e^(-z^2/2)

Boxplot

graphical display based on five-number summary

Observation

he value assigned to only one element is called an observation.

Mean for Normal Approximation to a Binomial Distribution

mean = n*p = μ

inferential statistics

numerical data that allow one to generalize- to infer from sample data the probability of something being true of a population

descriptive statistics

numerical data used to measure and describe characteristics of groups. Includes measures of central tendency and measures of variation.

The sampling distribution of p ̅ can be approximated by a normal distribution whenever:

n×p≥5 and n×(1-p)≥5

or (union) > +

sample proportion of p ̅

p ̅=x/n *where x is the number of observations of interest (e.g., number of managers that did the training) and n is the sample size **Taking a sample of size n and calculating the sample proportion p ̅ is thus a binomial experiment, with a trial of size n and a probability of success equal to p (the population proportion)

Interval estimate of a population proportion FORMULA

p ̅±z_(α⁄2)×√((p ̅×(1-p ̅))/n) **where (1-α) is the confidence coefficient (95% in our previous example) and z_(σ⁄2) is the z-value that provides an area equal α⁄2 in the upper tail of the standard normal distribution.

Standard Deviation for Normal Approximation to a Binomial Distribution

sqrt(n*p(1-p)) = σ

Variance of a Random Variable

square each value and multiply by its probability

Normal Approximation to a Binomial Distribution

suppose you're studying a binomial experiment. Whenever np≥5 and n(1-p)≥5, then the normal distribution is a good continuous approximation to the discrete binomial distribution

sample variance

quantitative variable

take numerical values and represent some kind of measurement

Central limit theorem

the distribution of sample averages tends to be normal regardless of the shape of the process distribution > as the size n of a simple random sample increases, the shape of the sampling distribution of x̄ tends toward being normally distributed

Properties of point estimators: EFFICIENCY

the point estimator with lower standard error is said to be more efficient

Sampled population

the population from which the sample is drawn

hypergeometric probability distribution

the probability distribution that is applied to determine the probability of x successes in n trials when the trials are not independent 1. The trials are not independent; 2. Probability of success changes from trial to trial. • Ex: In a classroom of 30 students, 10 of them scored 90 or higher in a final exam. The instructor forms a group of three students by randomly selecting, without replacement, among the class. What is the probability that exactly one student in this randomly selected group scored 90 or higher?

Conditional Probability: Bayes' Theorem

the probability of an event occurring based upon other event probabilities

Type I and Type II Errors: Level of significance

the probability of making a Type I error when the null hypothesis is true as an equality

Conditional Probability

the probability that one event happens given that another event is already known to have happened

Hypothesis Tests

use a known distribution to determine whether a hypothesis of no difference (the null hypothesis) can be rejected

Binomial Probability Table

used to calculate probabilities instead of using the binomial distribution formula. The number of trials (n) is given in the first column. The number of successful events (x) is given in the second column.

Finite Population Correction Factor

used when you sample without replacement from more than 5% of a finite population. It's needed because under these circumstances, the Central Limit Theorem doesn't hold and the standard error of the estimate (e.g. the mean or proportion) will be too big.

interval estimation of a population mean

we establish a range of potential sample statistics, usually, the mean, given sample information, and given the notion of repetitive sampling as outlined in Central Limit Theorem > estimate an unknown parameter using an interval of values that is likely to contain the true value of that parameter

when is diversification useful?

when correlation of returns is negative

Interval estimate of a population mean: σ unknown formula

x ̅±t_(σ⁄2)×s/√n **where s is the sample standard deviation, (1-α) is the confidence coefficient (95% in our previous example) and t_(σ⁄2) is the t-value that provides an area equal α⁄2 in the upper tail of the t-distribution with n-1 (sample size) degrees of freedom

Test statistic for hypothesis tests about a Population mean: σ known

z=(x ̅-μ_0)/(σ⁄√n)

Inference about two populations

• Compare population parameters of two different populations a. Ex: the mean starting salaries of male and female college graduates; b. Ex: the proportion of individuals who speak more than one language in countries A and B; • As before, we can do so via confidence intervals or hypothesis testing. • Also as before, the statistical inference depends on the assumption about the population standard deviation (known or not known)

Type I and Type II Errors

• Either H_0 is true, or H_a is true, but NEVER both. • Ideal world: a) Accept H_0 when it is in fact true b) Reject H_0 when H_a is in fact true • BUT, we are working with samples; always room for ERROR

Sampling Distribution of p ̅

• p ̅ is a random variable, since any one sample will have a (somewhat) different proportion of the characteristic of interest. • The probability distribution of p ̅ is called the sampling distribution **The sampling distribution of p ̅ is the probability distribution of all possible values of the sample proportion p ̅

Random sample (infinite population)

A simple random sample of size n from an infinite population is a sample selected such that the following conditions are satisfied: 1. Each element selected comes from the same population. 2. Each element is selected independently. **Ex: how would you select a random sample of customers arriving at a restaurant. Is this a finite population? How to ensure that each element is selected independently?

Sample

A subset of the population

continuity correction factor

A value of .5 that is added to or subtracted from a value of x when the continuous normal distribution is used to approximate the discrete binomial distribution.

Developing null and alternative hypotheses- Case 1: Research hypotheses

Alternative hypothesis defined first

Experiment, sample space, counting rules: permutation

An arrangement, or listing, of objects in which order is important *n = # of items in the set *r = # of number of selected elements in set

variables

Anything that can take on different values is called a variable. Therefore, number of gifted students is a variable since it has different values. Other examples of variables could be number of students who graduate from college, income of senior citizens, types of health insurance plans people enrolled in.

Descriptive analytics

Data visualization and summaries

percentiles

Divide the data set into 100 equal parts. An observation at the Pth percentile is higher than P percent of all observations.

Properties of point estimators: CONSISTENCY

Does the point estimator get closer to the population parameter as the sample size increases?

Expected Value of p ̅

E(p ̅ )=p **where p is the population proportion

Expected Value of x ̅ (unbiased)

E(x ̅ )=μ **where μ is the population mean

Expected value for binomial distribution

E(x) = μ = np

Expected Value for Uniform Probability Distribution

E(x)=(a+b)/2

Expected value for hypergeometric probability distribution

E(x)=μ=n(r⁄N)

Probability theory

Enables the user to make decisions that take into consideration conditions of risk

CONTINUOUS random variable (discrete probability distributions)

could be any value in an INTERVAL • measurements: temp, weight, height, ... P([108,109]) • Ex: weight of a tuna in a Tokyo market (108.743212.. kg)

Exponential Probability Distribution > a continuous probability distribution that describes the interval (of time, length) between occurrences - Example: suppose that x represents the loading time for a truck in a dock (or the interval of time between two occurrences, loading a truck). Suppose further that x follows an exponential distribution with an expected value of 15 minutes.

f(x) = 1/μ * e^-x/μ = 1/15 * e^-x/15

Measures of distribution shape (and outliers)​: z-score (or standardized value)

the number of standard deviations that a given value x is above or below the mean

Population mean: σ known: p-value approach

the researcher determines the exact probability of obtaining the observed sample difference, under the assumption that the null hypothesis is correct

Standard deviation of a random variable

the square root of the variance

Hypothesis Testing (P-Value Approach): If the P-value is less than (or equal to)...

then the null hypothesis is rejected in favor of the alternative hypothesis

Normal Distributions with the same SD, but different means

these curves have the same shapes but are located at different positions on the x axis

Interval estimation: σ known: FORMULA

x ̅±z_(α⁄2)×σ/√n ***where (1-α) is the confidence coefficient (95% in our previous example) and z_(α⁄2) is the z-value that provides an area equal α⁄2 in the upper tail of the standard normal probability distribution.

Interval estimate of a population mean: σ known formula

x ̅±z_(σ⁄2)×σ/√n **where (1-α) is the confidence coefficient (95% in our previous example) and z_(σ⁄2) is the z-value that provides an area equal α⁄2 in the upper tail of the standard normal probability distribution

covariance of random variables x and y

σ_xy=[Var(x+y)-Var(x)-Var(y)]⁄2 or σ_xy=∑_(i,j)[x_i-E(x_i )][y_j-E(y_j )]f(x_i,y_j )

population variance

σ²

variance for binomial distribution

σ² = np(1-p)

Difference between population means, σ_1 and σ_2 known : Example 1: Graystone Department Stores, Inc. operates one store in the inner city (1) and one store in the suburbs (2). The manager has noticed that the two stores usually sell a different mix of products and conjectured that different demographics are causing it. The manager set out to examine if the average age of customers is different in the two locations. Two independent random samples are collected: 36 clients from store 1 and 49 clients from store 2. The respective sample means, and populations standard deviations are: x ̅_1=40,σ_1=9,x ̅_2=35, σ_2=10. Construct the 95% confidence interval for the difference in means.

*do = μ_1-μ_2 (1) point estimate: x1 - x2=40-35=5 (2) x1-x2 +- margin of error (3) sampling distribution? N(D_0, SD_x1-x2)

Summarizing Data for Two Variables

- Investigating the relationship between two variables - We can do it for categorical or quantitative data, or a combination of the two (As with single-variable summaries of quantitative data, we must be careful with defining classes) - Tabulations or graphical displays

Quantitative variable: Stem-and-Leaf display

- Not as widely used​ - Impractical with any large dataset

Quantitative variable: The Histogram

- One of the most common ways of summarizing data​ - X-axis: variable of interest, divided in classes (or bins)​ - Y-axis: frequency or relative frequency​

Measures of distribution shape (and outliers)​: Detecting Outliers

- Unusually large or unusually small values in a dataset​ Method 1: anything with a z-score greater than, say, |3|​ Method 2: one-and-half times the IQR​ ​ Lower limit = Q1 - 1.5*IQR​ Upper limit = Q2 + 1.5*IQR​

General Idea of Hypothesis Testing

- make an initial assumption - collect evidence (data) - based on the available evidence, decide whether or not the initial assumption is reasonable ØWe have written the null and alternative hypotheses and decided on the confidence level. General idea of hypothesis testing: 1.Assume the null hypothesis holds with equality; 2.Derive the sampling distribution based on 1; 3.Ask yourself: how likely is it that your sample estimate came from that sampling distribution? i.If it's very unlikely, then we have reason to believe the null is not true; ii.It it's relatively likely, we don't have reason to reject the null hypothesis. We verify 3 using a test statistic

Relationship between sample size and the sampling distribution

1) First: E(x ̅ )=μ regardless of the sample size, IF we're dealing with a random sample. 2) Second: the sample size affects the dispersion of the sampling distribution. The larger the sample size, the lower is the standard error.

Addition Law (or Probability of the Union of Two Events)

1) If A and B are two events in a probability experiment, then the probability that either one of the events will occur is: P(A or B)=P(A)+P(B)−P(A and B) > P(A∪B)=P(A)+P(B)−P(A∩B) 2) If A and B are two mutually exclusive events , P(A∩B)=0. Then the probability that either one of the events will occur is: P(A or B)=P(A)+P(B) (Ex: Out of 30 students, 21 completed the hw, 15 missed no class, and 9 did both > P(A)+P(B)−P(A∩B) > 21/30 + 15/30 - 9/30 = 27/30 = 90%

Five-number summaries

1) Minimum​ 2) Q1​ 3) Median (Q2)​ 4) Q3​ 5) Maximum​

EXAMPLES of CONTINOUS random variables

1) Rand Exp: Customer visits a web page> Rand Var: Time customer spends on web page per minutes > Possible Values for the Rand Var: x ≥ 0 2) Rand Exp: fill a soft drink can > Rand Var: # of ounces > Possible Values for the Rand Var: 0 ≤ x ≤ 12.1 3) Rand Exp: Test a new chemical process > Rand Var: Temp when the desired reaction takes place (min temp = 150; max temp = 212) > Possible Values for Rand Var: 150 ≤ x ≤ 212 4) Rand Exp: Invest $10K in the stock market > Rand Var: Value of investment after 1 yr> Possible Values for Rand Var: x ≥ 0

EXAMPLES of DISCRETE random variables

1) Rand Exp: flip a coin > Rand Var: face of coin showing > Possible Values for the Rand Var: 1 if heads; 0 if tails 2) Rand Exp: roll a die > Rand Var: # of dots showing on top of die > Possible Values for the Rand Var: 1,2,3,4,5,6 3) Rand Exp: Contact 5 customers > Rand Var: # of customers who place an order > Possible Values for Rand Var: 0,1,2,3,4,5 4) Rand Exp: Operate a health care clinic for one day > Rand Var: # of patients who arrive > Possible Values for Rand Var: 0, 1, 2, 3, .... 5) Rand Exp: Offer a customer the choice of two products > Rand Var: product chosen by customer > Possible Values for Rand Var: 0 if none; 1 if choose product A; 2 if choose product B

Interval estimation: σ unknown - EXAMPLE: 95% confidence interval for the credit card debts of the US population.

1) STEP 1: compute sample mean and sample standard deviation. x ̅=$ 9,312 s=$ 4,007 2) STEP 2: find the t-value corresponding to σ⁄2=0.025 and n-1=69 degrees of freedom

Step 1) Population mean: σ known: EXAMPLE: FTC conducts periodical statistical studies of specific products. Suppose FTC is investigating Hilltop Coffee's claim that a can contains 3 pounds of coffee. For the FTC, the company is complying if the average weight is at least 3.

1) Start by defining the null and alternative hypotheses: H_0:μ≥3 H_a:μ<3

Uniform Probability Distribution EXAMPLE: Suppose x represents the travel time of a train from Chicago to Pittsburgh. The trip can last anywhere from 400 to 440 minutes. Suppose we have enough data to assert that any 1-minute interval has the same chance of occurring. 1. x is a continuous random variable; 2. x is said to follow a uniform probability distribution 3. the probability density function is: f(x)={(1⁄40, for 400≤x≤440; 0, for any other x

1) What is the probability that the train ride takes at least 405 minutes and at most 420 minutes? P(405<= x <=420) = 1/40 * (420-405) = 15/40 = 37.5% 2) What is the probability it takes at least 400 minutes and at most 440 minutes? P(400 <= x <= 400) = 1/40 * (440-400) = 40/40 = 1 3) What is the probability it takes exactly 425 minutes? P(x=425) = 1/40 * (425-425) = 0 **It's the same idea for every continuous probability distribution.

Conditions for Normal (continuous) Approx to a Binomial Distribution (discrete)

1) condition 1: np>= 5 2) condition 2: n(1-p) >= 5 Binomial ~ N(μ, σ)

General idea of hypothesis testing:

1. Assume the null hypothesis holds with equality; 2. Derive the sampling distribution based on 1; 3. Ask yourself: how likely is it that your sample estimate came from that sampling distribution? i.If it's very unlikely, then we have reason to believe the null is not true; ii.It it's relatively likely, we don't have reason to reject the null hypothesis. **We verify 3 using a test statistic

Properties of Normal Probability Distribution

1. Every single normal distribution is defined by two parameters: mean and standard deviation 2. The mean is equal to the median, which is the highest point in the graph; 3. The mean could be any numerical value; 4. The distribution is symmetric (skewedness is zero). Moreover, x can be any value in the real line (the graph never touches the axis); 5. Standard deviation determines how flat and wide the normal curve is; 6. Probabilities are, as usual, determined by the area under the curve; 7. For the normal distribution: a) 68.3% of the values lie within 1 standard deviation of the mean b) 95.4% of the values lie within 2 standard deviations of the mean c) 99.7% of the values lie within 3 standard deviations of the mean

Difference between population means, σ_1 and σ_2 known

1. Population 1 has mean μ_1 2. Population 2 has mean μ_2 3. Take a simple random sample of size n_1 of population 1, and a simple random sample of size n_2 of population 2 4. For now, we assume that the population standard deviations, σ_1 and σ_2 , are known 5. We will focus on inference about the difference between the means: μ_1-μ_2

Difference between population means, σ_1 and σ_2 UNknown

1. Population 1 has mean μ_1 2. Population 2 has mean μ_2 3. Take a simple random sample of size n_1 of population 1, and a simple random sample of size n_2 of population 2 4. We use the sample standard deviations, s_1 and s_2 , to estimate the population standard deviations o Consequently, the sampling distribution will follow a t-distribution, rather than a normal distribution 5. We will focus on inference about the difference between the means: μ_1-μ_2

Properties of a Binomial Experiment

1. The experiment consists of n identical trials; 2. Two outcomes are possible on each trial (success or failure); 3. The probability of each outcome does not change from trial to trial (p is the constant probability of success); 4. The trials are independent.

Step 2)Population mean: σ known: EXAMPLE: Suppose FTC randomly selects a sample of 36 cans of coffee and calculates the mean weight in the sample. How much lower than 3 pounds should the sample mean be for FTC to reject the null hypothesis given a chosen confidence level. **Note: the confidence level is the acceptable rate of a Type I error: rejecting the null hypothesis when it is in fact true (as an equality).

2) In this example, x ̅=2.92 and we can assume σ=0.18. Moreover, the FTC director is willing to accept a 1% chance of being wrong and punishing the company when in fact it did nothing wrong. We set α=1%.

Calculating the probability of a Type II error EXAMPLE: A company needs to decide to accept or return a large shipment of batteries. The company wants to accept the shipment if average life is at least 120 hours. The company selects a sample of 36 batteries for testing. Assume that α=0.05 and σ=12. IF the true population mean is 115, what is the chance that we will (mistakenly) fail to reject the null; in other words, what are the chances that we'll make a Type II error?

H_0: μ>= 120 H_a: μ<120 What's the min value of a sample mean that would lead me to reject H_0?

ANOVA and the completely randomized design

H_0: μ_1=μ_2=...=μ_k H_a:Not all population means are equal where μ_j=mean of the jth population Consider a simple random sample from each population and let: x_ij=value of observation i for population j n_j=number of observations for treatment j x ̅_j=sample mean for treatment j s_j^2=sample variance for treatment j s_j=sample standard deviation for treatment J

Experiment, sample space, counting rules: combination

How many combinations of say, q elements, can we form starting from Q>q elements? Here the order of the elements doesn't matter: {q1,q2} is the same as {q2,q1}: (Ex: suppose there is a party with 12 people, and you need to trace every two-person contact. How many cases do you need to examine? > (12 2) = 12!/(2! (12-2)) ) *n = # of items in the set *r = # of number of selected elements in set

Classical, relative frequency, and subjective methods: Relative frequency​

How often something happens divided by all outcomes

Population mean: σ known

Hypothesis testing about a population mean • If population distribution is normal, tests are exact • If we can't say if population is normally distributed, tests are approximations.

Determining the sample size for a hypothesis test: population mean : On the relationship between α, β and the sample size n

I. Once two of the three values are known, the other can be computed; II. For a given level of significance α, increasing the sample size will reduce β; III. For a given sample size, decreasing α will increase β, whereas increasing α will decrease β;

Experiment, sample space, counting rules: product rule

If an experiment can be described as a series of k steps, respectively, n1, n2,...nk (ex: toss two coins and roll a dice > out 1={H,H,3}, out 2 = {H,T,3} > # of outcomes = 2*2*6 = 24

Interval Estimation EXAMPLE: Let's revisit a problem from the previous class. What's strange about it? Suppose you don't have access to the entire population of 2,500 managers, but only to a randomly selected sample of size 50. Your boss wants to know if the sample average would be close enough to the population average of $51,800, say within $500 on each side, with at least 80% chance. How would you answer?

If you know the population average, why on earth do I need to get a sample and compute the sample average???

Population does have a Normal Distribution

In this case, the sampling distribution of x ̅ is normally distributed for any sample size

Properties of point estimators: UNBIASEDNESS

Is the expected value of the sample statistic equal to the population parameter? The sample statistic θ ̂ is an unbiased estimator of the population parameter θ if E(θ ̂ )=θ where: E(θ ̂ )=the expected value of the sample statistic θ ̂

Classical, relative frequency, and subjective methods: Classical

It makes sense to assume all outcomes are equally likely > the statistical concept that measures the likelihood/probability of something happening > In a classic sense, it means that every statistical experiment will contain elements that are equally likely to happen (equal chances of occurrence of something

Difference between population means, σ_1 and σ_2 known: hypothesis testing

Let D_0 denote the difference between means: H_0:μ_1-μ_2≥D_0 H_a:μ_1-μ_2<D_0 H_0:μ_1-μ_2≤D_0 H_a:μ_1-μ_2>D_0 H_0:μ_1-μ_2=D_0 H_a:μ_1-μ_2≠D_0

Quantitative variable: Cumulative Distribution (e.g. CDF)

Like a histogram, but frequencies are cumulative

normal probability table

Lists z-scores and corresponding percentiles

Population mean: σ known: One-Tailed Test

Lower Tail Test H_0:μ≥μ_0 H_a:μ<μ_0 Upper Tail Test H_0:μ≤μ_0 H_a:μ>μ_0

Developing null and alternative hypotheses- Case 2: Assumption to be challenged

Null hypothesis defined first; we don't have a particular result in mind

Exponential Probability and the Poisson Distribution Example 1: Suppose that the number of potholes in any 4-mile stretch of a highway follows a Poisson distribution with mean 10. Hence:

Poisson f(x) = (μ^x * e^-μ)/x! = 10^x*e^-10 /x!

Rejection rule for a lower-tail test: critical value approach

Reject if H_0 if z≤-z_α **where-z_α is the critical value; that is, the z value that provides an area of α in the lower tail of the standard normal distribution

Steps When Dealing with small samples

STEP 1: compute sample mean and sample standard deviation. STEP 2: find the t-value corresponding to σ⁄2 and n-1 degrees of freedom. STEP 3: construct confidence interval.

TWO-tailed hypothesis test

a test in which the null hypothesis is rejected in favor of the alternative hypothesis if the evidence indicates that the population parameter is either smaller or larger than a hypothesized value

Measures of distribution shape (and outliers)​: Chebyshev's Theorem

at least (1 - 1/z^2) of the data must be within z standard deviations of the mean, where z is any value greater than 1

Measures of association between two variables: Covariance

descriptive measure of linear association between variables

Poisson Probability Example: You observe a call center during regular business hours for, say, 100 days. The average number of calls received in a 5-minute interval is equal to 12. If properties 1 and 2 are valid, the number of calls received during a 5-minute interval follows a Poisson distribution 1) What is the probability that we observe 8 calls in a 5-minute interval?

f(x)=(12^x e^(-12))/x!

Population does not have a Normal Distribution

in these cases, we can invoke the CENTRAL LIMIT THEOREM: for random samples of size n from a population, the sampling distribution of the sample mean x ̅ can be approximated by a normal distribution as the sample size becomes large

Probability

likelihood that a particular event will occur

probability distribution of a discrete random variable

model for population data a) assign probabilities to an outcome or collection of outcomes b) usually written as a function > f(x) = probability

Categorical variable: Frequency Distribution

most common way of summarizing categorical variables​ - Step 1: determine number of classes;​ - Step 2: determine the width of each class or category;​ - Step 3: beware of limits!

Understanding ANOVA

o The closer the sample means are to one another, the weaker the evidence against the null. o In other words, if the variability among the sample means is small, it supports H_0; if the variability is large, it supports Ha. o As before, we start by assuming H_0 is true and derive the relevant test statistic under that assumption. o For ANOVA, the evidence will come from a comparison of the within-samples and between-samples variability.

population proportion

ratio of members of a population with a particular characteristic to the total members of the population

bivariate empirical discrete probability/ bivariate probabilities distribution or joint probabilities

represents the joint probability distribution of a pair of random variables > Each row in the table represents a value of one of the random variables (call it X) and each column represents a value of the other random variable (call it Y)

Normal Distributions with same means, but different standard deviations

the larger the standard deviation, the more dispersed, or spread out, the distribution is

Measures of location: Weighted Mean

the mean obtained by assigning each observation a weight that reflects its importance

Sampling Distribution

• Every time we draw a random sample, we get different point estimators. If we consider taking a sample to be an experiment, then the point estimators are random variables. • As such, the point estimators have an expected value and variance, and can have an associated probability distribution. This is the ALL-IMPORTANT sampling distribution

Population mean: σ unknown: test stat

• For the σ known case, the test statistic has a standard normal distribution • For the σ unknown, however, the test statistic follows the t distribution > like a normal but has more variability in small samples.

Dealing with small samples

• If the population distribution is normal, then the sampling distribution is exact and can be used for any sample size. • Otherwise, n≥30 is adequate. • If you only have access to smaller samples, say 15 or 20, check: 1. Is it roughly symmetric? Yes 2. Too much skeweness? No 3. Outliers? No

Hypothesis testing and decision-making

• If you reject the null, you have strong evidence in favor of the alternative AND you know the chances of having made a mistake (significance level; Type I error). > Rejecting the null usually leads to a clear decision. • If you fail to reject the null, however, we don't know (yet) how confident we should be in "accepting H_0". a) That is, we haven't calculated (yet) the chances of a Type II error: accepting the null hypothesis when the alternative is in fact true. b) Therefore we say, "failure to reject the null", instead of "accept the null". Not rejecting the null often does not lead to a decision, but to a stalemate.

Interval estimation: σ unknown

• When we don't know σ, we can't use it! (duh!) • We will therefore use a point estimate for it: the sample standard deviation, s. • BUT, when we must estimate σ, the interval estimate is based on a different distribution: the t distribution

Properties of point estimators

• notation • unbiasedness • efficiency • consistency

Sampling Distribution of x ̅

• x ̅ is a random variable, since any one sample will have a (somewhat) different mean. • The probability distribution of x ̅ is called the sampling distribution. • The sampling distribution of x ̅ is the probability distribution of all possible values of the sample mean x ̅


Kaugnay na mga set ng pag-aaral

Chem. for Health Science - Ch. 11 Intro to Organic Molecules

View Set

Ch.1. The Essentials of Statistics

View Set

A&P 2: Chapter 16 Endocrine system Review

View Set