STA 2023 Unit 2

Ace your homework & exams now with Quizwiz!

confidence interval

is a range of values with an associated probability or confidence level C. The probability quantifies the chance that the interval contains the true population parameter

mean of a random variable

is a weighted average of the possible values of X, reflecting the fact that all outcomes might not be equally likely.

variance of a random variable

is a weighted average of the squared deviations (X − µX)2 of the variable X from its mean µX.

test statistic

is based on the statistic that estimates the parameter. the standardized sample mean

The mean of the sampling distribution

is equal to the population mean μ.

The probability of making a Type II error

is labeled beta

Type II error

is made when we fail to reject the null hypothesis and the null hypothesis is false (incorrectly keep a false H0).

Type I error

is made when we reject the null hypothesis and the null hypothesis is actually true (incorrectly reject a true H0).

critical value z*

is related to the chosen confidence level C. C is the area under the standard Normal curve between −z* and z*

power of a test of hypothesis with fixed significance level α

is the probability that the test will reject the null hypothesis when the alternative is true. the probability that the data gathered in an experiment will be sufficient to reject a wrong null hypothesis

The probability of making a Type I error

is the significance level alpha

binomial probability

is this count multiplied by the probability of any specific arrangement of the k successes P (X = k) = (n choose k) p^k (1 - p)^ n-k

The standard deviation of the sampling distribution

is σ/√n, where n is the sample size.

Averages are more/less variable than individual observations.

less

mean of binomial distribution

m = np

margin of error formula

m is z* (sigma/ SR n)

statistical significance doesn't tell you about the ____ of the event, only if there is one

magnitude

Let X = your bank balance at the start of the day, and let Y = your bank balance at the end of the same day. And let us suppose that your starting balances vary with mean $2500 and standard deviation $37, while your ending balances vary with mean $3700 and standard deviation $34, and the two variables are correlated with ρ=0.75. Let us suppose that instead I will give you $60 cash at the start of each day. On average, how much will you start out with each day? What is the standard deviation of those amounts?

mb + x = b + mx 60 + 2500 = $2,560 σ2b+x = σ2X = $37

tests of significance

method of statistical inference assess evidence for a claim about a population

confidence intervals

method of statistical inference estimating a value of a population parameter

independent events use the ______ rule

multiplication rule P(BBB)= P(B) x P(B) x P(B)

Let X = your bank balance at the start of the day, and let Y = your bank balance at the end of the same day. And let us suppose that your starting balances vary with mean $2500 and standard deviation $37, while your ending balances vary with mean $3700 and standard deviation $34, and the two variables are correlated with ρ=0.75. On average, how much is deposited per day? What is the standard deviation of daily deposits

my - x= my - mx 3700 - 2500= $1,200 σ2X-Y = σ2X + σ2Y - 2ρσXσY (34)2 + (37)2 - (2 * .75 * 34 * 37) = 638 square root is $25.26

If A happens and it does NOT prevent B from happening

not disjoint

A company tests whether the mean volume of tea in their bottles is 500 ml, as stated on the label. Here, the company would likely be more concerned that the bottles contain less than advertised, which would likely lead to consumer complaints of false advertising. Thus we test:

one-sided test: H0 : µ = 500 ml Ha : µ < 500 ml

test of statistical significance

tests a specific hypothesis using sample data to decide on the validity of the hypothesis.

a SMALL p value implies

that random variation because of the sampling process alone is not likely to account for the observed difference.

If A happens and it DOES alter the chances that B will happen

dependent

final sample spaces deal with ____ data

discrete

discrete or continuous & sample space: the exact number of characters printed on a randomly selected page of this packet

discrete S= {1,2,3,4...}

discrete or continuous & sample space: the number of eyelashes on a randomly selected giraffe

discrete S= {1,2,3,4...}

complement rule

P(x)= 1 - P(xc) c: complement rule states that the probability of an event not occurring is 1 minus the probability that it does occur.

A complete set of probabilities always add up to

1

the total area under a probability histogram is always

1 just as the total area under a density curve is always 1

how to calculate the variance of a discrete random variable

((x1 - mean x)2 * probability 1) + ....

You are in charge of quality control in your food company. You sample randomly four packs of cherry tomatoes, each labeled 1/2 lb. (227 g). The average weight from your four boxes is 222 g. H0 : µ = 227 g (µ is the average weight of the population of packs) Ha : µ ≠ 227 g (µ is either larger or smaller) What is the probability of drawing a random sample such as yours if H0 is true? Find the P Value, is it significant? Do you reject the null hypothesis?

(see notes) p value= 0.0456 yes it is significant x < .05 yes you reject the null hypothesis

how to find the mean of a discrete random variable

(x1 + probability 1) + (x2 + probability 2) + ...

random variable

- many r.v. for one random process ex) rolling 2 die (random process) R.V: sume of these faces --> 2-12

the probability of a single event of a density curve is

0

P-value of ___ or less is considered significant

0.05

Let X = your bank balance at the start of the day, and let Y = your bank balance at the end of the same day. And let us suppose that your starting balances vary with mean $2500 and standard deviation $37, while your ending balances vary with mean $3700 and standard deviation $34, and the two variables are correlated with ρ=0.75. Let us suppose that I will give you 5% of your starting balance every day. On average how much will I give you every day? What is the standard deviation of those amounts?

0.5 mx = 0.5(2500) = $125 (0.5)2 * (37)2 = 3.42 square root --> $1.85

Let us suppose that the left eye spherical measurements on eyeglass prescriptions are Normally distributed with mean 0 diopters and standard deviation 1 diopter. We will randomly select 35 people and calculate their mean left eye spherical measurement. (c) What is the probability that the sample mean turns out to be within 0.4 diopters of the population mean?

1 - 0.0082 - 0.0082 = 0.9836 = 98%

the power of a test is

1 − b.

the population parameter "mu" must be within roughly ___ standard deviations from the sample average, in ___ of all samples.

2; 95%

Decrease σ.

A larger variance σ2 implies a larger spread of the sampling distribution, σ/√n. Thus, the larger the variance, the lower the power. The variance is in part a property of the population, but it is possible to reduce it to some extent by carefully designing your study.

law of large numbers

As the number of randomly drawn observations in a sample increases, the mean of the sample gets closer and closer to the population mean "mu". only applies to really large numbers

confidence interval to test a two-sided hypothesis.

C = 1 - α.

In a large population of adults, the mean IQ is 112 with standard deviation 20. Suppose 200 adults are randomly selected for a market research campaign. The distribution of the sample mean IQ is: A) Exactly normal, mean 112, standard deviation 20 B) Approximately normal, mean 112, standard deviation 20 C) Approximately normal, mean 112 , standard deviation 1.414 D) Approximately normal, mean 112, standard deviation 0.1

C) Approximately normal, mean 112 , standard deviation 1.414

If A happens and it DOES prevent B from happening?

Disjoint and depenedent

The FDA tests whether a generic drug has an absorption extent similar to the known absorption extent of the brand-name drug it is copying. Higher or lower absorption would both be problematic, thus we test:

H0 : µgeneric = µbrand Ha : µgeneric (does not equal) µbrand two-sided

mean and margin of error for confidence interval

Mean ± m m is called the margin of error m within x ± m Ex: 120 ± 6 Two endpoints of an interval m within (x −m) to (x + m) Ex: 114 to 126

statistical inference

Methods for drawing conclusions about a population from sample data

Increase α

More conservative significance levels (lower α) yield lower power. Thus, using an α of .01 will result in lower power than using an α of .05.

Let us suppose that the left eye spherical measurements on eyeglass prescriptions are Normally distributed with mean 0 diopters and standard deviation 1 diopter. We will randomly select 35 people and calculate their mean left eye spherical measurement. (a) Write down the distribution of the sample mean.

N (0, 1/square root of 35)

the sample mean distribution is N()

N(μ, σ/√n)

discrete or continuous & sample space: the exact square footage of a randomly selected apartment

continuous S= {0 < (or = to) x < infinity}

The fees in a sample of 292 bankruptcy cases was examined. x = $1078 and s = $592. What is the distribution of the sample means of x? Find the middle 95% of the sample means distribution.

Normal (mean μ, standard deviation σ/√n) = N($1078, $34.6). Roughly ± 2 standard deviations from the mean, or $1078 ± $2x34.6. approximately ($1008.80, $1147.20).

A coin is flipped 10 times. Each outcome is either a head or a tail. The variable X is the number of heads among those 10 flips, our count of "successes." Find B (n,p)

On each flip, the probability of success, "head," is 0.5. The number X of heads among 10 flips has the binomial distribution B(n = 10, p = 0.5).

The probability that you obtain heads OR tails

P(HH or TT) = P(HH) + P(TT) = 0.25 + 0.25 = 0.50

discrete data

data that can only take on a limited number of values.

Let us suppose that 17% of all hospital admissions are for gun-related injuries. X = the number of gun-related admissions in 29 randomly selected hospital admissions is a Binomial random variable. What is the probability that more than 5 but less than 12 are gun-related among 29 randomly selected hospital admissions?

P(x=6) + P(x=7) + P(x=8) + ... + P(x=11) P(x < or equal to 11) - P (x < or equal to 5) binomcdf (29, .17, 11) - binomcdf (29, .17, 5)= 0.3683

What is the probability, if we pick one woman at random, that her height will be some value X? For instance, between 68 and 70 inches P(68 < X < 70)? N(64.5, 2.5)

Probability= 0.0669

small p values and null hypothesis

REJECT null hypothesis The true property of the population is significantly different from what was stated in H0.

A basketball player shoots three free throws. What is the number of baskets made?

S = { 0, 1, 2, 3 }

A basketball player shoots three free throws. What are the possible sequences of hits (H) and misses (M)?

S = { HHH, HHM, HMH, HMM, MHH, MHM, MMH, MMM } Note: 8 elements, 23

If you flip two coins, and the first flip does not affect the second flip Sample size and probability of each of these events?

S = {HH, HT, TH, TT}. The probability of each of these events is 1/4, or 0.25.

ex) coin flip what is the sample space? probability of heads AND tails

S= {H,T} P of heads= 0.5 P of tails= 0.5

p value

Tests of statistical significance quantify the chance of obtaining a particular random sample result if the null hypothesis were true. This is a way of assessing the "believability" of the null hypothesis given the evidence provided by a random sample.

binomial distribution

The distribution of the count X of successes in the binomial setting is the binomial distribution with parameters n and p: B(n,p). The parameter n is the total number of observations. The parameter p is the probability of success on each observation. The count of successes X can be any whole number between 0 and n

binomial coefficient

The number of ways of arranging k successes in a series of n observations (with constant probability p of success) is the number of possible combinations (unordered sequences). (n) = n! / k! (n-k)! (k)

null hypothesis, H0

The statement being tested in a test of significance

S= sample space

This is a set, or list, of ALL possible outcomes of a random process.

significance level, alpha

This value is decided arbitrarily before conducting the test.

central limit theorem

When randomly sampling from any population with mean μ and standard deviation σ, when n is large enough, the sampling distribution of x bar is approximately normal: ~ N(σ, σ/√n).

discrete random variable

X has a finite number of possible values

continuous random variable

X takes all values in an interval (infinite)

higher confidence C implies

a larger margin of error m (thus less precision in our estimates).

probability

a random phenomenon can be defined as the proportion of times the outcome would occur in a very long series of repetitions.

probability distribution

a random variable X tells us what values X can take and how to assign probabilities to those values.

Increasing the sample size

decreases the spread of the sampling distribution and therefore increases power. But there is a tradeoff between gain in power and the time and cost of testing a larger sample

continuous random variables can by represented by (1)

density curve

event

a subset of the sample space

Think of the random process of repeatedly selecting a two-kid family and recording the number of girls in each family. Let X = the number of girls in a randomly chosen two-kid family. a) what are the possible values of x? (b) Let us assume that the probability of getting a girl is 0.6 for all families. Calculate the probability of getting each number of girls that you wrote down above. (c) Construct a probability distribution table for X. (d) Calculate the mean (e) Calculate the variance (f) calculate the standard deviation

a) x= {0,1,2} b) / c) 0 1 2 0.16 0.48 0.36 d) 1.2 e) ((0 - 1.2)2 * 1.6) + ... f) 0.69 (square root of variance)

disjoint events use ______ rule

addition rule P (A or B)= P(A) + P(B)

hypothesis

an assumption or a theory about the characteristics of one or more variables in one or more populations.

size of the effect

an important factor in determining power. Larger effects are easier to detect.

discrete or continuous & sample space: the exact amount of gas it takes to drive from FSU to a certain off-campus home on a randomly selected day

continuous S= {0 < (or = to) x < infinity}

the individual outcomes of a random phenomenon are always

disjoint

mean of a random variable is also called

expected value of x

assigning probabilities empirically

from our knowledge of numerous similar past events ex) Mendel discovered the probabilities of inheritance of a given trait from experiments on peas without knowing about genes or DNA.

assigning probabilities theoretically

from our understanding the phenomenon and symmetries in the problem ex) A six-sided fair die: each side has the same chance of turning up Genetic laws of inheritance based on meiosis process

confidence interval in relation to null hypothesis

gives a black and white answer: Reject or don't reject H0. But it also estimates a range of likely values for the true population mean µ.

two-tail or two-sided test

has these null and alternative hypotheses: H0 : µ = [a specific number] Ha : µ (does not equal) [a specific number]

one-tail or one-sided test

has these null and alternative hypotheses: H0 : µ = [a specific number] Ha : µ < [a specific number] OR H0 : µ = [a specific number] Ha : µ > [a specific number]

Two events A and B are disjoint if

if they have no outcomes in common and can never happen together. The probability that A or B occurs is then the sum of their individual probabilities

If A happens and it does NOT alter the chances that B will happen

independent

how to find the standard deviation of a random variable

positive square root of the variance

The sampling distribution of a statistic is the

probability distribution of that statistic.

discrete random variables can by represented by (2)

probability distribution table probability histogram

a lower confidence level C

produces a smaller margin of error m (thus better precision in our estimates).

p value in relation to null hypothesis

quantifies how strong the evidence is against the H0. But if you reject H0, it doesn't provide any information about the true population mean µ.

random process

series of independent trials where the outcome of each trial is unpredictable, but a pattern of outcomes shows up in the long run. ex) rolling dice

variance of binomial distribution

sigma squared x = np(1 - p)

standard deviation of binomial distribution

sigma x = square root np(1 - p)

sampling distribution of a statistic

the distribution of all possible values taken by the statistic when all possible samples of a fixed size n are taken from the population.

independent

the outcome of a new coin flip is not influenced by the result of the previous flip

alternative hypothesis , Ha

the statement we suspect is true instead of the null hypothesis

how do we find specific z* value?

use table of z/t values (table C)

If the P-value is greater than α (P > α)

we fail to reject H0.

If the P-value is equal to or less than α (P ≤ α)

we reject H0.

confidence interval formula

x bar +/- z* (sigma/ SR n)

Let us suppose that 17% of all hospital admissions are for gun-related injuries. X = the number of gun-related admissions in 29 randomly selected hospital admissions is a Binomial random variable. What is the distribution of X?

x is B (29, .17)

Suppose that for FSU students, the probability that an individual is eating pizza at any given time is 40%. That means if I randomly sample an FSU student, the probability that they are eating pizza is 40%. Suppose we randomly sample 3 FSU students. Let X = the number among 3 randomly sampled students who are eating pizza. Construct the probability distribution for X.

x | 0 1 2 3 p(x)| . 6^3 3 (.4 x .6^2) 3 (.4^2 x .6) .4^3

Let us suppose that 17% of all hospital admissions are for gun-related injuries. X = the number of gun-related admissions in 29 randomly selected hospital admissions is a Binomial random variable. Sample space of x?

x= {0,1,2,...,29}

Let us suppose that the left eye spherical measurements on eyeglass prescriptions are Normally distributed with mean 0 diopters and standard deviation 1 diopter. We will randomly select 35 people and calculate their mean left eye spherical measurement. (d) What is the 90th percentile of the sampling distribution of the sample mean?

z = 1.28 (0.915) 1.28 = x - 0 / (1/SR 35) 0.216 = x

Let us suppose that the left eye spherical measurements on eyeglass prescriptions are Normally distributed with mean 0 diopters and standard deviation 1 diopter. We will randomly select 35 people and calculate their mean left eye spherical measurement. (b) What is the probability that the sample mean turns out to be less than -0.4 diopters?

z= (-0.4 - 0)/ (1/ SR of 35) = -2.4 0.0082 = 82%

You invest 20% of your funds in Treasury bills and 80% in an "index fund" that represents all U.S. common stocks. Your rate of return over time is proportional to that of the T-bills (X) and of the index fund (Y), such that R = 0.2X + 0.8Y. Based on annual returns between 1950 and 2003: Annual return on T-bills µX = 5.0% σX = 2.9% Annual return on stocks µY = 13.2% σY = 17.6% Correlation between X and Yρ = −0.11 Find the mean and standard deviation

µR = 0.2µX + 0.8µY = (0.2*5) + (0.8*13.2) = 11.56% σ2R = σ20.2X + σ20.8Y + 2ρσ0.2Xσ0.8Y = 0.2*2σ2X + 0.8*2σ2Y + 2ρ*0.2*σX*0.8*σY = (0.2)2(2.9)2 + (0.8)2(17.6)2 + (2)(−0.11)(0.2*2.9)(0.8*17.6) = 196.786 σR = √196.786 = 14.03%


Related study sets

Toxicology - CH 08 Chemical Carcinogenesis

View Set

unit 4 - cell division and reproduction - test #4 quizlet

View Set

The Power of Art (chapter 1 in the book)

View Set

Chpt 19 : Managing Public Relations

View Set

SOC Chapter 3-6 Quiz Questions (EXAM 2)

View Set

Chapter 8: Consumer Purchasing Strategies and Legal Protection

View Set