COMP 233 up to final

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Calculate the sample mode

The sample mode is a value that has the highest frequency in a data set A distribution may have one, more than one, or no sample mode at all. check the frequency of the value and it is the class with the highest frequency if RV is continuous: It is the peak on the y axis that gives the mode (the x value)

How to calculate variance

Var(X)= E[X^2] - E[X]^2

Measures of Central Tendency

We're interested in a value that represents the center of the distribution o Sample Mean o Sample Median o Sample Mode

What is Chebyshev's Inequality?

SPECIAL CASE OF MARKOV requires mean and variance, give us bounds of probability being far away from the mean within the bound form: P{∣X−μ∣<k}≥1−k^2/σ^2​ The probability that X is at least T away from its mean cannot be bigger than Var(X)/t^2 Chebyshev's Inequality tells us that the probability of X being far from its mean (i.e., ∣X−μ∣≥k) is limited by the ratio σ^2/k^2 If the variance σ^2 is small, the probability of X being far from the mean is also small. If k is large, the probability of X deviating from the mean by at least k is small

Sample mean

Sample mean: X' = sum of x_i/n Population mean: μ expectation value of mean: E[X'] = μ This means the sample mean is an unbiased estimator of the population mean. On average, it hits the true population mean. variance: Var(X') = σ^2/n This shows that the spread of the sample mean decreases as the sample size increases.

Properties of Expectation: Scaling

Scaling: If you multiply a random variable by a constant a, the expectation scales by a. E[aX+b] = aE[X] +b. For example, if E[X]=3, then E[2X]= 2E[X] =6

Class frequency table example

Steps: 1. Determine number of classes: k≈sqrt(n)​ where is the amount of data value 2. Calculate range = max - min. 3. Class width = ⌈range/k⌉ 4. Use left-end inclusion convention (e.g., [0, 5) includes 0 but excludes 5). class must be: 1. mutually exclusive 2. exhaustive 3. continuous

The Exponential Density Function

Suppose the number of events in an interval on the number line follows a Poisson distribution with mean λ>0 Then, the lengths of the intervals between successive events are exponentially distributed with parameter λ>0

PDF to CDF

Take the integral of the pdf and make sure to define the bounds (so if it's x>=0, the bounds would be [0, x]). The upper bound is always x. Solve the integral which gives a function in terms of x

When to use which technique for binomial distribution approximations?

The Poisson approximation, which yields a good approximation when n is large and p small The Normal approximation, which can be shown to be quite good when np(1 − p) is large (typically >10)

Joint cumulative distribution function

The joint CDF F(x,y) for random variables X and Y is defined as: F(x,y)=P(X≤x,Y≤y) This gives the probability that X is less than or equal to x and Y is less than or equal to y. As x→∞, F(x,y) approaches the marginal CDF of Y: lim⁡x→∞ F(x,y)=FY(y) As y→∞, F(x,y) approaches the marginal CDF of X: lim⁡y→∞F(x,y)=FX(x)

What's the memoryless property of geometric mass function?

The memoryless property means that the time you still have to wait doesn't depend on how long you've already waited. geometric and exponential function only

Definition of probability

The probability of an event E, denoted as P(E), is a measure of the likelihood that E occurs.

Total probability rule

The rule says that the total probability of an event E can be calculated by splitting the sample space into: The part where F happens (first term: P(E∣F)P(F)) The part where F does not happen (second term: P(E∣F^c)P(F^c) When to use: When you want to find the probability of E happening, where the event E doesn't happen directly but depends on other factors

What is the complement of an event E?

The set of outcomes in the sample space that are not in E. E^c = S\E E and E^c are mutually exclusive

How to find the probability of a continuous random variable falling within a range

To find the probability of a continuous random variable falling within a range, we integrate the PDF over that range.

Probability of the union of 3 events

To make sure there is no overlapping events, we remove overlapping sections BUT we might remove them multiple times if they are in all sections so the formula is P(A∪B∪C) = P(A)+P(B)+P(C) −P(A∩B)−P(A∩C)−P(B∩C) +P(A∩B∩C)

Variance of Sum of RVs

Var(X+Y) = Var(X) + Var(Y) + 2Cov(X,Y) If X and Y are independent Cov(X,Y) = 0

Scaling of Variance

Var(aX) = a^2Var(X)

What are the type of statistical studies?

Descriptive Statistics: Summarizes data through organization, visualization, and measures (e.g., mean, frequency tables). Inferential Statistics: Uses probability theory to draw conclusions about populations from samples (e.g., predicting LED TV lifetimes).

Classical propability

Probability is calculated using the ratio of favorable outcomes to the total number of outcomes: P(E) = N(E)/N(S) N(E): Number of outcomes in event E. N(S): Total number of outcomes in the sample space S. EXAMPLE: F is the event of getting two numbers whose sum is 6. 1. Find the amount of times this happens F = {(1,5), (2,4), (3,3), (4,2), (5,1)} N(E) = 5 2. Divide it by sample size N(S) =

Probability vs statistics

Probability: - Predict data from a given model. - Based on rules for computing probabilities. Statistics: Getting data and then connecting them to your model - Create models from experimental data. - Involves various metrics and is more art-like. Probability predicts data from a given model, while statistics builds models from experimental data.

What is the difference between qualitative and quantitative data?

Qualitative Data: Non-numerical categories (e.g., colors, blood types). Example: Toyota Corolla colors (White, Black, Silver). Quantitative Data: Numerical values (e.g., counts, measurements). Example: LED TV lifetimes (in months).

Poisson distribution

understand variables λ is the average rate of occurrence (1.2 defects per 12 metres). k is the number of defects we're interested in. e is the base of the natural logarithm (e≈2.71828e≈2.71828).

Solve a marble problem

ways to choose the marble/total ways to choose x marbles *Note: if they are different ways to choose the marble MULTIPLY THEM* *NOTE: At Least One = 1 - BOTH* ex: Both Marbles Red A bag has 5 red marbles and 3 blue marbles. Two marbles are drawn at random without replacement. What is the probability both marbles are red? ways to choose 2 red marbles: 5C2 ways to choose 2 marbles: 8C2 Scenario: A bag has 3 red marbles and 2 blue marbles. Two marbles are drawn without replacement. Question:What is the probability the second marble is red? P(second marble is red) -> consider blue first then red, red first then red

Prensenting grouped data

1. Histograms 2. Ogives

Uniform Density Function

1/(b-a) review slide

Set operations

Commutative: E∪F = F∪E Associative: (E∪F)∪C = E∪(F∪C) Distributive: (E∪F)G = EG ∪ FG

Summary of discrete RV and continuous RV

PMF: probability mass function for discrete variable PDF: Probability density function for continuous variable

De Morgan's Law

(E∪F)^C = E^C * F^C (EF)^C = E^C ∪ F^C

Factorials reminder

0! = 1 n! = n(n-1)! if you want part of the factorial, add the rest as a multiplication of 1 (check picture)

Conditional probability

Conditional probability measures the likelihood of an event E, given that another event F has already occurred. It is defined as: P(E|F) = P(EF)/P(F) or P(E|F) = P(F|E)P(E)/P(F) Independent event: P(E|F) = P(E)

Binomial probability

1. Binary outcomes Normally success/failure 2. Independent trials The success or failure of one event doesn't affect the success or failure of the next trial 3. N# of trials (n) 4. Same p per trial (p) A binomial random variable X, with parameters (n, p), represents the number of successes in n independent Bernoulli trials (with same p).

How to solve a combinations/permutations problem

1. Does the order matters. (Yes = P, No = C) 2. Does it allows dupes (Yes = no factorial) 3. Amount of possibilities 4. Write the amount of boxes 5. If it is a combination, remove dupes SPECIAL CASE: Permutation with repetition: possibilities!/(repetition_1!*repetition_2!...) this gives the total way to arrange a word for example Numbers next to each other: Count them as a group/single unit, Find permutation and multiply with the amount of permutation possible within the unit

Calculate sample median

1. Sort the data 2. Odd n: Middle value. -> n+1/2 Even n: Average of two middle value -> n/2 and n+1/2 if RV is continuous: take the x value at half the y in an ogive

Calculate percentiles

1. Sort the data from low to high; 2. Count the number of values (n); 3. Select the p*(n+1)-th value. If p*(n+1) is not a whole number, then go halfway between the two adjacent numbers: look at the picture

Mutually exclusive vs independent

2 cases of events that are very special: 1. Mutually exclusive if both event can't happen at the same time you can add the probabilities together ex: getting an A(10%) or a B(20%) in an exam, probability of getting an A or B is 30% 2. Independent event The result of one doesn't affect the other do the union of all events, you know that P(AB) = P(A) *P(B) P(A∪B∪C) = P(A)+P(B)+P(C) −P(A∩B)−P(A∩C)−P(B∩C) +P(A∩B∩C) ex: Flipping a coin or throwing a dice

Whats a binomial random variable?

A binomial random variable, X, represents the number of Successes in n independent Bernoulli trials. X = sum of Bernoulli variable(0 or 1)

What is the negative binomial distribution?

A negative binomial random variable, X, represents the number of trials until the rth success. k: The total number of trials (including the r successes). r: the number of successes we want p: The probability of successes.

Population versus Sample Data

A sample is a subgroup of a population Use samples to infer information of the target population

What is an event ?

A set of outcomes of a probability experiment, i.e., a subset of the sample space It is only the set of the outcomes we are interested in, example we are only interested if the horse 3 won. E = {all outcomes of S that starts with a 3} Subset: E⊆S

What is sample space?

All the outcomes of an experiment We denote the sample space by S Ex: Rolling a cubic die once S = {1, 2, 3, 4, 5, 6} Ex: If the experiment consists of the running of a race among the seven horses having post positions 1, 2, 3, 4, 5, 6, 7 S = {all orderings of (1, 2, 3, 4, 5, 6, 7)} Discrete: outcome can be counted -> Finite or infinite Continuous: All the possible income within a range

What is the normal density function?

Also called a Gaussian distribution or the "bell curve". Properties of Normal Density Function: Maximum at x=μ Symmetric about x=μ Inflection points at μ±σ never touches 0 Mean (μ): Center of the distribution. Standard Deviation (σ): Spread of the distribution.

Stem and Leaf Plot

An efficient way of organizing a small or moderate sized data set stems are tens and leafs are one

Ogives

An ogive is the graph of a non-decreasing function graphical representation of cumulative frequencies for a dataset always starts at (0,0), then take upper class limit and its cumulative frequency X-Axis (Horizontal Axis): Represents class boundaries (or upper class limits). Y-Axis (Vertical Axis): Represents cumulative frequencies. Note: Ogive with relative frequencies is the same but with range [0, 1]

How to calculate covariance

Cov(X, Y) = E[XY] - E[X]E[Y] Note that Cov(X,Y) = Cov(Y,X) Cov(X,X) = Var(X)

Geometric distribution

Definition: This distribution models the number of trials until the first success/error P(X=k) = p(1−p)^k−1 p: the probability of success/error on a single trial k: represents the number of trials (or attempts) needed to achieve the first success.

What are the two types of frequency tables, and when are they used?

Categorical FT: For qualitative or small-range quantitative data (e.g., car colors, server requests). Grouped (Class) FT: For large-range quantitative data (e.g., TV lifetimes, exam grades). Guidelines: Equal-width intervals, mutually exclusive, 5-10 classes.

Mutually exclusive event

E and F have no outcome in common, it is impossible for both events to occur simultaneously. EF = empty set

What is an impossible event

E contains no outcomes from S

determine E[X], E[Y], E[X^2], E[Y^2], and E[XY]

E[X]: the sum of i*p(i) E[Y]: the sum of j*p(j) E[X^2]: The sum of i^2*p(i) E[Y^2]: The sum of j^2*p(j) E[XY]: The sum of i*j*p(i,j) **Do this for all the permutations of i,j, so all cases of the table**

Histograms

Each bar represents a class interval (e.g., 0-5, 5-10). The height of a bar represents the frequency of the respective class. Bar labels - Left endpoint value; or - Entire class interval Note: Relative Frequency Histograms Same principle as for frequency histograms. One simply uses relative frequencies - instead of ordinary frequencies - to determine the height of each vertical bar.

The n-th moment of an expected value

First moment: E[X] After than it's called the n-th moment of X: E[X^n] to calculate it just use the same formula as E[X] but power to the n the value

The central limit theorem

For large n, the sum/mean of n i.i.d. variables with mean μ and variance σ^2 approximates a normal distribution: N is always N(μ (mean), σ^2 (variance)) Sum: X∼N(nμ , nσ^2) -> σ is n^1/2*σ You're interested in the total (e.g., total yearly claims of all customers). Mean: X∼N(μ ,σ^2/n) -> σ is σ/n^1/2 You're interested in the sample mean (e.g., average income, average test score, average weight). REMEMBER Z-SCORE: Z = (X - μ)/σ the sample size n should be at least 30 but a sample size of 5 will often suffice for the approximation to be valid

How to know when to use geometric, binomial, negative binomial or poisson?

Geometric Distribution When to Use: The geometric distribution models the probability of the first success occurring on the k-th trial. use expected value when we want to know the number of trials to get the first succes Binomial Distribution When to Use: Use the Binomial Distribution when you are interested in the number of successes in a fixed number of trials. Negative Binomial Distribution When to Use: Use the Negative Binomial Distribution when you are interested in the number of trials needed to achieve a specified number of successes. The Poisson Distribution Use the Poisson Distribution when you are interested in the number of events that occur in a fixed interval of time or space, given a known average rate. + binomial probabilities in the case of large n and small p

How to know how the covariance of X and Y looks like?

If X > (mean of x) when Y > (mean of y) and X < (mean of x) when Y < (mean of y): Cov(X, Y) > 0 If X > (mean of x) when Y < (mean of y) and X < (mean of x) when Y > (mean of y): Cov(X,Y) < 0

Properties of Expectations: Independence of X and Y

If X and Y are independent: E[X+Y]=E[X]+E[Y] E[X⋅Y]=E[X]⋅E[Y]

Generalized Basic Principle of Counting

If the first experiment can occur in m ways and the second in n ways, the total number of outcomes is m×n If there are r experiments such that: 1. The first experiment has n1​ possible outcomes. 2. For each outcome of the first experiment, the second experiment has n2​ possible outcomes. 3. For each outcome of the first two experiments, the third experiment has n3​ possible outcomes, and so on. Then, the total number of outcomes across all r experiments is: n1​×n2​×⋯×n_r​ Basically just multiply the number of outcomes for each experiments.

What is the weak law of large numbers?

If you do something lots and lots of times, the average of your results will get closer to the true average.

What is a probability experiment

If you do the experiment multiple times (in the exact same way), you get a different result. Adds an element of randomness

Graph for Quantitative Data in Categorical FT

Line graph Bar graph Frequency polygon Pie chart

Probability of the union of 2 events

P (E ∪ F ) = P (E) + P (F ) − P (EF) We can easily see this by looking at this diagram, when you take P(E) + P(F) we get P(EF) twice, so we need to get rid of one of them

Cumulative probability

P( X <= a) = F(a) = integral from -inf to a in the PDF P(a < X <= b) = F(b) - F(a) = integral from a to b in the PDF P(X > a) = 1 - P(X <= a)

Intersection conditional probability (P(TC))

P(TC) = P(T)*P(C|T) Independent event: P(EF) = P(E) *P(F)

How to tell if X and Y are independent?

P(X, Y) = P(X) * P(Y) This means that the joint probability mass function (PMF) P(X=x,Y=y) must equal the product of the marginal PMFs P(X=x)and P(Y=y) for all possible values of x and y

Property of Probability

Non-negativity: 0≤ P(E) ≤1 Probability values always fall between 0 and 1 (inclusive) Certainty: P(S)=1 The probability of the entire sample space (SSS) is 1. General Addition Rule: P(E∪F)=P(E)+P(F)−P(E∩F) EXCEPT IF MUTUALLY EXCLUSIVE THEN (E∪F)=P(E)+P(F)

Permutations

ORDER IS IMPORTANT. Use permutation if you want to arrange r objects out of n total objects. Formula: P(n, r) = n!/(n-r)! n: total amount of objects r: object to arrange in a certain order if r = n: n! if r = 0: 1 ex: You have 5 students, and you want to arrange 3 of them in order for a photo. P(5, 3) = 5!/(5-3)! = 60 If you have multiple ways to order ex: How many words can we make with the letters "pepper" 2e, 3p, 1r = 6 characters n = 6 r1 = 2, r2 = 3 6!/(6-2)!*(6-3)! = 6!/(3!*2!)

Combinations

ORDER IS NOT IMPORTANT. The number of ways to choose r objects from n distinct objects assuming the order is important everywhere (using permutation) and then remove the places where the order does not matter. C(n, r) = P(n, r)/r! = n!/(n-r)!*r! C(n, 0) = C(n, n) = 1

Odds of an Event happening

Odds(A) = P(A)^C/P(A) Remember P(A)^C = 1 - P(A)

What is the probability that at least 5 (i.e., 5 or more) bits are received until the first error? HINT: Geometric distribution for P(X>= 5)

P(X>=5) = sum P(X=k) starting at 5 to infinity REVIEW THIS

Bayes formula

We know: 1. how often B happens given that A happens, written P(B|A) 2. how likely A is on its own, written P(A) 3. how likely B is on its own, written P(B) basically, FIND P(A|B) when we know P(B|A)

What is Markov inequality?

We only know E[X] Probability of X being at least T Markov's Inequality tells us that the probability of X being large (i.e., X≥a) is limited by the ratio E[X]/a If the mean E[X] is small, the probability of X being large is also small. If a is large, the probability of X exceeding a is small. Suppose X is a nonnegative random variable with E[X]=5. We want to find an upper bound on P(X≥10). Using Markov's Inequality: P(X≥10)≤E[X]/10 = 5/10 =0.5 So, the probability that X is greater than or equal to 10 is at most 0.5.

When can we use the normal approximation to the binomial?

What is a binomial random variable? It is a random variable X that counts the number of successes in n independent trials, where each trial has a success probability p. When np(1−p )> 10 you can use CTL: X∼N(np, np(1−p)) So, E[X] = np, Var(X) = np(1-p), Std dev = sqrt(np(1-p)) DONT FORGET Use continuity correction: P(X≤k)≈P(X<k+0.5)

Joint probability MASS function

X and Y are discrete the sum of p over all x and y is 1 ∑​_i∑_j​ p(xi​,yj​)=1

How to use the standardization formula?

Z=X−μ​/σ P{Z > x} = P{Z < -x} **find z before, dont do the conversion with P(X > -1) for example** P{Z < -x} = 1 - P{Z < x} Where X is the value we're interested in (e.g., 73 inches), μ is the mean (70 inches), σ is the standard deviation (3 inches). Once we have the Z-score, we can use the standard normal table (Z-table) to find the probability. The Z-table gives us P(Z≤z), the probability that a standard normal variable is less than or equal to z. ex: (a) Less than 73 inches tall. Calculate the Z-score: Z= (73−70)/3= 1.0 Look up P(Z≤1.0) in the Z-table: From the table, P(Z≤1.0)=0.8413 .So, the probability that a man is less than 73 inches tall is 84.13%. (b) Between 68 and 74 inches tall. Calculate the Z-scores for 68 and 74: Z_68=(68−70)/3=−0.67 Z_74=(74−70)/3=1.33 Look up P(Z≤−0.67) and P(Z≤1.33) in the Z-table: From the table: P(Z≤−0.67) = 1 - P{Z < 0.67} = 1 - 0.7517 = 0.2514 P(Z≤1.33)=0.9082 Calculate the probability: P(68<X<74)=P(Z≤1.33)−P(Z≤−0.67)=0.9082−0.2514=0.6568 So, the probability that a man is between 68 and 74 inches tall is 65.68%.

What is an expected/expectation value?

he expectation (or expected value) of a random variable is like the "average" value you would expect if you repeated an experiment many times. AKA weighted average in discrete: calculated by summing all possible outcomes multiplied by their probabilities ex: rolling a dice: E[X]=1⋅1/5​+2⋅1/6+⋯+6⋅1/6=3.5 in continuous: integral of the function given, change the bounds of the integral and change the f(x) to the given function, then solve the integral

Probability of a sub-event

if E⊆F, then: P(E)≤P(F) A sub-event E cannot have a probability larger than the event F it is contained within.

What are the Reproductive Property of Normal Distribution?

if X1, X2, ... X_N are independent normal random variables then their expected value is the sum of all their respective expected value and the same goes for its variance

Line graph

line graph The y-coordinate of a point represents the (relative) frequency of the category. The x-coordinate of a point represents the category.

Let E, F, G be three events. Find expressions for the events that of E, F, G

only - complement everything else ex: only E: EF'G' only E and G: EF'G at least x events happens - union all possibilities ex: at least two of the events occurs: EF u EG u FG at most x event happens - it is not the case that (sample size - x) events happens simultaneously ex: at most 2 events happens 1. two events happens simultaneously EF, EG, FG 2. It is not the case that these events happens together E'F' u E'G' u F'G'

How to calculate correlation?

provides a measure of the strength and direction of the linear relationship between two random variables X and Y Always between -1 and 1 A positive covariance indicates that Y tends to increase as X increases. A negative covariance indicates that Y tends to decrease as X increases. Corr(X,Y)=1: Perfect positive linear relationship. As X increases, Y increases linearly. Corr(X,Y)=−1: Perfect negative linear relationship. As X increases, Y decreases linearly. Corr(X,Y)=0: No linear relationship. X and Y are uncorrelated (but not necessarily independent).

Sample Standard Deviation (S)

sqrt(S²)

How to calculate standard deviation?

sqrt(Var(X))

Calculate the sample mean

sum of all x divided by the amount of people The mean is rounded to one more decimal place than occurs in the data. if there is frequency involved: (sum of all x*frequency)/ the total frequency or sum of all x*relative frequency with grouped variable: Find the midpoint of each class interval (lower + upper)/2 Multiply each midpoint by its class frequency Sum these products and divide by total observations

Sample variance (S²)

s² = Σ ( xi - x̄ )² / ( n - 1 ) shortcut: Σ ( xi )² - [( xi )²/n] / ( n - 1 )

What's a Bernoulli Random Variable?

t's a way to describe something that has only two possible outcomes: Success (usually represented as 1). P(1) = p Failure (usually represented as 0). P(0) = 1-p = q E[Y] = p Var(Y) = p*(1-p) = p*q

How to find marginal probabilities of X and Y with a joint probability function?

take the integral of the function with the corresponding bound and consider the variable as a constant, so you find an equation without the other variable To find fX(x), we integrate the joint PDF fX,Y(x,y) over all possible values of Y. So the bounds comes from the Y and the final function is in terms of x


संबंधित स्टडी सेट्स

Health, Illness, and Disparities

View Set

Chapter 5: Judges and Kings Await a Messiah (Schott)

View Set

Mountains, Mountain Ranges, and Volcanos

View Set

Chapter 56: Management of Patients with Dermatologic Disorders and Wounds

View Set

Strategy Final Chapter 9 Cooperative Strategy

View Set