StatQuest Chapter01 -03

Ace your homework & exams now with Quizwiz!

P-value and Effect Size

p-value does not imply how the effect size, or difference between Drug A and Drug B is . It just shows the how results could occur by chance

What is data leakage?

reusing the same data for Training and Testing is called Data Leakage

What is discrete data?

Discrete data can only take particular values. (Examples include number of apples or hair colour)

R-squared value calcuation with Sum of the Squared

(SSR(mean) - SSR(fitted line) )/ SSR(mean) if you get for exmplame 0.7 which means that there was a 70% reduction in the size of the Residuals between the mean and the fitted line

p-values

A 0.05 threshold for p-values means that 5% of the experiments, where the only differences come from weird, random things, will generate a p-value smaller than 0.05. we use p-values to give us a sense of how much confidence we should put in the predictions that our models make.

R-squared value in small dataset

Because a small amount of random data can have a high (close to 1) R2, any time we see a trend in a small dataset, it's difficult to have confidence that a high R2 value is not due to random chance.

What is continuous data?

Continuous data are not restricted to defined separate values (always numeric... examples include specific height)

how error is callculated in cross - validation method

Each fold ( literation ) results in a slightly different fitted line, and each fitted line give different prediction error . We can average or we can compare these errors to erros made by another method.

false positive

Getting a small p-value when there is no difference is called a False Positive. Using a threshold of 0.001 would get a False Positive only once in every 1,000 experiments. Using a threshold of 0.2 means we're willing to get a False Positive 2 times out of 10

Sample for Poisson distribution.

If you use 10 pages of this book in an hour, the you can use the Poisson Distribution. p(x | λ) = (e^(-λ) * λ^x) / x! λ= 10, x = 8

P value with different explaintion

In practice, a commonly used threshold is 0.05. It means that if there's no difference between Drug A and Drug B, and if we did this exact same experiment a bunch of times, then only 5% of those experiments would result in the wrong decision.

what is problem with using histograms ?

It is hard to deal with continuous data. Because histograms can be very sensitive to the size of the bins. The distribution graphs are used for continuous data ( Normal Distribution)

what is the purpose of histogram

It makes easy to see the trends in the data

What is discrete probablity Distributions

It that arise when a random variable can only take on a countable number of values. In other words, the variable can only take on specific, distinct values. For example, if we roll a fair six-sided die, the possible values are 1, 2, 3, 4, 5, and 6. Each of these values has an equal probability of occurring, which is 1/6.

What is a Poisson distribution?

It uses to calculate the probability of discrete distributions.

Mean Squared Error (MSE)

One way to compare the two models that may be fit to different-sized datasets is to calculate the Mean Squared Error (MSE), which is simply the average of the SSR.

R-squared value calcuation with Mean Squared Error

SSR/n

The problem with mean squared error (MSE)

That is correct. The interpretation of mean squared error (MSE) can be challenging because it depends on the scale of the data. MSE is a measure of the average squared difference between the predicted values and the actual values, and its units are the square of the units of the data. The maximum value of MSE is dependent on the scale of the data. For example, if the data are measured in thousands, then the maximum value of MSE will be much larger than if the data are measured in ones. Therefore, it is not meaningful to compare MSEs across datasets with different scales. To overcome this challenge, one possible solution is to use a scaled version of MSE, such as the root mean squared error (RMSE) or the mean absolute error (MAE), which have units that are the same as the original data and are easier to interpret. Another approach is to normalize the data before calculating MSE so that they are on a comparable scale. This way, the resulting MSE will be more interpretable and can be compared across datasets with different scales.

what threshold is 0.05

That said, the most common threshold is 0.05 because trying to reduce the number of False Positives below 5% often costs more than it's worth

When the binomial distribution is used ?

The Binomial Distribution is useful for anything that has binary outcomes (wins and losses, yeses and noes, etc.),

What is binomial distribution ?

The binomial distribution is a type of probability distribution that helps us understand the likelihood of a certain number of successes in a fixed number of independent trials.

Explanation of binomial distribution formula. ( p(x | n, p) = (n choose x) * p^x * (1-p)^(n-x))

The probability we meet x= 2 people who prefer pumpkin pie, given we ask n =3 people and the probablity of someone preferring pumplin pie is p =0.7

What is the likelihood in Normal (Gaussian) Distribution ?

The y-axis represents the Likelihood of observing any specific in the distribution. maxim likehood happens at mean value of Normal Distribution

Leave-One-Out Cross Validation (LOOCV)

This method uses all but one point for Training and uses the one reaining point for testing and interates untill every single point has been used for Testing

What is a probability distribution?

To estimate the probability of future events, we often use the normal distribution curve, also known as the bell curve. This mathematical model is symmetrical and bell-shaped, with the highest point representing the mean or average value, and the standard deviation showing the range of data. By identifying the mean and standard deviation, we can estimate the probability of events falling within a certain range of values. This is useful when we do not have enough data to make accurate predictions based on past occurrences.

P > 0.05 is the probability that the null hypothesis is true.

Yes

Is R2 related to Pearson's correlation coefficient?

Yes! If you can calculate Pearson's correlation coefficient, ρ (the Greek character rho) or r, for a relationship between two things, then the square of that coefficient is equal to R2. In other words... ρ2 = r2 = R2 ...and now we can see where R2 got its name

What is standard deviation?

a computed measure of how much scores vary around the mean score

The problem with Sum of the Squared Residuals (SSR)

although awesome, is not super easy to interpret because it depends, in part, on how much data you have . It means that if you have more data , then you have more error

What is cross-validation? and which problem it solves

it is the method that the points are assigned astest and training dataset.

Standard deviation in normal distribution

the standard deviation is helpful because normal curves are drawn such that about 95% of the measurements fall between +/- 2 Standard Deviations around the Mean.


Related study sets

English 4 Pt. 2 : Gulliver's Travels

View Set

Longman Dictionary of Language Teaching & Applied Linguistics-first 410 words Plus Practical English Usage the first 95 words

View Set

Gov Exam Mass Media and Interest Groups

View Set

Chapter 13: Aging & Elderly Quiz and Terms

View Set

NU 300 EXAM 1: CANCER, CULTURE, PAIN, MED-SURG NURSING

View Set

NUR2101 Test #4 QUESTIONS Neurology and Musculoskeletal

View Set