R101 Confidence Intervals and Hypothesis Testing

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

margin of error for proportion

MOE <- z * SE

margin of error (MOE)

Z* x SEM

print results for proportion

cat("Biden support:", p_hat, "\n") cat("Standard Error:", SE, "\n") cat("Margin of Error:", MOE, "\n") cat("95% Confidence Interval: [", lower_bound, ",", upper_bound, "]\n")

how to print results for means

cat("Sample Mean:", sample_mean, "\n") cat("Sample Standard Deviation:", sample_sd, "\n") cat("Sample Size:", n, "\n") cat("Standard Error of the Mean:", sem, "\n") cat("Critical Value:", critical_value, "\n") cat("Margin of Error:", margin_of_error, "\n") cat("95% Confidence Interval: [", lower_bound, ",", upper_bound, "]\n")

how to find the critical value for a __% confidence level

confidence_level <- ___ alpha <- 1 - confidence_level critical_value <- qnorm(1 - alpha / 2)

set.seed()

create reproducible results when writing code that involves creating variables that take on random values (same random values each time code is run)

conclusion if p value is greater than alpha

fail to reject the null hypothesis. this means that the data do not provide sufficient evidence to conclude that there is a significant effect or relationship.

how to find confidence interval for proportions with given confidence level

n <- ___ df <- read.csv p_hat <- sum(df$target variable >,< or ≠ 100) / n SE <- sqrt((p_hat * (1 - p_hat)) / n) # Critical value for 99% confidence level z <- qnorm(1 - 0.005) MOE <- z * SE lower_bound <- p_hat - MOE upper_bound <- p_hat + MOE cat("Sample Proportion:", p_hat, "\n") cat("Standard Error:", SE, "\n") cat("Margin of Error:", MOE, "\n") cat("99% Confidence Interval: [", lower_bound, ",", upper_bound, "]\n")

how to find confidence level from sample data

n <- _____ df <- read.csv p_hat <- sum(df$Spending >,< or ≠ 150) / n p_hat <- sum(df$Spending > 150) / n desired_moe <- _____ SE <- sqrt((p_hat * (1 - p_hat)) / n) z <- desired_moe / SE confidence_level <- 2 * pnorm(z) - 1 cat("Sample Proportion of spendings > $150:", p_hat, "\n") cat("Standard Error:", SE, "\n") cat("Desired Margin of Error:", desired_moe, "\n") cat("Z-score required:", z, "\n") cat("Confidence Level:", confidence_level * 100, "%\n")

how to find sample proportion

n <- _____ name of dataset_data <- read.csv p_hat <- sum(name of dataset_data$variable of interest == "name of wanted") / n

how to find sample size

n <- nrow(sample_df)

finding critical value Z*

qnorm(p, mean = 0, sd = 1, lower.tail = TRUE)

conclusion if p value is smaller than alpha

reject the null hypothesis. this suggests that there is strong evidence that the effect or relationship being studied is not due to random chance.

standard error of the mean (SEM)

s/sqrt(n)

how to find standard error of the mean

sem <- sample_sd / sqrt(n)

critical value for __% confidence for proportion

z <- ____

z test from aggregates

z_test_from_agg(mean1,mean2, sd1,sd2. N1,N2)

permutations test

- Permutation(df, 'CAT', 'NUM',N, 'v1, 'v2') - df is the name of data frame - CAT is a categocial variable - NUM is the numerical variable - N is the number of permutations - v1 and v2 are two values of CAT variable

interpretation of confidence interval

- at __% confidence, the true ______ falls between ___ and ____ - if we repeated this process multiple times, __% of intervals would contain the true ________

rnorm()

- generates a sample of normal distribution - norm(n, mean, sd)

relationship between confidence level and width of confidence interval

- higher confidence leads to wider interval - lower confidence leads to narrower interval

pnorm()

- return probability p = CDF value from (-infinity, q) - qnorm(q, mean, sd)

qnorm()

- return quantile value q based on the probability value of p - qnorm(p, mean)

dnorm()

- return the density of probability at point of x - dnorm(x, mean, sd)

z test

- z_test(df,"CAT, "NUM","v1", "v2") - df is the name of data frame - CAT is a categocial variable - NUM is the numerical variable - N is the number of permutations - v1 and v2 are two values of CAT variable

standard error for proportion

SE <- sqrt((p_hat * (1 - p_hat)) / n)

how to find two tailed proportion tests

library(stats) df <- read.csv milk_transactions <- df[df$Milk == "Yes", ] p_hat <- mean(milk_transactions$Bread == "Yes") n <- nrow(milk_transactions) se <- sqrt(p_hat * (1 - p_hat) / n) z <- qnorm(0.99) # for 98% confidence interval, we use 0.99 because it's two-tailed ci_lower <- p_hat - z * se ci_upper <- p_hat + z * se cat("The 98% confidence interval for the proportion of transactions that buy Bread when they buy Milk is:", ci_lower, "to", ci_upper, "\n")

how to find confidence level/interval given margin of error

library(stats) name of dataset_df <- read.csv set.seed(123) sample_indices <- sample(nrow(airbnb_df), # of samples of tuples from data frame) sample_df <- airbnb_df[sample_indices, ] sample_mean <- mean(sample_df$price, na.rm = TRUE) sample_sd <- sd(sample_df$price, na.rm = TRUE) n <- nrow(sample_df) sem <- sample_sd / sqrt(n) desired_margin_of_error <- _____ critical_value <- desired_margin_of_error / sem confidence_level <- 2 * pnorm(critical_value) - 1 lower_bound <- sample_mean - desired_margin_of_error upper_bound <- sample_mean + desired_margin_of_error cat("Sample Mean:", sample_mean, "\n") cat("Sample Standard Deviation:", sample_sd, "\n") cat("Sample Size:", n, "\n") cat("Standard Error of the Mean:", sem, "\n") cat("Critical Value:", critical_value, "\n") cat("Desired Margin of Error:", desired_margin_of_error, "\n") cat("Corresponding Confidence Level:", confidence_level * 100, "%\n") cat("Confidence Interval with Margin of Error $1: [", lower_bound, ",", upper_bound, "]\n")

how to find sample size necessary to achieve confidence level for given margin of error

library(stats) name of dataset_df <- read.csv set.seed(123) # For reproducibility sample_indices <- sample(nrow(airbnb_df), # of samples of tuples from data frame) sample_df <- airbnb_df[sample_indices, ] sample_sd <- sd(sample_df$price, na.rm = TRUE) desired_margin_of_error <- __ confidence_level <- _____ alpha <- 1 - confidence_level z_score <- qnorm(1 - alpha / 2) required_sample_size <- (z_score * sample_sd / desired_margin_of_error)^2 cat("Sample Standard Deviation:", sample_sd, "\n") cat("Critical Value for 90% Confidence Level:", z_score, "\n") cat("Desired Margin of Error:", desired_margin_of_error, "\n") cat("Required Sample Size:", ceiling(required_sample_size), "\n")

how to find sample mean and standard deviation

library(stats) nameofdataset_df <- read.csv set.seed(123) sample_df <- name of dataset_df[sample(nrow(nameofdataset_df), # of samples of tuples from data frame), ] sample_mean <- mean(sample_df$target variable, na.rm = TRUE) sample_sd <- sd(sample_df$target variable, na.rm = TRUE)

confidence interval for proportion

lower_bound <- p_hat - MOE upper_bound <- p_hat + MOE

how to find confidence interval

lower_bound <- sample_mean - margin_of_error upper_bound <- sample_mean + margin_of_error

how to find margin of error

margin_of_error <- critical_value * sem

confidence interval

mean ± MOE

alternative hypothesis

mean(df[CAT==v2,]$NUM > mean(df[CAT==v1,]$NUM)


Ensembles d'études connexes

Chapter 5 DSM (+What you need to know)

View Set

APUSH Essay Questions and Answers

View Set

Module 2 - Socio-cultural Evolution (44/50pts) "B"

View Set

Chapter 11: Essential Peripherals

View Set

Financial Accounting II - Chapter 13 - Exam 3 Review

View Set

EMT104 Chapter 7 Navigate 2 Quiz

View Set

Epidemiology & Evidence Based Medicine

View Set