Statistics

¡Supera tus tareas y exámenes ahora con Quizwiz!

What is a test statistic?

A test statistic is a standardized value that is calculated from sample data during a hypothesis test (ex. z-score, t-score).

What is a Bernoulli trial?

In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is conducted.

What is a Bayes estimator?

A bayesian estimator is an estimator of an unknown parameter θ that minimizes the expected loss for all observations x of X. In other words, it's a term that estimates your unknown parameter in a way that you lose the least amount of accuracy. Your data may be able to be represented by the function f(x|θ), where θ is a prior distribution. However, you don't know the actual value of θ, so you have to estimate it. An estimator of θ is a real valued function δ(X1 . . . Xn). The loss function L(θ, a) is also a real valued function of θ. Our estimate here is a, and L(θ, a) tells us how much we lose by using a as an estimate when the true, real value of a parameter is θ. There are different possible loss functions. For instance, the squared error loss function is given by L(θ a) = (θ -a)2. The absolute error loss function would be L(θ a) = |θ -a|.

What is a chi-square test?

A chi-squared test, also written as χ2 test, is a statistical hypothesis test that is valid to perform when the test statistic is chi-squared distributed under the null hypothesis. Test statistics that follow a χ2 distribution occur when the observations are independent and normally distributed, which assumptions are often justified under the central limit theorem. There are two types of chi-square tests: the independence test and the goodness of fit test.

What is a confidence interval?

A confidence interval gives us a range of values which is likely to contain the population parameter. The confidence interval is generally preferred over point estimates, as it tells us how likely this interval is to contain the population parameter. This likeliness or probability is called Confidence Level or Confidence coefficient and represented by 1 — alpha, where alpha is the level of significance.

What is a critical value?

A critical value is a point (or points) on the scale of the test statistic beyond which we reject the null hypothesis, and, is derived from the level of significance α of the test.

What is a Poisson distribution?

A discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space · A distribution is Poisson when the following assumptions are valid o Any successful event should not influence the outcome of another successful event o The probability of success over a short interval must equal the probability of success over a longer interval o The probability of success in an interval approaches zero as the interval becomes smaller An example that may follow a Poisson distribution is the number of phone calls received by a call center per hour throughout the day.

What is linear regression? What do the terms p-value and coefficient mean?

A linear regression is a good tool for quick predictive analysis. In order to see a linear relationship between variables, we need to build a linear regression, which predicts the line of best fit between them and can help conclude whether or not these factors have a positive or negative relationship. The p-value for each term tests the null hypothesis that the coefficient is equal to zero (no effect) and is the result of a t-test. A low p-value (< 0.05) indicates that you can reject the null hypothesis. In other words, a predictor that has a low p-value is likely to be a meaningful addition to your model because changes in the predictor's value are related to changes in the response variable. Regression coefficients represent the mean change in the response variable for one unit of change in the predictor variable while holding other predictors in the model constant.

What is a normal distribution?

A normal, or Gaussian, distribution is when data is distributed around a central value without any bias to the left or right and takes the form of a bell-shaped curve. Statistically speaking, the mean, mode, and median all fall at the center and are essentially the same.

What is the difference between two-tailed and one-tailed tests?

A two-tailed test is one that can test for differences in both directions. For example, a two-tailed 2-sample t-test can determine whether the difference between group 1 and group 2 is statistically significant in either the positive or negative direction. A one-tailed test can only assess one of those directions.

What is the difference between type I vs type II error?

A type I error occurs when the null hypothesis is true, but is rejected. A type II error occurs when the null hypothesis is false, but erroneously fails to be rejected.

What is Barnard's test?

Barnard's test is an exact test used in the analysis of contingency tables. It examines the association of two categorical variables and is a more powerful alternative than Fisher's exact test for 2×2 contingency tables. The Barnard's test is different than the Fisher's exact test in that it calculates the p-value differently. It is used in A/B testing for experiments where data follows a binomial distribution and is an alternative to the Fisher's exact test.

What is a statistical interaction?

Basically, an interaction is when the effect of one factor (input variable) on the dependent variable (output variable) differs among levels of another factor

What is correlation and covariance?

Both Correlation and Covariance establish the relationship and also measure the dependency between two random variables. It explains the systematic relation between a pair of random variables, wherein changes in one variable reciprocal by a corresponding change in another variable. Correlation is considered or described as the best technique for measuring and also for estimating the quantitative relationship between two variables since it is the normalized value of covariance and is much easier to interpret.

What is cluster sampling?

Cluster sampling is a technique used when it becomes difficult to study the target population spread across a wide area and simple random sampling cannot be applied. Cluster Sample is a probability sample where each sampling unit is a collection or cluster of elements. For example, a researcher wants to survey the academic performance of high school students in Japan. He can divide the entire population of Japan into different clusters (cities). Then the researcher selects a number of clusters depending on his research through simple or systematic random sampling.

What are probability distributions?

For any given random process there is both a range of values that are possible and a likelihood that a single draw from the random process will take on one of those values. Probability distributions provide the likelihood for all possible values of a given process.

How do you state the null hypothesis and alternate hypothesis?

If there is an expectation of what is to occur, this is the alternate hypothesis. This should be converted from words to a mathematical comparison of some parameter and a value. The null hypothesis is simply stating what will happen if the alternate hypothesis does not become true. If there is not an expectation of what is to occur, the null hypothesis states what would happen if the experiment does not make a difference. The alternate hypothesis states what would happen if the experiment does make a difference.

What does it mean to be unbiased in Statistics?

If your statistic is not an underestimate or overestimate of a population parameter, then that statistic is said to be unbiased.

What is the z-test?

In a z-test, the sample is assumed to be normally distributed. A z-score is calculated with population parameters such as "population mean" and "population standard deviation" and is used to validate a hypothesis that the sample drawn belongs to the same population. Null: Sample mean is same as the population mean Alternate: Sample mean is not same as the population mean The statistic used for this hypothesis testing is called z-score.

What are t-distributions?

In probability and statistics, Student's t-distribution (or simply the t-distribution) is any member of a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and the population standard deviation is unknown. When you perform a t-test for a single study, you obtain a single t-value. However, if we drew multiple random samples of the same size from the same population and performed the same t-test, we would obtain many t-values and we could plot a distribution of all of them. This type of distribution is known as a sampling distribution. Fortunately, the properties of t-distributions are well understood in statistics, so we can plot them without having to collect many samples! A specific t-distribution is defined by its degrees of freedom (DF), which is the sample size minus 1 for one-tail and minus 2 for two-tail. Therefore, different t-distributions exist for every sample size. T-distributions assume that you draw repeated random samples from a population where the null hypothesis is true. You place the t-value from your study in the t-distribution to determine how consistent your results are with the null hypothesis. This distribution plots the probability density function (PDF), which describes the likelihood of each t-value. For t-tests, if you take a t-value and place it in the context of the correct t-distribution, you can calculate the probabilities associated with that t-value. A probability allows us to determine how common or rare our t-value is under the assumption that the null hypothesis is true. If the probability is low enough, we can conclude that the effect observed in our sample is inconsistent with the null hypothesis. The evidence in the sample data is strong enough to reject the null hypothesis for the entire population. If you specify a range of t-values greater than the critical t-value, the area under the curve is the p-value. As the DF increases, the probability density in the tails decreases and the distribution becomes more tightly clustered around the central value.

What is a multinomial distribution?

In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a k-sided die rolled n times. For n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.

What is a point estimate and what are common point estimation methods?

In simple terms, any statistic can be a point estimate. A statistic is an estimator of some parameter in a population. The sample standard deviation (s) is a point estimate of the population standard deviation (σ). The sample mean is a point estimate of the population mean, μ. In more formal terms, the estimate occurs as a result of point estimation applied to a set of sample data. Points are single values, in comparison to interval estimates, which are a range of values. Some common point estimation methods are: - The Method of Moments: often not too accurate and has a tendency to be biased. - Maximum Likelihood: uses a distribution model and uses the values in the model to maximize a likelihood function. - Bayes Estimators: minimize the average risk (an expectation of random variables). - Best Unbiased Estimators: several unbiased estimators can be used to approximate a parameter.

What Are Confounding Variables?

In statistics, a confounder is a variable that influences both the dependent variable and independent variable. For example, if you are researching whether a lack of exercise leads to weight gain, lack of exercise is an independent variable and weight gain is a dependent variable. A confounding variable here would be any other variable that affects both of these variables, such as the age of the subject

What is the Law of Large Numbers?

It is a theorem that describes the result of performing the same experiment a large number of times. This theorem forms the basis of frequency-style thinking. It says that the sample means, the sample variance and the sample standard deviation converge to what they are trying to estimate.

What is survivorship bias?

It is the logical error of focusing aspects that support surviving some process and casually overlooking those that did not work because of their lack of prominence. This can lead to wrong conclusions in numerous different means.

What is maximum likelihood estimation?

Maximum Likelihood is a way to find the most likely probability distribution to explain a set of observed data. MLE takes known probability distributions (like the normal distribution) and compares data sets to those distributions in order to find a suitable match for the data. A Family of distributions can have an infinite amount of possible parameters. Maximum Likelihood Estimation is one way to find the parameters of the population that is most likely to have generated the sample being tested. How well the data matches the model is known as "Goodness of Fit." MLE chooses the model parameters based on the values that maximize the Likelihood Function. The likelihood of a sample is the probability of getting that sample, given a specified probability distribution model. The likelihood function is a way to express that probability: the parameters that maximize the probability of getting that sample are the Maximum Likelihood Estimators.

Why are Gaussian distributions often used to model data?

Most machine learning datasets are not necessarily from the Gaussian distribution. Data is generated by the real world (if you are talking about real world data). They only relate to distributions in that we often pick what is hopefully a fairly close approximation to what (we hope/think) the data really look like in general. We often use the Gaussian distribution to underlie our statistical/machine learning algorithms, usually implicitly assuming that that the data are generated by this distribution. The Gaussian distribution is statistically simple to work with because of the Central Limit Theorem.

What is R-squared and how do you interpret it?

R-squared is a statistical measure of how close the data are to the fitted regression model. It is the percentage of the response variable variation that is explained by a linear model. R-squared cannot determine whether the coefficient estimates and predictions are biased and does not indicate whether a regression model is adequate. You can have a low R-squared value for a good model, or a high R-squared value for a model that does not fit the data! If your R-squared value is low but you have statistically significant predictors, you can still draw important conclusions about how changes in the predictor values are associated with changes in the response value. If you want precise predictions, then a high R-squared is desired. However, a high R-squared does not mean a good fit either. If the regression line systematically over and under-predicts the data (bias) at different points along the curve, this is a bad fit. This is happening if you can see patterns in the Residuals versus Fits plot, rather than randomness. This indicates a bad fit, and serves as a reminder as to why you should always check the residual plots. This can occur when your model is missing important predictors, polynomial terms, and interaction terms. R-squared does not provide a formal hypothesis test for this relationship. The F-test of overall significance determines whether this relationship is statistically significant. Every time you add a predictor to a model, the R-squared increases, even if due to chance alone. It never decreases. Consequently, a model with more terms may appear to have a better fit simply because it has more terms.

What is selection bias?

Selection (or 'sampling') bias occurs in an 'active,' sense when the sample data that is gathered and prepared for modeling has characteristics that are not representative of the true, future population of cases the model will see. That is, active selection bias occurs when a subset of the data are systematically (i.e., non-randomly) excluded from analysis.

What does the term "statistically significant" mean?

Simply put, if you have significant result, it means that your results likely did not happen by chance.

What is systematic sampling?

Systematic sampling is a statistical technique where elements are selected from an ordered sampling frame. In systematic sampling, the list is progressed in a circular manner so once you reach the end of the list, it is progressed from the top again. Systematic sampling is used when an ordered list of all members of the population is available. The process involves selecting every kth individual on the list, using a starting point selected randomly.

What is the chi-square test of independence?

The Chi-square test of independence determines whether there is a statistically significant relationship between categorical variables in a contingency table. It is a hypothesis test that answers the question—do the values of one categorical variable depend on the value of other categorical variables? The null hypothesis is there are no relationships between the categorical variables. The alternate hypothesis is there are relationships between the categorical variables. The Chi-square test of independence works by comparing the distribution that you observe to the distribution that you expect if there is no relationship between the categorical variables. If your observed distribution is sufficiently different than the expected distribution (no relationship), you can reject the null hypothesis and infer that the variables are related. For a Chi-square test, a p-value that is less than or equal to your significance level indicates there is sufficient evidence to conclude that the observed distribution is not the same as the expected distribution. You can conclude that a relationship exists between the categorical variables. The degrees of freedom depends on the contingency table is the (num rows - 1)*(num cols - 1). Describing the relationship between categorical variables involves comparing the observed count to the expected count in each cell. You can observe which classes are higher or lower than what is expected (null hypothesis is true).

What are F-statistics and F-tests?

The F-statistic is simply a ratio of two variances. More specifically, F-statistics are based on the ratio of mean squares. The term "mean squares" is simply an estimate of population variance that accounts for the degrees of freedom (DF) used to calculate that estimate. In general, an F-statistic is a ratio of two quantities that are expected to be roughly equal under the null hypothesis, which produces an F-statistic of approximately 1. F-test can assess the equality of variances. However, by changing the variances that are included in the ratio, the F-test becomes a very flexible test. For example, you can use F-statistics and F-tests to test the overall significance for a regression model, to compare the fits of different models, to test specific regression terms, and to test the equality of means (ANOVA).

What is the Kruskal Wallis test?

The Kruskal Wallis test is the non parametric alternative to the One Way ANOVA (F-test). This test is used when the assumptions for ANOVA aren't met (like the assumption of normality). It is sometimes called the one-way ANOVA on ranks, as the ranks of the data values are used in the test rather than the actual data points. The test determines whether the medians of two or more groups are different. Like most statistical tests, you calculate a test statistic and compare it to a distribution cut-off point. The test statistic used in this test is called the H statistic. The hypotheses for the test are: Null: population medians are equal. Alternate: population medians are not equal. The assumptions for this test are one independent variable with two or more levels (independent groups). The test is more commonly used when you have three or more levels. For two levels, consider using the Mann Whitney U Test instead. The Kruskal Wallis test will tell you if there is a significant difference between groups. However, it won't tell you which groups are different. For that, you'll need to run a Post Hoc test (means to analyze the results of your experimental data).

What is the Mann-Whitney U test?

The Mann-Whitney U test is the nonparametric equivalent of the two sample t-test. While the t-test makes an assumption about the distribution of a population (i.e. that the sample came from a t-distributed population), the Mann Whitney U Test makes no such assumption. The test compares two populations. The null hypothesis for the test is that the probability is 50% that a randomly drawn member of the first population will exceed a member of the second population. In other words, the two samples come from the populations with the same median. The result of performing a Mann Whitney U Test is a U Statistic (U for unbiased).

What is the t-score? What are Student's t-tests?

The Student's t-test is any statistical hypothesis test in which the test statistic follows a Student's t-distribution under the null hypothesis. The t-test tells you how significant the differences between groups are; In other words it lets you know if those differences (measured in means) could have happened by chance. It is a similar test to the z-test, where it is more useful for sample sizes less than 30 and if the population standard deviation is unknown. The t-score is a ratio between the difference between two groups and the difference within the groups. The larger the t-score, the more difference there is between groups. The smaller the t-score, the more similarity there is between groups. A t-score of 3 means that the groups are three times as different from each other as they are within each other. When you run a t-test, the bigger the t-score, the more likely it is that the results are repeatable. A large t-score tells you that the groups are different. A small t-score tells you that the groups are similar. It is very similar to the Z-score but with the difference that t-score is used when the sample size is small or the population standard deviation is unknown. For example, the t-score is used in estimating the population mean from a sampling distribution of sample means if the population standard deviation is unknown. It is also used along with p-value when running hypothesis test where the p-value is the probability that the results from your sample data occurred by chance. The formula is: t = (x - μ) / (σ / √n) where x is the sample mean and μ is the population mean, and the denominator is the standard error.

What are some alternative metrics to R-squared, and what advantages do they offer?

The adjusted R-squared is a modified version of R-squared that has been adjusted for the number of predictors in the model. The adjusted R-squared increases only if the new term improves the model more than would be expected by chance. It decreases when a predictor improves the model by less than expected by chance. The standard error of the regression, S, represents the average distance that the observed values fall from the regression line. Conveniently, it tells you how wrong the regression model is on average using the units of the response variable. Unlike R-squared, you can use the standard error of the regression to assess the precision of the predictions, such as looking at if predictions on average fall under +/- 2*S.

What is method of moments? What are its pros and cons?

The method of moments is a way to estimate population parameters, like the population mean (first moment) or the population standard deviation (second moment). The basic idea is that you take known facts about the population, and extend those ideas to a sample. For example, the estimated mean of a population is the sample mean, and the estimated variance of a population is the sample variance. The same idea is applied to higher order moments like skewness and kurtosis. The advantage is this method is simple and can be performed by hand. The disadvantages are it is not very accurate, especially for small samples sizes, and it does not take into account all of the available information in the sample.

What is the z-score?

The number of standard deviations from the mean a data point is. But more technically it's a measure of how many standard deviations below or above the population mean a raw score is. A z-score is also known as a standard score and it can be placed on a normal distribution curve. A z-score tells you how many standard deviations from the mean your result is. You can use your knowledge of normal distributions (like the 68 95 and 99.7 rule) or the z-table to determine what percentage of the population will fall below or above your result. For each significance level in the confidence interval, the Z-test has a single critical value which makes it more convenient than the Student's t-test whose critical values are defined by the sample size (through the corresponding degrees of freedom). The Z-score formula for a data point x is: z = (x - μ) / σ The Z-score formula for a sample population is: z = (x - μ) / (σ / √n) where z is sample population mean. This z-score will tell you how many standard errors there are between the sample mean and the population mean.

How do you use t-tests in regression analysis?

The t-test is used on each coefficient to test if it is statistically significant. The null hypothesis is the coefficient is equal to zero. Under the assumption this null hypothesis is true, we can generate a t-distribution. We then compute a t-value: t = (estimated coefficient - 0) / standard error where standard error is looking at the standard deviation between the fitted line and ground truth points. This t-value falls somewhere on the t-distribution, and we can compute the p-value, the probability that this t-value would be generate by chance given the coefficient is zero. If this t-value falls outside the critical t-statistic for a significance level of 0.05, we can reject the null hypothesis and say this coefficient is very likely a predictor of the response variable. If the p-value is greater than 0.05, there is a good chance the coefficient is actually zero.

What are the different t-tests? When should you use each t-test?

There are three main types of t-test: An Independent Samples t-test (two sample t-test) compares the means for two groups. A Paired sample t-test compares means from the same group at different times and the samples are dependent. A One sample t-test tests the mean of a single group against a known mean. Choose the paired t-test if you have two measurements on the same item, person or thing. You should also choose this test if you have two items that are being measured with a unique condition. For example, you might be measuring car safety performance in Vehicle Research and Testing and subject the cars to a series of crash tests. Although the manufacturers are different, you might be subjecting them to the same conditions. With an independent sample t test, you're comparing the means for two different samples. For example, you might test two different groups students from two universities on their English skills. If you take a random sample each group separately and they have different conditions, your samples are independent and you should run an independent samples t test (also called between-samples and unpaired-samples). The one sample t test compares the mean of your sample data to a known value. For example, you might want to know how your sample mean compares to the population mean. You should run a one sample t test when you don't know the population standard deviation and you have a small sample size.

What does it mean for a test to be non-parametric?

When the word "non-parametric" is used in stats, it doesn't quite mean that you know nothing about the population. It usually means that you know the population data does not have a normal distribution.

What is a p-value?

When you perform a hypothesis test in statistics, a p-value can help you determine the strength of your results. p-value is a number between 0 and 1. Based on the value it will denote the strength of the results. The claim which is on trial is called the Null Hypothesis. Low p-value (≤ 0.05) indicates strength against the null hypothesis which means we can reject the null Hypothesis. High p-value (≥ 0.05) indicates strength for the null hypothesis which means we can accept the null Hypothesis. A p-value of 0.05 indicates the Hypothesis could go either way. The p-value is the probability that the t-statistic observed by chance under the assumption that NULL hypothesis is true.

What is the difference between t-test and z-test?

z-tests are used when we have large sample sizes (n > 30) and population standard deviation is known, whereas t-tests are most helpful with a smaller sample size (n < 30) and population standard deviation is unknown.

What is an exponential distribution?

· Describes the interval of time between events · Widely used for survival analysis, such as the expected life of a machine

What is the process for performing a t-test?

1. Specify the null hypothesis 2. Specify the alternate hypothesis 3. Identify parameters. For a common one sample t-test, this would be the population mean, the sample mean, the sample standard deviation, and the sample size. 4. Compute the t-score 5. Based on if problem is one or two tailed, find the critical t-value in the t-table for the given significance level and degrees of freedom and construct a confidence interval. This means there is high probability that t-values for a population where the null hypothesis is true will fall in this confidence interval. 6. Compare the t-score from step 4 to the critical t-value from step 5. If the t-score is higher than the critical t-value, you can reject the null hypothesis since the p-value (area under t-distribution for t-score->inf) is smaller than the significance level (area under t-distribution for critical t-value->inf).

What is the Binomial Probability Distribution?

A distribution where only two outcomes are possible, and the probability of success and failure does not change for all trials. The outcomes may or may not be equal. Each trial is independent. A total of n trials are conducted, with each trial have a probability p for success and (1-p) for failure.

What is the significance level?

A probability level needs to be set such that the chance of Type I error occurring is established. This level is called as the significance level. The alpha (α) denotes it. A lower α means that the test is very stringent. A relatively higher α means that the test is not so strict. For example, 0.05 signifies a 5% chance of rejecting the null hypothesis when it is true.

What is ANOVA?

Analysis of variance (ANOVA) can determine whether the means of three or more groups are different. ANOVA uses F-tests to statistically test the equality of means. In one-way ANOVA, the F-statistic is this ratio: F = variation between sample means / variation within the samples The null hypothesis is the groups means are equal. In order to reject the null hypothesis that the group means are equal, we need a high F-value. A single F-value is hard to interpret on its own, so we use the F-distribution to calculate probabilities. For one-way ANOVA, the ratio of the between-group variability to the within-group variability follows an F-distribution when the null hypothesis is true. When you perform a one-way ANOVA for a single study, you obtain a single F-value. However, if we drew multiple random samples of the same size from the same population and performed the same one-way ANOVA, we would obtain many F-values and we could plot a distribution of all of them. Because the F-distribution assumes that the null hypothesis is true, we can place the F-value from our study in the F-distribution to determine how consistent our results are with the null hypothesis and to calculate probabilities. The probability that we want to calculate is the probability of observing an F-statistic (specified by significance level) that is at least as high as the value that our study obtained. If the probability is low enough, we can conclude that our data is inconsistent with the null hypothesis. This probability that we're calculating is also known as the p-value (area under F-distribution from F-value->inf). ANOVA uses the F-test to determine whether the variability between group means is larger than the variability of the observations within the groups

What is the chi-square goodness-of-fit test?

Chi-Square goodness of fit test is a non-parametric test for categorical variables that is used to find out how the observed value of a given phenomena is significantly different from the expected value. In Chi-Square goodness of fit test, the term goodness of fit is used to compare the observed sample distribution with the expected probability distribution, so the population does NOT need to be normally distributed. The distribution is provided in the problem statement. Chi-Square goodness of fit test determines how well theoretical distribution fits the empirical distribution. In Chi-Square goodness of fit test, sample data is divided into intervals. Then the numbers of points that fall into the interval are compared, with the expected numbers of points in each interval. The null hypothesis assumes that there is no significant difference between the observed and the expected value. The P-value is the probability that a chi-square statistic having k degrees of freedom is as extreme as the chi-square statistic you computed, and if this p-value is less than the significance level 0.05, we will reject the null hypothesis and conclude that there is a significant difference between the observed and the expected frequency. The degrees of freedom is the number of classes in the categorical variable minus 1. You will need to compute the expected frequency counts in the sample as your null hypothesis distribution using information given in problem statement.

What is sampling? What are some sampling methods?

Data sampling is a statistical analysis technique used to select, manipulate and analyze a representative subset of data points to identify patterns and trends in the larger data set being examined. Simple random sampling is used to randomly select subjects from the whole population. Stratified sampling is when subsets of the data sets or population are created based on a common factor, and samples are randomly collected from each subgroup.

What is Fisher's exact test?

Fisher's Exact Test of Independence is a statistical test used when you have two nominal variables and want to find out if proportions for one nominal variable are different among values of the other nominal variable. For experiments with small numbers of participants (under around 1,000), Fisher's is more accurate than the chi-square test or G-test. To get a result for this test, calculate the probability of getting the observed data using the null hypothesis that the proportions are the same for both sets. Fisher's Exact Test of Independence uses a contingency table to display the different outcomes for an experiment. This test is commonly used in A/B testing for binomial distributions (click-through rate). An alternative test is the Barnard's test.

What is hypothesis testing?

Hypothesis testing is a form of inferential statistics that allows us to draw conclusions about an entire population based on a representative sample. The framing of a hypothesis test is always: does the data I have support my idea or could my data just be due to chance? The way scientists quantify due to chance is by assessing the likelihood of observing their data given that their idea is wrong. The process is as follows: 1. The first step is to quantify the size of the apparent effect by choosing a test statistic. 2. The second step is to define a null hypothesis, which is a model of the system based on the assumption that the apparent effect is not real. 3. The third step is to compute a p-value, which is the probability of seeing the apparent effect if the null hypothesis is true. 4. The last step is to interpret the result. If the p-value is low, the effect is said to be statistically significant, which means that it is unlikely to have occurred by chance. In that case we infer that the effect is more likely to appear in the larger population.

How do you use F-tests in regression analysis?

In general, an F-test in regression compares the fits of different linear models. You might get a regression model that achieves a higher R^2 value than another regression model, but this could be due to chance. To determine if there is a statistically significant difference between the models, you perform an F-test. The F-value itself doesn't say much, but the p-value does. The p-value tells you the probability of getting the F-value if the null hypothesis is true. The F-test of the overall significance is a specific form of the F-test. It compares a model with no predictors to the model that you specify. A regression model that contains no predictors is also known as an intercept-only model, which is just the average of the response variable. The hypotheses for the F-test of the overall significance are as follows: Null hypothesis: The fit of the intercept-only model and your model are equal. Alternative hypothesis: The fit of the intercept-only model is significantly reduced compared to your model. If the P value for the F-test of overall significance test is less than your significance level, you can reject the null-hypothesis and conclude that your model provides a better fit than the intercept-only model.

How do you determine degrees of freedom and why are they important?

In statistics, the degrees of freedom (DF) indicate the number of independent values that can vary in an analysis without breaking any constraints. It is an essential idea that appears in many contexts throughout statistics including hypothesis tests, probability distributions, and regression analysis. Degrees of freedom encompasses the notion that the amount of independent information you have limits the number of parameters that you can estimate. Typically, the degrees of freedom equal your sample size minus the number of parameters you need to calculate during an analysis. Degrees of freedom also define the probability distributions for the test statistics of various hypothesis tests. For example, hypothesis tests use the t-distribution, F-distribution, and the chi-square distribution to determine statistical significance. Each of these probability distributions is a family of distributions where the degrees of freedom define the shape. Hypothesis tests use these distributions to calculate p-values. So, the DF directly link to p-values through these distributions!

What is A/B testing?

It is a user experience research methodology. A/B tests consist of a randomized experiment with two variants, A and B. It includes application of statistical hypothesis testing or "two-sample hypothesis testing" as used in the field of statistics. A/B testing is a way to compare two versions of a single variable, typically by testing a subject's response to variant A against variant B, and determining which of the two variants is more effective. A/B tests are useful for understanding user engagement and satisfaction of online features, such as a new feature or product. Large social media sites like LinkedIn, Facebook, and Instagram use A/B testing to make user experiences more successful and as a way to streamline their services. Multivariate testing or multinomial testing is similar to A/B testing, but may test more than two versions at the same time or use more controls.

What is the Central Limit Theorem and why is it important?

Suppose that we are interested in estimating the average height among all people. Collecting data for every person in the world is impossible. While we can't obtain a height measurement from everyone in the population, we can still sample some people. The question now becomes, what can we say about the average height of the entire population given a single sample. The Central Limit Theorem addresses this question exactly. Formally, it states that if we sample from a population using a sufficiently large sample size, the mean of the samples (also known as the sample population) will be normally distributed (assuming true random sampling). What's especially important is that this will be true regardless of the distribution of the original population. A minimum sample size of 30 is standard. Because of the central limit theorem, many test statistics are approximately normally distributed for large samples.

What is the three sigma rule?

The empirical rule states that for a normal distribution, nearly all of the data will fall within three standard deviations of the mean. The empirical rule can be broken down into three parts: 68% of data falls within the first standard deviation from the mean. 95% fall within two standard deviations. 99.7% fall within three standard deviations.

What is the difference between a t-test and an F-test?

The t-test is used to test if two sample have the same mean. The assumptions are that they are samples from normal distribution. An f-test is used to test if two sample have the same variance. Same assumptions hold.

What are the assumptions required for linear regression?

There are four major assumptions: 1. There is a linear relationship between the dependent variables and the regressors, meaning the model you are creating actually fits the data, 2. The errors or residuals of the data are normally distributed and independent from each other, 3. There is minimal multicollinearity between explanatory variables, and 4. Homoscedasticity. This means the variance around the regression line is the same for all values of the predictor variable.

What is the Wilcoxon Signed Rank test?

Two slightly different versions of the test exist: The Wilcoxon signed rank test compares your sample median against a hypothetical median. The Wilcoxon matched-pairs signed rank test computes the difference between each set of matched pairs, then follows the same procedure as the signed rank test to compare the sample against some median. The null hypothesis for this test is that the medians of two samples are equal. It is generally used as a non-parametric alternative to the one-sample t test or paired t test.

What is Welch's t-test?

Welch's t-test, or unequal variances t-test, is a two-sample test which is used to test the hypothesis that two populations have equal means. It is an adaptation of Student's t-test and is more reliable when the two samples have unequal variances and/or unequal sample sizes. It is commonly used in A/B testing for experiments involving normally distributed data.

How do you deal with non-normal distributions in hypothesis testing?

You have several options for handling your non normal data. Many tests, including the one sample Z test, T test and ANOVA assume normality. You may still be able to run these tests if your sample size is large enough (usually over 20 items). You can also choose to transform the data with a function, forcing it to fit a normal model. However, if you have a very small sample, a sample that is skewed or one that naturally fits another distribution type, you may want to run a non parametric test. A non parametric test is one that doesn't assume the data fits a specific distribution type. Non parametric tests include the Wilcoxon signed rank test, the Mann-Whitney U Test and the Kruskal-Wallis test.


Conjuntos de estudio relacionados

Chapter 25: Diabetes Mellitus and the Metabolic Syndrome

View Set

starting a business final exam study guide

View Set

3.4) Explain the use of code to represent a character set

View Set

GRE 3000 Word List - list07 - 12

View Set