Chapter 12: Data-Based and Statistical Reasoning

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Assume that blonde hair and blue eyes are independent recessive traits. If one parent is a carrier for each gene while the other parent is homozygous recessive for both genes, what is the probability that the first two offspring will both have blonde hair and blue eyes?

6.25% Because one parent is homozygous for both traits, we are only concerned with the other parent. This parent has a 50% chance of transmitting each independent trait, and thus a 25% chance of transmitting both (1/2 x 1/2 = 1/4 ). This probability is the same for both pregnancies because they are independent events; thus, the probability that both children exhibit both traits is: 1/4 x 1/4 = 1/16 = 6.25%

What is the median of the following data set? 7,17,53,23,4,2,4

7

Exhaustive outcomes

A group of outcomes is said to be exhaustive if there are no other possible outcomes. For example, flipping heads or tails are said to be exhaustive outcomes of a coin flip; these are the only two possibilities.

Type II error

A type II error occurs when we incorrectly fail to reject the null hypothesis. In other words, a type II error is the likelihood that we report no difference between two populations when one actually exists. The probability of a type II error is sometimes symbolized by β.

Which of the following outliers would most likely be the easiest to correct?

A typographical error in data transfer Because the error is in data transfer, the original source of data can be consulted to allow for the inclusion of the correct data point. An error in instrument calibration may introduce bias; while this should not affect the standard deviation of a sample, it would certainly affect the mean. The instrument would have to be recalibrated, and the relevant data points would have to be measured again to correct for this type of outlier, eliminating (B). A skewed distribution is one that has a long tail. In this case, it may be more challenging to determine if a particular value is an outlier or simply a value in the long tail of the distribution. Repeated sampling or a large sample size is usually required to determine if a sample is truly skewed, eliminating (C). An anomalous result is challenging to interpret, and how to correct for the result may be unclear. In some cases, the result should be inflated or weighed more heavily to reflect its significance; in other cases, it should be interpreted as a regular value. In still other cases, it is appropriate to drop the anomalous result. This decision should ideally be made before the study even begins, but this still certainly requires more consideration than simply checking a result from one's original data set, eliminating (D).

How is the p-value calculated during a hypothesis test?

After the test statistic is calculated, a computer program or table is consulted to determine the p-value of the statistic.

can data that do not follow a normal distribution be analyzed with measures of central tendency and measures of distribution? why or why mot?

Any distribution can be mathematically or procedurally transformed to follow a normal distribution by virtue of the central limit theorem which is beyond the scope of the MCAT Regardless a distribution that is not normal may still be analyzed with these measures

It is known that crickets increase their rate of chirping in a direct linear relationship with temperature until a maximum chirping rate is reached. Which of the following graphs best represents this relationship?

B (look at answer and choices) The question stem indicates that there is a linear relationship, so we know that we are looking for a straight line before a plateau. We also know that linear relationships are represented on linear plots. (B) matches both criteria because the axes show constant intervals. Constant ratios, as shown in (C) and (D), are seen in semilog plots like these, as well as log-log plots.

Bar Charts

Bar charts are used for categorical data, which sort data points based on predetermined categories. The bars may then be sorted by increasing or decreasing bar length. The length of a bar is generally proportional to the value it represents. Wherever possible, breaks should be avoided in the chart because of the potential to distort scale. To that end, be wary of graphs that contain breaks; they may be enlarging the difference between bars.

what is the difference between normal or skewed distributions and bimodal distributions

Bimodal distributions have two peaks whereas normal or skewed distributions have only one

Confidence Intervals

Confidence intervals are essentially the reverse of hypothesis testing. With a confidence interval, we determine a range of values from the sample mean and standard deviation. Rather than finding a p-value, we begin with a desired confidence level (95% is standard) and use a table to find its corresponding z- or t-score. When we multiply the z- or t-score by the standard deviation, and then add and subtract this number from the mean, we create a range of values. For example, consider a population for which we wish to know the mean age. We draw a sample from that population and find that the mean of the sample is 30, with a standard deviation of 3. If we wish to have 95% confidence, the corresponding z- score (which would be provided on Test Day) is 1.96. Thus, the range is 30 − (3)(1.96) to 30 + (3)(1.96) = 24.12 to 35.88. We can then report that we are 95% confident that the true mean age of the population from which this sample is drawn is between 24.12 and 35.88.

Causation

Correlation does not necessarily imply causation; we must avoid this assumption when there is insufficient evidence to draw such a conclusion.

Skewed Distributions

Distributions are not always symmetrical A skewed distribution is one that contains a tail on one side or the other of the data set On the MCAT skewed distributions are most often tested by simply identifying their type This is often an area of confusion for students because the visual shift in the data appear opposite the direction of the skew A negatively skewed distribution has a tail on the left or negative side whereas a positively skewed distribution has a tail on the right or positive side Because the mean is more susceptible to outliers than the median the mean of a negatively skewed distribution will be lower than the median while the mean of a positively skewed distribution will be higher than the median

How do exponential and parabolic curves differ in shape?

Exponential and parabolic curves both have a steep component; however, exponential curves have horizontal asymptotes and become flat on one side while parabolic curves are symmetrical and have steep components on both sides of a center point.

True or False: Statistical significance is sufficient criteria to enact policy change.

False. As discussed in the last chapter, there must be practical (clinical), as well as statistical significance for a conclusion to be useful.

Confidence

Finally, the probability of correctly failing to reject a true null hypothesis (reporting no difference between two populations when on does not exist) is referred to as confidence.

Independent Events Calculation

For independent events, the probability of two or more events occurring at the same time is the product of their probabilities alone For example, the probability of getting heads on a coin flip twice in a row is the same as the probability of getting heads the first time times the probability of getting heads the second time, or 0.5 × 0.5 = 0.25.

Test statistic and p-value

From the data collected, a test statistic is calculated and compared to a table to determine the likelihood that that statistic was obtained by random chance (under the assumption that our null hypothesis is true). This is our p-value.

Histograms

Histograms present numerical data rather than discrete categories. Histograms are particularly useful for determining the mode of a data set because they are used to display the distribution of a data set.

How do hypothesis tests and confidence intervals differ

Hypothesis tests are used to validate or invalidate a claim that two populations are different, or that one population differs from a given parameter. In a hypothesis test, we calculate a p-value and compare it to a chosen significance level (α) to conclude if an observed difference between two populations (or between a population and the parameter) is significant or not. Confidence intervals are used to determine a potential range of values for the true mean of a population.

A hypothesis test was correctly conducted and the experimenter failed to reject the null hypothesis. Which of the following must be true? I. The p-value was greater than α. II.A type I error did not occur. III. The power of the study was too small.

I and II only A type I error occurs when the null hypothesis is incorrectly rejected. Because we failed to reject the null hypothesis, this could not have occurred. Statement I is correct. If we failed to reject the null hypothesis, then the p-value must be greater than the significance level. Statement III is incorrect because we lack information about power in the question stem. In addition, a study could be extremely well-powered and still fail to reject the null hypothesis if no difference truly exists between two populations.

The following histogram: I. contains a bimodal distribution. II. should be analyzed as two separate distributions. III. contains one mode.

I, II, and III Because the histogram contains two peaks with a valley in between, it is a bimodal distribution. The color separation of two distinct populations provides evidence that there is a qualitative difference in the data between the two peaks, thus the data should be analyzed according to gender. There is indeed only one mode, at 5'6". This is the measurement with the largest number of corresponding data points

If the p-value is greater than α in a given statistical test, what is the outcome of the test?

If the p-value is greater than α, then we fail to reject the null hypothesis.

log-Log Graph

In some cases, both axes can be given a different axis ratio to create a linear plot. When both axes use a constant ratio from point to point on the axis, this is termed a log-log graph. Note that the difference between these three plot types (linear, semilog, and log-log) is based on the labeling of the axes. Therefore, it is crucial to pay attention to the axes on Test Day to be able to interpret a graph correctly.

Normal distribution

In statistics we most often work with a normal distribution Even when we know that this is not quite the case we can use special techniques so that our data will approximate a normal distribution This is very important because the normal distribution has been solved in the sense that we can transform any normal distribution to a standard distribution with a mean of zero and a standard deviation of one and then use the newly generated curve to get information about probability or percentages of populations The normal distribution is also the basis for the bell curve seen in many scenarios including exam scores on the MCAT The mean median and mode are at the center of the distribution Approximately of the distribution is within one standard deviation of the mean within two and within three

Independence

Independence is a condition of events wherein the outcome of one event has no effect on the outcome of the other.

Inerquartile Range (IQR)

Interquartile range is related to the median first and third quartiles Quartiles, including the median (Q2), divide data when placed in ascending order into groups that comprise one-fourth of the entire set

Linear graphs

Linear graphs show the relationships between two variables. They generally involve two direct measurements and, strictly speaking, do not have to be a straight line. The shape of the curve on this type of graph may be linear, parabolic, exponential, or logarithmic. a) linear b) parabolic c) exponential d) logarithmic The axes of a linear graph will be consistent in the sense that each unit will occupy the same amount of space (the distance from 1 to 2 to 3 to 4 on each axis remains the same size). As with bar graphs, be wary of scale and breaks in axes.

What type of data relationship is least likely to require transformation into a semilog or log-log plot?

Linear relationships can be analyzed without any data or axis transformation into semilog or log-log plots.

Measures of central tendency

Measures of central tendency are those that describe the middle of a sample How we define middle can vary Is it the mathematical average of the numbers in the data set? Is it the result in a data set that divides the set into two with half the sample values above this result and half the sample values below Both of these data can be important and the difference between them can also provide useful information on the shape of a distribution

mutual exclusivity

Mutual exclusivity is a condition wherein two outcomes cannot occur simultaneously.

Outliers

Outliers typically result from one of three causes A true statistical anomaly eg a person who is over seven feet tall A measurement error for example reading the centimeter side of a tape measure instead of inches A distribution that is not approximated by the normal distribution eg a skewed distribution with a long tail When an outlier is found it should trigger an investigation to determine which of these three causes applies If there is a measurement error the data point should be excluded from analysis However the other two situations are less clear If an outlier is the result of a true measurement but is not representative of the population it may be weighted to reflect its rarity included normally or excluded from the analysis depending on the purpose of the study and preselected protocols The decision should be made before a study beginsnot once an outlier has been found When outliers are an indication that a data set may not approximate the normal distribution repeated samples or larger samples will generally demonstrate if this is true

Probability of two independent events co-occuring

P(AandB)=P(A)×P(B)

Pie Charts/Circle Charts

Pie or circle charts are used to represent relative amounts of entities and are especially popular in demographics. They may be labeled with raw numerical values or with percent values. The primary downside to pie charts is that as the number of represented categories increases, the visual representation loses impact and becomes confusing

Table pros and cons

Pros: Categorical data can be presented without comparison. Does not require estimation for calculations. Cons: Disorganized or unrelated data may be presented together.

Pie Chart Pros and Cons

Pros: Easily constructed; useful for categorical data with a small number of categories. Cons: Easily overwhelmed with multiple categories. Difficult to estimate values with circles.

Box plot pros and cons

Pros: Information-dense; can be useful for comparison. Cons: May not highlight outliers or mean value of a data set. Only useful for numerical data.

Bar Graph Pros and Cons

Pros: Multiple organization strategies. Good for large categorical data sets. Cons: Axes are oen misleading because of sizeable breaks.

Graph pros and cons

Pros: Provide information about relationships. Useful for estimation. Cons: Axis labels and logarithmic scales require careful interpretation.

Map pros and cons

Pros: Provide relevant and integrated geographic and demographic information. Cons: May only be used to represent at most two variables coherently.

Which of the following measures of distribution is most useful for determining probabilities?

SD Standard deviation is the most common measure of distribution. It is the most closely linked to the mean of a distribution and can be used to calculate p-values, which are probabilities (specifically, p-values are the probability that an observed difference between two populations is due to chance).

Semilog Graphs

Semilog graphs are a specialized representation of a logarithmic data set. They can be easier to interpret because the otherwise curved nature of the logarithmic data is made linear by a change in the axis ratio. In semilog graphs, one axis (usually the x-axis) maintains the traditional unit spacing. The other axis assigns spacing based on a ratio, usually 10, 100, 1000, and so on. The multiples may be of any number as long as there is consistency in the ratio from one point on the axis to the next.

Assume the likelihood of having a male child is equal to the likelihood of having a female child. In a series of ten live births, the probability of having at least one boy is equal to:

Simplify this question by rewording it as the probability of not having all girls. Having at least one boy and having all girls are mutually exclusive events, and no other possibilities can occur. Thus, the probability of having all girls is (0.5)^10 and the probability of having at least one boy is 1 - (0.5)^10, or 99.90%.

In a sample of hospital patients, the mean age is found to be significantly lower than the median. Which of the following best describes this distribution?

Skewed left The mean is to the le of the median, which implies that the tail of the distribution is on the le side; therefore, this distribution is skewed le. It would be expected that there would be a low plateau on the le side of the distribution, which accounts for the shi in the mean.

Bimodal Distributions

Some distributions have two or more peaks A distribution containing two peaks with a valley in between is called bimodal as shown in Figure It is important to note that a bimodal distribution strictly speaking might have only one mode if one peak is slightly higher than the other However even when the peaks are of two different sizes we still call the distribution bimodal If there is sufficient separation of the two peaks or a sufficiently small amount of data within the valley region bimodal distributions can oen be analyzed as two separate distributions On the other hand bimodal distributions do not have to be analyzed as two separate distributions either the same measures of central tendency and measures of distribution can be applied to them as well

Standard Deviation

Standard deviation is the most informative measure of distribution but it is also the most mathematically laborious It is calculated relative to the mean of the data Standard deviation is calculated by taking the difference between each data point and the mean squaring this value dividing the sum of all of these squared values by the number of points in the data set minus one and then taking the square root of the result

Why would the average difference from the mean be an inappropriate measure of distribution?

The average distance from the mean will always be zero. This is why, in calculations of standard deviation, we always square the distance from the mean and then take the square root at the end—it forces all of the values to be positive numbers, which will not cancel out to zero.

Based on the county-level map below, which of the following statements best represents the data about elderly individuals? (Note: The darker the shade of green, the higher the percentage of elderly persons in the county.)

The center of the United States tends to have a larger proportion of elderly people. With data about percentages, we can only draw conclusions about percentages. Thus any information about number of people, as in (A), is incorrect. This map shows us that a higher percentage of the residents in the middle of the country are elderly in comparison to other parts of the country. There are, of course, exceptions to this rule, including Florida, the Pacific Coast, and parts of Appalachia, which are all in the top category. Even so, there appears to be a clustering of counties with a high percentage of elderly individuals in the middle of the country. We also cannot say that most of the population is elderly in any place on this map because we are not given actual values for the percentages. There may be a plurality, but there is insufficient information to posit a majority, eliminating (B). The map gives no indication of migration patterns, so we can also eliminate (D).

Interquartile Range

The interquartile range is then calculated by subtracting the value of the first quartile from the value of the third quartile IQR = Q3 - Q1 The interquartile range can be used to determine outliers Any value that falls more than interquartile ranges below the first quartile or above the third quartile is considered an outlier

what types of data sets are best analyzed using the mean as a measure of central tendency

The mean is the best measure of central tendency for a data set with a relatively normal distribution The mean performs poorly in data sets with outliers

how do the mean, median and mode compare for a right skewed distribution

The mean of a right positively skewed distribution is to the right of the median which is to the right of the mode

Median -median POSITION equation

The median value for a set of data is its midpoint where half of data points are greater than the value and half are smaller In data sets with an odd number of values the median will actually be one of the data points In data sets with an even number of values the median will be the mean of the two central data points To calculate the median a data set must first be listed in increasing fashion n= number of data values

Mode

The mode quite simply is the number that appears the most oen in a set of data There may be multiple modes in a data set orif all numbers appear equallythere can even be no mode for a data set When we examine distributions the peaks represent modes The mode is not typically used as a measure of central tendency for a set of data but the number of modes and their distance from one another is oen informative If a data set has two modes with a small number of values between them it may be useful to analyze these portions separately or to look for other variables that may be responsible for dividing the distribution into two parts

Z and T-tests

The most common hypothesis tests are z- or t-tests, which rely on the standard distribution or the closely related t-distribution.

Probability of at least one of two independent events occurring

The probability of at least one of two events occurring is equal to the sum of their initial probabilities, minus the probability that they will both occur.

Power

The probability of correctly rejecting a false null hypothesis (reporting a difference between two populations when one actually exists) is referred to as power, and is equal to 1 − β.

Using SD to determine outlier

The standard deviation can also be used to determine whether a data point is an outlier If the data point falls more than three standard deviations from the mean it is considered an outlier

SD and normal distributions

The standard deviation relates to the normal distribution as well 68% within one SD 95% within 2 SD 99% within 3 SD

A new medication for heart failure is being developed and has had a statistically significant effect on contractility in clinical trials. Which of the following would NOT likely cause the drug to be held back from common use?

The study had low power to detect a difference. If a study has low power, it is more difficult to get results that are statistically significant. Therefore, if the results are still statistically significant even with low power, then there is likely a large effect size that makes the effect clinically significant. If the value of α used in the study was 0.5, then statistically significant results do not mean much— traditionally, α = 0.05 or a smaller probability is used, eliminating (A). Concerns about toxicity should always limit the use of a drug, eliminating (B). A statistically significant result is only of interest if it also represents a clinically significant improvement, eliminating (C).

Type I error

The value of α is the level of risk that we are willing to accept for incorrectly rejecting the null hypothesis. This is also called a type I error. In other words, a type I error is the likelihood that we report a difference between two populations when one does not actually exist.

Calculating Quartiles

To calculate the position of the first quartile Q1 in a set of data sorted in ascending order multiply n by 1/4 ex. 8 numbers, 8 x 1/4 = 2 mean of the 2nd and 3rd number will be the 1st quartile position If this is a whole number the quartile is the mean of the value at this position and the next highest position If this is a decimal round up to the next whole number and take that as the quartile position --- To calculate the position of the third quartile Q3 multiply the value of n by 3/4 ex. 8 numbers, 8 x3/4 = 6 mean of 6th and 7th numbers will be Q3 position Again if this is a whole number take the mean of this position and the next If it is a decimal round up to the next whole number and take that as the quartile position -- n = number of values

True or False: Power is the probability of correctly rejecting the null hypothesis.

True. Power is the probability that the individual rejects the null hypothesis when the alternative hypothesis is true for the population.

True or False: Two variables that are causally related will also be correlated with each other.

True. While two variables that are correlated are not necessarily causally related, all variables that are causally related must be correlated in some way (direct relationship, inverse relationship, or otherwise).

Interpreting Tables

Unlike with graphs, you should only take a brief moment to glance at the title of a table before approaching Test Day questions. Tables are more likely to contain disjointed information than either charts or graphs because they oen contain categorical data or experimental results. Tables that do not have unusual data values (zeroes, outliers, changes in a trend, and so on) should be approached especially briefly. When a table does contain significant organizatio n (for example, listing results progressively), this structure is likely to be relevant while answering questions. For example, a trend that suddenly appears or disappears will oen require an explanation.

significance level (alpha)

We then compare our p-value to a significance level (α); 0.05 is commonly used. If the p-value is greater than α, then we fail to reject the null hypothesis, which means that there is not a statistically significant difference between the two populations. If the p-value is less than α, then we reject the null hypothesis and state that there is a statistically significant difference between the two groups. Again, when the null hypothesis is rejected, we state that our results are statistically significant.

Exhaustiveness

When a set of outcomes is exhaustive, there are no other possible outcomes.

Slope (m)

Where both the shape of the graph and the graph type are linear, we should be able to calculate the slope of the line. Slope (m) is the change in the y-direction divided by the change in the x-direction for any two points:

How do range and standard deviation generally relate to one another mathematically? Is this relationship accurate for the data set used earlier in this section (1, 2, 3, 9, 10; σ = 4.18)?

Where the data are not available, the range can be approximated as four times the standard deviation. For this data set, the relationship fails. The range is 9, which is only a little more than twice the standard deviation. This is because the data set does not fall in a normal distribution.

Are there any outliers on the following box plot?

Yes; both 1575 and 2600 are outliers. Outliers can be determined with respect to the interquartile range, Q3 − Q1. The interquartile range for this box plot is 2280 − 2075, or 205. Values that are 1.5 × IQR below Q1 or above Q3 are considered outliers. 2075 − 1.5 × 205 is approximately 2075 − 300, or 1775 (actual = 1767.5). Therefore, 1575 is an outlier. 2280 +1.5 × 205 is approximately 2580 (actual = 2587.5). Therefore, 2600 is also an outlier.

The following titration curve is an example of:

a sigmoidal relationship on a linear graph. The first term in the answer choices describes the shape of the curve. While we did not discuss sigmoidal curves in this chapter specifically, they do show up in other places in science—in particular, for enzymes, cooperative binding, and titrations. Sigmoidal curves are S-shaped. The second term refers to the type of plot. Because the axes have the same scale throughout, this is a linear graph. Note that even though the y-axis represents logarithmic changes in H+ concentration (pH = −log [H+]), the actual unit that is used is pH points, which increase linearly in this graph.

Null hypothesis

always a hypothesis of equivalence. In other words, the null hypothesis says that two populations are equal, or that a single population can be described by a parameter equal to a given value

As the confidence level increases, a confidence interval:

becomes wider. To increase the confidence level, one must increase the size of the confidence interval to make it more likely that the true value of the mean is within the range. Therefore, the confidence interval must become wider. 99% confidence is wider than 95% confidence

Hypothesis Testing

begins with an idea about what may be different between two populations

Mutually exclusive outcomes

cannot occur at the same time One cannot flip both heads and tails in one throw, or be both ten and twenty years old. The probability of two mutually exclusive outcomes occurring together is 0%.

Correlation Coefficient

correlation relationships can be quantified with a correlation coefficient, a number between -1 and +1 that represents the strength of the relationship. A correlation coefficient of +1 indicates a strong positive relationship, a value of -1 indicates a strong negative relationship, and a value of zero indicates no apparent relationship.

Dependent Events

do have an impact on one another, such that the order changes the probability

Compare the method of determining outliers from the interquartile range and from the SD

from interquartile range: Outliers can be defined as data points more than 1.5 × IQR below Q1 or above Q3. from standard deviation: defined as data points more than 3σ above or below the mean. The cutoff values calculated through the two methods are likely to be different, and the selection of one method over the other is one of preference and study design. In general, the use of the standard deviation method is superior.

Independent Events

have no effect on one another if you roll a die and get a 3 and then pick it up and roll it again the probability of getting a 3 on the second roll will be the same Independent events can occur in any order without impacting one another.

Mean (average) -outlier

is calculated by adding up all of the individual values within the data set and dividing the result by the number of values mean may be a parameter or a statistic as is true of all of the measures of central tendency depending on whether we are discussing a population or a sample Mean values are a good indicator of central tendency when all of the values tend to be fairly close to one another Having an outlieran extremely large or extremely small value compared to the other data valuescan shi the mean toward one end of the range For example the average income in the United States is about but half of the population makes less than In this case the small number of extremely highincome individuals in the distribution shis the mean to the high end of the range

Alternative hypothesis

may be nondirectional (that the populations are not equal) or directional (for example, that the mean of population A is greater than the mean of population B).

The direction of skew in a sample is determined by its tail not the bulk of the distribution

negative skew = tail on lower end positive skew = tail on higher end

Results of hypothesis testing

pg before 12.5 concept review

Range

range= xmax-xmin The range of a data set is the difference between its largest and smallest values Range does not consider the number of items of the data set nor does it consider the placement of any measures of central tendency Range is therefore heavily affected by the presence of data outliers In cases where it is not possible to calculate the standard deviation for a normal distribution because the entire data set is not provided it is possible to approximate the standard deviation as one fourth of the range

Correlation

refers to a connection— direct relationship, inverse relationship, or otherwise—between data. If two variables trend together, that is as one increases so does the other, there is a positive correlation. If two variables trend in opposite directions (one increases as the other decreases) there is a negative correlation.

Box Plot (box and whisker)

used to show the range, median, quartiles and outliers for a set of data The box of a box-and-whisker plot is bounded by Q1 and Q3; Q2 (the median) is the line in the middle of the box. The ends of the whiskers correspond to maximum and minimum values of the data set. Alternatively, outliers can be presented as individual points, with the ends of the whiskers corresponding to the largest and smallest values in the data set that are still within 1.5 × IQR of the median. Box-and-whisker plots are especially useful for comparing data because they contain a large amount of data in a small amount of space, and multiple plots can be oriented on a single axis.

A 95% confidence interval will fall within what distance from the mean?

±2σ Approximately 95% of values fall within two standard deviations (±2σ) of the mean for a normal distribution. A confidence interval is constructed using the same values. Approximately 68% of the values are within one standard deviation, and 99% are within three standard deviations, eliminating the other answer choices.

Which of the following values corresponds to the probability of a type I error?

α Type I error is the probability of mistakenly rejecting the null hypothesis. We set the type I error level by selecting a significance level (α).


Ensembles d'études connexes

8.5 The Effects of Mutations on Gene Expression

View Set

Power of Professional Selling Midterm

View Set

Chapter 39: The End of Empire (QUESTIONS)

View Set

Chapter 1: Introduction to Nursing

View Set

Match Organelle with its function

View Set

Principles of Assessment Chapter 21 The Abdomen

View Set

Chapter 10 Quiz Question Bank - CIST1601-Information Security Fund

View Set

William Shakespeare Introductory Notes

View Set