Biostats Quizzes-final

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

"experimental unit" & "observational unit"

"experimental unit" is randomized to the treatment regimen and receives the treatment directly. "observational unit" has measurements taken on it. In most clinical trials, the experimental units and the observational units are one and the same, namely, the individual patient

Which one of the following statements about variance is correct? (i) The variance of a given parameter is expressed in squared units of that parameter. (ii) The variance of a given parameter is expressed in the same unit as the parameter. (iii) The variance of a given parameter is unitless. (iv) The variance of a given parameter is expressed in the root-squared unit of that parameter. a. iv b. i c. iii d. ii

(i) The variance of a given parameter is expressed in squared units of that parameter.

parallel design

A parallel design refers to a study in which patients are randomized to a treatment and remain on that treatment throughout the course of the trial. This is a typical design. In contrast, with a crossover design patients are randomized to a sequence of treatments and they cross over from one treatment to another during the course of the trial. Each treatment occurs in a time period with a washout period in between. Crossover designs are of interest since with each patient serving as their own control,there is potential for reduced variability. However, there are potential problems with this type of design. There should be investigation into possible carry over effects, i.e. the residual effects of the previous treatment affecting subject's response in the later treatment period. In addition, only conditions that are likely to be similar in both treatment periods are amenable to crossover designs. Acute health problems that do not recur are not well-suited for a crossover study. We will study crossover design in a later lesson.

By noting the mean of a dataset, some information about the distribution of the data points can be known. True False

False Response Feedback: Read (section titled "Basic concepts of descriptive statistics"): Measures of central tendency have their applicability. Table 2 shows the indication for the application of each measure. Taking two sets of random values, the first being 88, 89, 90, 91 and 92 and the second 30 + 70 + 90 + 120 + 140, we will have two sets of 90 as the mean. By noting only the mean, one does not perceive the information about the rest of the values and, therefore, it is necessary to use dispersion measures to realize that data from groups are not equal. https://www.sciencedirect.com/science/article/pii/S0104001417300167

factors

In experimental design terminology, factors are variables that are controlled and varied during the course of the experiment. For example, treatment is a factor in a clinical trial with experimental units randomized to treatment. Another example is pressure and temperature as factors in a chemical experiment.

one-way designs

Most clinical trials are structured as one-way designs, i.e., only one factor, treatment, with a few levels.

community intervention trial

One exception to this is a community intervention trial in which communities, e.g., geographic regions, are randomized to treatments. For example, communities (experimental units) might be randomized to receive different formulations of a vaccine, whereas the effects are measured directly on the subjects (observational units) within the communities. The advantages here are strictly logistical - it is simply easier to implement in this fashion. Another example occurs in reproductive toxicology experiments in which female rodents are exposed to a treatment (experimental units) but measurements are taken on the pups (observational units).

two-way design & An incomplete factorial design

Temperature and pressure in the chemical experiment are two factors that comprise a two-way design in which it is of interest to examine various combinations of temperature and pressure. Some clinical trials may have a two-way factorial design, such as in oncology where various combinations of doses of two chemotherapeutic agents comprise the treatments. An incomplete factorial design may be useful if it is inappropriate to assign subjects to some of the possible treatment combinations, such as no treatment (double placebo). We will study factorial designs in a later lesson.

arithmetic mean

The finite population mean of X1 , X2 ,.... XN is defined as the sum of the values Xi divided by the population size N. Typically, in a non- survey setting an arithmetic mean is estimated by taking a simple random sample of the finite population, x1, x2,...,xn, summing the values and dividing by the sample size n. This is often referred to as the arithmetic mean.

4. For a large survey, a weighted arithmetic mean instead of an arithmetic mean should be used to estimate the population mean because each surveyed individual could represent a different number of people in the population, depending on the sampling scheme. True False

True

In a nominal scale, it serves no statistical purpose to order the variables in a particular way. True False

True

Consider the following scenario: If σ = 1.2 and μ = 11.7 for variable X with normal distribution, what is the approximate value for P(X > 12.5)? a. 0.25 b. 0.75 c. 0.68 d. 0.94 e. -0.75

a. 0.25 Response Feedback: Read (Normal Applications), section titled "Example: Male Foot Length"): Comments: By solving the above example, we inadvertently discovered the quartiles of a normal distribution! P(Z < -0.67) = 0.2514 tells us that roughly 25%, or one quarter, of a normal variable's values are less than 0.67 standard deviations below the mean. P(Z < +0.67) = 0.7486 tells us that roughly 75%, or three quarters, are less than 0.67 standard deviations above the mean. And of course the median is equal to the mean, since the distribution is symmetric, the median is 0 standard deviations away from the mean. http://bolt.mph.ufl.edu/6050-6052/unit-3b/normal-random-variables/applications/

A friend of yours scored in the 86th percentile on his IQ test. What does this mean? a. 86% of all scores are similar to or smaller than his score b. 14% of all scores are similar to or smaller than his score. c. 86% of all scores are equal to his score. d. He must have scored 86% on his test.

a. 86% of all scores are similar to or smaller than his score

Which of the following aspects is NOT part of the basics of statistics? a. Propagation of errors b. Confidence intervals c. Hypothesis testing d. Sampling distributions

a. Propagation of errors

To which step of a research project does determining the sample size belong? a. Study design b. Data collection c. Interpretation d. Data analysis

a. Study design

Fill in the blank with the CORRECT option: An event with a probability of 1 is _________________. a. certain to happen b. unlikely to happen with 1% chance c. likely to happen d. certain not to happen

a. certain to happen

Consider the following scenario: A researcher wants to investigate whether or not there is a difference in the amount of cavities seen by x-ray after brushing one year with toothpaste A versus toothpaste B. Which one(s) of the following statements is true? (i) The response variable is the number of cavities observed. (ii) The explanatory variable is the type of toothpaste used (either A or B). a. i and ii b. ii only c. i only d. neither i nor ii are correct options

a. i and ii Response Feedback: Review the examples in lesson "1.1.2 - Explanatory & Response Variables". https://newonlinecourses.science.psu.edu/stat200/lesson/1/1.1/1.1.2/

Consider the following scenario: A researcher wants to show that people living in the northern hemisphere weigh more than those living in the southern hemisphere. Which one(s) of the following statements is correct? (i) The null hypothesis hypothesis is: The people in the southern hemisphere weigh less than those in the northern hemisphere. (ii) The alternative hypothesis is one-sided. (iii) The explanatory variable is the region where they live (northern or southern hemisphere). a. ii and iii only b. i, ii and iii c. ii only d. i and ii only e. i only

a. ii and iii only Response Feedback: See Examples 11.2 to 11.8 in lesson "11.2 Setting the Hypotheses: Examples". https://newonlinecourses.science.psu.edu/stat100/node/63/

Fill in the blank with the correct option: In a positively skewed distribution the mean has a value that is ___________________. a. larger than the median b. equal to the median c. less than the mode d. less than the median

a. larger than the median Response Feedback: Read (section 4.2.3 (page 68): For symmetric distributions, the mean and the median coincide. For unimodal skewed (asymmetric) distributions, the mean is farther in the direction of the "pulled out tail" of the distribution than the median is. Read (section 4.3.3 (page 82)): In a skewed distribution we expect to see the median pushed in the direction of the shorter whisker. If the longer whisker is the top one, then the distribution is positively skewed (or skewed to the right, because higher values are on the right in a histogram). If the lower whisker is longer, the distribution is negatively skewed (or left skewed.) http://www.stat.cmu.edu/~hseltman/309/Book/chapter4.pdf

Consider the following scenario: You need to plot a histogram, and the value with the most decimal places is 7.893; the smallest value in the set is 0.4. Which starting point would you optimally choose for your intervals? a. 0.395 b. 0.3995 c. 0.4 d. 0.39

b. 0.3995

Consider the following scenario: You randomly select 11 college students to determine how many times per week they skip breakfast. The responses are 1, 2, 2, 3, 4, 4, 5, 5, 6, 7, and 7. What percentage of these students skip breakfast more than 2 but less than 5 times a week? a. 23.9% b. 27.2% c. impossible to determine given the above information d. 63.6%

b. 27.2% Response Feedback: See example 1.17 (page 32) https://openstax.org/details/books/introductory-statistics

What does a 90% confidence level mean? a. The true mean has a 10% error. b. 9 times out of 10, the true mean would be contained within the confidence interval. c. 9 times out of 10, there is only a 10% error on the true mean. d. 10% of all people constructing a similar confidence interval could get similar results.

b. 9 times out of 10, the true mean would be contained within the confidence interval. Response Feedback: Read (Section 8.1, page 445): Explanation of 90% Confidence Level Ninety percent of all confidence intervals constructed in this way contain the true mean statistics exam score. For example, if we constructed 100 of these confidence intervals, we would expect 90 of them to contain the true population mean exam score. https://openstax.org/details/books/introductory-statistics

Of what kind of scale is a car speedometer an example? a. An interval scale b. A ratio scale c. An ordinal scale d. A nominal scale

b. A ratio scale Response Feedback: Read (section 2.3, page 14): Once you have determined that a variable is quantitative, it is often worthwhile to further classify it into discrete (also called counting) vs. continuous. Here the test is the midway test. If, for every pair of values of a quantitative variable the value midway between them is a meaningful value, then the variable is continuous, otherwise it is discrete. Typically discrete variables can only take on whole numbers (but all whole numbered variables are not necessarily discrete).

What is the simple bivariate non-graphical exploratory data analysis method? a. Simple tabulation b. Cross-tabulation c. Square-block tabulation d. Bivariate tabulation

b. Cross-tabulation Response Feedback: Read (section 4.4.1 page 90): Cross-tabulation is the basic bivariate non-graphical EDA technique. http://www.stat.cmu.edu/~hseltman/309/Book/chapter4.pdf

Consider the following scenario: You want to investigate if more than 43% of school-age children in Mexico break a bone in their body before the age of 10 years. What are Ho and Ha? a. Ho: p = 0.43, Ha: p > 0.43 b. Ho: p ≤ 0.43, Ha: p > 0.43 c. Ho: p ≥ 0.43, Ha: p < 0.43 d. Ho: p > 0.43, Ha: p < 0.43 e. Ho: p ≠ 0.43, Ha: p ≤ 0.43

b. Ho: p ≤ 0.43, Ha: p > 0.43 Response Feedback: Read (9.1 - Null and Alternative Hypotheses, page 503): H0 always has a symbol with an equal in it. Ha never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis. See Example 9.4 in section 9.1 (Null and Alternative Hypothesis), page 504. https://openstax.org/details/books/introductory-statistics

Which one of the following options describes the standard error of the mean? a.It is the value of the variance divided by the square root of the sample size. b. It is the value of the standard deviation divided by the square root of the sample size. c. It is the same as the standard deviation. d. It is the value of the standard deviation divided by the sample size.

b. It is the value of the standard deviation divided by the square root of the sample size.

What does "inference" mean in the context of statistics? a. Accounting for positive interference b. Making a statement about a population from a single study and its variability c. Determining the standard deviation of key statistical parameters d. Identifying the trends in a set of data

b. Making a statement about a population from a single study and its variability

Which of the following is NOT a property of statistics? a. Inductive inference with probability b. Vectorial and matrices calculations c. Data exploration and analysis

b. Vectorial and matrices calculations

Fill in the blank with the CORRECT option: The p-value measures the likelihood that any observed ____________ between groups is due to a chance event. a. equality b. difference c. parity d. probability

b. difference Response Feedback: Read (What does P value Mean?): The P value is defined as the probability under the assumption of no effect or no difference (null hypothesis), of obtaining a result equal to or more extreme than what was actually observed. The P stands for probability and measures how likely it is that any observed difference between groups is due to chance. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4111019/

Consider the following scenario: A researcher wants to show that science and engineering students spend more time in the library than the rest of the students at a given campus. Which one(s) of the following statements is correct? (i) The null hypothesis is: Engineering and science students spend the same amount of time in the library as other students on the campus. (ii) The alternative hypothesis is one-sided. (iii) The explanatory variable is the amount of hours spent in the library. a. i only b. i and ii only c. i, ii, and iii d. iii only e. ii only

b. i and ii only Response Feedback: See Examples 11.2 to 11.8 in lesson "11.2 Setting the Hypotheses: Examples". https://newonlinecourses.science.psu.edu/stat100/node/63/

Consider the following scenario: You randomly ask 6 college students how many times per day they brush their teeth. The results are 2, 3, 1, 2, 1, 1. Which one(s) of the following options correctly depicts the relative frequency these college students brush their teeth only once per day? (i) 3/6 (ii) 0.50 (iii) 3 (iv) 33% a. iv only b. i and ii only c. iii only d. ii only e. i only

b. i and ii only Response Feedback: See the example in the section titled "Frequency" (page 28) https://openstax.org/details/books/introductory-statistics

Consider the following scenario: You want to know what the probability of throwing a "4" each time you randomly throw a dice 3 times. Which one(s) of the following statements is correct? (i) The variable has the values of 0, 1, and 2. (ii) P(rolling a "4" three times) = 1/6 x 1/6 x 1/6 (iii) The number of times "4" is thrown is considered to be a discrete variable. a. i, ii, and iii b. ii and iii only c. ii only d. i only e. i and ii only

b. ii and iii only Response Feedback: Read: • See "EXAMPLE: Observational" in "Unit 3B: Random Variables": http://bolt.mph.ufl.edu/6050-6052/unit-3b/) • See "EXAMPLE: Flipping a Coin Twice" in "Discrete Random Variable": http://bolt.mph.ufl.edu/6050-6052/unit-3b/discrete-random-variables/) What is the probability distribution of X, where the random variable X is the number of tails appearing in two tosses of a fair coin? We first note that since the coin is fair, each of the four outcomes HH, HT, TH, TT in the sample space S is equally likely, and so each has a probability of 1/4. (Alternatively, the multiplication principle can be applied to find the probability of each outcome to be 1/2 * 1/2 = 1/4.)

In which one(s) of the following scenarios could you perform a two-way ANOVA test? (i) A study of mercury content in 3 different species of fish by 2 different analytical laboratories. (ii) A study of the deterioration caused by acid rain and by road salt along a particular section of a highway. (iii) A study of the effect of sleep deprivation in white and Latino populations. a. i only b. ii only c. ii and iii only d. i and ii only e. iii only

b. ii only Response Feedback: See the examples outlined in 11.1 - Introduction to Analysis of Variance (https://newonlinecourses.science.psu.edu/stat500/node/65/) and 11.2 -Two-way ANOVA (https://newonlinecourses.science.psu.edu/stat500/node/216/): Read (11.2 - Two-way ANOVA, (https://newonlinecourses.science.psu.edu/stat500/node/216/)): In Multi-factor experiments combinations of treatments are applied to experimental units. Applying this to our greenhouse example, we have worked with a single factor, fertilizer, and examined differences among the fertilizer types. However, the researcher is also interested in the growth of different species of plant. Species is a second factor, making this a multifactor experiment. But... those of you with green thumbs say sometimes different fertilizers are more effective on different species of plants! Now we come to the idea that in a factorial design, each level of every treatment is combined with each level of all other treatments.

Consider the following scenario: You are plotting a histogram from a dataset of 347 points. As a rule of thumb, how many intervals would you start with in a first version of the plot? a. 35 b. 10 c. 19 d. Not enough data to decide

c. 19 ***Read (section 2.2, page 79):A guideline that is followed by some for the width of a bar or class interval is to take the square root of the number of data values and then round to the nearest whole number, if necessary. For example, if there are 150 values of data, take the square root of 150 and round to 12 bars or intervals.

Under what kind of statistics can range, interquartile range (IQR), and standard deviation be classified? a. Measures of quantity b. Measures of frequency c. Measures of spread d. Measures of center

c. Measures of spread Response Feedback:Read the section titled "Summary" (page 16) We can now summarize distributions of quantitative variables numerically. - The 5-number summary displays the min, Q1, median, Q3, and max. - Measures of center include the mean and median. - Measures of spread include the range, IQR, and standard deviation http://www.stat.ncsu.edu/people/reiland/courses/st101/chap4_quant-data.pdf

Fill in the blanks with the correct options: The median is a measure of the __________ and is a major component of the ___________. a. center, histogram b. highest frequency, pie chart c. center, box plot d. central tendency, bar graph

c. center, box plot Response Feedback: Read (section 4.3.3, pages 79-82): Here you can see that the boxplot consists of a rectangular box bounded above and below by "hinges" that represent the quartiles Q3 and Q1 respectively, and with a horizontal "median" line through it. Symmetry is appreciated by noticing if the median is in the center of the box

Fill in the blank with the correct option: When conducting hypothesis testing that compares two independent population proportions, __________. a. the number of positive results must be greater than the number of negative results b. each sample must be randomly sampled and the sample size must be at least 5 c. each sample must contain 5 positive and 5 negative results d. the samples should preferably be independent of each other e. the population should be at least 5 times larger than the sample population

c. each sample must contain 5 positive and 5 negative results Response Feedback: Read (10.3 - Comparing Two Independent Population Proportions, page 573-574). When conducting a hypothesis test that compares two independent population proportions, the following characteristics should be present: 1. The two independent samples are simple random samples that are independent. 2. The number of successes is at least five, and the number of failures is at least five, for each of the samples. 3. Growing literature states that the population must be at least ten or 20 times the size of the sample. This keeps each population from being over-sampled and causing incorrect results. https://openstax.org/details/books/introductory-statistics

Fill in the blank with the CORRECT option: To calculate the sample size required for a particular study, one must first ________________. a. perform a chi-square test on the data b. define the statistical hypotheses c. establish the targeted margin of error d. impose a limit on sample size given the availability of the resources (funding, time, material, etc.)

c. establish the targeted margin of error Response Feedback: See page 2 (Issues in Estimating Sample Size for Confidence Intervals Estimates): In order to determine the sample size needed, the investigator must specify the desired margin of error. It is important to note that this is not a statistical issue, but a clinical or a practical one. http://sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_power/BS704_Power2.html

Which one(s) of the following statements concerning standard normal variables is correct? (i) The mean of a random variable is sometimes referred to as the expected value. (ii) Standard deviation is equal to the square of the variance. (iii) A binomial continuous variable can only have two potential results. (iv) A standard normal table can be used to determine the probability associated with a normal random variable. a. i only b. iii and iv only c. i, iii and iv only d. ii and iii only e. iii only

c. i, iii and iv only Response Feedback: Read (Summary (Unit 3B - Random Variables), section titled "Random Variables"): Center: The center of a random variable is measured by its mean (which is sometimes also referred to as the expected value). Spread: The spread of a random variable is measured by its variance, or more typically by its standard deviation (the square root of the variance). Read (Summary (Unit 3B - Random Variables), section titled "Binomial Random Variables"): The binomial random variable is defined in a random experiment that consists of n independent trials, each having two possible outcomes (called "success" and "failure"). Read (Summary (Unit 3B - Random Variables), section titled "Continuous Random Variables": Another way to find probabilities associated with the normal random variable is using the standard normal table. http://bolt.mph.ufl.edu/6050-6052/unit-3b/summary-unit-3b-random-variables/

Consider the following scenario: The school board wants to investigate if the students at elementary school A take more sick days than the students at elementary school B. You randomly select 7 student records from each school, and each population has a normal distribution. You determine the sample mean and sample deviation for both samples but you do NOT know the population standard deviation. The school board believes that students in school A take more sick days. You perform a hypothesis test at the 95% significance level. Which one(s) of the following statements is correct? i) Ho: µschool A ≠ µschool B, Ha: µschool A = µschool B. ii) This is a two-tailed test. iii) If p = 0.01, you should reject Ho. iv) A student t-test distribution can be used to perform the hypothesis testing. a. i and ii only b. i, ii, and iii only c. iii and iv only d. ii, iii, and iv only e. iv only

c. iii and iv only Response Feedback: See Examples 10.1 and 10.2 in section 10.1 - Two Population Means with Unknown Standard Deviation, page 563-566): Read (10.1 - Two Population Means with Unknown Standard Deviation, page 563). When both sample sizes n1 and n2 are five or larger, the Student's t approximation is very good. Notice that the sample variances (s1)2 and (s2)2 are not pooled. (If the question comes up, do not pool the variances.) https://openstax.org/details/books/introductory-statistics

Consider the following scenario: When conducting hypothesis testing using the critical value approach, Ho: μ = 2.1 and Ha: μ > 2.1. Considering a sample size of 10 and the data in the following table (which contains excepts from the quartiles of t-test distribution), which one of the following statements is correct? quartile A= 0.90 quartile B= 0.95 quartile C = 0.975 quartile D = 0.99 df QA QB Qc QD 9 1.383 1.833 2.262 2.821 10 1.372 1.812 2.228 2.764 11 1.363 1.796 2.201 2.718 (i) At a significance level of 0.025, if t* > 2.228, the null hypothesis should be rejected. (ii) At a significance level of 0.10, if t* > 1.363, the null hypothesis should be rejected. (iii) At a significance level of 0.99, if t* < 2.718, the null hypothesis should be rejected. (iv) At a significance level of 0.05, if t* > 1.833, the null hypothesis should be rejected. (v) At a significance level of 0.05, if t* > 2.228, the null hypothesis should be rejected. a. ii b. iii c. iv d. v e. i

c. iv Response Feedback: See similar types of problems in lesson "3.1 - Hypothesis Testing (Critical value approach)" https://newonlinecourses.science.psu.edu/statprogram/node/137/

Fill in the blank with the CORRECT option: It is generally considered that committing a Type II error is _____________ committing a Type I error. a. more risky than b. the same as c. less risky than

c. less risky than Response Feedback: Read (Type I and type II errors): Making some level of error is unavoidable because fundamental uncertainty lies in a statistical inference procedure. As allowing errors is basically harmful, we need to control or limit the maximum level of errors. Which type of error is more risky between type I and type II errors? Traditionally, committing type I error has been considered more risky, and thus more strict control of type I error has been performed in statistical inference. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4534731/

Fill in the blank with the correct option: The null hypothesis for a hypothesis test that compares two independent population proportions is usually ________________. a. pA ≠ pB b. pA ≤ pB c. pA = pB d. pA ≥ pB

c. pA = pB Response Feedback: Read (10.3 - Comparing Two Independent Population Proportions, page 574). The difference of two proportions follows an approximate normal distribution. Generally, the null hypothesis states that the two proportions are the same. That is, H0: pA = pB. https://openstax.org/details/books/introductory-statistics

Fill in the blank with the CORRECT option: Assuming all other conditions are similar, the decrease of Type I error levels translates into ______________ Type II error levels. a. no change in b. the decrease of c. the increase of

c. the increase of Response Feedback: Read (Relationship and affecting factors on type I and type II errors): 1. Related change of both errors Type I and type II errors are closely related. If all other conditions are the same, the reduction of Type I error level accompanies the increase of type II error level. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4534731/

Fill in the blank with the correct option: The variance is _____________. a. the square root of the standard deviation b. the same as the standard deviation c. the square of the standard deviation d. the interval defined by the standard deviation

c. the square of the standard deviation Response Feedback: Read (section 4.2.4, page 70): The variance is the mean of the squares of the individual deviations. The standard deviation is the square root of the variance. http://www.stat.cmu.edu/~hseltman/309/Book/chapter4.pdf

Consider the following scenario: Two events (C and D) are not mutually exclusive. P(C) = 0.40, P(D) = 0.60, P(C and D) = 0.25. What is the probability that either event D or C occurs or that both occur at the same time? a. 0.35 b. not possible to calculate given the lack of information c. 1.00 d. 0.75 e. 0.20

d. 0.75 Response Feedback: Read the section titled "Let's Summarize": 4. The General Addition Rule (#5) states that for any two events, • P(A or B) = P(A) + P(B) - P(A and B), where, by P(A or B) we mean P(A occurs or B occurs or both). In the special case of disjoint events, events that cannot occur together, the General Addition Rule can be reduced to the Addition Rule for Disjoint Events (#4), which is • P(A or B) = P(A) + P(B). * *ONLY use when you are CONVINCED the events are disjoint (they do NOT overlap) http://bolt.mph.ufl.edu/6050-6052/unit-3/module-6/

Of what kind of scale is the Celsius temperature scale an example? a. A ratio scale b. An ordinal scale c. A nominal scale d. An interval scale

d. An interval scale Interval scales are numerical scales in which intervals have the same interpretation throughout. As an example, consider the Fahrenheit scale of temperature. The difference between 30 degrees and 40 degrees represents the same temperature difference as the difference between 80 degrees and 90 degrees. This is because each 10-degree interval has the same physical meaning (in terms of the kinetic energy of molecules).

Fill in the blank with the CORRECT option: The Bonferroni correction can be used as a way around the following issue; when the number of tests in a statistical analysis ___________, so does the chance of a ________________. a. goes down, Type II error b. goes down, Type I error c. goes up, Type II error d. goes up, Type I error

d. goes up, Type I error Response Feedback: Read (Introduction): The Bonferroni correction was proposed to circumvent the problem that as the number of tests increases, so does the likelihood of a type I error, i.e., concluding that a significant difference is present when it is not. Hence, if a null hypothesis (Ho) is true and p ≤ 0.05 is used as the test criterion for all tests, a significant difference will be observed by chance one in 20 trials. If 20 tests are performed, and Ho is true for all 20 tests, however, it can be shown that the chance of least one test being statistically significant is not p = 0.05 but p = 0.64. http://onlinelibrary.wiley.com/doi/10.1111/opo.12131/full

In which one(s) of the following conditions is the use of the Bonferroni correction appropriate? (i) When a single test of the "universal null hypothesis" (Ho) (i.e. meaning that all tests lack significance) is necessary to the study (ii) When the researcher must absolutely avoid a Type II error. (iii) When a large number of tests are performed without premeditated hypotheses in an effort to uncover any significant outcomes. a. i only b. ii only c. ii and iii only d. i and iii only e. i, ii, and iii

d. i and iii only Response Feedback: Read (Concluding remarks and advice): 2. A Bonferroni correction should be considered if: • a single test of the 'universal null hypothesis' (Ho) that all tests are not significant is required. • it is imperative to avoid a type I error. • a large number of tests are carried out without preplanned hypotheses in an attempt to establish any results that may be significant. http://onlinelibrary.wiley.com/doi/10.1111/opo.12131/full

Which one of the following statements correctly describes the meaning of a 95% confidence interval? (i) The values within the confidence interval are within 1.96 standard deviations of the population mean. (ii) The values within the confidence interval are within 1.96 standard deviations of the sample mean. (iii) The values within the confidence interval are within 1.645 standard deviations of the population mean. (iv) The values within the confidence interval are within 1.645 standard deviations of the sample mean. a. iii only b. ii only c. neither i, ii, iii, nor iv are correct statements d. i only e. iv only

d. i only Response Feedback: Read (Confidence Interval on the Mean): The value of 1.96 is based on the fact that 95% of the area of a normal distribution is within 1.96 standard deviations of the population mean. http://onlinestatbook.com/2/estimation/mean.html

Consider the following scenario: A 90% confidence interval for a population proportion was determined to be [0.49, 0.87]. Which one of the following statements regarding this confidence interval is correct? (i) There is a 90% probability that the population proportion lies outside the range of 0.49 and 0.87. (ii) The authors are 90% confident that 0.49 < p < 0.87 (iii) The authors are confident that 90% of the time, the sample proportion lies between the values 0.49 and 0.87. a. ii b. i c. neither i, ii, nor iii are correct options d. iii

d. iii Response Feedback: Read (4.2.1 - Interpreting Confidence Intervals): Example: Correlation Between Height and Weight At the beginning of the Spring 2017 semester a sample of World Campus students were surveyed and asked for their height and weight. In the sample, Pearson's r = 0.487. A 95% confidence interval was computed of [0.410, 0.559]. The correct interpretation of this confidence interval is that we are 95% confident that the correlation between height and weight in the population of all World Campus students is between 0.410 and 0.559. In other words, we are 95% confident that [0.410≤ρ≤0.559] https://newonlinecourses.science.psu.edu/stat200/lesson/4/4.2/4.2.1

Consider the following scenario: A researcher wants to show that eating an apple a day reduces the number of times a person visits the doctor per year. Which one(s) of the following statements is true? (i) The null hypothesis is: Eating an apple a day increases the number of times a person visits the doctor per year. (ii) The alternative hypothesis is two-sided. (iii) The response variable is the number of visits to the doctor. a. ii only b. i and ii only c. i only d. iii only e. i, ii, and iii

d. iii only Response Feedback: See Examples 11.2 to 11.8 in lesson "11.2 Setting the Hypotheses: Examples". https://newonlinecourses.science.psu.edu/stat100/node/63/

Fill the blank with the CORRECT option: The difference between a parallel and a cross-over design is that _____________________. a. in a parallel design, the patients are submitted to more than one treatment at a time b. in a cross-over design, the patients remain on a particular treatment for the duration of the trial c. in a parallel design, the patients serve as their own control thus reducing variability d. in a cross-over design, the patients switch from one treatment to another at some point during the trial

d. in a cross-over design, the patients switch from one treatment to another at some point during the trial Response Feedback: Read (3.3 - Experimental Design Terminology): A parallel design refers to a study in which patients are randomized to a treatment and remain on that treatment throughout the course of the trial. This is a typical design. In contrast, with a crossover design patients are randomized to a sequence of treatments and they cross over from one treatment to another during the course of the trial. Each treatment occurs in a time period with a washout period in between. Crossover designs are of interest since with each patient serving as their own control, there is potential for reduced variability. https://newonlinecourses.science.psu.edu/stat509/node/21/

What statistical values are included in the "5-number summary"? a. range, Q1, median, Q3, mean b. min, max, Q1, Q3, range c. IQR, median, mean, min, max d. min, max, median, Q1, Q3

d. min, max, median, Q1, Q3 Response Feedback: Read the section titled "5 Number Summary" (page 12): 5-Number Summary minimum Q1 median Q3 maximum http://www.stat.ncsu.edu/people/reiland/courses/st101/chap4_quant-data.pdf

Which one(s) of the following statements concerning variability is correct? (IQR = interquartile range) i) If Q1 = 3 and Q2 = 5, IQR = 4. ii) 93% of the data lie at or below the 7th percentile. iii) If Q1 = 3 and Q4 = 5, IQR = 2 a. i and ii only b. i only c. i, ii, and iii d. neither i, ii, nor iii are correct statements e. ii only

d. neither i, ii, nor iii are correct statements Response Feedback: Read (2.2 - Measures of Variability), section titled "Measures of Variability": B. Interquartile range (IQR) In order to talk about interquartile range, we need to first talk about percentiles. The pth percentile of the data set is a measurement such that after the data are ordered from smallest to largest, at most, p% of the data are at or below this value and at most, (100 - p)% at or above it. Thus, the median is the 50th percentile. Fifty percent or the data values fall at or below the median. Also, Q1 = lower quartile = the 25th percentile and Q3 = upper quartile = the 75th percentile. The interquartile range is the difference between upper and lower quartiles and denoted as IQR. IQR = Q3 - Q1 = upper quartile - lower quartile = 75th percentile - 25th percentile. https://newonlinecourses.science.psu.edu/stat500/node/13/

What are the names given to the two types of statistical hypotheses? a. null and progressive b. significant and alternative c. true and untrue d. null and alternative e. probable and improbable

d. null and alternative Response Feedback: Read (section titled "Basic concepts of inferential statistics"): The two statistical hypotheses are: null and alternative hypotheses.16 Null hypothesis refers to the absence of effect or association.1 Alternative hypothesis states that there is a difference between at least two populations studied and when positive states that there is difference between the groups analyzed.16 https://www.sciencedirect.com/science/article/pii/S0104001417300167

Geometric Means

data are highly skewed, geometric means can be used. A geometric mean, unlike an arithmetic mean, minimizes the effect of very high or low values, which could bias the mean if a straight average (arithmetic mean) were calculated. The geometric mean is a log-transformation of the data and is expressed as the N-th root of the product of N numbers.

Consider the following scenario: If σ = 2 and μ = 14 for a normal random variable, what is the probability of having a value more than 20? a. 0.50 b. 0.025 c. 0.003 d. 0.68 e. 0.0015

e. 0.0015 Response Feedback: Read (Normal Random Variables, section titled "The Standard Deviation Rule for Normal Random Variables"): Comment • Notice that the information from the rule can be interpreted from the perspective of the tails of the normal curve: o Since 0.68 is the probability of being within 1 standard deviation of the mean, (1 - 0.68) / 2 = 0.16 is the probability of being further than 1 standard deviation below the mean (or further than 1 standard deviation above the mean.) o Likewise, (1 - 0.95) / 2 = 0.025 is the probability of being more than 2 standard deviations below (or above) the mean. o And (1 - 0.997) / 2 = 0.0015 is the probability of being more than 3 standard deviations below (or above) the mean. http://bolt.mph.ufl.edu/6050-6052/unit-3b/normal-random-variables/

Consider the following scenario: A study examined the number of days a student is absent from classes during the academic year. If σ = 2.5 and μ = 12.1, how many days was Tom absent if his z-score is -1.4? a. 15.6 b. 4.2 c. Cannot calculate based on the information provided d. 11 e. 8.6

e. 8.6 Response Feedback: Read (Standard Normal Distribution, section titled "Standardizing Values"): EXAMPLE: Foot Length (a) What is the standardized value for a male foot length of 8.5 inches? How does this foot length relate to the mean? z = (8.5 - 11) / 1.5 = -1.67. This foot length is 1.67 standard deviations below the mean. (b) A man's standardized foot length is +2.5. What is his actual foot length in inches? If z = +2.5, then his foot length is 2.5 standard deviations above the mean. Since the mean is 11, and each standard deviation is 1.5, we get that the man's foot length is: 11 + 2.5(1.5) = 14.75 inches. http://bolt.mph.ufl.edu/6050-6052/unit-3b/normal-random-variables/standard-normal-probabilities/

Fill in the blank with the correct option: A symmetrical bell-shaped curve in which the majority of the points can be found around the centre is said to follow ____________. a. a biased distribution b. a two-tailed distribution c. a skewed-distribution d. a categorical distribution e. a normal distribution

e. a normal distribution Response Feedback: Read (5.3 The Normal Curve): The predictable pattern of interest is a type of symmetry where much of the distribution of the data is clumped around the center and few observations are found on the extremes. Data that has this pattern are said to be bell-shaped or have a normal distribution. https://newonlinecourses.science.psu.edu/stat100/node/83/

Fill in the blank with the correct option: When conducting hypothesis testing for matched or paired samples, __________. a. the difference between pairs must be normally distributed regardless of the sample size b. the two samples are independent c. participants are randomly paired with each other d. the size of the samples is generally large e. differences are calculated between pairs

e. differences are calculated between pairs Response Feedback: Read (10.4 - Matched or Paired Samples, page 578-579). When using a hypothesis test for matched or paired samples, the following characteristics should be present: 1. Simple random sampling is used. 2. Sample sizes are often small. 3. Two measurements (samples) are drawn from the same pair of individuals or objects. 4. Differences are calculated from the matched or paired samples. 5. The differences form the sample that is used for the hypothesis test. 6. Either the matched pairs have differences that come from a population that is normal or the number of differences is sufficiently large so that distribution of the sample mean of differences is approximately normal. https://openstax.org/details/books/introductory-statistics

Which one(s) of the following statements correctly describes the margin of error? (i) It is the width of the confidence interval. (ii) It is dependent upon the standard error. (iii) It is dependent upon the mean. a. iii only b. i only c. ii only d. i, ii, and iii e. i and ii only

e. i and ii only Response Feedback: Read (4.2 - Introduction to Confidence Intervals): The center of a confidence interval is the sample statistic, such as a sample mean or sample proportion. This is also known as the point estimate. The width of our confidence interval is also known as the margin of error. The margin of error is the amount that is subtracted from and added to the point estimate to construct the confidence interval. The margin of error will depend on two factors: (1) the level of confidence; and (2) the value of the standard error. https://newonlinecourses.science.psu.edu/stat200/lesson/4/4.2

Which one(s) of the following assumptions is required for estimating the confidence interval on the difference between the means of two groups? (i) The variance is identical between the two groups. (ii) The groups follow a t-distribution. (iii) The groups were sampled together. a. i and iii only b. ii and iii only c. iii only d. i and ii only e. i only

e. i only Response Feedback: Read (Difference between Means): In order to construct a confidence interval, we are going to make three assumptions: 1. The two populations have the same variance. This assumption is called the assumption of homogeneity of variance. 2. The populations are normally distributed. 3. Each value is sampled independently from each other value. http://onlinestatbook.com/2/estimation/difference_means.html

Consider the following scenario: A population of 1000 is normally distributed, with a mean of 53, and the population standard deviation is 5. A researcher randomly selects 50 samples with replacement. Which one(s) of the following statements is correct? (i) The standard deviation on the sample means is 5. (ii) The variance of the sample means would increase if the sample size was changed to 25. (iii) By the central limit theorem, the distribution should be normal with a sample mean of 53. a. iii only b. i and ii only c. ii only d. i only e. ii and iii only

e. ii and iii only Response Feedback: Read: For the random samples we take from the population, we can compute the mean of the sample means: μx̅ =μ and the standard deviation of the sample means: σx̅ =σ/√n Read (section "Central Limit Theorem with a Normal Population"): If we were to take samples of n=5 instead of n=10, we would get a similar distribution, but the variation among the sample means would be larger. In fact, when we did this we got a sample mean = 75 and a sample standard deviation = 3.6. http://sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_probability/BS704_Probability12.html

Which one(s) of the following statements is correct regarding confidence intervals? (i) If the confidence level increases, the value becomes more precise and thus the confidence interval decreases. (ii) If you increase the sample size, the confidence interval increases because the error bound for a population mean (EBM) increases. (iii) When determining a confidence interval for a proportion population, the sample size multiplied by the number of positive results must be greater than 5 and the sample size multiplied by the number of negative results must be greater than 5. (iv) If the population mean in not known, a student t-distribution table can be used to find EBM with n-1 degrees of freedom. a. ii and iii only b. i only c. ii only d. i and iii only e. iii only

e. iii only Response Feedback: Read (Section 8.1, page 449): Summary: Effect of Changing the Confidence Level • Increasing the confidence level increases the error bound, making the confidence interval wider. • Decreasing the confidence level decreases the error bound, making the confidence interval narrower. Read (Section 8.1, page 450): Summary: Effect of Changing the Sample Size • Increasing the sample size causes the error bound to decrease, making the confidence interval narrower. • Decreasing the sample size causes the error bound to increase, making the confidence interval wider. Read (Section 8.2, page 454): When calculating the error bound, a probability table for the Student's t-distribution can also be used to find the value of t. The table gives t-scores that correspond to the confidence level (column) and degrees of freedom (row); the t-score is found where the row and column intersect in the table. Read (Section 8.3, page 457): The confidence interval can be used only if the number of successes np′ and the number of failures nq′ are both greater than five. https://openstax.org/details/books/introductory-statistics

Fill in the blank with the correct option: Statistical significance means ________________. a. the effect is due to chance b. that the null hypothesis should be accepted c. the effect is substantial d. the effect is important e. the effect is not zero

e. the effect is not zero Response Feedback: Read (Significance Testing): When the null hypothesis is rejected, the effect is said to be statistically significant. For example, in the Physicians' Reactions case study, the probability value is 0.0057. Therefore, the effect of obesity is statistically significant and the null hypothesis that obesity makes no difference is rejected. It is very important to keep in mind that statistical significance means only that the null hypothesis of exactly no effect is rejected; it does not mean that the effect is important, which is what "significant" usually means. When an effect is significant, you can have confidence the effect is not exactly zero. Finding that an effect is significant does not tell you about how large or important the effect is. Thus, finding that an effect is statistically significant signifies that the effect is real and not due to chance. http://onlinestatbook.com/2/logic_of_hypothesis_testing/significance.html

Which one(s) of the following options is an example of a probability sample? i) Simple random sample ii) Convenience sample iii) Judgment sample iv) Systematic sample

i and iv only

Which one of the following statements regarding the value of the mean and median within a normal distribution is correct? (i) The mean and median are always identical. (ii)The mean and median are most likely vastly different. (iii) The mean and median are most likely similar.

iii Read (5.3 The Normal Curve): Example 5.3. Normal Curves Since a normal distribution is a type of symmetric distribution, you would expect the mean and median to be very close in value. https://newonlinecourses.science.psu.edu/stat100/node/83/

In a positively skewed distribution the mean has a value that is ___________________.

larger than the median Read (section 4.2.3 (page 68): For symmetric distributions, the mean and the median coincide. For unimodal skewed (asymmetric) distributions, the mean is farther in the direction of the "pulled out tail" of the distribution than the median is.

Weighted arithmetic means

measure of the number of people in the population represented by that person. To obtain an unbiased estimate of the population mean, based on data from the NHANES 1999-2002 sample, it is necessary to take a weighted arithmetic mean.

***Fill in the blank with the correct option: For the same confidence level, the Z value for a one-sided confidence interval is ______ that for a two-sided confidence interval. a. smaller than b. larger than c. the same as

smaller than Response Feedback: See "One-sided Confidence Intervals", slide 24. https://dspace.mit.edu/bitstream/handle/1721.1/72947/15-075-spring-2003/contents/lecture-notes/lec6_chap6.pdf

The median is said to be robust because ______________.

very high or very low values often have no effect on it


Kaugnay na mga set ng pag-aaral

Nutrition Exam 2 - ch 5 , 6, and 7

View Set

Chapter 25: Disorders of Renal Function- Patho level 3 taken from http://thepoint.lww.com

View Set

Mosbys Review Biochemistry, Nutrition and Nutritional Counseling Chp 12

View Set

Unit 4 - Economic Interdependence Quizlet

View Set