HTH 320 Exam #2

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

-Alpha levels and p-values are rooted in the central limits theorem. -The central limit theorem states that given large enough samples from a population, the distribution of sample means taken from this population will approach a normal distribution (bell curve or normal curve).

-__________ and __________s are rooted in the central limits theorem. -The __________ limit theorem states that given large enough __________ from a population, the distribution of sample means taken from this population will approach a __________ distribution (__________ curve or __________ curve).

CRONBACH ALPHA RELIABILITY COEFFICIENT:

A coefficient of reliability relating to the internal consistency of a questionnaire or test. -reliability statistics

VARIABLE:

A condition or factor that is present in a research design.

NORMAL DISTRIBUTION:

A frequency distribution in which the data form a symmetrical bell shaped curve partitioned into standardized distributions.

FREQUENCY DISTRIBUTION:

A graph depicting the frequency of data point occurrences.

Z SCORE:

A measure of how many standard deviations an individual data point is from the mean.

VALIDITY:

A measure of how well a test or measurement instrument measures what it is supposed to measure.

ODDS RATIO:

A measure of the ratio of the odds that an event will occur to the odds of the event not happening. (The odds of drawing an ace out of a deck of cards is 4/48.)

KURTOSIS:

A measure of the shape (flatness or peakedness) of a frequency distribution.

WILCOXON'S SIGNED RANK TEST:

A non-parametric version of the paired t-test.

CHI-SQUARE GOODNESS OF FIT:(x^2)

A nonparametric statistical technique used to compare expected verses observed outcomes.

CHI SQUARE TEST OF INDEPENDENCE: (x^2 )

A nonparametric statistical test used to determine if responses from two groups are independent.

FRIEDMAN TEST:

A nonparametric version of a one-way ANOVA with repeated measures.

MANN-WHITNEY U TEST: (Mann-Whitney-Wilcoxon,Wilcoxon rank-sum test, or Wilcoxon-Mann-Whitney test).

A nonparametric version of a t-test.

PARAMETER:

A numerical value summarizing the population.

SAMPLE STATISTIC:

A numerical value summarizing the sample.

STUDENT'S T-TEST (or t-test):

A parametric statistical technique measuring the difference between means of two independent samples.

WELCH TEST:

A parametric version of a t-test used when the two data sets have unequal variances and unequal sample sizes.

NORMAL DISTRIBUTION: BINOMIAL DISTRIBUTION:

A probability distribution describing data that are limited to two outcomes (yes/no; heads/tails)

CATEGORICAL VARIABLE:

A qualitative variable associated with categories.

CASE:

A single value from a data set.

ANOVA: Analysis of variance.

A statistical procedure examining variations between two or more sets of interval or ratio data.

TYPES OF STATISTICAL ANALYSES ANCOVA: Analysis of covariance.

A statistical procedure examiningvariations between two or more sets of interval or ratio data while controlling for variables that may confound the data.

CORRELATION:

A statistical technique examining the degree of the relationship between statistical variables which tend to be associated in a way not expected by chance.

LINEAR REGRESSION:

A statistical technique for modeling the relationship between a dependent variable (y) and one or more explanatory variables (X).

PAIRED T-TEST:

A statistical test examining the means and variances of two related samples.

REGRESSION:

A statistical test that determines the degree of relationship between variables.

F STATISTIC (F):

A statistical test to determine the relationship between the variances of two or more samples.

KOLMOGOROV-SMIRNOV TEST:

A statistical test used to determine if the shape of a data set approaches a normal distribution. Used when the number of subjects is greater than fifty.; Normality statistical test

SHAPIRO-WILK:

A statistical test used to determine if the shape of a data set approaches a normal distribution. Used when the number of subjects is less than fifty.

MANOVA:

A statistical test used to evaluate the relationship between three or more levels of an independent variable and two or more dependent variables.

DOUBLE BLIND:

A study where neither the participants nor the researchers know who is in the control and treatment groups.

SAMPLE:

A subset of the population.

NORMAL DISTRIBUTION: CENTRAL LIMITS THEOREM:

A theorem thatstates that given a large enough sample, the shape of a distribution of samples from such a sampling distribution will approach a normal distribution.

BONFERRONI POST HOC:

A type of post hoc test. - how we adjust for inflated alpha

DEGREES OF FREEDOM (df):

A value derived from the number of subjects and the number of levels within the independent variables within a study.

RELIABILITY CO-EFFICIENT:

A value indicating the repeatability of a data set.

QUANTITATIVE VARIABLE:

A variable associated with numerical values that have a numerical meaning. These variables respond to typical measurement properties.

CONTROL VARIABLE (COVARIATE):

A variable that is controlled by the researcher in an attempt to prevent such a variable from unduly influencing a dependent variable.

ANALYSIS OF COVARIANCE (ANCOVA) · The ANCOVA is used to examine a statistical design that would employ an ANOVA, with an attempt to control for a covariate that may be affecting to results of a simple ANOVA. For example, you may suspect that pretreatment exercise may affect post-treatment cholesterol levels. The pre-treatment exercise values can be used as covariates. · There needs to be a linear relationship between the covariate, and the dependent variable at all levels. · Results will be similar to ANOVA.

ANALYSIS OF COVARIANCE (ANCOVA) · The ANCOVA is used to examine a statistical _________ that would employ an ANOVA, with an attempt to control for a _________ that may be affecting to results of a simple _________. For example, you may suspect that pretreatment exercise may affect post-treatment cholesterol levels. The pre-treatment exercise values can be used as covariates. · There needs to be a _________ relationship between the _________, and the _________ variable at all levels. · Results will be similar to _________.

ANALYSIS OF VARIANCE (ANOVA) ONE WAY ANOVA: · Used to evaluate the differences between three or more independent means. · This acronym can be confusing. The ANOVA is comparing the variances and the means within each group and the variances between each group. · One dependent variable · One independent variable with three or more levels · Determine if your data are normally distributed o Visual inspection of Q-Q Plots or other graphical methods can be done. However, this type of interpretation is difficult. o Use the Kolmogorov Smirnoff test, or the Shapiro-Wilk test if sample sizes are fewer than 50. · Use Levene's test to determine if the data have equality of variances. Example of Levene's test: ex - p. 97 _________

ANALYSIS OF VARIANCE (ANOVA) ONE WAY ANOVA: · Used to evaluate the differences between _________ or more _________ means. · This acronym can be confusing. The ANOVA is comparing the _________ and the _________ within each group and the _________ between each group. · One _________ variable · One _________ variable with _________ or more _________ · Determine if your data are _________ distributed o Visual inspection of Q-Q Plots or other graphical methods can be done. However, this type of interpretation is _________. o Use the _________ test, or the _________ test if sample sizes are fewer than 50. · Use _________ test to determine if the data have equality of variances.

when using multiple data sets: The question researchers try to answer is "given the means and standard deviations for each data set, would we find this amount of difference between data sets more than 5% of the time. If so, the statistical analyses would produce a p-value greater than .05. -If this divergence of data sets occurs less than 5% of the time, researchers may postulate that this divergence may be due to the unique characteristics of each data set.

when using multiple data sets: The question researchers try to answer is "given the __________ and standard __________ for each data set, would we find this amount of __________ between data sets more than 5% of the time. If so, the statistical analyses would produce a p-value __________ than .05. -If this divergence of data sets occurs __________ than 5% of the time, researchers may postulate that this divergence may be due to the unique __________ of each data set.

· Independent T-test o Used to compare two independent samples t= (M1 - M2) / square root of (SD1^2/n1) + (SD2^2/n2)

· Independent T-test o Used to compare two _________ samples t= (_________ - _________) / _________ of (_________^2/_________) + (_________^2/_________)

· Note that a one-tailed test places all of the alpha percentage (.05) at one tail, not split between two tails. So, given the same df, the values needed for significance with a one-tailed test are lower than those associated with a two-tailed test. · Note that given the same alpha and the same tail assignment, as the degrees of freedom increase, the threshold for the t statistic decreases. · SPSS analyses will produce an exact p value. So, tables like this are no longer necessary.

· Note that a one-tailed test places all of the alpha percentage (.05) at _________ tail, not _________ between two tails. So, given the same df, the values needed for significance with a one-tailed test are _________ than those associated with a two-tailed test. · Note that given the same alpha and the same tail assignment, as the degrees of freedom increase, the threshold for the t statistic _________. · SPSS analyses will produce an exact p value. So, tables like this are no longer necessary.

· One sample t-test (Z test): o Comparison between a population and sample. o Not found very often in the literature because population data is not readily available. o Difference between the z-test and t-test: § t-statistic uses standard deviation to estimate standard error § z-statistic uses the population standard deviation Mean - population mean / SD

· One sample t-test (_________): o Comparison between a _________ and _________. o Not found very often in the literature because _________ data is not readily available. o Difference between the z-test and t-test: § t-statistic uses standard _________ to estimate standard _________ § z-statistic uses the _________ standard _________ _________ - population _________ / _________

· The ANOVA Table is used to determine if the regression line that has been determined as the best fit is useful (statistically significant) when predicting the dependent variable.

· The ANOVA Table is used to determine if the _________ line that has been determined as the best fit is useful (statistically _________) when predicting the _________ variable.

· The Coefficients Table provides the necessary values to construct a regression equation. · The linear regression equation: Y = bX +a · Y' = the dependent variable (predicted score) · B = Beta (the slope of the line) · a = the Y intercept. · X = explanatory variable ex: 107

· The Coefficients Table provides the necessary _________ to construct a _________ equation. · The linear regression equation: _________ · Y' = the _________ variable (predicted score) · B = _________ (the _________ of the line) · a = the Y _________. · X = _________ variable

· Dependent T-Test

· Used to compare two related samples

·Sensitivity, Specificity, Positive Predictive Value, Negative Predictive Value: A diagnostic test can have four possible outcomes: o True positive o False positive o True negative o False negative

·Sensitivity, Specificity, Positive Predictive Value, Negative Predictive Value: A diagnostic test can have four possible outcomes: o True _________ o False _________ o True _________ o False _________

"Suppose 0.8% of women who get mammograms have breast cancer (8/1000). In (approximately) 90% of women with breast cancer, the mammogram will correctly detect it. (That's the statistical power of the test. This is an estimate, since it's hard to tell how many cancers are missed if we don't know they're there.) However, among women with no breast cancer at all, about 7% will get a positive reading on the mammogram, leading to further tests and biopsies and so on. If you get a positive mammogram result, what are the chances you have breast cancer? The answer is 9%. This is referred to as the positive predictive value. If you administer questions like this one to statistics students and scientific methodology instructors, more than a third fail. If you ask doctors, two thirds fail. They erroneously conclude that a p<0.05 result implies a 95% chance that the result is true - but as you can see in these examples, the likelihood of a positive result being true depends on what proportion of hypotheses tested are true. And we are very fortunate that only a small proportion of women have breast cancer at any given time."

"Suppose 0.8% of women who get mammograms have breast cancer (8/1000). In (approximately) 90% of women with breast cancer, the mammogram will correctly detect it. (That's the statistical power of the test. This is an estimate, since it's hard to tell how many cancers are missed if we don't know they're there.) However, among women with no breast cancer at all, about 7% will get a positive reading on the mammogram, leading to further tests and biopsies and so on. If you get a positive mammogram result, what are the chances you have breast cancer? The answer is 9%. This is referred to as the positive predictive value. If you administer questions like this one to statistics students and scientific methodology instructors, more than a third fail. If you ask doctors, two thirds fail. They erroneously conclude that a p<0.05 result implies a 95% chance that the result is true - but as you can see in these examples, the likelihood of a positive result being true depends on what proportion of hypotheses tested are true. And we are very fortunate that only a small proportion of women have breast cancer at any given time." - calc info behind this p. 59

"The more mammograms a woman has, the more likely she is to have a false positive result that will require follow-up tests. Studies have shown the chances of having a false positive result after 10 yearly mammograms are about 50 to 60 percent. Getting a false positive result can cause fear and worry (and result in more testing). However, this does not outweigh the benefit of mammography for most women. The goal of mammography is to find as many cancers as possible, not to avoid false positive results."

"The more mammograms a woman has, the more likely she is to have a ___________ positive result that will require ___________ tests. Studies have shown the chances of having a false positive result after 10 yearly mammograms are about _____ to ______ percent. Getting a ___________ positive result can cause fear and worry (and result in more testing). However, this does not ___________ the benefit of mammography for most women. The goal of mammography is to find as many cancers as possible, not to ___________ false positive results."

CORRELATED T-TEST (Paired sample, repeated measures)

(Paired sample, repeated measures) is used to determine whether the mean difference between paired observations is statistically different

(Probability sampling techniques) Some of the more common techniques include the following: Convenience sampling (Non-Probability Sampling) occurs when the samples from a population are selected based on convenience without attention to randomization. Cluster sampling occurs when the population is divided into clusters or groups. Clusters are then randomly selected. Random sample occurs when members of a population have an equal chance at being selected. Random number generators are often used to select subjects. Stratified random sampling is similar to cluster sampling. A population is stratified by some criteria. Random selection from each stratum is then conducted. Systematic sampling refers to an attempt at random sampling where an ordered system of selection is determined. Voluntary response sample occurs when members of a population are given the opportunity to become subjects. Snowball Sampling occurs when the researcher starts with a subject or a group of subjects who are asked to recommend other potential subjects who possess the necessary qualifications to serve as a subject.

(Probability sampling techniques) Some of the more common techniques include the following: __________ sampling (Non-__________ Sampling) occurs when the samples from a population are selected based on __________ without attention to __________. __________ sampling occurs when the population is divided into __________ or __________. Clusters are then __________ selected. __________ sample occurs when members of a population have an __________ chance at being selected. Random number __________ are often used to __________ subjects. __________ __________ sampling is similar to cluster sampling. A population is __________ by some criteria. __________ selection from each __________ is then conducted. __________ sampling refers to an attempt at __________ sampling where an __________ system of selection is determined. __________ __________ sample occurs when members of a population are given the __________ to become subjects. __________ Sampling occurs when the researcher starts with a __________ or a __________ of subjects who are asked to __________ other potential subjects who possess the necessary __________ to serve as a subject.

-we have statistical significance when the p-value is less than the alpha level set by the researcher. -We also know that the p-value only addresses the probability of finding such a sample as it relates to the population distribution. -If our p-value is less than .05, we can conclude that given a null hypothesis that is true, we would only find such a sample data set less than five percent of the time. -We also know that the likelihood of finding statistical significance increases as our sample size increases; the larger the sample, the smaller the measures of central tendency between data sets need to be to reach statistical significance.

-we have statistical significance when the p-value is _________ than the alpha level set by the researcher. -We also know that the p-value only addresses the probability of _________ such a sample as it relates to the _________ distribution. -If our p-value is less than .05, we can conclude that given a null hypothesis that is _________, we would only find such a _________ data set less than five percent of the time. -We also know that the likelihood of finding statistical significance increases as our sample size _________; the _________ the sample, the _________ the measures of central tendency between data sets need to be to reach statistical significance.

MANCOVA:

A MANOVA design that attempts to control confounding variables.

RELATIVE RISK:

: Comparison of the probability of an event occurring in different groups. (The probability of drawing an ace out of a deck of cards is 4/52.)

ANOVA Where: F = ANOVA Coefficient = MST / MSE MST = Mean sum of squares due to treatment MSE = Mean sum of squares due to error. df = With this particular experimental design, there are three groups that result in two degrees of freedom. The between groups df refer to the number of data sets - 1. The within groups degrees of freedom is based on a formula using the number of subjects. · These results indicate the probability of obtaining the observed F-value if the null hypothesis is true. The p-value exceeds .05. Therefore, we do not have reason to suspect mean differences other than by chance _________

ANOVA Where: _________= ANOVA Coefficient = _________ / _________ MST = Mean sum of _________ due to _________ MSE = Mean sum of _________ due to _________. df = With this particular experimental design, there are _________ groups that result in _________ degrees of freedom. The between groups df refer to the _________ of data sets - 1. The within groups degrees of freedom is based on a formula using the number of _________. · These results indicate the probability of obtaining the observed _________ if the null hypothesis is _________. The p-value exceeds .05. Therefore, we do not have reason to suspect mean _________ other than by chance

POPULATION:

All persons or objects meeting specific parameters established by the researcher.

ONE-TAILED TEST:

Alpha level is applied to only one tail of the normal distribution. The researcher is expecting any differences to occur in only one direction.

LEVENE'S TEST:

An F-test used to determine if the variances from data sets are significantly different. -test for homogeneity or equality of variance (including ANOVA)

BONFERRONI ADJUSTMENT:

An adjustment to the Alpha level due to multiple statistical tests applied to the same sample. The alpha level is divided by the number of statistical analyses being conducted.

RANDOM:

An event or pattern in which all outcomes are equally likely.

NORMAL DISTRIBUTION: -SKEWNESS:

Any deviation in lateral symmetry from a normal distribution.

NORMAL DISTRIBUTION: -KURTOSIS:

Any deviation in the flatness or peakedness from a normal distribution.

As a general rule of thumb, we are looking for a relative risk of 3 or more [before accepting a paper for publication], particularly if it is biologically implausible or if it's a brand new finding -If you see a 10-fold relative risk and it's replicated and it's a good study with biological backup, like we have with cigarettes and lung cancer, you can draw a strong inference. If it's a 1.5 relative risk, and it's only one study and even a very good one, you scratch your chin and say maybe. -Taubes described Harvard epidemiologistDimitrios Trichopoulosas suggesting that a study should show a four-fold increased risk,and the late Sir Richard Doll of Oxford University as suggesting that a single epidemiologic study would not be persuasive unless the lower limit of its 95% confidence interval exclude 3.0. -Breslow and Day, two respected cancer researchers, noted in a publication of the World Health Organization, that: "[r]elative risks of less than 2.0 may readily reflect some unperceived bias or confounding factor, those over 5.0 are unlikely to do so." -The caveat makes sense, but it clearly was never intended to be some sort of bright-line rule for people too lazy to look at the actual studies and data. Unfortunately, not all epidemiologists are as capable as Breslow and Day, and there are plenty of examples of spurious RR > 5, arising from biased or confounded studies. -Sir Richard Doll, and Sir Richard Peto, expressed a similarly skeptical view about RR < 2, in assessing the causality of associations: "when relative risk lies between 1 and 2 ... problems of interpretation may become acute, and it may be extremely difficult to disentangle the various contributions of biased information, confounding of two or more factors, and cause and effect."

As a general rule of thumb, we are looking for a relative risk of ________ or more [before ________ a paper for publication], particularly if it is biologically implausible or if it's a brand new finding -If you see a 10-fold relative risk and it's replicated and it's a good study with biological backup, like we have with cigarettes and lung cancer, you can draw a ________ inference. If it's a 1.5 relative risk, and it's only one study and even a very good one, you scratch your chin and say ________. -Taubes described Harvard epidemiologist Dimitrios Trichopoulosas suggesting that a study should show a ________-fold increased risk, and the late Sir Richard Doll of Oxford University as suggesting that a ________ epidemiologic study would not be persuasive unless the lower limit of its 95% confidence interval exclude ________ -Breslow and Day, two respected cancer researchers, noted in a publication of the World Health Organization, that: "[r]elative risks of less than ________ may readily reflect some unperceived bias or confounding factor, those over ________ are unlikely to do so." -The caveat makes sense, but it clearly was never intended to be some sort of bright-line rule for people too lazy to look at the actual studies and data. Unfortunately, not all epidemiologists are as capable as Breslow and Day, and there are plenty of examples of spurious RR > 5, arising from biased or confounded studies. -Sir Richard Doll, and Sir Richard Peto, expressed a similarly skeptical view about RR < 2, in assessing the causality of associations: "when relative risk lies between 1 and 2 ... problems of interpretation may become acute, and it may be extremely difficult to disentangle the various contributions of ________ information, confounding of two or more factors, and cause and effect."

Because of the many factors that can adversely affect the statistical outcome of a research study, there is always a degree of chance that researchers take. -Researchers decide on an alpha level prior to data analysis. In most cases, alpha is set at .05 (p<.05). Again, note that in order to declare statistical significance, the p value must be less than the alpha value. -This alpha value indicates the maximum threshold of probability of obtaining a particular set of data that the researcher is willing to except, given the null hypothesis is correct.

Because of the many factors that can adversely affect the statistical outcome of a research study, there is always a degree of __________ that researchers take. -Researchers decide on an __________ level prior to data analysis. In most cases, alpha is set at _____ (p<._____). Again, note that in order to declare statistical significance, the __________ value must be less than the __________ value. -This alpha value indicates the __________ threshold of probability of obtaining a __________ set of data that the researcher is willing to except, given the null hypothesis is __________.

Binomial distribution Bernoulli Trial: (one of two outcomes) Using a random number generator, a data set of 100 coin flips was recorded: Coin flip (1= Heads; 2= Tails). Notice the patterns of heads. At one point heads occurred nine times in a row. Is the coin biased? We can calculate a z score and using a z table, determine if 54/100 heads is statistically significant at the .05 level. Null Hypothesis (Ho) = 50/50 change Observed: 54/100 .54 Variance of the proportion: p (1-p)/n .5 (.5)/100 = .0025 sd = square root of variance = .05 z = (proportion - prob.)/ sd = .54-.5/.05 = 0.80 We compare this to the known quantity of a two-tailed z for significance at the .05 level (1.96). z does not equal or exceed, so we accept the null.

Binomial distribution _________ Trial: (one of two outcomes) Using a random number generator, a data set of 100 coin flips was recorded: Coin flip (1= Heads; 2= Tails). Notice the patterns of heads. At one point heads occurred nine times in a row. Is the coin biased? We can calculate a z score and using a z table, determine if 54/100 heads is statistically significant at the .05 level. Null Hypothesis (Ho) = 50/50 change Observed: 54/100 .54 Variance of the proportion: _________ .5 (.5)/100 = .0025 sd = square root of _________ = .05 z = (_________ - _________.)/ _________ = .54-.5/.05 = 0.80 We compare this to the known quantity of a two-tailed z for significance at the .05 level (1.96). z does not equal or exceed, so we _________ the null.

CHI SQUARE (X2) The two most common chi square tests (X2) are: X2goodness of fit X2independence (association) The Chi-square statistic (X2) is used to determine if distributions of categorical variables differ from one another. P- values are calculated. Must have a minimum of five responses per cell to run a Chi-square test. Fisher's test is used if a cell size is less than 5. Reported as chi-square statistic (χ2)...... χ2 (2) = 12.580, p = .001).

CHI SQUARE (X2) The two most common chi square tests (X2) are: _________ _________ The Chi-square statistic (X2) is used to determine if distributions of _________ variables _________ from one another. _________ are calculated. Must have a minimum of _________ responses per _________ to run a Chi-square test. Fisher's test is used if a cell size is _________ than ___. Reported as chi-square statistic (χ2)...... χ2 (2) = 12.580, p = .001).

CORRELATION (r) Correlation tests are used to determine how strong of a relationship exists between two sets of data. The data need to be continuous. The relationship can be positive or negative. Values will be between -1 and +1. Correlations do not indicate causation. Pearson correlation is a parametric test. - use numbers, more realistic Spearman correlation is a nonparametric test. - rank order

CORRELATION (____) Correlation tests are used to determine how _______ of a _____________ exists between two sets of data. The data need to be _______. The relationship can be _______ or _______. Values will be between ____ and ____. Correlations do not indicate _______. Pearson correlation is a _______ test. - use numbers, more _______ Spearman correlation is a _______ test. - _______ order

Calculating Effect Size · To calculate an effect size (dor Cohen's d) for a paired-samples t-test: divide the mean difference by the standard deviation of the difference. d= M / sd _________

Calculating Effect Size · To calculate an effect size ( _________) for a paired-samples t-test: divide the _________ difference by the standard _________ of the _________. d= ___ / ______

Calculating and reporting an effect size · As previously stated, p values do not tell us about the size of any effect. · Another way to look at effect size is to think about whether or not the significance differences found should be considered as "clinically different". If a drug can reduce systolic blood pressure and the p value associated with the study is less than .05, we still need to consider if the reduction is clinically significant. This is a judgment call. · In an effort to help the researcher make a judgment about the size of any effect, Cohen's d is often calculated. · Cohen's d = the square root of the mean difference between the groups divided by the pooled standard deviation. · Effect sizes are most commonly interpreted as follows: Effect Size Strength 0.2 small 0.5 medium 0.8 large Cohen's d is only used when the researcher has determined that homogeneity of variance is present.Effect sizes are subject-specific and therefore must be viewed with caution

Calculating and reporting an effect size · As previously stated, p values do not tell us about the size of any _________. · Another way to look at effect size is to think about whether or not the significance differences found should be considered as "_________ different". If a drug can reduce systolic blood pressure and the p value associated with the study is less than .05, we still need to consider if the reduction is _________ significant. This is a _________ call. · In an effort to help the researcher make a judgment about the size of any effect, _________ is often calculated. · Cohen's d = the _________ of the mean _________ between the groups _________ by the _________ standard deviation. · Effect sizes are most commonly interpreted as follows: Effect Size Strength _________ small _________ medium _________ large Cohen's d is only used when the researcher has determined that homogeneity of variance is _________. Effect sizes are _________-specific and therefore must be viewed with _________

Chi Square goodness-of-fit test (Pearson's Goodness of Fit Test) term-114 The chi-square goodness-of-fit test is a nonparametric test used to determine whether observed sample frequencies differ significantly from known frequencies. The chi-square goodness-of-fit test is used with one dichotomous, nominal or ordinal categorical variable. The chi-square goodness-of-fit test is often used with Likert scale data. The chi-square goodness-of-fit test is comparing observed outcomes to known expected outcomes. Since expected outcomes are known, only observed outcomes are collected. ex: p. 74

Chi Square goodness-of-fit test (_________ Goodness of Fit Test) The chi-square goodness-of-fit test is a _________ test used to determine whether _________ sample frequencies differ significantly from _________ frequencies. The chi-square goodness-of-fit test is used with one _________, nominal or ordinal _________ variable. The chi-square goodness-of-fit test is often used with _________ scale data. The chi-square goodness-of-fit test is comparing _________ outcomes to known _________ outcomes. Since expected outcomes are _________, only observed outcomes are _________.

Chi-square test for independence (or association). The X2 test for independence is applied when you have two categorical variables from a single population. Example: Do males and females have different preferences when shown an introductory video on skin cancer

Chi-square test for independence (or association). The X2 test for independence is applied when you have two _________ variables from a _________ population. Example: Do males and females have different _________ when shown an _________ video on skin cancer

NOMINAL SCALE:

Classification of non-numerical data.

Common Assumptions associated with t- tests · Assumptions: o Random sampling o Interval or ratio data o Normality o Homogeneity of Variance

Common Assumptions associated with t- tests · Assumptions: o _________ sampling o _________ or _________ data o _________ o _________ of Variance

Confidence Intervals: -The 95% confidence interval (CI) indicates the observed range in which 95/100 sample statistic values would fall. -CI, , also symbolizes cumulative incidence, a biostatistical measure -The CI is always accompanied by a confidence level, usually the 95% confidence level. This confidence level is similar to selecting .05 as an alpha level. For example, we may find a study reporting a Relative Risk = 1.75 (95% CI= 1.11-2.34). We can interpret this RR as indicative of a 75% increase in risk, and a 95% confidence that the population risk is somewhere between 11% and 134%. -When we are testing the null hypothesis using a t-test, we can also use the CI of the mean difference to determine if we have statistical significance. -If the lower and upper values of the 95% CI does not straddle zero, we can conclude that we have significance at the .05 level. -It is much more common to use p-values instead of CI's to determine statistical significance. However, when dealing with odds ratios and relative risks, the CI's are used in concert with p-values. (See the section on odds ratios and relative risks)

Confidence Intervals: -The 95% confidence interval (CI) indicates the observed __________ in which 95/100 sample statistic values would fall. -CI, , also symbolizes __________ incidence, a biostatistical measure -The CI is always accompanied by a confidence _________, usually the 95% confidence level. This confidence level is similar to selecting _______ as an alpha level. For example, we may find a study reporting a Relative Risk = 1.75 (95% CI= 1.11-2.34). We can interpret this RR as indicative of a ______% increase in risk, and a _____% confidence that the population risk is somewhere between ______% and ______%. -When we are testing the null hypothesis using a _________, we can also use the CI of the mean difference to determine if we have _________ significance. -If the lower and upper values of the 95% CI does not straddle _________, we can conclude that we have significance at the .05 level. -It is much more common to use _________ instead of CI's to determine statistical significance. However, when dealing with odds ratios and relative risks, the _________ are used in concert with p-values. (See the section on odds ratios and relative risks)

Correlation Coefficient A correlation analysis produces a correlation coefficient (r). Coefficient Value Strength of Association 0.1 - 0.3 small correlation 0.3 - 0.5 medium/moderate correlation > 0.5 large/strong correlation

Correlation Coefficient A correlation _______ produces a correlation _______ (____). Coefficient Value Strength of Association _______ - _______ small correlation _______ - _______ medium/moderate correlation >_______ large/strong correlation

Determining Significance of Skewness and Kurtosis · There is no gold standard for assessing the peakedness and symmetry of a data set. · Using descriptive statistics, a z score for skewness and kurtosis can be calculated by dividing the statistical value by the standard error. · Alphas for skewness and kurtosis assessments are usually set at .01 · Because these calculations lead to standardized values, we can compare these values to our normal distribution curve at the .01 level (2.58). None of the values below exceed 2.58 so we do not have significance. -if the values exceeded +/- 2.58 at the .01 level, we would have significance

Determining Significance of Skewness and Kurtosis · There is no gold standard for assessing the _________ and _________ of a data set. · Using descriptive statistics, a _________ for skewness and kurtosis can be calculated by dividing the _________ value by the standard _________. · Alphas for _________ and _________ assessments are usually set at ._________ · Because these calculations lead to _________ values, we can compare these values to our _________ curve at the .01 level (2.58). None of the values below exceed 2.58 so we _________ significance. -if the values exceeded +/- 2.58 at the .01 level, we _________ significance

Do we really need to worry about skewness and kurtosis? · Skewness and kurtosis values are very dependent on the sample size. · The larger the sample, the smaller the skewness and kurtosis values. · Extreme skewness or kurtosis values are red flags when using parametric tests.

Do we really need to worry about skewness and kurtosis? · Skewness and kurtosis values are very dependent on the sample _________. · The _________ the sample, the _________ the skewness and kurtosis values. · Extreme skewness or kurtosis values are red flags when using _________ tests.

Does the regression model fit the data? R is the correlation coefficient. R indicates the correlation strength between the variables. R^2 value = the proportion of variance explained by the independent variable. This value is an indicator or how well your prediction equation will perform. Regression analysis produces an ANOVA The ANOVA indicates if the predictive power is significant.

Does the regression model fit the data? _________ is the correlation coefficient. R indicates the correlation _________ between the variables. R^2 value = the _________ of variance explained by the _________ variable. This value is an _________ or how well your prediction equation will _________. Regression analysis produces an _________ The _________ indicates if the predictive power is significant.

EFFECT SIZE · Calculating effect size for an ANOVA result is often done by calculating omega squared (ω2), or eta squared (η2). · (ω2) = SSb- (dfb) (MSw) /SSt + MSw · Using the ANOVA printout: o SSb: between groups sum of squares o dfb: between groups degrees of freedom o MSw: within groups mean square o SSt: total sum of squares

EFFECT SIZE · Calculating effect size for an ANOVA result is often done by calculating _________ squared (ω2), or _________ squared (η2). · (ω2) = SSb- (dfb) (MSw) /SSt + MSw · Using the ANOVA printout: o _________: between groups sum of squares o _________: between groups degrees of freedom o _________: within groups mean square o _________: total sum of squares

EXAMPLE OF INDEPENDENT T-TEST Continuous variables were analyzed by means of Student's t-test(independent t-test) and the Wilcoxon rank-sum test. Categorical variables were analyzed with the use of Fisher's exact test, and the Holm procedure was used to correct for testing associations with multiple clinical variables. Sensitivity, specificity, and negative and positive predictive values were calculated with the use of established methods. Two-sided P values of less than 0.05 were considered to indicate statistical significance. Confidence intervals for proportions are reported as two-sided exact binomial 95% confidence intervals

EXAMPLE OF INDEPENDENT T-TEST Continuous variables were analyzed by means of Student's t-test(independent t-test) and the Wilcoxon rank-sum test. Categorical variables were analyzed with the use of Fisher's exact test, and the Holm procedure was used to correct for testing associations with multiple clinical variables. Sensitivity, specificity, and negative and positive predictive values were calculated with the use of established methods. Two-sided P values of less than 0.05 were considered to indicate statistical significance. Confidence intervals for proportions are reported as two-sided exact binomial 95% confidence intervals

Each statistical tests carries with it a chance probability, usually set at five percent (.05). Multiple tests inflate the alpha level. This is due to a percentage chance with each individual test. If a study design required multiple statistical tests, the alpha level should be adjusted to account for this inflation -For example, if a researcher examined the effects of three different diets on cholesterol, and ran three t-tests, each t-test would carry with it an alpha. These alphas would be cumulative. A Bonferroni Adjustment (alpha / number of tests) would indicate that in order to reach significance at the .05 level, the p-values would have to be lower than .0167

Each statistical tests carries with it a __________ probability, usually set at __________ percent (.__________). Multiple tests __________ the alpha level. This is due to a percentage __________ with each individual test. If a study design required multiple statistical tests, the alpha level should be __________ to account for this __________ -For example, if a researcher examined the effects of three different diets on cholesterol, and ran three t-tests, each t-test would carry with it an __________. These alphas would be __________. A Bonferroni Adjustment (__________ / __________ ) would indicate that in order to reach significance at the .05 level, the p-values would have to be lower than .__________

Effect Size: -how do we determine if a significant difference is clinically significant -This measure(d, Cohen's d, D, or delta) is often referred to as the standardized effect size, calculated as the mean of group A minus the mean of group B divided by the pooled standard deviation. Interpretation of effect size has been defined as: 0.2 = small effect 0.5 = medium effect 0.8 = large effect -It would be extremely rare to find two samples that do not have any overlap. An effect size of .8 is an indication of 50% overlap. -the larger the effect size the smaller the overlap; the smaller the effect size the larger the overlap

Effect Size: -how do we determine if a significant difference is _________ significant -This measure (d, Cohen's d, D, or delta) is often referred to as the standardized _________ size, calculated as the _________ of group ___ minus the _________ of group ___ divided by the pooled _________ deviation. Interpretation of effect size has been defined as: 0.__ = small effect 0.__ = medium effect 0.__ = large effect -It would be extremely rare to find two samples that do not have any _________. An effect size of .8 is an indication of ____% overlap. -the larger the effect size the _________ the overlap; the smaller the effect size the _________ the overlap

Effect size associated with a t-test · Remember that a p-value does not tell us anything about the size of an effect. · To calculate the effect size, dividethe mean difference between the groups by the pooled standard deviation. d = |M1-M2| / Spooled ( || ) means the absolute value (negative value becomes a positive value) S pooled = square root of s1^2 (n1 - 1) + s2^2 (n2 - 1) / n1 + n2 - 2 _________

Effect size associated with a t-test · Remember that a p-value does not tell us anything about the _________ of an _________. · To calculate the effect size, dividethe _________ difference between the groups by the _________ standard _________. d = _________ / _________ ( || ) means the _________ value (negative value becomes a _________ value) S pooled = square root of s1^2 (n1 - 1) + s2^2 (n2 - 1) / n1 + n2 - 2

Example of a Linear Regression Output · The Model Summary Table indicates the correlation (R). · The R2 value indicates the amount of total variation explained by the independent variable. · Adjusted R2 is an estimate of the explained variance expected in the population.

Example of a Linear Regression Output · The Model Summary Table indicates the _________ (_________). · The R2 value indicates the amount of _________ _________ explained by the _________ variable. · Adjusted R2 is an estimate of the _________ _________ expected in the population.

From a statistical standpoint, Relative Risks (RR) and Odds Ratios (OR) are deemed significant at the .05 level if the RR and/or OR is above 1.0 and the 95% confidence intervals (CI) are both above 1.0; or, RR and/or OR is below 1.0 and the 95% confidence intervals (CI) are both below1.0. However, in an article titled, General Causation and Epidemiological Measures of Risk Size, Nathan Schachtman, Esq.stated that: -The Texas courts have adopted a rule that plaintiffs must offer a statistically significant study, with a risk ratio (RR) greater than two, to show generalcausation. A RR ≤ 2 can be a strong practical argument against specific causation in many cases.

From a statistical standpoint, Relative Risks (RR) and Odds Ratios (OR) are deemed ________ at the .05 level if the RR and/or OR is above ________ and the 95% confidence intervals (CI) are both above ________; or, RR and/or OR is ________ 1.0 and the 95% confidence intervals (CI) are both ________ 1.0. However, in an article titled, General Causation and Epidemiological Measures of Risk Size, Nathan Schachtman, Esq.stated that: -The Texas courts have adopted a rule that plaintiffs must offer a statistically significant study, with a risk ratio (RR) greater than ________, to show general causation. A RR ≤ 2 can be a strong practical argument against specific ________ in many cases.

CONFIDENCE INTERVAL (CI): -95%

Given a sample from a population, this indicates a range in which the population mean is believed to be found. Usually expressed as a ____% .

Homogeneity of Variance: · The equality of variances is sometimes referred to as homogeneity of variance or homoscedasticity. · Levene's test: An F test used to assess the equality of variances between data sets. · A p-value will be produced. · Interpret the p-value against the .05 level. · If the p-value is less than .05 the data are violating the assumption of homogeneity of variance. · Often, statistical outputs will produce p-values to be used when the assumption of homogeneity of variance is met and when the assumption is not met. · In the example in the next table, the Levene's test is significant at the .01 level. Therefore, we conclude that the variances are statistically different. We must then use the row titled Equal Variances Not Assumed when interpreting the t- test.

Homogeneity of Variance: · The equality of variances is sometimes referred to as _________ of variance or homoscedasticity. · _________: An F test used to assess the equality of variances between data sets. · A _________ will be produced. · Interpret the p-value against the .05 level. · If the p-value is _______ than .05 the data are _________ the assumption of homogeneity of variance. · Often, statistical outputs will produce _________ to be used when the assumption of homogeneity of variance is met and when the _________ is not met. · In the example in the next table, the Levene's test is significant at the .01 level. Therefore, we conclude that the variances are statistically _________. We must then use the row titled Equal Variances _________ when interpreting the t- test.

How Do We Know if Assumptions Are Violated? Table: Tests of Normality · The Kolmogorov-Smirnov test and the Shapiro-Wilk test are commonly used to assess normality of the data. · Both tests produce p values. · If any of the p values are less than .01, the data are not normally shaped and the researcher should use the appropriate nonparametric test.

How Do We Know if Assumptions Are Violated? Table: Tests of _________ · The _________ test and the _________ test are commonly used to assess normality of the data. · Both tests produce _________ values. · If any of the p values are less than _________, the data are _________ shaped and the researcher should use the appropriate _________ test.

INDEPENDENT T-TEST · The independent-samples t-test is used to determine if a difference exists between the means of two independent groups on a continuous dependent variable. · This test is also known by a number of different names, including the independent t-test, independent-measures t-test, Student t-test, between-subjects t-test and unpaired t-test. · Assumptions o A single continuous dependent variable that consists of interval or ratio data, but a case has been made to use t-tests with ordinal data. o A single categorical independent variable that consists of two, independent groups. o There cannot be any relationship between the participants in each group. Data from group one cannot be influenced by the members in group two, and visa versa. o Homogeneity of Variance assumptions are met. The difference between variances for both populations is statistically insignificant. o Normal distribution assumptions are met. o The two independent samples are equal in size. · An independent-samples t-test will produce a p-value. The p-value is compared to the alpha value, usually .05. · If homogeneity of variance assumptions are not met, use a Welch's t-test (does not require equality of variance). · There are also t-tests to use with unequal sample sizes.

INDEPENDENT T-TEST · The independent-samples t-test is used to determine if a _________ exists between the _________ of two independent groups on a _________ dependent variable. · This test is also known by a number of different names, including the independent t-test, independent-_________ t-test, _________ t-test, _________-_________ t-test and _________ t-test. · Assumptions o A single continuous _________ variable that consists of _________ or _________ data, but a case has been made to use t-tests with _________ data. o A single _________ independent variable that consists of two, _________ groups. o There cannot be any _________ between the participants in each group. Data from group one cannot be _________ by the members in group two, and visa versa. o _________ of Variance assumptions are met. The difference between variances for both populations is statistically _________. o Normal distribution _________ are met. o The two independent samples are _________ in size. · An independent-samples t-test will produce a _________. The p-value is compared to the _________ value, usually .05. · If homogeneity of variance assumptions are not met, use a _________ t-test (does not require equality of variance). · There are also t-tests to use with _________ sample sizes.

If a statistical test indicates that the probability of obtaining a particular data set was equal to or greater than the alpha level set by the researcher, the null hypothesis is accepted. If the statistical test indicates that the probability of obtaining a particular data set was less than the alpha level, the null hypothesis is rejected. But we must exercise caution. We cannot conclude that we proved or disproved a hypothesis. If the statistical test indicates that the probability of obtaining a data set was less than the alpha level (i.e. .05), it simply means that the data set would randomly occur less than 5 percent of the time.

If a statistical test indicates that the probability of obtaining a particular data set was equal to or greater than the alpha level set by the researcher, the null hypothesis is __________. If the statistical test indicates that the probability of obtaining a particular data set was less than the alpha level, the null hypothesis is __________. But we must exercise __________. We cannot conclude that we __________ or __________ a hypothesis. If the statistical test indicates that the probability of obtaining a data set was less than the alpha level (i.e. .05), it simply means that the data set would __________ occur __________ than 5 percent of the time.

Important Facts to Remember: The research method does not allow us to determine, without any doubt, that there is or is not a real difference. We can only talk about probabilities associated with differences we find. Researchers always run the risk of false findings. P values simply measure the degree of surprise associated with a finding. P values do not indicate effect size. Alpha levels are arbitrary. Too little data or too much data can produce false findings. Tainted data sets produce tainted results. (Garbage in, garbage out) Inflated alphas may occur with multiple comparisons. Statistical power analyses are useful when determining the size of a data set. But, power calculations are not common. Confounding variables are usually present and uncontrolled. Replications of results are not common. There is widespread misuse of statistical methods and interpretation of results. The Bayesian approach can be difficult to understand.

Important Facts to Remember: The research method does _________ allow us to determine, without any doubt, that there is or is not a real _________. We can only talk about _________ associated with differences we find. Researchers always run the risk of _________ findings. P values simply measure the ___________ of surprise associated with a ___________. P values do not indicate ___________ size. Alpha levels are ___________. Too little data or too much data can produce ___________ findings. Tainted data sets produce ___________ results. (Garbage in, ___________ out) Inflated alphas may occur with ___________ comparisons. Statistical power analyses are useful when determining the ___________ of a data set. But, power calculations are not ___________. Confounding variables are usually present and ___________. Replications of results are not ___________. There is widespread ___________ of statistical methods and interpretation of results. The Bayesian approach can be difficult to ___________.

In Order For Regression to be Useful: Must have a moderate to strong correlation ( .6 to .7) Must have a significant ANOVA.

In Order For Regression to be Useful: Must have a _________ to _________ correlation ( .__ to .__) Must have a _________ ANOVA.

Kruskal-Wallis H Test This is a test used with rank order data stemming from three or more groups. Example: attitudes (Likert Scale) towards salary across three groups ( LPN's, RN's and NP's). Realize that if each statement is analyzed, each analysis carries with it an alpha level, usually .05, that will become inflated with each additional statement analysis. Thus, it is important for the survey instrument to not only be valid and reliable, but also be limited to the minimum number of statements needed to address the hypothesis. Likert responses that all address the same question can be summed and the sums treated as interval data. However, this process is complicated and beyond the scope of this text.

Kruskal-Wallis H Test This is a test used with _________ order data stemming from _________ or more groups. Example: attitudes (Likert Scale) towards salary across three groups ( LPN's, RN's and NP's). Realize that if each statement is analyzed, each analysis carries with it an _________ level, usually .05, that will become _________ with each additional statement analysis. Thus, it is important for the survey instrument to not only be _________ and reliable, but also be limited to the _________ number of statements needed to address the hypothesis. Likert responses that all address the same question can be summed and the sums treated as _________ data. However, this process is complicated and beyond the scope of this text.

LINEAR REGRESSION · First tests for a correlation between the variables. · Produces a line of best fit. Given a graph of all data points, the regression analysis produces a line that lies as close as possible to all data points. · This line is then used to create a regression equation. · Produces an ANOVA used to determine if the prediction equation is significant. · The equation allows for the prediction of y using x. For example, if we are interested in predicting weight (y) given height (x), the regression equation allows for a particular height to be placed in the equation. · Used for predicting one variable based on one or more existing variables. o Are cholesterol levels explained by cholesterol consumption? o Are triglyceride levels explained by cholesterol consumption? · The value you are predicting is the dependent variable and the value you know is the independent variable. · Determine how much of the variation in the dependent variable is explained by the independent variable. · Often, your goal is not to make predictions, but to determine whether differences in your independent variable can help explain the differences in your dependent variable.

LINEAR REGRESSION · First tests for a _________ between the variables. · Produces a line of best _________. Given a graph of all data points, the _________ analysis produces a line that lies as close as possible to all _________ points. · This line is then used to create a _________ equation. · Produces an _________ used to determine if the prediction equation is _________. · The equation allows for the _________ of ___ using x. For example, if we are interested in predicting weight (y) given height (x), the regression equation allows for a particular height to be placed in the equation. · Used for predicting one _________ based on one or more _________ variables. o Are cholesterol levels explained by cholesterol consumption? o Are triglyceride levels explained by cholesterol consumption? · The value you are predicting is the _________ variable and the value you know is the _________ variable. · Determine how much of the variation in the _________ variable is explained by the _________ variable. · Often, your goal is not to make _________, but to determine whether _________ in your independent variable can help explain the _________ in your dependent variable.

Mann-Whitney U Test(Wilcoxon-Mann-Whitney test) This is a test used with rank order data coming from two groups. Example: Does amount of exercise indicated via a Likert Scale, differed based on gender.

Mann-Whitney U Test(Wilcoxon-Mann-Whitney test) This is a test used with _________ order data coming from _________ groups. Example: Does amount of exercise indicated via a Likert Scale, differed based on gender.

More power is needed if alpha is set below .05. (p < 0.01 or <0.001). If the difference between data sets is small, more power is required to detect such a difference. To increase power from 0.80 to 0.90, multiply N by 1.33. To move alpha from α = 0.05 to 0.01 multiply N by 1.5. To move from 0.80 power and α = 0.05 to 0.90 power and α = 0.01, double N. As you can see, raising the power of a study will require significant increases in the sample size. Power is most often set between 0.80-0.90. Specifically, power is an indication of the experimental design to detect an effect of a specified size. In the examples below, note that the researchers defined the specific effect size, the desired power and the alpha level. Too much power can lead to a Type II error.

More power is needed if alpha is set below ._________. (p < 0.01 or <0.001). If the difference between data sets is _________, _________ power is required to detect such a difference. To increase power from 0.80 to 0.90, multiply N by _________. To move alpha from α = 0.05 to 0.01 multiply N by _________. To move from 0.80 power and α = 0.05 to 0.90 power and α = 0.01, _________ N. As you can see, raising the power of a study will require significant _________ in the sample size. Power is most often set between _________-_________. Specifically, power is an indication of the experimental design to _________ an effect of a _________ size. In the examples below, note that the researchers defined the specific effect size, the desired power and the alpha level. Too much power can lead to a _________ error.

Adjusted OR:

Most ORs are generated through logistical regression. If confounding variables are suspected, the linear regression can be programmed to treat these variables as covariates. these have taken into account the confounding variables identified by the researcher.

Most test statistics are reduced to the difference between measures of central tendencies divided by the size of these differences due to random error. -Usually this is reflected as the mean difference (numerator) divided by a modified version of variance (denominator). The denominator is also affected by the size of the samples.

Most test statistics are reduced to the _________ between measures of __________ tendencies divided by the __________ of these differences due to __________ error. -Usually this is reflected as the __________ difference (__________) divided by a modified version of __________ (__________). The denominator is also affected by the size of the __________.

NNT: Is it worthwhile to treat? How many patients need to be treated before you achieve a positive result? Also see NNH (Numbers needed to Harm) NNT= N/ARR N= number of subjects ARR = improvement rate in intervention group - improvement rate in control group

NNT: Is it worthwhile to treat? How many patients need to be treated before you achieve a positive result? Also see NNH (Numbers needed to Harm) NNT= ____/_________ N= number of subjects ARR = _________ rate in _________ group - _________ rate in _________ group

NON PARAMETRIC ONE-WAY ANOVA Kruskal-Wallis H test · Nonparametric version of a One-way ANOVA · Comparison of medians · Much less information about the data is provided. · Assessment of boxplots can be used as evidence of similar score distributions. p. 102

NON PARAMETRIC ONE-WAY ANOVA _________ H test · Nonparametric version of a One-way ANOVA · Comparison of _________ · Much _________ information about the data is provided. · Assessment of _________ can be used as evidence of similar score distributions.

Ninety five percent of the area under the bell curve is contained within +/- 1.96 standard deviations. Five percent of the remaining area lies outside of +/-1.96 standard deviations, with 2.5 percent of the area residing in each end of the curve (at the tails). So, if our sample mean is converted to a z score, and that z score is equal to +/- 1.96, we can conclude that this z score would only randomly occur 5 percent of the time. Recall that the p-value indicates the probability level for a statistical occurrence. If we compare a sample mean to the population mean and the sample mean is greater than +/-1.96 standard deviations away from the population mean, we can conclude that there was less than a 5% chance that this difference randomly occurred. It is this logic that allows researchers to postulate that something else may have caused the sample mean to be so different. -If the researcher decides that there is little chance that any change from a treatment can result in a movement of the mean in one direction only, the alpha can be loaded on only one tail of the bell curve. This would result in a .05 alpha area depicted in the graph below.

Ninety five percent of the area under the bell curve is contained within +/- __________ standard deviations. __________ percent of the remaining area lies outside of +/-1.96 standard deviations, with 2.5 percent of the area residing in each end of the curve (at the __________). So, if our sample __________ is converted to a __________ , and that z score is equal to +/- 1.96, we can conclude that this z score would only randomly occur ___ percent of the time. Recall that the p-value indicates the __________ level for a __________ occurrence. If we compare a sample __________ to the population __________ and the sample mean is __________ than +/-1.96 standard deviations away from the population mean, we can conclude that there was less than a 5% chance that this difference __________ occurred. It is this logic that allows researchers to postulate that something else may have caused the sample mean to be so __________. -If the researcher decides that there is little __________ that any change from a treatment can result in a movement of the mean in one direction only, the __________ can be loaded on only one _______ of the bell curve. This would result in a .05 alpha area depicted in the graph below.

Spearman Correlation:

Nonparametric measure of statistical dependence between two variables.

ONE-WAY MANOVA The one-way multivariate analysis of variance (MANOVA) is an extension of the ANOVA with the addition of another dependent variable, for a total of two dependent variables. Reporting a significant MANOVA result: F(3, 68) = 20.554, p < .001; partial η2 = .627 A significant MANOVA result will be followed by univariate ANOVAs. For any univariate ANOVAs that are significant, appropriate post hoc tests will follow (Tukey, Bonferroni, etc.) Assumptions are numerous and beyond the scope of this test.

ONE-WAY MANOVA The one-way _________ analysis of variance (MANOVA) is an extension of the ANOVA with the addition of another _________ variable, for a total of two _________ variables. Reporting a significant MANOVA result: F(3, 68) = 20.554, p < .001; partial η2 = .627 A significant MANOVA result will be followed by _________ ANOVAs. For any univariate ANOVAs that are significant, appropriate_________ will follow (Tukey, Bonferroni, etc.) Assumptions are numerous and beyond the scope of this test.

Other Ways To Test For Normal Distributions: · Histograms: It is acceptable to simply eyeball histograms generated from the descriptive statistics. If the data approach a normal distribution shape, the researcher can assume that the assumption has been met. However, this type of casual evaluation should be avoided. · QQ Plots: A Q-Q Plot will produce a diagonal line with the data points dispersed along this line. The closer the data points to the line, the more normal the distribution. Again, this is an eyeball test. · Shapiro-Wilk Test: This test is designed to evaluate the shape of the data. A p-value will be produced for each set of data. Interpret the p-value against the .05 level. If the p-value is less than .05 the data are not normally shaped. Nonparametric tests would then be used.The Shapiro-Wilk test is used with small sample sizes, usually less than N=50. (See Table below) · Kolmogorov-Smirnov Test:This test is designed to evaluate the shape of the data. A p-value will be produced for each set of data. Interpret the p-value against the .05 level. If the p-value is less than .05 the data are not normally shaped. Nonparametric tests would then be used. The Kolmogorov-Smirnov test is used with large samples of N=50 · Skewness and Kurtosis z-scores: Included in the descriptive statistics table are skewness and kurtosis values and their stand error values. Simply divide the value by the standard error to produce a z-score. (See Table below) Most sources indicate a statistical significance level be set at .01 (z-score of ±2.58). If the z-score is within ±2.58, the data are normally distributed. · If normality is violated researchers sometimes transform the data. But in most cases, a non-parametric test is used.

Other Ways To Test For Normal Distributions: · _________: It is acceptable to simply eyeball these generated from the descriptive statistics. If the data approach a normal distribution shape, the researcher can assume that the assumption has been met. However, this type of casual evaluation should be avoided. · _________: will produce a diagonal line with the data points dispersed along this line. The closer the data points to the line, the more normal the distribution. Again, this is an eyeball test. · _________: This test is designed to evaluate the shape of the data. A p-value will be produced for each set of data. Interpret the p-value against the .05 level. If the p-value is less than .05 the data are not normally shaped. Nonparametric tests would then be used. This test is used with small sample sizes, usually less than N=50. (See Table below) · _________:This test is designed to evaluate the shape of the data. A p-value will be produced for each set of data. Interpret the p-value against the .05 level. If the p-value is less than .05 the data are not normally shaped. Nonparametric tests would then be used. This is used with large samples of N=50 · _________: Included in the descriptive statistics table are these values and their stand error values. Simply divide the _________ by the standard _________ to produce a z-score. (See Table below) Most sources indicate a statistical significance level be set at .01 (z-score of ±2.58). If the z-score is within ±2.58, the data are _________ distributed. · If normality is violated researchers sometimes transform the data. But in most cases, a non-parametric test is used.

Pearson Correlation SPSS Output The Kolmogorov-Smirnov and Shapiro-Wilk tests are used to determine if the data sets have a normal shape. The result of the Shapiro-Wilk test is most commonly used. Examine the Sig. column. These values indicate the p values. Note that if all p values are greater than .05, indicating no significant deviation from normal. p. 71

Pearson Correlation SPSS Output The _______ and _______ tests are used to determine if the data sets have a normal shape. The result of the _______ test is most commonly used. Examine the Sig. column. These values indicate the _______. Note that if all p values are greater than ._______, indicate no _______ deviation from normal.

Pearson's Chi-square Test for Independence Example:A physician's office staff is trying to decide if male and female patients differ in their choice between two instructional videos. Ho: The proportion of patients that select each video is independent of gender. Chi Square Formula: |Observed frequencies- expected frequencies|2/ Expected: |O-E|2/E If the computed test statistic is large, then the observed and expected values are not close and the model is a poor fit to the data.

Pearson's Chi-square Test for Independence Chi Square Formula: |_________ frequencies- _________ frequencies| 2 / Expected : |O-E| 2 /E If the computed test statistic is _________, then the observed and expected values are not close and the model is a _________ fit to the data.

Possible Statistical Outcomes There is no real difference and we find no significant difference. There is no real difference but we find a significant difference. (Type I error) There is a real difference but we fail to find a significant difference. (Type II Error) There is a real difference and we find a significant difference.

Possible Statistical Outcomes There is no real difference and we find _________ significant difference. There is no real difference but we _________ a significant difference. (_________error) There is a real difference but we _________ to find a significant difference. (_________ Error) There is a real difference and we _________ a significant difference.

Power: -The statistical power of a study is an indication of the study design's capability of avoiding false negatives, or making a Type II error, when the researchers are trying to detect a specific effect size. -As we have seen, researchers are trying to create a healthy balance between the possibilities of Type I and Type II errors. -Researchers use the expected variance, the alpha level, and the magnitude of the effect size to determine how many subjects are needed to reach a certain power. -Power is not reported as often as it should be reported. This is in part due to researchers selecting sample sizes based on convenience. -Any study incorporating small sample sizes and failing to report the power of the study should be viewed with caution.

Power: -The statistical power of a study is an indication of the study design's capability of avoiding _________ negatives, or making a _________ error, when the researchers are trying to detect a specific _________ size. -As we have seen, researchers are trying to create a healthy balance between the possibilities of _________ and _________ errors. -Researchers use the expected _________, the _________ level, and the magnitude of the _________ size to determine how many subjects are needed to reach a certain power. -Power is _________ reported as often as it should be reported. This is in part due to researchers selecting sample sizes based on _________. -Any study incorporating small sample sizes and failing to report the power of the study should be viewed with _________.

Problems with Convenience Sampling Convenience sampling is one of the most common types of sampling used in the research process. However, as stated by Mora (2010) "doing statistical testing in a convenience sample is pointless since the assumptions about probability sampling are violated

Problems with Convenience Sampling Convenience sampling is one of the most __________ types of sampling used in the research process. However, as stated by Mora (2010) "doing statistical testing in a convenience sample is __________ since the assumptions about probability sampling are __________

Problems with Null Hypothesis Significance Testing: -Researchers want to know if the null hypothesis is true or false. NHST tells us the probability of obtaining our data set given the null hypothesis is true. Note the subtle difference between the objective and the output. -All data sets are different. As we begin to understand how we use the number of subjects and the number of comparisons to determine the cut off for reaching statistical significance, we will find that given a large enough sample, researchers will eventually reach significance in order to gain any useful information from NHST, researchers must assure that: -Randomization has occurred. -Samples are of a reasonable size neither too small, nor too large to generate meaningless results. -A limited number of variables are being examined. -The selected alpha level takes into consideration the type of research and hypotheses being tested. -P-Values are accompanied by measures of effect size and/or confidence intervals.

Problems with Null Hypothesis Significance Testing: -Researchers want to know if the null hypothesis is __________ or __________. NHST tells us the __________ of __________ our data set given the null hypothesis is __________. Note the subtle difference between the __________ and the __________. -All data sets are different. As we begin to understand how we use the number of subjects and the number of comparisons to determine the cut off for reaching statistical significance, we will find that given a __________ enough sample, researchers will eventually reach __________ in order to gain any useful information from NHST, researchers must assure that: -__________ has occurred. -Samples are of a reasonable size neither too __________, nor too __________ to generate meaningless results. -A __________ number of variables are being examined. -The selected __________ level takes into consideration the type of research and __________ being tested. -P-Values are accompanied by measures of effect __________ and/or __________ intervals.

Problems: Convenience sampling cannot be taken to be representative of the population. Convenience samples do not produce representative results. Convenience samples often contain outliers that can adversely affect statistical testing.

Problems: Convenience sampling cannot be taken to be __________ of the population. Convenience samples do not produce __________ results. Convenience samples often contain __________ that can __________ affect statistical testing.

RELATIVE RISKS AND ODDS RATIOS A great deal of medical research examines the association between a condition and a possible risk factor. Odds Ratios (OR) are reported when the experimental design involves a retrospective approach (case-control). "OR's were developed to deal with case-control studies examining conditions that are rare or slow developing.....where prospective studies would be impractical." Boslaugh, S. (2013) Statistics in a Nutshell. Relative risks (RR) are reported when the experimental design involves a prospective approach (cohort). RRs and ORs are interpreted as follows: o .01 to .99 values indicate a decrease in risk or odds. o 1.01-infinity indicates an increase in risk or odds. o The value of 1.0 indicates no increase or decrease in risk or odds. o A RR or OR of 1.3 indicates a 30% increase in the risk or odds. o An RR or OR of 2.0 indicates a 100% increase in risk or odds. We would say that the risk or odds doubled. o An RR or OR of 0.7 indicates a 30% decrease in risk or odds. If the range contains the value 1.0, the RR or OR is not significant. 95% confidence intervals (CI) are always reported with RRs and ORs. The 95% confidence interval indicates the range containing the population RR or OR. The researcher is 95% confident in the accuracy of this range. What Does the Law Say about Relative Risks? Numerous sources infer that OR's and RR's should be at least 2.0 before being considered as an indication of a relationship. Use Caution When Interpreting Odds Ratios · When the relative risk is very low, an odds ratio will produce a similar value to the RR. · As risk rises, the odds ratio will produce an overestimate of the true risk.

RELATIVE RISKS AND ODDS RATIOS A great deal of medical research examines the association between a _________ and a possible _________ factor. Odds Ratios (OR) are reported when the experimental design involves a _________ approach (_________-control). "OR's were developed to deal with _________ studies examining conditions that are _______ or _______ developing.....where prospective studies would be _________." Boslaugh, S. (2013) Statistics in a Nutshell. Relative risks (RR) are reported when the experimental design involves a _________ approach (_________). RRs and ORs are interpreted as follows: o .01 to .99 values indicate a _________ in risk or odds. o 1.01-infinity indicates an _________ in risk or odds. o The value of 1.0 indicates no _________ or _________ in risk or odds. o A RR or OR of 1.3 indicates a _________% _________ in the risk or odds. o An RR or OR of 2.0 indicates a _________% _________ in risk or odds. We would say that the risk or odds _________. o An RR or OR of 0.7 indicates a _________% _________ in risk or odds. If the range contains the value 1.0, the RR or OR is _________. _________ are always reported with RRs and ORs. The 95% confidence interval indicates the _________ containing the _________ RR or OR. The researcher is _________% confident in the accuracy of this _________. What Does the Law Say about Relative Risks? Numerous sources infer that OR's and RR's should be at least _________ before being considered as an _________ of a relationship. Use Caution When Interpreting Odds Ratios · When the relative risk is very _________, an odds ratio will produce a similar value to the _________. · As risk rises, the odds ratio will produce an _________ of the true risk.

RELIABILITY Reliability tests produce a reliability coefficient. The coefficient value (R) will lie between 0-1. **A capital R also symbolizes regression. Common interpretations of reliability coefficients: o .90 and above: Excellent reliability o .80 - .90 Very good for a classroom test o .70 - .80 Good o .60 - .70 Somewhat low o .50 - .60 Low o .50 or below- Questionable reliability -ex. p. 67

RELIABILITY Reliability tests produce a reliability _______. The coefficient value (_______) will lie between _______. **A capital R also symbolizes _______. Common interpretations of reliability coefficients: o .90 and above: _______ reliability o .80 - .90 _______ for a classroom test o .70 - .80 _______ o .60 - .70 Somewhat _______ o .50 - .60 _______ o .50 or below- _______ reliability

ORDINAL SCALE:

Rank ordering without concern for equal distances between ranks.

SAMPLING Sampling is a key component of a research project. If not done properly, research results can become meaningless. Sampling techniques are numerous. Some of the more common techniques include the following:

SAMPLING Sampling is a __________ component of a research project. If not done properly, research results can become __________. Sampling techniques are __________.

When the P-value is equal to or less than the alpha value.

STATISTICAL SIGNIFICANCE:

SURVEYS Prior to using a survey instrument, the validity and reliability of the instrument must be measured. Existing survey instruments should report the validity and reliability associated with the instrument. If the survey instrument is being used for the first time, the initial assessment should be a measure of the validity and reliability of the instrument. A vast majority of surveys use a Likert scale that involves subjects selecting from predetermined choices that are linked to a numerical value. For example, a common Likert scale survey statement associated with student evaluations might read, These numerical values are ordinal in nature. The values represent nothing more than a rank order. Therefore, descriptive statistics, and a non-parametric analysis of the responses, usually a form of Chi Square or a rank based test such as the Mann-Whitney-Wilcoxon (MWW) test or a Kruskal-Wallis test, is the typical statistical protocol.

SURVEYS Prior to using a survey instrument, the _________ and _________ of the instrument must be measured. Existing survey instruments should report the _________ and _________ associated with the instrument. If the survey instrument is being used for the first time, the _________ assessment should be a measure of the validity and reliability of the instrument. A vast majority of surveys use a _________ scale that involves subjects selecting from _________ choices that are linked to a _________ value. For example, a common Likert scale survey statement associated with student evaluations might read, These numerical values are _________ in nature. The values represent nothing more than a _________ order. Therefore, _________ statistics, and a non-parametric analysis of the responses, usually a form of _________ or a rank based test such as the _________ test or a _________ test, is the typical statistical protocol.

TP: true positive TN: true negative FP: false positive FN: false negative Sensitivity (SENS) = TP / (TP+FN) Specificity (SPEC) = TN / (FP+TN) Prevalence = (TP+FN) / (TP+FN+FP+TN) Predictive value positive = TP / (TP+FP) Predictive value negative = TN / (FN+TN) Positive Likelihood Ratio = SENS / (1-SPEC) Negative Likelihood Ratio = (1-SENS) / SPEC Pre-test Probability = Prevalence Pre-test Odds = Pre-test Prob / (1 - Pre-test Prob) Post-test Odds = Pre-test Odds x Likelihood Ratio Post-test Probability = Post-test Odds / (1 + Post-test Odds)

TP: true _________ TN: true _________ FP: false _________ FN: false _________ _________ = TP / (TP+FN) _________ = TN / (FP+TN) _________ = (TP+FN) / (TP+FN+FP+TN) _________ = TP / (TP+FP) _________ = TN / (FN+TN) _________ = SENS / (1-SPEC) _________ = (1-SENS) / SPEC _________ = Prevalence _________= Pre-test Prob / (1 - Pre-test Prob) _________ = Pre-test Odds x Likelihood Ratio _________ = Post-test Odds / (1 + Post-test Odds)

Sensitivity = true positives/true positives + false negatives Specificity = true negatives / true negatives+ false positives Positive Predictive Value a/ a+b (true pos/ true pos+false pos) Negative Predictive Value d/c+d (true neg/false neg+ true neg) 100% specificity correctly identifies all patients without the disease. 70% specificity correctly reports 70% of patients as true negatives, but 30% patients without the disease are incorrectly identified as test positive (false positives). "The lower the pretest probability of a disease, the lower the predictive value of a positive test will be regardless of the sensitivity and specificity."

Sensitivity = true _________/true _________ + false _________ Specificity = true _________ / true _________+ false _________ Positive Predictive Value a/ a+b (true _________/ true _________+false _________) Negative Predictive Value d/c+d (true _________/false _________+ true _________) 100% specificity _________ identifies all patients _________ the disease. 70% specificity _________ reports 70% of patients as true _________, but 30% patients without the disease are _________ identified as test positive (_________ positives). "The lower the pretest _________ of a disease, the lower the _________ value of a positive test will be regardless of the _________ and _________." _________

Sensitivity: Detecting a true positive. (true positive/ TP+ False neg) Specificity: Detecting a true negative d/d+b (true neg/ true neg+false pos) _________

Sensitivity: Detecting a true _________ (true _________/ _________+ _________) Specificity: Detecting a true _________ d/d+b (_________/_________+_________)

Some factors affecting power: -alpha -the magnitude of the effect or the true difference -the standard deviations of the distributions -the sample size

Some factors affecting power: -_________ -the _________ of the _________ or the true _________ -the standard _________ of the distributions -the sample _________

Statistical Packages There are numerous statistical packages used to examine data sets (SPSS, SAS, R, etc). The use of the Statistical Package for the Social Sciences (SPSS) is commonly found throughout the health sciences literature. The outputs presented in this chapter were generated through the use of SPSS. Note that SPSS outputs label the test statistic value as "Statistic", the degrees of freedom as "df", and the p value as "Sig."

Statistical Packages There are _______ statistical packages used to examine data sets (_______, _______, _______, etc). The use of the Statistical Package for the Social Sciences (SPSS) is commonly found throughout the health sciences _______. The outputs presented in this chapter were generated through the use of SPSS. Note that SPSS outputs label the test statistic value as "_______", the degrees of freedom as "_______", and the p value as "_______."

CONFOUNDMENT:

Statistical confusion caused by a variable(s) that is/are not controlled by the researcher.

POST HOC TESTS:

Statistical tests designed to compare mean differences after a significant analysis of variance (ANOVA) is found.

CATEGORIES OF STATISTICAL TESTS: -PARAMETRIC TESTS:

Statistical tests that are designed to be used with data that are determined to be normally shaped and normally distributed.

CATEGORIES OF STATISTICAL TESTS: -NONPARAMETRIC TESTS:

Statistical tests that are distribution free; data do not have to conform to defined distribution characteristics.

Statistics and probability are not intuitive. We tend to jump to conclusions. We tend to make overly strong conclusions from limited data. We have incorrect intuitive feelings about probability. We see patterns in random data. We don't realize coincidences are common.

Statistics and probability are not _________. We tend to jump to _________. We tend to make overly strong conclusions from _________ data. We have _________ intuitive feelings about probability. We see patterns in _________ data. We don't realize _________ are common.

Strength of association · The chi-square test does not indicate the effect size, or magnitude of the association. · Two measures that do provide measures of effect size are Phi (φ) and Cramer's V. The strength of association results may appear in the literature as: φ = 0.375, p= .02. · Phi and Cramer's V are indications of the strength of the association between the two variables. they're a form of symmetric measures _________

Strength of association · The chi-square test does not indicate the _________ size, or _________ of the association. · Two measures that do provide measures of effect size are _________ (φ) and _________ V. The strength of association results may appear in the literature as: φ = 0.375, p= .02. · Phi and Cramer's V are indications of the _________ of the _________ between the two variables. they're a form of _________ measures

TWO-WAY ANOVA · A two-way ANOVA is used when the experimental design contains two independent variables. · For example, we may be interested in examining critical thinking knowledge differences between gender and alcohol consumption. · The assumptions for a one-way ANOVA apply to the two-way ANOVA · The SPSS version of a two-way ANOVA will provide: o descriptive statistics o Levene's test results o This result of the simple main effect. The simple main effect (Univariate test) provides p values for the two independent variables · The two-way ANOVA will determine whether either of the two independent variables or their interaction are statistically significant. · The table labeled Tests of Between-Subjects Effectsis used to determine significance. · The Levene's test p value of .071 is not significant at the .05 level, so we can assume that the variances are not significantly different. · Therefore, we can proceed with the parametric two-way ANOVA. · The independent variables are: Gender (2) and Alchol (2). · The dependent variable is critical thinking test score.

TWO-WAY ANOVA · A two-way ANOVA is used when the experimental design contains two _________ variables. · For example, we may be interested in examining critical thinking knowledge differences between gender and alcohol consumption. · The assumptions for a one-way ANOVA apply to the two-way ANOVA · The SPSS version of a two-way ANOVA will provide: o _________ statistics o _________ test results o This result of the simple main effect. The simple main effect (_________ test) provides p values for the two _________ variables · The two-way ANOVA will determine whether either of the two independent variables or their interaction are statistically _________. · The table labeled Tests of _________-Subjects Effects is used to determine _________. · The Levene's test p value of .071 is not significant at the .05 level, so we can assume that the variances are _________ significantly different. · Therefore, we can proceed with the _________ two-way ANOVA. · The independent variables are: Gender (2) and Alchol (2). · The dependent variable is critical thinking test score.

Rejecting the null hypothesis when it should not be rejected. (False Positive)

TYPE I ERROR:

Accepting the null hypothesis when it should be rejected. (False Negative)

TYPE II ERROR:

Test for Homogeneity (Equality) of Variance · The SPSS generated results from a Levene's test for equality of variances is presented below. · The first two columns indicate the Levene's F value and the p value (Sig). A p value of .09 is not significant at the .05 level, so variances are assumed to be equal.

Test for Homogeneity (Equality) of Variance · The SPSS generated results from a _________ test for equality of variances is presented below. · The first two columns indicate the Levene's ____ value and the _________ (_________). A p value of .09 is _________ at the .05 level, so variances are assumed to be _________.

REPEATED MEASURES DESIGN:

Tests designed to evaluate two or more data sets that come from the same subject sample.

Tests of Normality These tests reveal a non-significant deviation from normality. The Pearson (parametric) test is appropriate. Shapiro-Wilk: Kolmogorov-Smirnov: -ex: p. 70

Tests of Normality These tests reveal a _______ deviation from normality. The _______ (parametric) test is appropriate. _______: _______:

Tests of Normality · As discussed in Chapter 6, a common assumption of parametric statistical techniques requires data sets approach normal distributions. · Skewness and kurtosis values indicate the shape of a data set. · An examination of the skewness and kurtosis values allows for a determination of normality to be made. · Two of the most common tests used to assess skewness and kurtosis are the Kolmogorov-Smirnov test (used with 50+ subjects), the Shapiro-Wilk test (used with less than 50 subjects). · The Anderson-Darling test and the Cramer Von-Mises test are also seen in the literature. · Often, stem and leaf plots and QQ plots are used to assess normality. · Each test has its own strengths and weaknesses. All are useful for use when interpreting normality of the data sets. · As seen in the following table, a p value (sig) is produced for each data set. · A significant p-value, usually less than .01, indicates a data set that is not normal in shape. · In the example below, Data Set 1 has been found to be abnormal in shape. Therefore, statistical tests requiring normality cannot be used.

Tests of Normality · As discussed in Chapter 6, a common assumption of parametric statistical techniques requires data sets approach _________ distributions. · _________ and _________ values indicate the shape of a data set. · An examination of the skewness and kurtosis values allows for a determination of _________ to be made. · Two of the most common tests used to assess skewness and kurtosis are the _________ test (used with 50+ subjects), the _________ (used with less than 50 subjects). · The _________ test and the _________ test are also seen in the literature. · Often, stem and leaf plots and QQ plots are used to assess _________. · Each test has its own strengths and weaknesses. All are useful for use when interpreting _________ of the data sets. · As seen in the following table, a p value (_________) is produced for each data set. · A significant p-value, usually less than ._____, indicates a data set that is _________ in shape. · In the example below, Data Set 1 has been found to be abnormal in shape. Therefore, statistical tests requiring normality _________ be used.

TWO-TAILED TEST:

The Alpha level set by the researcher is halved and then applied to both tails of the normal distribution.

The Pearson correlation or the Spearman rank-order correlation Person correlation is a parametric test Spearman correlation is a nonparametric test. Both tests calculate a coefficient, ( ρ ...pronounced "rho") ρ (rho) is a measure of the strength and direction of the association between two variables. 10.29 Example: The admissions committee for an Occupational Therapy program is interested in determining the correlation between GRE scores and GPA.

The Pearson correlation or the Spearman rank-order correlation Person correlation is a _______ test Spearman correlation is a _______ test. Both tests calculate a _______, ( ___ ...pronounced "_______") ____ (rho) is a measure of the _______ and _______ of the association between _______ variables. 10.29 Example:The admissions committee for an Occupational Therapy program is interested in determining the correlation between GRE scores and GPA.

QUARTILE:

The data points contained in an area equal to one quarter of the entire data set.

TREATMENT GROUP:

The group of subjects exposed to the treatment.

CONTROL GROUP:

The group of subjects that are not subjected to any treatment.

EXPERIMENTAL GROUP:

The group of subjects that receive a treatment.

NULL HYPOTHESIS:

The hypothesis that states there is no statistical difference between the data sets.

KRUSKAL-WALLIS TEST:

The non-parametric equivalent to the one-way ANOVA.

The normal curve is sectioned by standard deviations. If we take random samples from a population, the area under the bell curve associated with the sample will be representative of the probabilities associated with the likelihood of obtaining a particular sample mean. For example, 68 percent of the means of random samples of the population will fall within plus or minus one standard deviation of the population mean. -Examination of the percentage of area contained within two standard deviations +/- indicates that 95.4 percent of the sample means exist within this area of the normal curve. -The probability of obtaining a mean value residing outside +/- two standard deviations is only 4.6 percent.

The normal curve is sectioned by __________ __________. If we take random samples from a population, the area under the bell curve associated with the sample will be representative of the __________ associated with the likelihood of obtaining a particular sample __________. For example, __________ percent of the means of random samples of the population will fall within plus or minus __________ standard deviation of the population mean. -Examination of the percentage of area contained within __________ standard deviations +/- indicates that __________ percent of the sample means exist within this area of the normal curve. -The probability of obtaining a mean value residing __________ +/- two standard deviations is only __________ percent.

NUMBER NEEDED TO HARM (NNH):

The number of individuals that need to be treated before one additional harmful effect occurs.

NUMBER NEEDED TO TREAT (NNT):

The number of individuals that need to be treated before one additional positive benefit occurs.

The p-value is compared to the alpha level set by the researcher. -the p-value does not mean the probability that the null hypothesis is correct. It (the p-value) indicates the probability of obtaining our data, assuming the null is correct -Researchers will reject the null hypothesis because the probability of obtaining the particular data fell below the alpha level. But this does not mean that a real difference was proven. -A significant p value does not tell us anything about the magnitude, importance or practicality of the differences.

The p-value is compared to the __________ level set by the researcher. -the __________ does NOT mean the __________ that the null hypothesis is __________. It (the p-value) indicates the __________ of __________ our data, assuming the __________ is correct -Researchers will __________ the null hypothesis because the probability of obtaining the particular data fell below the __________ level. But this does not mean that a real __________ was proven. -A significant p value does not tell us anything about the __________, __________ or __________ of the differences.

P VALUE:

The probability of attaining the sample data set when the null hypothesis is true. -indicates the degree of extremeness associated with the data set, assuming the null hypothesis is true. -probability value

POWER:

The probability of correctly rejecting the null hypothesis. Expressed as a value between 0.0-1.0.

ALPHA:

The probability of falsely rejecting a null hypothesis given that the null hypothesis is true (Type I error).; is the probability of a Type I error (false positive) given that the null hypothesis is true.

INCIDENCE:

The probability of occurrence of a factor in a population within a specified period of time.

PREVALENCE:

The proportion of a population possessing a specific characteristic in a given time period.

DEPENDENT VARIABLE:

The variable that is being measured.

INDEPENDENT VARIABLE(S):

The variable(s) that may influence the dependent variables.

There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical models; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias. In this essay, I discuss the implications of these problems for the conduct and interpretation of research. Published research findings are sometimes refuted by subsequent evidence, with ensuing confusion and disappointment. Refutation and controversy is seen across the range of research designs, from clinical trials and traditional epidemiological studies [1-3] to the most modern molecular research [4, 5]. There is increasing concern that in modern research, false findings may be the majority or even the vast majority of published research claims [6-8]. However, this should not be surprising. It can be proven that most claimed research findings are false. Here I will examine the key factors that influence this problem and some corollaries thereof.

There is increasing concern that most current published research findings are ___________. The probability that a research claim is true may depend on study ___________ and bias, the ___________ of other studies on the same question, and, importantly, the ___________ of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be ___________ when the studies conducted in a field are ___________, when effect sizes are ___________; when there is a ___________ number and ___________ preselection of tested relationships; where there is greater ___________ in designs, definitions, outcomes, and analytical models; when there is greater financial and other interest and ___________; and when more teams are involved in a scientific field in chase of statistical ___________. Simulations show that for most study designs and settings, it is more likely for a research claim to be ___________ than ___________. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing ___________. In this essay, I discuss the implications of these problems for the conduct and interpretation of research. Published research findings are sometimes refuted by subsequent evidence, with ensuing confusion and disappointment. Refutation and controversy is seen across the range of research ___________, from clinical trials and traditional epidemiological studies [1-3] to the most modern molecular research [4, 5]. There is increasing concern that in ___________ research, false findings may be the majority or even the vast majority of published research ___________ [6-8]. However, this should not be surprising. It can be proven that most claimed research findings are ___________. Here I will examine the key factors that influence this problem and some corollaries thereof.

Hazard and Hazard-Ratios:

These analyses look at risk over a specific time period.

Cox Proportional Hazard: Positive coefficients indicating a worse prognosis and negative coefficient indicating a better prognosis.

This analysis uses regression to predict a cumulative hazard over a specific time period. this type of regressions produce coefficients that relate to hazard. ___________ coefficients indicating a worse prognosis and ___________ coefficient indicating a better prognosis.

VALIDITY There are several types of validity assessment tools (face, concurrent, construct, etc.). The most common assessment of validity is face validity. Face validity is often simply described. There is not a value associated with validity unless a statistical evaluation, for example, a correlation, is presented as evidence.

VALIDITY There are several types of validity assessment tools (_______, _______, _______, etc.). The most common assessment of validity is _______ validity. Face validity is often simply _______. There is not a _______ associated with validity unless a statistical _______, for example, a _______, is presented as evidence.

INTERVAL/ RATIO SCALE: Ratio data possess an absolute zero value. Interval data do not possess an absolute zero.

Values that express consistent magnitude between each value. _______ data possess an absolute zero value. _______ data do not possess an absolute zero.

CONTINUOUS VARIABLE:

Variables that are on the interval or ratio level.

WHAT TO DO IF T-TEST ASSUMPTIONS ARE NOT MET MANN WHITNEY U TEST: Nonparametric Test for Two Independent Means · Tests the medians. · Mann-Whitney U test holds the following assumptions: o The populations do not follow any specific parameterized distributions. o The populations of interest have the same shape. o The populations are independent of each other. ("Asymptotic" means that the p-value approaches the real value as sample size increases.) -The Asym. Sig (p values)= .01 Given alpha set at .05, the decision is to reject the null. _________

WHAT TO DO IF T-TEST ASSUMPTIONS ARE NOT MET MANN WHITNEY U TEST: Nonparametric Test for _________ Independent _________ · Tests the _________. · Mann-Whitney U test holds the following assumptions: o The populations do not _________ any specific _________ distributions. o The populations of interest have the same _________. o The populations are _________ of each other. ("_________" means that the p-value approaches the real value as sample size increases.) -The Asym. Sig (p values)= .01 Given alpha set at .05, the decision is to _________ the null.

What Affects the CI? The population standard deviation The sample size The level of confidence (Alpha)

What Affects the CI? The _________ standard _________ The sample _________ The level of _________ (_________)

What Are We to Do? Do our very best to understand research methods and the limitations of these methods. Recognize that we all have biases that need to be removed. Recognize that the research method is far from perfect. Recognize that the research method is misunderstood by consumers, the media and many of the researchers. Realize that conflicting results are not only common but predictable based on the tenuous foundations supporting both Frequentist and Bayesian approaches. Accept that the best we can do with research findings is to talk about probabilities.

What Are We to Do? Do our very best to ________ research methods and the ________ of these methods. Recognize that we all have ________ that need to be removed. Recognize that the research method is far from ________. Recognize that the research method is ________ by consumers, the ________ and many of the ________. Realize that conflicting results are not only ________ but predictable based on the tenuous foundations supporting both ________ and ________ approaches. Accept that the best we can do with research findings is to talk about ________.

What Do We Mean by "Significance"?: -When a statistical test indicates significance at the .05 level, we can assume that the likelihood of difference between data sets occurring by chance is low. -As stated by Schmitz (2007), "The term "significant" does not mean "a really important finding," or that a particularly large difference or relationship was found. A finding that falls below .01 ("highly significant") is not necessarily larger, smaller, or more important than one that falls just below .05 ("significant")."

What Do We Mean by "Significance"?: -When a statistical test indicates significance at the .05 level, we can assume that the likelihood of difference between data sets occurring by chance is __________. -As stated by Schmitz (2007), "The term "significant" does not mean "a really __________ finding," or that a particularly __________ difference or relationship was found. A finding that falls below .01 ("__________ significant") is _______ necessarily larger, smaller, or more __________ than one that falls just below .05 ("__________")."

What Happens when the ANOVA Reaches Significance? Post Hoc Tests (Multiple Comparison Procedures) · The significant ANOVA tells us that there is a difference, but the ANOVA does not tell us where that difference occurs. o Is the difference between data set one and two? o Is the difference between data set two and three? o Is the difference between data set one and three? o Or, are all three data sets significantly different from each other? · A post hoc (after the fact) analysis follows a significant finding. · There are several post hoc analyses from which to choose. · Selection of a post hoc test is dependent on the type of data sets and is beyond the scope of this book. · In the example below, A Tukey HSD test is used. · Other common post hoc tests: Scheffé, Dunn, and Bonferroni.

What Happens when the ANOVA Reaches Significance? Post Hoc Tests (_________ Comparison Procedures) · The significant ANOVA tells us that there is a _________, but the ANOVA does not tell us _________ that difference occurs. o Is the _________ between data set one and two? o Is the _________ between data set two and three? o Is the difference between data set one and three? o Or, are all three data sets _________ different from each other? · A _________ (_________ the fact) analysis follows a significant finding. · There are several post hoc analyses from which to choose. · Selection of a post hoc test is dependent on the _________ of data sets and is beyond the scope of this book. · In the example below, A _________ test is used. · Other common post hoc tests: _________, _________, and _________.

What affects the test statistic? Size of the samples Differences between the measures of central tendencies (means or medians) Variances associated with the data sets

What affects the test statistic? Size of the __________ Differences between the measures of __________ tendencies (__________ or __________) __________ associated with the data sets

Random patterns happen all of the time. Small samples can sometimes hide a significant pattern. We can have large differences that are not deemed statistically significant. We can have small differences that are deemed statistically significant. The larger the sample size the more likely we will have a significant finding, simply due to the largeness of the sample.

_________ patterns happen all of the time. Small samples can sometimes hide a _________ pattern. We can have large differences that are not deemed statistically _________. We can have small differences that are deemed statistically _________. The larger the sample size the more likely we will have a _________ finding, simply due to the largeness of the sample.

Spearman Correlation SPSS output (nonparametric)

tests reveal a significant deviation from normality. Therefore, this test is appropriate.

DISCRETE:

these variables are usually obtained by counting.

· (t) refers to the t statistic. This is the value produced by the t-statistic formula. · (df) refers to the degrees of freedom. · (Sig. two-tailed) refers to the p value, given a two tailed test. · The p value (p = .005) indicates the probability of obtaining the t value of 3.00 if the null hypothesis is correct. · Using the T-Table, with 78 degrees of freedom and alpha set at .05, the t statistic (3.00) is greater than the table value of 1.99. Therefore, the t-statistic is significant at the .05 level. · The 95% confidence intervals indicate that we are 95% confidence that the real mean difference lies somewhere between the two values presented. Note that the mean difference value from the data resides within the 95% confidence intervals. · In the example found in this table, a researcher would summarize the results as follows: · There was a statistically significant difference, t (78) = 3.00, p = .005. · The 95% confidence interval of the difference refers to a range indicating 95% certainty that the population mean difference would be found within the range. · Referring to the 95% confidence interval can be much more useful to the researcher when interpreting the results. · Notice that if the variances are not assumed to be equal, the degrees of freedom are reduced. Given the t-test formula, a reduction in the degrees of freedom will affect the p value, given a certain t statistic value. · See the following T Distribution table for values associated with one and two- tailed t tests.

· (_________) refers to the t statistic. This is the value produced by the t-statistic formula. · (_________) refers to the degrees of freedom. · (_________) refers to the p value, given a two tailed test. · The p value (p = .005) indicates the probability of _________ the _____ value of 3.00 if the null hypothesis is _________. · Using the T-Table, with 78 degrees of freedom and alpha set at .05, the t statistic (3.00) is greater than the table value of 1.99. Therefore, the t-statistic is _________ at the .05 level. · The 95% confidence intervals indicate that we are 95% confidence that the real _________ difference lies somewhere between the two values presented. Note that the mean _________ value from the data _________ within the 95% confidence intervals. · In the example found in this table, a researcher would summarize the results as follows: · There was a statistically significant difference, t (____) = _________, p = ._________. · The 95% confidence interval of the difference refers to a _________ indicating 95% certainty that the population mean difference would be found within the _________. · Referring to the 95% confidence interval can be much more useful to the researcher when _________ the results. · Notice that if the variances are not assumed to be _________, the degrees of freedom are _________. Given the t-test formula, a reduction in the degrees of freedom will affect the _____ value, given a certain _________ value. · See the following T Distribution table for values associated with one and two- tailed t tests.

· A visual assessment of histograms, stem and leaf plots, and QQ plots is sometimes used as a way to assess the shape of a data set. Stem and Leaf Plot (2 Examples) · Notice the sideways shape of the data. Make a visual assessment as to whether or not this shape resembles a normal distribution. Normal Q-Q Plots · Make a visual assessment about the relationship between the data points and the line. Are the data points close to the line or scattered away from the line?

· A visual assessment of _________, _________, and _________ is sometimes used as a way to assess the _________ of a data set. Stem and Leaf Plot (2 Examples) · Notice the sideways shape of the data. Make a visual assessment as to whether or not this shape resembles a _________ distribution. Normal Q-Q Plots · Make a visual assessment about the relationship between the data points and the line. Are the data points close to the line or scattered _________ from the line?


Set pelajaran terkait

Introduction to the Human Body Questions (BIO 163)

View Set

Quiz EL ED 11/8/23 typical Reading /content

View Set

ITE 221 ch 3 quiz, ITE 221 CH.2 INTRODUCTION TO SYSTEMS ARCHITECTURE QUIZ

View Set

Chapter 11: Forestry and Resource Management

View Set

Psychology Chapter 11: Personality

View Set

Study Guide (1 day study; 15 minutes minimum)

View Set

Hazard Control and Loss Prevention - Evaluate Risks

View Set