IPE
What is the nominal scale?
"name" -providing only information about a difference. -Each subject can belong in only one classification -E.g., male/female, republican/democrat/independence
Q1. If I wanted to increase the power of my study, I would: (a) Recruit as many subjects as possible. (b) Use an alpha level of 0.01 instead of 0.05. (c) Both (a) and (b)
(a) Recruit as many subjects as possible.
What is the correlation matrix?
-A common method for reporting a number of correlations presented in table form.
What are populations and samples/
-A population: an all-inclusive group. -A sample: a representative subset of the population that contains all of the essential elements of that population.
What is theoretical probability?
-A traditional way of expressing the statistical likelihood of an event occurring. p = (the # of ways an event in question can occur) divided by (the # of all possible events)
What is. coefficient of variation?
-CV = (SD/Mean) x 100 -Used when the measurement means or variables of two or more distributions are not the same. e.g., body height vs. weight
What is validity?
-Indicates how well a test measures what is intended to be tested. E.g., validating a new evaluation tool to the best accepted evaluation tool -New test vs. gold standard on the same subjects -E.g., Caliper for skinfold vs. DXA scan
What is random sampling?
-Random sampling is an impartial process that results in an unbiased sample of a population. -Every person/observation in the particular population has an "equal probability" of being selected for the sample.
What is correlation?
-Relationship between two or more variables.
What is the mode?
-The most frequent score in a distribution. -If the distribution has two modes, it is bimodal. -not used very often
What is variance?
-The square of SD (SD²).
What are the measures of variability?
-Variance -Confidence intervals -Standard deviation -Coefficient of variation -Range
What is population specificity?
-a prediction equation is only sound or valid for subjects with traits nearly identical to those of the subjects used in the development of an equation.
Confounding variable/factor
-affect the relation between the independent and dependent variables
what is the mean?
-the arithmetic average of a distribution of scores -the most commonly used measure of central tendency. add up and divide
what is the median?
-the midpoint in a distribution, in which 50% of the scores lie above and 50% below. -good to use for a small N and some outliers
Q. A researcher developed a dynamic thoracolumbar orthosis/brace for patients with scoliosis. He wants to determine if the dynamic torso brace would better improve standing balance in scoliosis patients compared to the conventional torso brace. The researcher recruited two groups of subjects (one group wearing the dynamic torso brace and another group waring conventional torso brace) and recorded their balance performance while wearing the brace assigned. 1. P-I-O terms? 2. Null and alternative hypotheses for one-tailed test 3. For the statistical analysis, he set the significance level at 0.05 (α=0.05). He performed a one-tailed test and got a p value of 0.08 (p=0.08). Decision on accepting or rejecting the null hypothesis 4. Later, he changed his mind and decided to set the significance level at 0.10 (α=0.10). Then, what is his decision on accepting or rejecting the null hypothesis?
1. P= Patients with scoliosis I = torso brace O = standing balance 2. N0= dynamic is less than or equal to conventional HA= dynamic is better than conventional 3. accept H0 reject HA 4. accept HA reject H0
•Q: We want to determine the incidence of diabetes in people aged 30 to 50 years in Massachusetts. We obtain a random sample of 500 people and measure their fasting blood sugar level. We find that M = 85 mg/dL and SD = 20 mg/dL. 1)What is the probability that any one subject selected at random would have a blood sugar reading over 125 mg/dL? 2)How many subjects in our sample (subject pool) would have a blood sugar reading over 125 mg/dL? 3)How many people can we expect or predict to be in the population of 2 million (aged 30 to 50 years in Massachusetts) with a blood sugar reading over 125 mg/dL?
1. 85 is median in middle, use SD (20) as rate of change on scale 2+0.5=2.5% 2. 2.5% x 500= 0.25 x 500= 12.5 3. 2million x 2.5%= 2million x 0.25 =
What are the 5 measurement scales?
1. •"Nominal" Scale 2. Ordinal" Scale 3. Absolute Scale 4. Interval Scale 5. Ratio Scale
What is multiple regression?
A linear regression performed on two or independent variables. Eg. •Vertical jump height can be predicted by? Leg strength, height, weight, muscle mass, gender, age, anthropology measurements
Q4. Which measures can provide the information regarding the variability of your data set? Range, standard deviation, coefficient of variation, variance, confidence interval?
ALL
Q1. An investigator in Europe has developed a new method to estimate % body fat. He wants to see how good his new method compares to a DEXA scan, the "gold standard". He measures % body fat by both methods on student volunteers. What type of statistic should he use to determine whether his new method gives values that are good estimates of % body fat (i.e., How does it compare to the gold standard?) Regression, Correlations? Q2. What kind of the information he is assessing? Reliability, Validity, Objectivity, Specificity?
Correlation, only looking for r value. If was looking to compare measurement to dexa scan would use regression Q2. Validity
Q3. If the researcher fails to reject the null hypothesis when there really is a difference, this is an example of a: (a) one-tailed test (b) two-tailed test (c) type I error (d) type II error
D. type II error
If you increase the level of significance (a), what happens to the power?
If all other things are held constant, then as α increases, so does the power of the test. This is because a larger α means a larger rejection region for the test and thus a greater probability of rejecting the null hypothesis. That translates to a more powerful test.
-Why lowering the alpha level to a stricter, rigorous extent reduces the chance of making a type I error but increases the probability of making a type II error?
Lowering the alpha level reduces the type 1 error because the alpha level is the chance of a type 1 error occurring. If there is less chance the error occurs, then it will
What is better to use as a central tendency Mean or median?
Median because it is not as effected by extreme values
What is standard deviation?
Most frequently used -Provides information on the "average algebraic distance" from the mean of each score in a distribution.
Q: To assess the meaningfulness or magnitude of a difference between two means (e.g., difference between PRE and POST-intervention, difference between two groups), what info should we examine?
Need the effect size to get the effect size we need: Group means thus the sample size (n) sample size standard deviation
Does statistical significance mean practical or clinical significance?
No, magnitude of significance is important for practical and clinical application
Q. A researcher wants to investigate how exercise duration affects people's short-term memory. She hypothesizes that longer duration of the exercise can better improve the short-term memory. She plans to let the subjects do six bouts of the 5-min jogging (total exercise time = 30 minutes). She plans to conduct a memory test before and after each bout of the exercise. •Q1. What are the independent and dependent variables? •Q2. The "learning effects" from the repeated memory tests will be a: independent variable, dependent variable, confounding variable?
Q1. IV = exercise duration DV = memory performance Q2. cofounding
What is random sampling vs group randomization?
Random sampling -is an impartial process that results in an unbiased sample of a population. -Every person/observation in the particular population has an "equal probability" of being selected for the sample. Group randomization ensures similar suject characteristics for the experimental and control groups. Random sampling is broad whereas group randomization is specific within the random samples
What is a type 1 error?
Rejecting null hypothesis when it is true "false positive"
What are the 3 other uses of correlation?
Reliability Objectivity Validity
What is the effect size?
Represents the standardized difference between two means. Allows comparison between studies using different dependent variables because it puts data in standard deviation units. formula ; ES = |M1 - M2| / SD I I is absolute value
What information is needed to compute the power level?
Sample size Group means Standard deviations
What is the critical value?
The boundary between non significant and significant data
How is the effect size different than the p value
The effect size is the main finding of a quantitative study. While a P value can inform the reader whether an effect exists, the P value will not reveal the size of the effect.
What information is needed in planning research?
The effect size or group means Standard deviation Significance level One or two tailed hypothesis
If significance level (a), is decreased, what occurs to the type 1 error?
Type 1 error decreases
If significance level (a), is Increased, what occurs to the type 1 error?
Type 1 error increases
If significance level (a), is Increased, what occurs to the type 2 error?
Type 2 error decreases
If significance level (a), is decreased, what occurs to the type 2 error?
Type 2 error increases
What are nonparametric statistics?
Used for nominal or ordinal data that assesses ABNORMAL distribution
What is a Pearson Correlation?
Used to measure how well two variables are related to each other. •Interpretation: -Significance (different from zero or not?) -Positive or Negative Relationship (direction) -Strength:
What are descriptive use of statistics?
Used when measuring a trait or characteristic of a group without an intention to generalize that statistic beyond that group
Example of multiple regression: Y´ = 35 + 1.0X1 - 0.02X2 + 0.10X3 + 0.50X4 where Y´ =risk score of coronary heart disease X1 = smoking score, X2= physical activity score X3= stress rating, X4 = abdominal girth
When have a 1.0 in smoking score, also increase 1.0 point in heart disease. r value isn't provided so don't know the strength just the relationship. If abdominal girth increases by 1, then heart disease will increase by 0.5
What are inferential use of statistics?
When one makes generalizations or inferences from a smaller group to a larger group Ex: I want to study the (typical physical activity level in people living in the greater Boston Area.)
What are confidence intervals?
a range of values within which there can be some degree of confidence that the true parameter is likely to fall
Q1. If I wanted to increase the power of my study, I would: (a) Recruit as many subjects as possible. (b) Use an alpha level of 0.01 instead of 0.05. (c) Both (a) and (b)
a. Recruit as many subjects as possible. why does power go down if alpha goes down ie type I error goes down?
reject Ho means
accept Ha
if p > a
accept the null (H0) reject alternative (HA)
-Why does lowering the alpha level to a stricter, rigorous extent reduces the chance of making a type I error but increases the probability of making a type II error?
alpha level and type I error are directly proportional but type I error is inversely proportional to type II
What is the Ratio scale?
equal intervals + "an absolute zero point: •absence of characteristic being measured" -Most complex, with all of the elements of interval scale plus an absolute zero point (ex percentage of body fat 0% bf = no body fat)
Small p value =
father away
Is the effect size of 0.8 small, medium, or large?
large
Is the effect size of 0.5 small, medium, or large?
medium
Big p value =
more overlap
what is the relationship between neg and pos correlation
neg and pos just tell us direction. A neg correlation can have a higher correlation than a pos correlation. Strength of correlation depends on the absolute value of r (how close points are to best fit line) I r I
What does the effect size of 0 mean?
no difference between the two means
A researcher tested a new intervention program in 100 female older adults to examine the effectiveness of the new intervention in increasing muscle mass %. The following lists the data and statistical information. What do you think about the effectiveness of this new intervention? How do you conclude? Average pre-training muscle mass %: 25 % (mean = 25%) Average post-training body fat: 28 % (mean = 28%) Overall standard deviation: 15% (std = 15%) Significance level (α) set at 0.05 p-value = 0.03
not significant (>.05%) also effect size = .02 meaning very small and not clinically or practically significant
If you lower the level of significance (a), what happens to the power level?
power level decreases
What is the p value?
probability that the null hypothesis is true
what value represents negative correlation?
r = -0.01 to -1.0 -As the value for one variable increases, the value for the other variable tends to decrease.
what value represents positive correlation?
r = 0.01 to 1.0 -As the value for one variable becomes larger, the value for the other variable also tends to increase. correlation is not equal to causation
what is no correlation?
r=0
accept Ho means
reject Ha
If p < a
reject the null (H0) accept alternative (HA)
What graphs are used for correlation?
scatterplot •The size of correlations can also be estimated by the degree to which the data in a scatterplot are clustered around a line running through them, called the line of best fit or regression line.
Is the effect size of 0.2 small, medium, or large?
small
What is a correlation coefficient?
statistical measurement that says wether something has correlation or not (r)
What information is needed to calculate a sample size?
the effect size or group means and standard deviation from the pilot data. the level of significance (i.e., alpha level). the study power we plan to achieve.
What is the alpha level?
the probability of making a type I error ("false positive")
What are parametric statistics?
type of inferential statistics used for ratio or interval data that assesses NORMAL distribution
What is a type 2 error?
when a false null hypothesis is accepted, or we fail to reject false null hypothesis. "false negative"
What is the interval scale?
•"equal intervals" -Obtain all of the properties of ordinal measures plus equality of units -Difference, direction and amount of difference (ex; temperature can be 0 but does not mean no temperature)
What is the Ordinal scale?
•"order, rank" -Demonstrates rank by showing difference and direction of difference. -No equality of units (2nd place could be just a bit behind 1st or could be 1000x behind) Ex; Pain in doctors office 1-10, 10 will be worse than 5 but we don't know how much more
What is reliability?
•(e.g., intra-rater reliability) -The consistency of measurements. -Same test on the same subjects for several times
What is study power?
•: probability of rejecting the null hypothesis (Ha is true) when the null hypothesis is false 1 - beta level (probability of making a type 2 error)
What is the central tendency? What are the 3 values used?
•A central or typical value that best represents all the "scores" in a distribution: -mean -median -mode
What is the normal curve?
•A statistical and theoretical model that is used to visualize data, to interpret distributions of scores, and to make predictions and probability statements. -Must be perfectly symmetrical -Bell-shaped -Unit normal curve: •Identical mean, median, and mode (i.e., midpoint, 0) •Is divided into SD units which have a size equaling 1.0 (SD = 1) •The total area underneath the curve = 1 (100%)
What is hypothesis testing?
•A strict statistical process that is built on making probability statements for two possible states of actuality.
Two tailed vs One tailed hypothesis?
•If the investigator cannot hypothesize about the direction of the outcome (could be this could be that), a two-tailed hypothesis test must be used. -H0: No difference -HA : there's a difference •keto and high protein could both help build muscle (HA) no difference in muscle mass between the diets (H0) •If the investigator has a strong sense about the direction of the outcome (believe it is this not that) , use a one-tailed hypothesis test. e.g., -H0: -HA : Experimental > Control -high protein diet is better than keto (HA) -keto is better or equal to high protein diet (H0) probability should be 100% in both nothing left out
What are non-normal distributions and curves?
•Non-normal distributions: Atypical distributions that occur when a distribution of scores will not fit the model of the normal curve, particularly in shape. -Negatively skewed -Positively skewed
What is probability?
•Statistical decisions are not exact but are made with a certain probability or chance of being right or wrong. •Probability and the Normal Curve -the applications of the normal curve include the ability to make predictions and probability statement.
What are null and alternative hypotheses?
•The Null and Alternative Hypotheses (typical) -The null hypothesis (H0) is often a statement of no difference or no relationship. - -The alternative hypothesis (HA) is the logical state of reality that must exist if the null hypothesis is not true. • Probability = 100%
What are the equations for the following Pearson correlation values -Size/amplitude of r -Coefficient of determination -Common variance
•The size/amplitude of r = | r | •r^2 (coefficient of determination) •r2 x 100 (common variance)
What is the Z score?
•Used to covert a raw score from a distribution into units of the normal distribution curve called SD units. •z = (X - M) / SD where X is the raw score and M is the mean of the distribution.
What is simple linear regression?
•Using correlation between variables to make predictions. E.g., leg muscle strength vs. max jump height •Linear Regression Equation (Line of best fit): Y´ = bX + a dont need to know Using correlation between leg muscle and jump strength, we can predict the jump height
What are common sampling methods/procedures?
•Using random numbers • •Systematic counting: using some type of list or inventory of subjects in a population, e.g., phone book, a list of licensed drivers, an index of registered voters, etc. • •Stratified random sampling: use when a population is believed to have distinct subgroups and we want to have the appropriate representation from each subgroup.
What is the relationship between validity and reliability?
•Validity (accuracy) ≠ Reliability (consistency) •must be evaluated independently. Validity assessment is more important High validity must have high reliability High reliability can have low validity
inductive reasoning
•based on making a conclusion or generalization on a limited number of observations. Proceeding from the specific to the general. Example: if we study 40 patients with lung cancer and we found that all of them have been smoking.... =Patients with lung cancer are smokers =Smoking would lead to lung cancer development
What is objectivity?
•i.e., inter-rater reliability) -The reliability or consistency of measurements between different test administrators. -Same test on the same subjects by different researchers
Deductive reasoning
•proceeds from the general to a specific case. -Compare with reality (to conduct experiments) -The foundation of modern science Example: Concept/theory: Aerobic exercise is good for cardiopulmonary function Hypothesis: Treadmill walking improves cardiopulmonary function
What is effect size?
•represents the standardized difference between two means. ES = |M1 - M2| / SD •allows comparison between studies using different dependent variables because it puts data in standard deviation units.
Dependent variable
•the behavior that is measured to determine whether it is affected by the independent variable. e.g., outcome measures (effect)
What is the absolute scale?
•the number of events or observations (i.e., frequency counts) is made to the whole unit.
Independent variable
•the one that is manipulated or controlled by the researcher (Cause)
What is the level of significance?
•the α level -The statistical reference point that is selected for the purpose of accepting or rejecting the null hypotheses. - -In our area, researchers typically use α = 5% or 1% (p < 0.05 or p < 0.01) for making statistical decisions. - -Defines if the probability of an event occurs rarely due to real difference. p < 0.05 vs. p > 0.05