NSG522 EpiStats Midterm
Height recorded in centimeters should be displayed using SELECT ALL THAT APPLY - Bar chart. - Box-and-whisker plot. - Stem-and-leaf plot. - Histogram.
- Box-and-whisker plot. - Stem-and-leaf plot. - Histogram.
A variable was coded as disease present, disease not present, or disease status unknown. When performing the descriptive statistics, which of the following should be used? SELECT ALL THAT APPLY - Median - Percentages - Proportions - Frequencies - Mode
- Percentages - Frequencies
The decision to use the mean or median is based upon: (SELECT ALL THAT APPLY) -Which statistical tests will be used for hypothesis testing. -Whether the data are normally distributed. -The guidelines of the journal where the data will be published. -The presence or absence of outliers.
-Whether the data are normally distributed. -The presence or absence of outliers.
68% of observations fall within....
1 standard deviation of the mean
What are the first two steps of data analysis?
1. Enter clean, correct data 2. Perform univariate analysis (descriptive statistics and graphic display)
What are the three most commonly reported measures of spread?
1. SD 2. IQR 3. Epidemiological range
What are the 5 measures of central location? Which two are most commonly *reported*?
1. mean 2. median 3. mode 4. geometric mean 5. midrange mean and median
What are the seven measures of spread?
1. statistical range 2. epidemiological range 3. percentiles 4. quartiles 5. interquartile range 6. standard deviation 7. coefficient of variation
The maximum value is the ____th percentile.
100th (100% of values fall at or below it)
95% of the observations fall within...
2 standard deviations of the mean
Match the following variables to the correct level of measurement. Temperature in Fahrenheit Disease status (present or absent) Weight in ounces Race, self-reported Quality of life, on a scale of 1 to 5 Hemoglobin A1C Time in minutes Level of Education (< High School, High School Graduate, Some College, College Graduate) Oxygen saturation in percent Blood type A. Interval B. Nominal C. Ratio D. Ordinal
A-Temperature in Fahrenheit B-Disease status (present or absent) C-Weight in ounces B-Race, self-reported D-Quality of life, on a scale of 1 to 5 C-Hemoglobin A1C C-Time in minutes D-Level of Education (< High School, High School Graduate, Some College, College Graduate) C-Oxygen saturation in percent B-Blood type
T/F: The measure of central location determines which measure of spread should be reported.
True
T/F: Data integrity refers to the accuracy of the data.
True.
T/F: The mean is the arithmetic average of the observations.
True.
T/F: The identifier is not a variable and is not analyzed or described.
True. It is arbitrary.
T/F: There is very little difference between ratio and interval level data.
True. They can be treated pretty much the same way for most statistical operations and are collectively referred to as "scale" variables.
Which element of a box and whisker plot shows outliers?
Tukey fences
_____ refers to the accuracy of measurement, and that it measured what it was supposed to measure.
Validity
For continuous data, if there are no outliers, do you need to perform a normality test?
Yes. The data must have no outliers AND be normally distributed in order to determine that you need to report the mean rather than the median.
Height and weight are examples of ____ measurements, heart rate and blood pressure are examples of ____ measurements, and glucose and cholesterol are examples of ____ measurements.
anthropometric physiologic biochemical
The two types of ____ variables are nominal and ordinal, and the two types of _____ variables are ratio and interval.
categorical continuous
Description of continuous data includes number of observations, measures of _____ and ____, as well as graphic display, which includes ____, _____, and _____.
central location spread histogram stem and leaf box and whisker
Variables are defined according to ____ and ____.
concept (what it is) operation (how and with what it is measured)
Descriptive epi is to analytic epi like _____ is to _____.
distribution determinants
Epidemiological measurements focus on the _____ (frequency and pattern) and _____ (factors and causes) of health issues.
distribution determinants
Epidemiology is the study of the _____ and _____ of heath-related states or events in specified _____, and the application of this knowledge to the control of health _____.
distribution determinants populations problems
The goal of _____ is to identify factors associated with the development of disease and injury as well as factors that prevent illness and promote wellness
epidemiology
There can only be one _____ variable.
independent
The _____ variable is the one manipulated by the researcher, and the ____ variable is the one affected (outcome variable).
independent dependent
Continuous variables *without* an absolute zero and equal interval between variables are ____ variables, such as temperature or date of diagnosis.
interval
Data is organized into a ____ ____, also known as a rectangular file.
line list
If the distribution is normally distributed AND there are no outliers, which measure of central location is reported?
mean
If the distribution is normally distributed OR there is one or more outliers, which measure of central location is reported?
median
Number of hospital acquired infection would be a performance _____ for the hospital.
metric
Standard summary for continuous variables....
n mean or median if mean, SD and epidemiological range; if median, IQR and epidemiological range (best practice also Q1 and Q3)
What is the statistical summary data to run for continuous data?
n, mean, median, mode, SD, (statistical) range, IQR, Q1, Q3, min, max
Categorical variables with no ranking, such as "characteristics of silicone" are ____ variables.
nominal
The standard deviation is the measure of the average distance between the _____ and the _____.
observations mean
Categorical variables that are ranked are ____ variables, such as stages of cancer, pain score, or letter grade.
ordinal
The last two core epidemiological concepts are ____ and ____.
policy linkages
Epidemiology focuses on ______ health rather than the individual.
population
Categorical variables are (qualitative/quantitative) and continuous variables are (qualitative/quantitative).
qualitative quantitative
Continuous variables *with* an absolute zero and equal interval between variables are ____ variables, such as heart rate and weight.
ratio
Descriptive statistics is also known as data ____, because it can summarize large amounts of data in just a few numbers or a simple graphic display
reduction
According to the basic structure of a data file, the (rows/columns) contain the records, observations, or cases, and the (rows/columns) contain the variables.
rows columns
The decision of whether to use the mean or median is based on the _____ and the _____.
shape of distribution presence of outliers
If you report the mean, which measures of spread should be reported?
standard deviation and epidemiological range
What is Q2?
the median
_____ refers to the frequency and pattern of health related states and events.
Distribution
_____ epidemiology quantifies the association between exposures and outcomes to determine causal relationships.
Analytic
How is the epidemiological range reported?
As two numbers: the largest and smallest values, i.e. 50, 30
What percentage of scores fall between the first and third quartile? A. 25% B. 50% C. 68% D. 75% E. 100%
B. 50%
What type of graphic display would be the best for a variable measured at the ordinal level? A. Bar chart B. Histogram C. Pie chart D. Any of the above would be appropriate.
B. Histogram
When reporting the mean for a variable, what measure of spread should be reported? A. The statistical range B. The standard deviation C. The interquartile range D. Any/all of the above would be appropriate.
B. The standard deviation
A nurse practitioner student is completing his final practicum in the outpatient clinic of the health care organization that is part of the university. As a part of the course requirements for the final term, the student must complete a quality improvement project. The student decides to examine whether the staff routinely assess vaccination status for healthy adults. Since this is a quality improvement project for the clinic and a course requirement for the student, does the student need any additional approvals to review patient records for this project? A. No, only research projects require approval. B. Yes, the student needs to obtain appropriate permission to examine the data. C. No, this is a required student project and it is being conducted at the student's own university medical center. D. Yes, the student's project must undergo a full IRB review and approval.
B. Yes, the student needs to obtain appropriate permission to examine the data.
Which graph displays the most information?
Box and whisker. It depicts the overall and central distribution, quartiles, and outliers.
Outliers can be identified using a (SELECT ALL THAT APPLY) -Box-and-whisker plot with Tukey fences. -Histogram. -Stem-and-leaf plot. -Bar chart. -Any/all of the above.
Box-and-whisker plot with Tukey fences.
The best way to determine whether data are normally distributed is to: A. Construct a box-and-whisker plot with Tukey fences. B. Construct a histogram. C. Perform a Shapiro-Wilk test. D. Perform a test for outliers.
C. Perform a Shapiro-Wilk test.
Testing a new piece of equipment to ensure that the equipment consistently provides that same measurement when the same variable is measured multiple times is an example of: A. Accuracy B. Validity C. Reliability D. All of the above
C. Reliability
What percentage of observations fall within + 2 standard deviations of the mean? A. 25% B. 50% C. 68% D. 95% E. 99%
D. 95%
When analyzing the variable gender, which was measured in 5 categories (male, female, transgender male, transgender female, other), the data should be displayed using a: A. Histogram. B. Box-and-whisker plot. C. Pie chart. D. Bar chart. E. Line chart.
D. Bar chart.
Quality of life is measured on a scale from 1 (extremely poor quality) to 5 (best possible quality). The best graphic display of these measurements would be a A. Bar chart. B. Stem-and-leaf plot. C. Box-and-whisker plot. D. Histogram
D. Histogram
In a box-and-whisker plot, the length of the box is equal to the: A. Midrange. B. Epidemiological range. C. Statistical range. D. Interquartile range.
D. Interquartile range
The most frequently occurring number in a set of numbers is the: A. Median. B. Midrange. C. Frequency. D. Mode. E. 50th percentile.
D. Mode
The operational definition of a variable: A. Explains what the variable means. B. Specifies the precision of the measurements. C. Describes the validity of the variable. D. Specifies how a variable should be measured.
D. Specifies how a variable should be measured
The dependent variable is: A. The intervention being studied. B. The experiemental variable. C. The variable manipulated by the researcher. D. The outcome variable.
D. The outcome variable.
_____ statistics describes data, and _____ statistics makes inferences.
Descriptive inferential
______ epidemiology describes the frequency and pattern of data. Frequency is the number of events, and pattern= 3Ws ____, ____, and ____.
Descriptive who where when
______ refer to the physical, biological, social, cultural, economic and behavioral factors that influence health.
Determinants
What is the statistical range?
Difference between largest and smallest value, i.e. (50-30)= 20
A data plan must address: A. How the data will be measured B. How the data will be recorded C. How the data will be entered into an electronic file D. How the data will be examined for accuracy E. All of the above
E. All of the above
T/F: The best place to store data from a clinical project conducted by 2 students and their advisor would be in a cloud-based system such as Google docs so that all team members can easily access the data and be sure that they are using the most current version of the data.
False
T/F: The frequency and relative frequency presented as a percentage should both be reported when describing interval level data.
False
T/F: A conceptual definition defines how a variable will be measured, and includes the equipment and procedures that are used in the measurement.
False (Operational)
T/F: There can only be one dependent variable.
False (independent)
T/F: The validity of a weight measurement obtained with an electronic scale refers to the repeatability of the measurement.
False (reliability)
T/F: Inferential statistics are used to organize, summarize, and report data
False. Descriptive.
T/F: When a continuous variable has a normal distribution and only 1 small outlier, the mean and standard deviation should be used to describe the variable.
False. Median for any outliers.
T/F: Bar charts and histograms can both be used to describe interval level data.
False. Never bar charts.
T/F: The first column of a data file usually contains a personal identifier.
False. Non-personal record identifier (01, 02, 03, etc. or randomly generated number or letter combination)
T/F: Measures of central location and spread are used for both categorical and continuous data.
False. Only continuous.
The first step in any data analysis, no matter how large, how complex or how important is to compare the data.
False. The first step is to describe the data.
T/F: The mean is the number in the middle of the observations, when the observations are arranged from smallest to largest.
False. The mean is the average.
T/F: A student is planning on conducting a quality improvement project on a unit in the university hospital where the student is enrolled. Since the project is required by the university and the university owns the hospital, the student does not need to obtain additional permission for the project.
False. The student needs institutional approval from the hospital.
A bar chart or histogram for nominal or ordinal data would include which statistic on the Y axis?
Frequency (n)
What are the descriptive statistics and graphic display for nominal data?
Frequency or count (n) Relative frequency (%) Bar chart
What are the descriptive statistics and graphic display for ordinal data?
Frequency or count (n) Relative frequency (%) Histogram (or bar chart if unavailable)
What does a Z score tell you?
How many standard deviations away from the mean a value is. If it is positive, it is above. It is negative, it is below.
If you report the median, which measures of spread should be reported?
IQR (also best practice to report Q1, Q3) and epidemiological range (max and min)
Research requires approval from the _____.
IRB (Institutional Review Board)
(Ordinal/nominal) data can also include cumulative frequency and cumulative relative frequency.
Ordinal
What are the commonly reported quartiles?
Q1 and Q3
What is the IQR?
Q3 minus Q1 (reported as one value, best practice to include Q1 and Q3)
______ is about the precision of measurements.
Reliability
_____ means that the measurement can be repeated or reproduced. It does not mean that it is _____.
Reliability accurate
The first four core epidemiological functions, ______, _____, ______, and _____, are used to _____ the distribution and determinants of health, as well as evaluate the ______ of health interventions and services.
SAFE: surveillance analytic studies field investigations evaluation describe/identify efficacy
What normality test do you use for univariate continuous data?
Shapiro-Wilke
Which measure of spread identifies the individual measure of spread in the data set?
Standard deviation
_____ refers to systematic surveillance, observation, experimentation and use of a scientific approach
Study
The epidemiological range is: A. The difference between 2 standard deviations above and below the mean. B. The minimum and maximum values. C. The difference between the 25th and 75th percentiles. D. The difference between the largest and smallest values.
The minimum and maximum values.