RS HDS Chapter 13 Research and Data Analysis

Ace your homework & exams now with Quizwiz!

Which organization focuses one of its main priorities on patient-centered outcomes research? CDC AHRQ WHO HHS

AHRQ

Data list or input

Data list or input includes the name of the variable, the type of the variable such as string, numeric, how long the variable is and how man decimal places to keep in the number see example p 357

Pareto Chart contd

Example: A billing manager collected data over a period of time to determine the causes for claim denials for Medicare inpatient stays. After reviewing the chart, the billing manager is able to determine coding errors where the department should start to reduce Medicare denials. The cumulative line indicates the overall effect if all of the areas were to be addressed. see pic

Although the CDC provides a great deal of data and reports, it does not providing funding for researchers to conduct their research. True False

False

Informed consent is only required for exempt research studies True False

False

Patients must be involved in the design of the research study if the researcher is to receive funding from which of the following organizations? PCORI CDC WHO AHA

PCORI

Once the review category is determined, the researcher must then submit their research protocol to the IRB for their review and approval True False

True

PCORI, the Patient-Centered Outcomes Research Institute provides funding to researchers to perform research that is patient centered and patient engaged. True False

True

The IRB determines whether research conducted on human subjects is appropriate and protects the participant's rights True False

True

The major difference between WHO and the CDC and AHRQ is that WHO focuses its efforts across the world whereas the CDC and AHRQ focus on the United States True False

True

Which organization's primary focus is improving healthcare across the world? CDC WHO AHRQ Joint Commission

WHO

skewed distribution

an asymmetrical but generally bell-shaped distribution (of opinions); its mode, or most frequent response, lies off to one side Mean is sensitive to extreme values or outliers, it gravitates in the direction of the extreme values thus making a long tail sometimes data do not follow a normal distribution and are pulled toward the tails of the curve p362

continuous variable

any numerical value that goes from one whole number to the next whole number

Qualitative variable

are categorical, are all discrete and include both nominal and ordinal variables Nominal - a number is assigned to a specific category such as a 1=male and 2= female Ordinal - assigned to rank a category in an ordered series but does not indicate the magnitude of the difference between any two data points (example CSAT score card, patient questionnaire)

ratio variable

are the most common quantitative variables used in healthcare ex inches and meters, lbs or kg

interval variable

are those that have equal units with an arbitrary zero point ext temperature

Presentation methods

bar graphs charts tables

Variability

in a set of numbers, how widely dispersed the values are from each other and from the mean (range, variance, standard deviation)

Missing values

missing values are values that do not have a number or value assigned to the variable. Missing values can be displayed in output and can be recoded by the user if necessary. One may need to recode or add a number in case it was missed by the data entry. see example p 357

quantitative variable

numeric variables that can be classified as discrete or continuous.

Stem and Leaf Plot Example

oA stem and leaf plot is constructed on the number of discharges across cities in one particular state for MS-DRG 39, extracranial procedures w/o CC or MCC. oThe data are listed as follows and ranked from the smallest to largest number of discharges: oA stem and leaf plot is constructed on the number of discharges across cities in one particular state for MS-DRG 39, extracranial procedures w/o CC or MCC. oThe data are listed as follows and ranked from the smallest to largest number discharges: 14, 14, 15, 18, 18, 21, 24, 25, 27, 27, 29, 31, 32, 33, 34, 43, 45, 51, 66, 67 •To develop the stem and leaf plot, the numbers above are separated into two parts. oThe first digit (the tens digit) is listed once as per occurrence in the stem column, and the last digit(s) (the ones digit) are placed in the leaf column. ------The first number, 14, is separated so that 1 goes in the stem column and 4 is the first listing in the leaf column. ------In the second 14, the first digit is already in the stem column (1) so the second (4) is placed in the leaf column as the second entry (44). •Summary of feedback oCompleted stem and leaf plot shows distribution of the data set. •The lowest value in the distribution is 14 •The highest value in the distribution is 67 •There are six observations in the 20s group •The largest number of discharges is 67

output

output includes statistics that can be generated from the data that is collected and entered into the software or spreadsheet. Statistics can be generated such as descriptive statistics (frequency tables, percentiles, graphs, measure of central tendency) as well as advanced statistics see example p 357

Steps to Analyze Information

p363 1.Know your objectives or purpose of the data analysis o What problem are you trying to solve? •This includes obtaining the study objectives or specific aims for a quality, safety, or healthcare outcomes study •An example would be the comparison of nosocomial infection rates at a facility to establish CDC standards •If the rate is higher than the standard, there is an identified problem that needs corrective action to comply with accepted practice 2.Know your audience 3.Understand how the data was collected and for what purpose 4.Recognize the different data types since this will dictate how to analyze the data 5.Start with basic types of data analysis and work up to more sophisticated analysis, if appropriate 6.Develop an interdisciplinary team of an HIM professional, a statistician or epidemiologist, information technologist, and administration o Examines methods used to analyze the data early so that feedback on the approaches are incorporated from multiple potential users of the information

Types of Data or Variables

see pic

Statistical packages

statistical packages can be used to facilitate the data collection and analysis processes. -Simplify the statistical analysis of data and are often used in addition to spreadsheet software Presentation software is software used to build slide when presenting a specific topic, idea, research data or any type of information. -Presentation of data and information is an important function of HIT ---LOS and nosocomial infection rate are often reported

Standard Deviation

the measure of variability that is used most often and displays how data are related to the mean

Value labels

value labels assign a value to a specific variable and appears in the output for easy interpretation Example: Coder Status 1= advanced 2= intermediate 3=beginner see example p 357

discrete variable

variables that can take on finite number of values, usually whole numbers or numbers that can be counted

Bar chart notes

when constructing bar charts it is important to know the audience, keep it simple and make it clear, colorful and concise. main goal is to succinctly provide clear and easy to understand data This includes providing a title axes labels legend a number within or above the bars percentages appropriate colors to distinguish between groups

One-variable bar chart

• Easiest bar chart to build • Displays a bar to represent the amount of the specific category see pic

two-variable bar chart

• The two-variable bar chart can further distinguish or classify additional variables. • This example includes both the number of healthcare facilities in the region and the number of trauma units in each of those settings. see pic

stack bar chart

• Used when demonstrating a comparison of the proportion of two things • This example demonstrates the proportion of the number of trauma units in relation to the number of healthcare facilities see pic

horizontal bar chart

• Used when label titles are long and therefore more difficult to read • Used when sorting the data from the smallest amount at the top to the largest amount at the bottom see pic

horizontal bar chart

• Used when titles get longer • Used if data is shown from the smallest at the top to the largest at the bottom • Used when showing proportions of two areas see pic

Pareto Chart

•A Pareto chart is similar to bar chart oHighest ranking value is listed as the first column and the next highest ranking is second, and so on, to the lowest ranking. •Created by Vilfredo Pareto and based on his theory that "the significant few things will generally make up 80 percent of the whole, while the trivial many will make up about 20 percent" (Productivity-Quality Systems 2015). •Used to analyze data about the frequency or causes of problems in a process. •Shows data in terms of arranging it into categories and then ranking each category according to its importance. •Displays a cumulative line that shows the overall effect of each of the categories that make up the whole. •Useful in quality improvement processes

Z score

•A Z-score is a standardized unit that provides the relative position of any observation in the distribution o The Z Score is the number of standard deviations that the observed value lies away from the mean, μ. oTransforming the raw observations to Z values makes it possible to make comparisons between distributions. o Z-score = Observation or x - Mean (μ) Standard Deviation (σ) o Z-scores represent the number of standard deviations above or below the mean, so a Z-score of -2.5 represents a score that is 2.5 standard deviations below the mean.

Measures of Variability: Range

•A measure of variability between the smallest and largest observations in a frequency distribution •Quick and easy to do, but not that useful since it only considers extremes and not the entire sample of data values oThe range in the systolic blood pressure data set (109, 116, 120, 140, 190) is: 190 - 109 = 81

Percentiles

•A measure used in descriptive statistics that shows the value below which a given percentage of scores in a given group of scores fall. o For example, the 40th percentile is the score below which 40 percent of the other scores in a given group of scores fall. If a score is in the 95th percentile, it is higher than 95 percent of the other scores. o A percentile can be broken up into quartiles. Quartiles are values that break up a list of numbers into quarters such as: •25th percentile or first quartile •50th percentile or second quartile •75th percentile, or third quartile Example: see image • Demonstrates how the age of 250 individuals is categorized in 25th, 50th, and 75th percentiles. • One can see that age 36 is at the 25th percentile, age 45 is at the 50th percentile, and age 53.25 is at the 75th percentile. • This shows that age 36 is the age below which 25 percent of the other ages fall, 45 is the age below which 50 percent of the other ages fall, and 53.25 is the age below which 75 percent of the other ages fall within this particular group of subjects. • This demonstrates that given the ages of the subjects, the majority are not considered elderly, since elderly would include those equal to or over the age of 65.

Mixed-Methods

•A mixed-methodology approach includes using both quantitative and qualitative data in a research study design. o Research questions that focus on real life, multilevel perspectives, across many cultures o Using multiple methods (for example, intervention trials and in-depth interviews) o Integrating these multiple methods or combining them to extract the strengths of each o Focusing the research within philosophical and theoretical positions

Normal Distribution

•A theoretical family of continuous frequency distributions characterized by a symmetric bell-shaped curve, with an equal mean, median, and mode •Half of the observations above the mean and half below it

Bar Charts

•Are the simplest used to describe qualitative, categorical, or discrete variables such as nominal or ordinal data. •Bars may be vertical where the value represents the height of the bar, or horizontal where the value represents the length of the bar. •There are several types of bar charts.

Normal Distribution: Properties

•Bell-shaped curve is symmetrical about the mean and extends infinitely in both directions (positive and negative). o Total area under the curve equals 1, so the area of one half of the curve is equal to .50 and the area of the other half is equal to .50. o One standard deviation from the mean = 68.26 percent of the area, two standard deviations = 95.45 percent of the area, and three standard deviations = 99.74 percent of the area under the curve. o Being defined by two parameters: the mean, μ and the standard deviation, σ.

Healthcare Research Organizations

•Centers for Disease Control and Prevention (CDC) is a US government organization oMission is to collaborate with the public to create the expertise, information, and tools people and communities need to protect their health, through health promotion, prevention of disease, injury and disability, and preparedness for new health threats. •World Health Organization (WHO) works to direct and coordinate authority on international health through the United Nations •Agency for Healthcare Research and Quality (AHRQ) is a federal agency within the HHS oMission is to make healthcare safer, higher quality, more accessible, equitable, and affordable, and to work within HHS and with other partners to make sure that the evidence is understood and used.

Introduction

•Data are more abundant and important than ever before. •Health information management (HIM) professionals are the bridge between data and information. •HIM professionals take data, present it clearly, and provide it to those who will use it to make important decisions.

Stem and Leaf Plots

•Data can be organized so that the shape of a frequency distribution is revealed

Scatter Charts

•Demonstrates a relationship between two variables. o One variable is plotted on the x-axis, and the other is plotted on the y-axis. • Strong relationship between two variables is seen when the data comes closer to forming a straight line. •When both variables increase and decrease at the same time, the scatter chart will show a positive relationship. •When one variable increases and the other variable decreases, the scatter chart will display a negative relationship. • Main goal is to illustrate nonlinear relationships between variables. •Used by researchers to determine quickly whether further calculations are needed o If the scatter plot demonstrates nonlinear relationships, then no further calculations, such as correlation or regression statistics, are needed.

Research Methodologies: Quantitative Studies, continued

•Descriptive studies: Research that is exploratory in nature and generates new hypotheses from the data collected •Correlational studies: Similar to descriptive studies except that the correlational study determines if a relationship may exist between two variables - purpose of performing correlational studies is to determine which variables are connected in some way. - correlation does not equal causation •Retrospective studies: Conducted by reviewing records and asking the subjects to recall past events in order to determine the presence or absence of the independent variable under study - This type of study is also called an analytic study because it tries to determine causation or whether an independent variable produced the dependent variable. - Case-control study •Prospective study: A cohort (a group of people banded together or treated as a group) of individuals are followed to determine if a particular characteristic or risk factor(s) may be causing the disease or outcome under study - Known as Farmingham Heart Study - The prospective study starts with subjects who have the risk factors (exposed group) and are free from the disease and compares them to individuals without the risk factor (unexposed group) who are also free of the disease. •Experimental study: Strives to establish cause and effect; it entails exposing participants to different interventions in order to compare the result of these interventions with the outcome -eligibility of appropriate participants -randomization (indiscriminate method) its important in effectively testing if the intervention actually made a difference -ethics - Experimental study can also be referred to as clinical trails •Quasi-experimental study is similar to the experimental study except that randomization of participants is not included in a quasi-experimental study, the independent variable may not be manipulated by the researcher, and there may not be control or comparison group. oCan be performed over time and may not include individual participants but whole healthcare systems - Example: Kaiser Permanente study to examin the impact of implementation of EHR on outcomes of diabetes patients. •Qualitative research designs involve collecting types of data that reflect a participant's perceptions, attitudes, feelings, or attitudes about a certain subject. o The methods used to collect qualitative data can include observations, focus groups, case studies, informal conversational interviews, and in-depth interviews. - Collect robust types of data •Grounded theory is a research method that enables the researcher to develop a theory that is substantiated or confirmed by the data. o It is a systematic method that can use multiple methods (both quantitative and qualitative findings) and pull it all together to develop a theory. - uses multiple methods to determine a new theory •Ethnography is a methodology where the researcher delves into a particular culture or organization in great detail in order to learn everything there is to know about them and to develop new hypotheses. - It is not objective and includes opinions of the researcher

Measures of Variability

•Examine the spread of different values around the measures of central tendency. oRange oVariance oStandard deviation

Institutional Review Board: Exempt Exempt research activities that are the most closely related to HIM.

•Exempt research activities : o Research conducted in an educational setting involving normal education practices such as testing different teaching methods o Research conducted includes using tests, interviews, or observations, unless identifiable and pose risks o Collection or study of existing data or specimens, if publicly available or de-identified

Institutional Review Board: Expedited

•Expedited research includes those studies that pose only minimal risk to human subjects o Examples include those studies that collect information on human subjects that is identifiable and may include sensitive information such as identifiable health information on subjects that are HIV positive.

Z Score: Example

•For example, if a prospective employee scores a 95 percent on a billing exam, with an employee average of 80 percent and a standard deviation of 5, using the given formula, the Z-score will be: 95 - 80 = 3 5 •This means that the score of 95 percent is 3 standard deviations above the mean.

Institutional Review Board: Full Board Approval

•Full board approval is required for those studies that do not fall under exempt or expedited

Descriptive Statistics

•Include frequencies, percentiles, measures of central tendency (mean, median, and mode) and measures of variability (range, variance, and standard deviation).

Structure and Use of Health Information and Healthcare Outcomes

•Individual data: Healthcare data that is housed within the EHR, or data collected from a case study, a focus group of individuals, or during an interview or survey •Comparative data: When individual data is organized numerically and collated to evaluate against standards or benchmarks •Aggregate data: Individual, comparative, or other multiple sources of data are compiled and analyzed in order to draw conclusions about a specific topic or area

Institutional Review Board (IRB)

•Major focus of the IRB is to ensure the research contains all the appropriate protections for human subjects involved in the research - Exempt research activities that are the most closely related to HIM. oThree major categories for IRB review and approval: exempt, expedited, and full board approval •Protects human subjects as they are involved in research activities. •Institutional Review Board (IRB) determines whether research conducted on human subjects is appropriate and protects the participant's rights.

Measures of Central Tendency

•Measures of central tendency oMean (average) - an average group of values oMedian (the middlemost point) - when values are ranked, the median is the value in which there is the same amount of numbers above and below oMode (most frequent value) - the value that occurs most frequently in a given set of observations or values

Variables Used with Measures of Central Tendency

•Modes are used mostly with nominal variables •Medians are used mainly for ordinal or ranked variables •Mean is used primarily with continuous or quantitative variables, such as interval or ratio.

Tables

•One type of tool to use to display data and can include both numbers and text. •Organize and categorize data oExamines detail of a specific concept, category, or response. •The key to building a table is to be stand alone so anyone reading it can understand the information displayed. •Every table should be composed of the following: o Table legend or title o Column titles o The body of the table that includes the actual data o Lines that divide certain parts of the table o Footnote or reference citation if the table text was taken from an article or other source

Histogram

•Represents the frequency distribution of numerical data •Used with continuous data that is part of a frequency distribution •Differs from a bar graph because histograms use continuous data oNo spaces between the bars oEach bar has a class interval at its base and the frequency or percentage of cases in that class interval at its height

Bubble Charts

•Similar to scatter chart except that it compares three data variables. • Figure 13.15 takes the data on personal income and adds two variables • Cost of hospital admission • Percent of the cost of hospital admission in relation to personal income • The larger the bubble, the more that patient had to pay for the hospital admission in relation to personal income

Pie Charts

•Simple graphs that use "slices of pie" to explain numerical proportion •Depict a breakdown of numerical data elements by percentages •Should not be used to compare data elements or when using many data elements because the slices of the pie can become too small to interpret •Good when explaining simple types of data, broken down into percentages see image

Measures of Central Tendency: Mean

•The average of a group of values •Numerator: Add each observation (i) from the first observation (i = 1) up to and including the last observation •Denominator (n): The total number of observations •For example, calculate the mean length of stay for cardiac patients for the month of January. The lengths of stay for these six patients were 5, 8, 6, 4, 7 and 3. oAdd the days of stay together: 5 + 8 + 6 + 4 + 7 + 3 = 33 total days oThen, divide by 6, the total number of patients: 33/6 = 5.5 days oThe mean stay of days per patient is 5.5 days

Frequency and Percentile Frequency

•The number of times something occurs in a particular population or sample over a specific period of time Example: • If a researcher wanted to determine how often subjects considered themselves a leader in information governance (IG), they could ask the subjects whether they consider themselves a leader in IG and then count how many of the subjects said yes and how many said no. • They could then build a frequency table based on this question and its results. If there are 200 subjects in the study, the frequency table may look like table 13.7.

Measures of Central Tendency: Median

•The value in which there is the same amount of numbers above and below •The middlemost value when arranged in numerical order oFor an odd number of observations, the median is the middle number in an ordered set of numbers; for an even number of observations it is the mean or average of the middle two numbers •For example, the systolic blood pressure of five patients is 140, 190, 120, 116, 109 oThe first step to compute the median will include ranking these values from lowest to highest: 109, 116, 120, 140, 190 •The median is 120 since it is the middlemost value when counting from left to right and also from right to left oIf there was one more systolic blood pressure value added to this data set, For example, if 140 is added to the existing values, then the new data set will include the following: 109, 116, 120, 140, 140, 190 •The new median will be 120 + 140 = 260/2 = 130

Measures of Central Tendency: Mode

•The value that occurs most frequently in a given set of observations or values o In the systolic blood pressure scores example (109, 116, 120, 140, 140, 190), the mode is 140 because it occurs more than any of the other values o Sometimes data sets have two modes and are then called bimodal

Line Graph

•Used to display continuous data and to show changes or trends of the data over time •The x-axis designates time (such as month, day, or year) and the y-axis shows the quantity of the plotted data •Used when one has many different data points to plot or more than one set of data to be plotted oMultiple lines can be put on one graph for very useful comparisons. see pic

Measures of Variability: Variance and Standard Deviation

•Variance: The average of the squared deviations from the mean. oSymbol is σ2 for populations and s 2 for samples. •Standard deviation: A measure of variability that describes the deviation from the mean of a frequency distribution in the original units of measurement; the square root of the variance oDisplays how data are related to the mean. •Variance and standard deviation can be cumbersome to compute by hand but statistical applications make it easy to automatically generate results. •Example for systolic blood pressure data set: 109, 116, 120, 140, 190 The variance and standard deviation are both fairly large values, which means there is great variability of the systolic blood pressure scores around the mean. This makes sense since the mean is 135.8 and the scores range from 109 to 190, which does demonstrate large amounts of variability.

Research Methodologies: Quantitative Studies

•When the data collected for the research studies are collated numerically with descriptive, inferential, or predictive statistics, oMay include open-ended or qualitative-type responses by participants in the study. oDescriptive studies oCorrelational studies oRetrospective studies oProspective studies oExperimental studies •Randomization •Ethics oQuasi-experimental studies

Frequency Polygon

•displays a frequency distribution using continuous data in a line form •Single data point placed at the midpoint of the interval is used to mark the specific number of observations within that interval •Each point is then connected by a line •Frequency polygons differ from line graphs in that frequency polygons display the entire frequency distribution (counts) of the continuous variable oLine graph plots only the specific data points over time


Related study sets

Real Estate Principles Final Exam

View Set

Clin Pharm: Intro, Rheum, EENT, GI, ID, Behavioral Health, Endo, Derm, OB/GYN, Neuro, Fluids/Electrolytes, Pulm, Peds, Cardio/Hemo, Pain Manangment

View Set