hca 465 chapter 5

Ace your homework & exams now with Quizwiz!

coefficient alpha

aka "cronbach's alpha"; a measurement of internal consistency among a group of items (e.g., survey, test, or interview questions) -allows researchers to determine how well the items measure different aspects of the same topic -values range from 0 to 1; values closer to one indicate a higher internal consistency than values closer to zero

inter-rate reliability

aka "inter-observational reliability; the scoring or observations remain consistent regardless of the person who is doing it -investigators use a checklist to standardize and increase reliability by ensuring evaluators are collecting the same type of observational data

internal validity

focuses on the rigor of the study; its primary types are face validity, criterion validity, construct validity, content validitiy

measurement bias

happens during data collection by researchers or subjects because of systematic errors in measurement

relationship between internal and external validity

-INTERNAL validity is more critical than external -without internal validity, research is not testing what it reports to measure (however, as the study inclusion criteria becomes more selective, the results become less generalizable)

pilot test

-essential for both quantitative and qualitative studies -it is a complete dress rehearsal before data collection; involve every aspect including environment, data collection, content and outcomes of a study. -involves conducting a preliminary test of data collection tools and procedures to identify and eliminate problems -when executed correctly, they save time and resources during the actual data collection process

ways to reduce measurement errors

-pilot test -data collection training -double data entry -statistical consultation -triangulate data collection

how to conduct a pilot study by categories

-sample of respondents: recruit a small number (less than 10) of individuals with characteristics similar to the actual sample to test the method of recruitment. If it is difficult to recruit for the pilot test, than recruitment methods for the actual study need to be revised. -data collection: helps researchers determine if revisions are needed in the actual instrument or in the data collection procedures. Steps: 1) create the exact environment that will be used for the actual data collection 2) when possible, observe participants while they complete surveys during the pilot test. 3) after the surveys or interviews are complete, ask the participant a few questions -data analysis: enter the data and conduct a pilot test of the data analysis procedures -outcome: conduct the actual study

benefits of pilot testing

-taking the time to conduct a thorough pilot test is always worth the time and money, because once data is collected, it is too late to fix mistakes that could prove to be fatal flaws in the entire study -when reporting results, investigators document the pilot study and revisions made based on results, adding authenticity

randomized controlled trial designs

-the gold standard of research design -participants are randomly assigned to either a treatment of placebo group -participants have similar characteristics (age, gender, or length of diagnoses) -allows researchers to draw conclusions with confidence if one group is significantly different at the end of the study due to random assignment -controlled study (one treatment and one placebo group)

what 3 areas should one investigate when looking for systematical errors?

1) environment 2) observation 3) drift *the problem with each type of error is even when the error is recognized, it is difficult to determine when it began in the data collection process and how much the systematic error influenced actual results

factors that influence generalizability (external validity)

1) population (also known as selection and treatment): when population selection is too specific, treatment is matched to a specific sample and is not applicable to a wider population 2) environment: studies conducted in particular spaces/settings 3) temporal/sequential factors: the time of year/season a study was conducted in can affect the results 4) participants: animal-to-human links; human-to-human links; gender bias; racial bias; cultural and ethnocentric bias 5) testing and treatment interaction: if participants learn from the pretest, they may be less likely to learn as much from the treatment 6) reactive arrangements (hawthorne effect): if individuals change their behavior when observed (threat to internal validity), results are not generalizable to real-world conditions (threat to external validity) 7) multiple treatment conditions: the same individuals exposed to multiple treatments; because multiple treatments may create an artificial setting that does not exist in the real world, results may not be generalizable *additionally, researchers and evaluators must be aware of all possible threats that affect INDEPENDENT variables

Validity

the extent to which a test measures what it purports to measure

split-half technique

a technique to measure homogeneity by dividing the entire test or survey into two equal halves (e.g. odd numbered and even-numbered questions). -The two forms are administered to the same individuals. -if the odd-numbered questions yield the same results as the even-numbered, the entire test is deemed reliable

Reactive Arrangements (Hawthorne Effect)

applies to external validity as well; if the individuals change their behavior when observed (threat to internal validity), results are not generalizable to real-world conditions (threats to external validity)

what is reliable research based on?

based on several basic assumptions, including randomized controlled trial designs and adequate sample sizes, and is free of known bias

observation

can change due to observer fatigue, time of day, room temperature, training of different observers, attitude of participants, and various other human behavior variables

Test-retest technique

checking the stability of an instrument by giving it to an individual and then giving it to her again after a certain period of time -involves two assumptions: 1) the item or observation does not change over time 2) the time between the first and second administration is long enough

reliability

consistency of the instrument or survey being used, not the respondents -related to consistency or the ability to repeat results -important because if a test's results are different each time, the test is faulty and can lead to inaccurate conclusions -can be established one of two ways: stability and internal consistency

external validity

deals with generalizability; if evaluation or research is repeated with different populations, situations, time, or environments, the results are expected to be the same -a threat to this might explain how generalizations are incorrect -repeating evaluation methods and research in different populations is the best way to access generalizability

controlled study

defined as one group receiving treatment while another, similar group receives no treatment -may be randomized from original groups of participants, or may be similar groups at a different location

systematic errors

errors that are consistent in the same direction, so that no matter how many times the experiment is repeated, the same errors occur -validity is influenced by these kind of errors -these errors introduce inaccuracy to the measurement, cause bias in the data, and diminish the extent to which the test is measuring what it purports to measure -problematic to detect and eliminate -it is not possible to reduce the effect of these kind of errors through statistical methods

double data entry

evaluators and researchers enter a portion of the data twice and then compare the two entries. -if there are multiple errors, investigators go back to the original data and determine why the errors are occuring

relationship with similar measures

ex: researchers are interested in developing a shorter depression scale for adolescents. Adolescents are asked to complete both the older, longer depression scale with established validity and the newly developed, shorter depression scale. If the measurement of the established depression scale are similar to the measurement of the new scale, then researchers conclude evidence of construct validity.

face validity

exames how the test appears *NOT based on theory but merely the appearance of the test (potential ease of completion, comprehension, and readability) -can be compared to "showing up in the correct outfit" or "looking the part"; if the survey does not look appealing, individuals are less likely to complete it -also refers to the logical sense of the survey (do the questions relate to the subject being addressed?)

content validity

how well a test measures the specific content it is intended to measure

testing and treatment interaction

if participants learn from the pretest, they may be less likely to learn as much from the treatment -if a participant is sure that they received a perfect score on the pretest, they are less likely to listen intently during the class, because they already know the information

relationship between reliability and validity

if the scale is VALID, the results should be the same if the same person tested on it over and over. It measures what it is intended to measure HOWEVER, the reverse is not always true reliability is about consistency and repeatability

types of validity

internal validity (face validity, criterion-related validity; construct validity, content validity) external validity (

statistical consultation

investigators consult with statisticians to seek assistance with entering data and determine ways to reduce and/or measure data errors

intervention bias

involves how many intervention groups are treated differently than control groups if the researchers involved know which group is which

Is it more important for a test to be reliable or valid?

it is more important for a test to be valid than reliable

criterion-related validity

measures one topic in two different ways *most common in having a written test and then a test that applies the skill (like a written test for your permit and a behind-the-wheel driving test for your license) *it is important that the learner knows the didactic information as well as how to perform the skill

inter-rate reliability score

number of concurrences / number of opportunities for concurrence x 100 ex: investigators use a predetermined checklist to observe 80 patients in the clinic waiting room; according to their checklist, they agree on 68 out of the 80 ratings. The calculation is 68/80 = .85 x 100 = 85% -therefore, the investigators agree 85% of the time, or their observations have a reliability of 85%

random errors

occur by chance and are inconsistent across the respondents -these errors increase or decrease results in an unpredictable manner; therefore, researchers and evaluators have no control over the occurrence of them they influence consistency in several ways: 1) participating INDIVIDUALS may change from day 1 to day 2. 2) TASKS may change between day 1 and day 2 3) if there is a small sample of participating individuals, outside forces have a greater effect on the outcome *each alters the reliability of knowing whether the experiment was reliable in doing what it intended to 4) these errors may also occur in written tests. There are 3 examples under this source: 1) if the test is TOO SHORT, individual scores are based more on chance and luck than knowledge 2) if the test is NOT GRADE precisely the SAME WAY for each student, these errors can cause inconsistent reliability 3) if the test is NOT ADMINISTERED CORRECTLY Ways to reduce these kinds of errors: 1) have a larger sample size; these kind of errors are less influential in either direction when the sample size increases 2) average scores over a larger sample size to reduce these errors through statistical methods

selection bias

occurs if specific individuals or groups are purposely omitted from the investigation

bias

occurs in research and evaluations from numerous causes including selection, measurement, and intervention.

relationship with experimental measures

patients are asked to complete two anxiety scales: one with established validity and a newly developed scale. In the midst of potential anxiety, patients should score higher on both scales. If both scales show similar levels of anxiety, researchers should conclude evidence of construct validity

environment

refers to how the setting changes over the course of research -can be a source of systematical errors ex: if the research was conducted outside, the temperature, humidity, wind speed, or heat index may have caused variations in data and results that are difficult to pinpoint

cultural and ethnocentric bias

research conducted on one specific cultural group is not generalizable to other cultures. for example, spanish speaking individuals do not share the same cultural background (i.e. there are differences among the cultures in Puerto Rico, Mexico, Nicaragua, Honduras, Venezuela, and Spain) -religions can also contribute to cultural differences among various populations

racial bias

research conducted only on African Americans, for example, does not yield generalizable results to other racial groups. In today's world of linking genetics to health conditions, issues of racial inclusion are more important than ever.

gender bias

research including only men or only women is not generalizable to the non-represented gender group. The same is true for studies that include only heterosexual individuals and fail to recognize LGBTQ individuals.

comparison scores among defined groups

researchers create a new dexterity aptitude test with the hypothesis that adults have varying degrees of dexterity. *more than 1000 adults complete the new test. If results show that adults in specific fields (e.g. surgeons, dentists, musicians) score higher on the new dexterity aptitude test than adults in non-dexterity applications, there is evidence of construct validity

human-to-human links

researchers use college students for many studies because the students are a convenient sample. However, results are questioned because data gathered from college students may or may not be generalizable to other groups of young adults not attending college.

internal consistency

the extent to which each question in a survey is related to the same topic; also called "homogeneity" -in quantitative research, researchers use a split-half technique to measure homogeneity -Test A and Test B scores should be approximately equal, with all questions measuring concepts related to the material covered in the tested chapter

randomized

the research participants are randomly assigned to either a treatment group or a non-treatment group (placebo group) -allows researchers to draw conclusions with confidence if one group is significantly different at the end of the study

multiple treatment conditions

the same individuals exposed to multiple treatments *because multiple treatments may create an artificial setting that may exist in the real world, results may not be generalizable

construct validity

used to measure a concept that is not actually observable *can be established by exploring relationships with similar measures, experimental measures, and comparison scores among defined groups

threats to internal validity

things that confuse or confound test and survey results and overall findings 9 key threats: 1) history: an event happens during research that influences the behavior of participating individuals 2) maturation: the natural changes that occur over time with individuals 3) testing: differences noted from pretest to posttest that can be attributed to students becoming familiar with the test 4) instrumentation: measures changes in respondent performance that cannot be credited to the treatment or intervention 5) regression: some respondents performing well on pretests and poorly on post-tests or vice versa, merely by change. sometimes manifests as "regression to the mean" in which data from these high-pretest to low-posttest performances cancel each other out, and the overall score is similar to the average 6) ceiling effect and floor effect: **ceiling effect: when all participating individuals perform extremely well on pretest and posttest, therefore making it difficult to determine any changes the intervention may have had **floor effect: when individual performance starts out low and remains low; leads investigators to think that individuals are unresponsive to treatment, when in fact the low performance may be caused by a factor outside the intervention or treatment altogether 7) attrition: individuals lost from the study; if a large number of individuals leave the study for a variety of reasons, results may reflect more about the individuals who stayed in the study than the treatment conditions of the study 8) selection: when participating individuals are different at the onset of the study; it can make it difficult to know whether the type of condition or the treatment used influenced the results 9) hawthorne effect: improving performance when you are aware that you are being watched

triangulate data collection

to increase reliability and validity, researchers and evaluators choose to collect similar data using more than one method

determining adequate sample size

too small: results are inconclusive and significant differences among groups are statistically harder to determine too large: cost, feasibility, and time become problematic *an ideal sample size to adequately represent the target population is what researchers strive for. However, a larger population is preferred over a smaller one because of the increased precision and accuracy of the study

data collection training

training data collectors for consistency (e.g., detailed checklists, machine measurement calibration checks, inter-rater comparisons to avoid creating errors)

measurement errors

two types of errors that influence the results of surveys, tests, and instruments: random errors and systematic erros

dependent variables

variables that are fixed and not manipulated

independent variables

variables that researchers manipulate and control

advantages and disadv of estimating reliability

ways to establish reliability: stability --> test-retest *pros: single rater is adequate, no need to train teams of raters, less expensive and time consuming *cons: often difficult to recruit same respondents to respond twice; individuals may not respond as seriously the second time ways to establish reliability: internal consistency --> quantative: split half forms *pros: respondents take both surveys at the same time; no need to recruit respondents twice *cons: need to create a large pool of items qualitative: inter-rater or inter-observation reliability *pros: best for observational research, especially when video recording is used; possible option: one rater reviewing video at two different times *cons: expensive, time consuming, requires a team of raters for best results

drift

when evidence suggests that the dat is slowly moving in one direction -ex: may occur if the machine used for lab results is not calibrated each day prior to running complete blood count samples

animal to human links

when researchers use rats or other animals to test specific drugs, it is questionable whether humans will react to the new drug in the same way as rats

stability

when the results of a survey or instrument are consistent over time.


Related study sets

MGT 444 Midterm Sample Questions & Answers

View Set

chapter 18: states and societies of sub-saharan africa

View Set

NCLEX-RN Passpoint - Basic Care and Comfort, Pharmacological and Parenteral Therapies, Reduction of Risk Potential, Physiological Adaptation ( found online not sure of it)

View Set