CITI Internet-Based Research SBE
The internet can be used as a research tool or as the object of a study. Which of the following examples best describes an investigator using the internet as a research tool?
An investigator uses his Facebook wall to post a URL link to a survey he is hosting on SurveyMonkey.
To minimize potential risks of harm, a researcher conducting an on-line survey can:
Design the survey so that no direct or indirect identifiers are collected.
Consent to participate in research is an ongoing process. Which of the following strategies would help ensure that participation in a survey about a sensitive personal topic remains voluntary throughout a study?
Designing the survey so that subjects are not forced to answer one question before going to the next.
Researchers endeavoring to conduct an on-line study should consider that there are some potential risks of harm to subjects unique to internet-based research. One of these risks is:
Individuals may post private identifiable information about themselves on-line without intending it to be public and available to researchers.
Which of the following on-line research strategies raises the most concerns regarding the ethical principle of respecting the autonomy of research subjects and the corresponding federal regulations requiring informed consent?
A researcher proposes to join a moderated support group for cancer survivors posing as a survivor. She plans to insert comments to see how the members respond.
The internet as a research tool and as the object of study
The Internet As a Research Tool and As the Object of Study Researchers can use the internet to conduct research with human subjects in two ways. They can: Use the internet as a research tool Use the internet as the environment to study human behavior Researchers may use the internet to help them conduct their studies. Examples of internet research tools include... Researchers may use the internet as a research setting in which they merely observe people's online behavior, often requiring little or no interaction with subjects. Examples of internet research environments include... Researchers also can use computer programs to extract information from people's profiles on social networking websites or from MMORPGs. This method of data collection is known as "data scraping." Although data scraping may provide researchers a wealth of useful data with little effort, it presents several ethical issues that require consideration regarding informed consent, data security, and privacy.
Internet-based research: applying the federal regulations and ethical principles
The federal regulations and ethical principles should be applied to internet-based research. Applying the Federal Definition of Research with Human Subjects to Internet-Based Research According to the 45 CFR 46, Subpart A (the Common Rule), "human subject" and "research" are defined separately. However, both definitions must be met to qualify as human subjects research, which is a systematic investigation (intent to contribute to generalizable knowledge) involving living individuals about whom a researcher conducting research obtains information through intervention or interaction with the individual, and uses, studies, or analyzes the information or obtains, uses, studies, analyzes, or generates identifiable private information (Protection of Human Subjects 2018). Living Individuals Asking subjects to complete an online survey or studying behavior in an online support group would meet the definition of research with human subjects because there are live humans about whom information is being collected and the researcher is interacting or intervening with them. However, determining if you are collecting information about or interacting with "live human beings" may not always be clear in internet-based research. For example, suppose a researcher wants to study avatars in virtual worlds. Avatars are characters that represent a "real world" user online, such as on MMORPGs or online communities, and are used to interact and engage within the virtual world. Users can generally customize the physical appearance of their avatars, as well as their avatars' personality and character information. On some online platforms, it is understood that the information making up an avatar's character is fictional, such as avatars on the MMORPGs World of Warcraft and Final Fantasy. However, on some online platforms, such as web forums and groups, it may not be clear (TechTerms.com 2013). Furthermore, researchers need to be aware that online landscapes may also make use of Non-Player Characters, also known as Non-Person Characters or Non-Playable Characters (NPCs). Although NPCs may look like an avatar, and in some instances be considered a "virtual human," they are not avatars because they are designed and controlled by a computer through artificial intelligence. If you obtain consent from an avatar, does the avatar become a human subject, by default, because it is controlled by a human? Do you need to obtain consent to use an avatar's information even if it is likely to not be representative of the human user? Interacting/Intervening As already noted, there are several ways in which researchers can use the internet as a research tool or as the object of study. Generally, when researchers actively engage and interact with individuals online to collect data, it is likely they are using the internet as a research tool. Conversely, if researchers collect data from individuals by merely observing the way people interact or behave online, it is likely they are using the internet as the object of study. But the degree to which researchers interact and intervene with subjects also may vary. Determining how the data are collected may help guide other aspects of the study's implementation, such as informed consent, privacy, and confidentiality protections. For example, the way subjects' consent to an online survey is not the same way they consent to a Skype interview. Similarly, the way researchers apply privacy and confidentiality protections will differ if the information collected is from a private online support group or from a public blog. Private Information According to the Common Rule (Protection of Human Subjects 2018), private information is information that an individual can reasonably expect will not be observed or recorded. In internet-based research, issues about private information usually pertain when using the internet as a research setting. One method to determine whether content on a webpage is public or private is to consider how a researcher can access the material. If the material could be accessed only by registering an account and logging in, then one could argue that the content on that page is private; whereas any material that could be accessed without an account or by logging in, could be considered public. However, this method may not be applicable to all websites or services. For example, some websites (such as the online music search engine Grooveshark) allow users to access their content for a limited time, or up to a certain number of "hits," or a number of requests to view or access a file or page. After users have reached the site's threshold, the site might ask or require users to log in to continue accessing content. Do websites with these types of limitations provide researchers with information that is considered public or private? Another method is to consider what the users' expectations are about the privacy of their online behavior. People use the internet to share information about themselves everyday through different platforms like YouTube, Facebook, Instagram, Pinterest, Twitter, Yahoo, Strava, FitBit, LinkedIn, and many more. Users of these sites and services can share identifiable information about themselves, such as their full legal names, birthdates, email addresses, Global Positioning System (GPS) coordinates, job titles, and employers. However, users may not know or lose sight of the fact that the information shared may be public by virtue of it being posted on the internet. What responsibilities do researchers have in determining their subjects' expectations of privacy? If it is reasonable to expect that some individually identifiable information available online is public rather than private, collection of that information may be compared to collecting information about people from a local newspaper. Institutional Review Board (IRB) review would not be required in this case because there is no expectation that information published about us in newspapers is private. However, individuals may have privacy expectations that are at odds with the technology they are using. For example, people in online support groups may consider their communications within the group to be private even if the group is publicly accessible. Their expectation may be that their communications within the group are not to be observed and recorded for purposes such as marketing or research. This expectation is not consistent with the reality that many online support groups are public and that millions of people can access these groups' online communications. Users of social networking sites, such as Facebook, may or may not understand the privacy settings available to them. They may assume that information scraped from their profiles by researchers was, in fact, private. Conversely, users may have a thorough understanding of the privacy settings they apply to their profiles, but fail to realize that Facebook could change the privacy settings outlined in its Statement of Rights and Responsibilities without notice (Facebook 2012). Do researchers have any obligations to naïve users of these kinds of sites when assessing what is a reasonable expectation of privacy? Individually Identifying Information Protecting subjects' privacy as well as the confidentiality of identifiable study data requires considering the ways people identify themselves when using the internet. For example, individuals' internet identities may be very different from their "real" (or offline) identities. Although researchers may not use a subject's real name, that subject's online username may serve as that subject's internet identity. In this scenario, a subject's online name is considered a direct identifier. Similarly, if a researcher reports on the features and attributes of an individual's MMORPG character (for example, the character's race, job classification, equipment, weaponry, and ranking), this may indirectly identify the individual. Should researchers afford individuals' online identities the same protections as their offline identities? There are other re-identification methods unique to the internet. For example, direct quotes taken from a blog post or forum response can be retrieved easily using a simple Google search and thus may reveal the identity of the author. A series of unique data points also may allow subject re-identification, specifically if these data points appear in more than one dataset. In addition, information that may not seem sensitive or identifiable at the time of collection may become sensitive or identifiable, inadvertently, as technological advances help researchers link, combine, and analyze data (Thompson 2012). One unique data point that could, conceivably, be used to re-identify someone is an Internet Protocol (IP) address. An IP address is a unique identification number assigned to every computer connected to the internet, and is used by websites to send and retrieve information. It is said the internet "runs on IP addresses," which is perhaps the reason why it is quite simple to trace an IP address. In fact, there are several sites that provide users with instructions on how to trace an IP address in a matter of steps. Though it is not always possible to track down a specific individual using an IP address alone, the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule considers IP addresses direct identifiers, as does the European Union (OHRP 2010; HHS 2012b). Some online research service providers, such as the survey tool-builder Qualtrics, offer researchers the option to be given their data without an IP address as an added measure to protect the confidentiality and privacy of respondents. Applying Ethical Principles to Internet-Based Research Respecting the autonomy of subjects and minimizing potential risks of harm to subjects are ethical principles most relevant to internet-based research. 1. How can researchers respect the autonomy of subjects in internet-based research? Respecting the autonomy of subjects requires researchers to give prospective subjects adequate information to make an informed decision before agreeing to participate in a study. This is done through the informed consent process, which includes: Providing subjects with the information they need in order to decide whether to take part in a study Documenting that the information was provided and that the subject volunteered to take part in the study Read more about information provided in online consent forms... Documenting consent online is quite different from documenting consent in person. Documenting consent in person often consists of a signed form that researchers and subjects keep in their possession. The challenge in designing a consent process for online studies is that they usually do not provide an opportunity for the researchers and subjects to directly interact. Fortunately for researchers, the Office for Human Research Protections (OHRP) allows them to obtain an electronic signature as a way to document consent (HHS 2012a). HHS regulations permit consent to be documented in a tangible format (written or in writing) which includes paper or electronic (Protection of Human Subjects 2018). An electronic signature can be an embedded image of a subject's signature, a handwritten signature produced by a tablet computer, or encrypted digital signature (Thompson 2012). Nonetheless, the Common Rule allows IRBs to waive the requirement to document consent from subjects, as long as the study presents no more than minimal risk of harm to subjects and the research does not involve procedures requiring consent outside the context of participation in the study. However, waiving the requirement to document consent does not waive the requirement for providing a consent process. One method of providing a consent process is to design an online consent form that includes a "live button" that subjects can click to demonstrate their consent. This version of an online consent form should include a statement to the effect of, "Clicking below indicates that I have read the description of the study and I agree to participate in the study." This way, subjects are actively demonstrating their consent to participate. Note that this process does not produce a signed consent form for subjects or researchers to store as part of their research records. Other methods for securing consent for online studies include obtaining an electronic signature, as noted above, or sending a paper consent form to the subject via mail. Although the latter method produces a signed consent form, it also requires that the researcher collect personally identifiable information about their subjects that may not be necessary (Thompson 2012). Securing consent online also raises issues regarding children. In the U.S., each state determines the age of legal majority, and the age of legal majority differs from country to country. Unfortunately, researchers recruiting via the internet cannot know where respondents reside or whether respondents are 15 or 51. Researchers may use informal age verification measures, such as asking subjects to provide their age or year of birth, or by stating that subjects must be considered an adult to participate. However, these techniques do not guarantee honest responses or compliance. Researchers could make use of several age verification methods that may prevent minors from participating in research, such as age verification services (AVS). Nonetheless, internet research experts argue that these products, though robust, are not infallible (Thompson 2012). Thus, in the absence of a foolproof method for determining the ages of internet users, researchers and IRBs should carefully consider these issues when reviewing some of the types of online studies endeavored. Another issue related to respecting the autonomy of subjects and internet-based research includes incomplete disclosure, deception, and the complete absence of an informed consent process. The internet offers researchers compelling opportunities to study behavior unmediated by the presence of an observer. The internet also provides unique opportunities for observational research in private settings. For example, a researcher can register as a member of a closed group with relative ease to observe interactions among the members, while concealing his or her identity as a researcher, or use public blog posts about a group of individuals' private lives for research purposes without the individuals' consent. An IRB also may approve studies in which the informed consent process is completely waived, if it determines that the waiver is justified. For example, if a researcher wanted to study behavior and interactions among users of online virtual worlds, such as Second Life, the IRB could waive the requirement to obtain consent because consenting might alter the subject's behavior, thus requiring a consent process would make the research impracticable. In these types of studies, researchers "lurk" to collect data. In some instances, researchers may even adopt a false identity or create a fictitious website in order to observe private behavior. Nonetheless, the relative ease of collecting data without consenting subjects calls for more discussion about applying ethical guidelines to research conducted in internet environments. As discussed earlier, other issues pertaining to respecting the autonomy of subjects include whether researchers bear the responsibility to interpret their subjects' expectations of privacy and how to recognize those expectations. 2. How can researchers minimize potential risks of harm to research subjects in internet-based research? The greatest risk of harm to subjects taking part in social and behavioral sciences research is the inadvertent disclosure of private identifiable information that could damage their reputations, employability, insurability, or subject them to criminal or civil liability. Several factors need to be considered when assessing potential risks. In most internet-based research, the primary risk of harm is loss of confidentiality. Re-identification of Data Re-identification methods are unique to internet-based research. Two examples of re-identifying presumably de-identified datasets are the Harvard University study "Taste, Ties, and Time (T3)" in 2008 and the America Online (AOL) search data leak in 2006. Read more about the T3 study and how an individual at another organization was able to use indirect identifies to re-identify the study subjects because the dataset included enough unique indirect identifiers... Learn more about the AOL search data leak and how journalists were able to identify unique users... Data Storage How the data are stored also raises confidentiality issues. Researchers can expect more challenges as technologies such as cloud computing, continue to develop. Cloud computing is the storage and delivery of resources, software, and information over an online network, rather than a physical product (Mell and Grance 2011). Resources, software, and information are uploaded to and stored on "cloud servers" in the U.S. and abroad. Users can then access and share these files remotely, and with relative ease. Encrypting data before uploading it to a cloud server is a recommended security measure. Data Mining New technologies are paving the way for careers and professions in data mining. Data mining is a field of computer science that involves methods such as artificial intelligence, statistics, database systems management, and data processing to detect patterns from large data sets (Chakrabarti et al. 2006). Academic institutions now offer training to individuals who want to use statistical methods to extract useful information from large, multiple datasets that can be used for research purposes, including marketing (Boston University 2012; Northwestern University 2012). It is important to consider the source of the data before mining it. In data mining, researchers may use data scraping techniques to collect information from Facebook or public blog posts without consent. Unauthorized Access If email communications are used to recruit subjects, they may not be secure and may provide a record of subjects' identities. Similarly, hacking and unauthorized computer access have become an everyday possibilities that researchers need to anticipate when conducting research in the digital age. Encrypting data while it is stored on a researcher's server is a recommended security measure. It is for these reasons that researchers should collaborate with information technology professionals to develop a data security plan. However, even the most careful security procedures used by a researcher can be voided if the subjects' computers are not secure. Describing Risks to Subjects As mentioned in the previous section, the informed consent process should describe the extent, if any, to which the confidentiality of subjects' personally identifiable information will be maintained. It is clear that ensuring the confidentiality of personally identifiable information is challenging in internet-based research. Some internet-based research experts have identified several "best practices" for describing confidentiality protections to subjects. Some of these best practices include an explanation of how data are transmitted from the subject to the researcher, how the researcher will maintain and secure the data, and a discussion to emphasize that there is no way to guarantee absolute confidentiality (Thompson 2012; Office for the Protection 2012).
Introduction
The internet, with an estimated 2.4 billion users worldwide as of 2013, has much to offer researchers both as a research tool and as the object of study. Let us consider some ways the internet can be used in research... However, researchers must deal with ethical issues unique to the internet environment. These issues, such as the difficulty of confirming the adult status of online subjects, the "personhood" of pseudonymous online identities, and the potential re-identification of de-identified data, do not have ready solutions. Researchers must become conversant with new technologies, particularly when sensitive information must be safeguarded. Researchers also must become familiar with privacy and confidentiality policies offered by private companies or other agencies that conduct or host online research tools and services, such as online survey builders, to ensure they are adequately protective. This module will raise some issues for discussion and offer guidelines for protecting subjects and data in consideration of emerging technologies and the rapidly evolving uses of the internet. Learning Objectives By the end of this module, you should be able to: Identify some of the ways in which social, behavioral, and educational researchers are using new internet technologies. Discuss the application of the federal definition of research with human subjects to internet research. Discuss how ethical principles that guide research with human subjects can be applied to internet research. Identify some of the issues that must be addressed when designing a research study that uses the internet.
Which of the following methods could be considered a "best practice" in terms of informing respondents how their answers to an on-line survey about personal information will be protected?
The investigator uses the informed consent process to explain how respondent data will be transmitted from the website to his encrypted database without ever recording respondents' IP addresses, but explains that on the internet confidentiality cannot be absolutely guaranteed.
Summary
Though the internet has much to offer researchers, researchers using it have several issues to consider before designing an online research study. Some of these issues include user expectations about the privacy of their information, informed consent processes, potential for re-identification of collected data, and that there are no absolute guarantees to ensure the confidentiality of personally identifiable information. Navigating these issues becomes much more intricate and complex when adding the fluid nature of the online world and the rapid development of new internet technologies into the equation. It is for these reasons that experts in the field of internet research ethics have described internet-based research as a "moving target."
