Data and Society Combined

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

ideal user assumption

-data generated by people who express themselves honestly through their personal accounts -fails to hold under many circumstances

deontology (salganik)

-focuses on ethical duties independent of their consequences -respect for persons rooted in this -focus on means

develop a code of conduct (10 rules)

-internalizing debated over ethics is key for successful research -public attention to ethical use of data shouldn't be avoided -provides guidance in peer review -visible case of unethical research can bring problems to an entire field

law enforcement (brayne)

becomes involved once criminal incident occurs

big data (Herschel)

capturing, storing, sharing, evaluating, actving upon info from humans and devices

dragnet surveillance (brayne)

collect data on everyone, increased monitoring of groups previously exempt

predictive, reactive or explanatory

data are increasingly used for ________, rather than _________ purposes

stratified surveillance (brayne)

differentially surveilling individuals according to their risk score

human subject (def 1)

living individual who participates in research investigation as a recipient of an item regulated by the FDA, as a control, or on whose specimen an investigational device is used

merged

previously separate data systems are ___________

utilitarianism and big data (Herschel)

-acts and rules assessed to see where good and bad weighed on scale -would have to quantify plusses and minuses of big data consequences -ambiguity inherent in trying to identify pros and cons

justice

-addresses distribution of burdens and benefits of research -should not be the case that one group in society bears the costs of research while another reaps its benefits

justice (Salganik)

-addresses distribution of burdens and benefits of research-one group shouldn't bear costs while another reaps benefits -researchers shouldn't be allowed to intentionally prey on the powerless -views of this around 1990 went from protection access -often interpreted to raise questions about appropriate compensation

LAPD case study (brayne)

-at forefront of data analytics - invests heavily in data collection and analysis -involved in multiple high-profile scandals in 90s, in response department of justice and them have decree that mandates creation and oversight of new data-driven employee risk management system -2011 began using platform for compiling and analyzing massive and disparate data -shift from traditional to big data surveillance associate w migration of law enforcement operations toward intelligence activities -use risk scores and field interview cards -shift from reactive to proactive, problem-oriented policing strategies

minimal risk standard

-attempts to benchmark risk of particular study against risks participants undertake in their daily lives -can make decision if something meets this standard even if don't know the absolute level of risk

Richard Herschel, Virginia M. Miori

-big data enables collection and use of massive amounts of data from man and machine -data characterized in terms of volume, variety, velocity, veracity, variability, complexity -four ethical theories talked about: kantianism, utilitarianism, social contract theory, virtue theory -digital media increasingly more data intensive and media rich -big data requires examination of those that have control over it bc can be used to target and manipulate people

more data are coming

-big data grow into more domains -also reach into past as libraries digitize their collections -more linkages between different big data will become more common

models will become more generic

-creating generic models and making them available to public -let researchers use pretrained machine learning models on their own data -use big data to make most effective models, then make those models standard for processing unstructured data -generic not always better than specified

big data (brayne)

-data environment characterized by it being vast, fast, disparate, and digital -analysis of large amounts of info -high frequency observations, fast data processing -comes from wide range of institutional sensors, merging of previous separate data

beneficence (Salganik)

-do not harm - shouldn't injure person regardless of possible benefits to others -maximize possible benefits and minimize possible harms -researchers should do risk/ benefit analysis and make decisions about whether risks and benefits strike appropriate ethical balance

virtue ethics

-emphasizes moral character rather than duties, rules, or the consequences of actions -character and actions of the people who deploy and use big data -also considering the intended and unintended effects of their actions on others

kantianism (Herschel)

-ethical theory concerns not about what we do but what we should do -dutifulness reflects good will -dutiful person acts way they do bc of morale rule -rules are paramount -everyone held to same standard and there are clear guidelines for appropriate behavior -not outcome that matters but rule behind the action

respect for law and public (Salganik)

-explicitly encourages researchers to take wider view, include law in considerations -compliance: researchers should attempt to identify and obey relevant laws, contracts, terms of service -transparency-based accountability: researchers should be clear about goals, methods, results at all stages of research, take responsibility -the IRB is a floor, not ceiling -not a ceiling - just filling out forms and following rules isn't enough, ethical responsibility still lies w reseracher

consequentialism (utilitarianism) (salganik)

-focuses on taking actions that lead to better states in the world -beneficence rooted in this -focus on ends

consider the strengths and limitations of your data (big does not automatically mean better) (10 rules)

-ground datasets in proper context including conflicts of interests -during data acquisition important to understand source and rules and regulations of data gathered -being mindful of data's context lets you clarify when data and analysis are working or not -sensitive to potential multiple meanings of data

engage with the broader consequences of data and analysis practices (10 rules)

-how might big data research lessen environmental impact of data analytics work -should researchers take lead in asking cloud storage to shift to sustainable energy -big data research has societal-wide effects

different data are coming

-image processing can analyze pictures now -tools to analyze these data increasingly being made available

practice ethical data sharing (10 rules)

-in some cases sharing data is expecation and key part of researchers -asking participants for broad, not narrowly structured consent, makes it easier to share data -even if broad consent gained shoul still consider best interest of participant -people followed by data clouds collected under mandatory terms of service -burden of ethical use and sharing placed on researcher

respect for persons (Salganik)

-individuals should be treated as autonomous -individuals w diminished autonomy should be entitled to additional protections -researchers shouldn't do things to people w/o consent -get informed consent when possible

qualitative approaches to big data

-innovative approaches weaving together qualitative methods and computational approaches -searching and sorting archives

design your data and systems for auditability (10 rules)

-internal auditing processes flowing easily into audit systems keep track of factors that could contribute to problems -automated testing processes for assessing outcomes can strengthen research -clearly document when decisions are made and backtrack to earlier dataset if needed

Sarah Brayne

-intersection of two structural developments: growth of surveillance and rise of big data -adoption of big data analytics facilitates amplifications of prior surveillance practices -data used for predictive purposes, rather than reactive or explanatory -automatic alert systems makes it possible to surveil very large number of people -social consequences of big data surveillance for law and social inequality -some individuals, groups, areas more surveilled than others, different populations surveilled for different purposes -two theories for why many institutions adopted big data surveillance: technical/ rational perspective and institutional perspective -US criminal justice surveillance increased dramatically - incarceration, parole and probation, untinended one -data driven decision-making become systematically incorporated into law enforcement practices in recent decades

guard against the reidentification of your data (10 rules)

-when datasets thought to be anonymzed combining w other variables can have unexpected reidentification -metadata associated w digital activity useful in identifying indivuduals -difficult to recognize these vulnerable points

automated alerts

________ make it possible to surveil many people

human subject (def 2)

a living individual about whom a research obtains data through intervention or interaction w the individual or identifiable private information

respect for a person

all persons have moral worth and should be treated w dignity

power analysis

allows researchers to calculate sample size they need to reliably detect effect of given size

stakeholder

any group of individuals who can affect or are affected by the achievement of the organization's objective

big data hubris

belief that volume can solve all problems

technical/ rational perspective (brayne)

big data is means to improve efficiency through improving prediction, filling analytic gaps, effectively allocating scarce resources

virtue (Herschel)

character trait that is well entrenched in possessor, makes that person good

dragnet surveillance practices

collect data on everyone, rather than merely individuals under suspicion

surveillance (brayne)

collection, recording, classification of info about people, processes, institutions

convenience census

complete record of certain set of individuals or behaviors matching certain criteria

variability (Herschel)

data flows can be inconsistent w periodic peaks

predictive purposes

data increasingly used for ____________

complexity (Herschel)

data is structured and unstructured from multiple sources

walled garden approach (Salganik)

data shared w people who meet certain criteria and who agree to be bound by certain rules

query-based system (brayne)

databases to which users submit requests for info in form of a search

direct police contact

datasets now have info on individuals who haven't had __________

direct police contact

datasets now include info on individuals who have not had any __________

moral virtues (Herschel)

deep-seated habits or dispositions formed through repetition of virtuous actions over time

intellectual virtue (Herschel)

derived from reasoning and truth

unintended use paradox (Herschel)

different sets of data that wouldn't previously been considered as having privacy concerns being combined in ways that threaten privacy

risk score

discretionary assessments of risk are supplemented and quantified using __________

risk scores

discretionary assessmnets of risk supplemented and quanitifed with __________

virtue ethics (Herschel)

emphasizes moral character rather than duties, rules, or consequences

deontological argument for informed consent (salganik)

focus on researcher's duty to respect autonomy of participants

intelligence (brayne)

fundamentally predictive: gathering data, identifying suspicious patterns, locations, etc.

alert based system (brayne)

get real-time notifications when certain variables are present in the data

context-relative informational norms

govern flow of info in specific settings -actors (subject, sender, recipient) -attributes (types of info) -transmission principles (constraints under which info flows) -differences in any of these three make different sets of these in situation

consequentialist argument for informed consent (salganik)

helps prevent harm to participants

mass surveillance in the US

law enforcement databases now include facial recognition of 117 million people (about 1 in 2 adults)

staged trials

move up step by step (ex. testing effectiveness of new drug

palantir

one of premier platforms for compiling and analyzing massive and disparate data by law enforcement and intelligence agencies

data from multiple platforms will become standard

possible and easier for researchers to perform studies on different platforms

informational risk (Salganik)

potential for harm from disclosure of info

merged

previously separate data systems are __________

anonymization (Salganik)

process of removing obvious personal identifiers, much less effective than people realize

systematically surveil

proliferation of automated alerts makes it possible to ________ unprecedently large number of people

beneficence

researchers should undertake two separate processes: a risk/ benefit analysis and then a decision about whether the risks and benefits strike an appropriate ethical balance

ethical-response surveys

reserachers present brief decription of proposed research project then ask: -if someone you cared about was candidate participant would you want them to be included -do you believe researchers should be allowed to proceed w this experiment

common rule (Salganik)

set of regulations governing human subjects research

ethics

study of what it means to do the right thing

surveillance

systematic investigation or monitoring of the actions or communications of one or more persons

dataveillance

systematic use of personal data systems in the investigation or monitoring of the actions or communications of one or more persons

function creep (brayne)

tendency of data collected for one purpose to be used for another, unintended one

research ethics

this suggests that researchers should make their studies as small as possible

respect for law and public

transparency-based accountibility - researchers should be clear about their goals, methods, and results at all stages of their research and take responsibility for their actions

risk/ benefit analysis (Salganik)

understanding and improving risks and benefits of study

confirmation bias (Herschel)

when data selectively used to confirm preexisting viewpoint while disrefarding data that refutes it

mass surveillance in the UK

-one CCTV (closed circuit television) per 12 people

David Lazer and Jason Radford

-issue w data is who and what get represented -certain big data can be vulnerable to changes in data generation process -scale of big data creates illusion they contain all relevent info on all relevant people -twitter has become 'model organism' for social media scholars -generalizability is a question of reference -results from one pop. of users doesn't necessarily apply to other -fix this by using data form multiple sources to validate findings -big data systems susceptible to various kinds of error and misappropriation -more and different data are coming, models will become more generic, data from multiple platforms will become standard -qualitative approaches to big data -methodological integration - big data increasingly integrated w existing research methods in sociology

debate the tough, ethical choices (10 rules)

-many big data ethical issues outside of governance mandate of IRBs -debate issues w groups of peers -precondition of formal ethics rules is capacity to have such debates -if debate done well provies means to understand ethic issues from range of perspective

IRB and data science

-may involve human subject as individual or may aaffect much wide group of people -moves ethical inquiry away from traditional harms like physical pain to less tangible concepts such as info privacy impact and data discirmination -fundamentally changes our understanding of research data to be infinitely connectable, indefinitely re-purposable, continuously updatable and easily removed from the context of collection

kantianism and big data (Herschel)

-organizations w big data not respecting autonomy of people, using personal data to further self-interest -by default, people's privacy compromised for gain of another -no one truly has abililty to determine how thier data is actually shared and used -should everyone assent to rule that states everyone's info can be shared w/o their permission -challenge of rights and fair treatment of individual

IRB origin

-originally developed in direct response to research abuse: -post-WWII doctors' trial - tuskegee syphilis study -some were ijected w disease but didn,t know, even after cures available the people weren't told

social contract theory (Herschel)

-person's moral and/ or political obligations dependent upon contract or agreement people have made to form the society in which they live -people understand that must cooperate and agree to follow certain guidelines to gain benefits of social living -chose rationality over natural selfish instincts

acknowledge that data are people and can do harm (10 rules)

-places difficulty of disassociating data from specific individuals from and center -seemingly benign data can have sensitive and private info -data seemingly having nothing to do w people could impact their lives in unexpected ways -harm also when datasets about pop-wide effects used to shape lives or stigmatize groups

recognize that privacy in more than a binary value (10 rules)

-privacy is contextual and situational -depends on nature of data, context in which created and obtained, expectations and norms of those affected -social media utilizng locations to push info or tracking it for intelligence has been seen as breaches of privacy -privacy extends to groups

american statistical association's ethical guidelines

-professiona integrity and accountability -integrity of data and methods -responsibilities to the science/ public -instituted for protection and support of statistician

operation laser

-program to identify and deter people likely to commit crimes -premised on idea that small percentage of high-impact players are disproportionately responsible for most violent crime -list distributed to patrolmen w orders to monitor and stop the pre-crime suspects as often as possible -at each contact officers fill out field interview card

institutional perspective (brayne)

-questions assumption that organizational structures stem from rational processes -role of culture - organizations operate in technically ambiguous fields, adopt big data due to wider beliefs of what organizations should be doing -big data may confer legitimacy

know when to break these rules (10 rules)

-recognize when is appropriate to stray (ex. in times of natural disaster may be important to temporarily put aside questions to serve larger public good) -review regulatory expectations and legal demands associated w protection of privacy -ethics often about finding good or better, but not perfect

utilitarianism (Herschel)

-right or wrong based on consequences of act or rule -right act is one that produces greatest happiness for community or society -wrong act decreases total happiness of affected parties -right moral rule of conduct is one where if adopted by everyone will have greatest net increase in happiness

mass surveillance

-surveillance of large groups of people -reason for investigation or monitoring is to identify individuals who belong to some particular class of interest to the surveillance organization

mass dataveillance

-systematic use of personal data systems in investigation or monitoring of actions or communications of groups of people -reason for investigation or monitoring is to identify individuals who belong to some particular class of interest to surveillance organization

utilitarianism theory

-theory of the good is fundamental -look at greater good/ greater benefits -examines right or wrong based on the consequences of an act or rule -the right one is one that produces the greatest happiness for a community or society -a wrong act decreases total happiness of the affected parties -focus on ends

deontological theory

-theory of the right is fundamental -choices should be made based on the rules -everyone is treated w dignity -obligation is independent of value -obligation based on reason alone -treat people as ends in themselves, never only as means to an end

Matt Salganik

-uncertainty about appropriate conduct of digital-age social research -ethical uncertainty had chilling effect preventing ethical research from happening -if can develop ethical norms and standards shared by researchers and public can harness capabilities of digital age in responsible and beneficial ways -norms around abstract concepts like privacy still actively debated, no uniform consensus -four principles: respect for persons, beneficence, justice, respect for law and public interest


Set pelajaran terkait

Rektion 4, s. sorgen - s. wundern / abhängen - beginnen

View Set

Chapter 18 Privacy and Consumer Protection

View Set

PERFUSION THE CONCEPT OF PERFUSION

View Set

Chapter 06: Current Digital Forensics Tools

View Set

A&P CH 13 - Anatomy of the nervous system

View Set

Chapter 52: Renal and Urinary Medications

View Set