Global Data Law
Monaco Cloud, the Monegasque sovereign cloud set for launch in 2021 (Extended Monaco, September 2020).
"Based on sector-leading Amazon Web Services (AWS) technology, the Monegasque sovereign cloud will serve as the basis for the development and creation of new digital services in the Principality, including those related to the smart city, e-health, e-education and e-government."
Apple, Learn more about iCloud in China mainland, https://support.apple.com/en-us/HT208351
iCloud in Mainland China is operated by a local company to comply with Chinese regulations. Therefore all data stored with iCloud is subject to the terms and conditions of iCloud operated by that company. If you're not a Chinese citizen, you can edit your country/region setting to continue using iCloud under the current terms and conditions.
Passive Personality Principle of Jurisdiction
in limited circumstances, a state may exercise jurisdiction over acts committed abroad by non-national when they injure a national of the state
World Wide Web Consortium (W3C)
maintains the standards that define how information on the "web" has to be presented (in which format) and accessed (through a "web browser")
Packet-switching
most messages are too big to send all at once, so they are broken up into packets which are sent separately, usually along multiple paths. It is flexible, resilient, and allows the packets to reroute and avoid "traffic jams" to travel more efficiently
Australian Government Independent National Security Legislation Monitor, "Trust but verify: A report concerning the Telecommunications and Other Legislation Amendment (Assistance and Access) Act 2018 and related matters" (2020) Paragraphs 5.38 - 5.54 (Encryption).
"Encryption is a branch of cryptography designed so that transmitted data is only intelligible to those authorised to decrypt, and thus make intelligible, that data, whether that is ultimately viewed as text, voice, images or in some other format." 106 Encryption has been a key part of communication systems, protecting data from unauthorized access. Gives examples of online banking and sensitive data stored by gov't. The main change recently is that more and more data is being encrypted by default (ubiquitous encryption.) Asymmetric encryption aims to remove the opportunity for an unauthorized person to intercept the transmission of a private key that would allow them to decrypt messages. However, encryption is never 100% guaranteed to be secure, because systems have to allow authorized people to access the data, which will provide a potential route for others. Data at rest → data stored on a device (e.g., smartphone), computer server, or other equipment. Is encrypted to prevent someone who physically accesses the device from being able to see the data. Data in transit → data that is being moved from one place to another across a telecommunications network and other systems. May be encrypted to prevent unauthorized people who may have access to the intermediate system from being able to see the content of the data being sent. Encryption can be implemented at a number of levels: by the network service provider, application service provider, or end-to-end encryption (encrypted all the way from the sender to the intended recipient)
ROXANA VATANPARAST, The Infrastructures of the Global Data Economy: Undersea cables and International Law, 61 Harvard International Law Journal Frontiers (2020)
"Ninety-nine percent of global data moves through undersea cables." p.1 The way data is transported and where it travels is not neutral, it is determined by decision-makers, engineers, technology and history (many undersea cables are laid along the same paths as colonial-era telegraph lines). Undersea cables were laid in the 19th century and transformed political thought by shrinking space and time and allowing for the emergence of newly unified "political communities" such as nations Undersea cables led to desires to colonize some island nations, as well as creating new territorial and economic tensions in SE Asia where the natural resources used for insulating cables was sourced "Limitations on state claims of sovereignty in the high seas gave significant leeway for the laying of cables on the seabed, as authorized by the United Nations Convention on the Law of the Sea ("UNCLOS") and customary international law. At the same time, state claims of sovereignty and partnerships with private corporations in early developments of telegraphic cable infrastructures paved the way for corporations to play a significant role in having control over undersea cables today. This was due to the fact that some states did not want supranational oversight or regulation by international organizations or foreign state-owned cables to come into their sovereign territorial space, including their territorial space in the sea." p.8
Angelina Fisher and Thomas Streinz, Confronting Data Inequality, IILJ Working Paper 2021/1, Section II.B.
"Some have suggested establishing new ownership rights over data for "data creators" to facilitate contracting over data and to incentivize data generation. This idea, however, ignores the not-IP-like incentive structure under which most data gets generated and rewards those who have treated data essentially as a res nullius: "things that belong to no one but can be claimed by whoever catches them first." p.36 Reiterates general explanation of copyright law from previous reading: "certain categories of data can be subject to copyright — if the general standard for creative works is satisfied — and compilations of data (databases) can be subject to copyright, too, if they constitute intellectual creations" but most databases do not qualify EU's sui generis database right didn't have any impact, but it hasn't changed the rule--it is hard to take away property rights once given. Trade secrecy laws provide an easy way to control data, because the way data is commercialized means theoretically any data could have commercial value (see Scassa for summary of TS) Data ownership is often asserted through contracts such as licensing agreements, and protected via "digital rights management" technologies. Regular property laws apply to physical infrastructures, but ownership of the digital components varies. Proprietary control of software is no longer the norm, open source software is very common. Infrastructural control is key, not just who owns the data itself. Ex: control over webAPIs allows players to cut others off from accessing data.
Paul Schwartz, Legal Access to the Global Cloud, 118 Columbia Law Review 1681 (2018), 1689-1699 (discussing different models of cloud computing: I.A and I.B.).
"This Article develops and distinguishes between three models of cloud computing to provide greater clarity for courts when evaluating international data access requests. These models are the Data Shard, Data Localization, and Data Trust clouds."\ "Federal electronic surveillance law for domestic law enforcement consists of three statutes: the Wiretap Act, the Stored Communications Act, and the Pen Register Act. Of these, the SCA is the most relevant to access to cloud information." p.1690 "In Microsoft Ireland, the Second Circuit held that the SCA did not obligate Microsoft to give the government information stored in an extraterritorial data center. After accepting the government's appeal from this decision, the Supreme Court ultimately declared it mooted by the enactment of the CLOUD Act, which, given its extraterritorial reach, would have obligated Microsoft to give the government that information. Unlike the Second Circuit in Microsoft Ireland, the Eastern District of Pennsylvania in Google Pennsylvania held that the SCA required a cloud provider to supply the government with information distributed in its global network." p.1690-91 "[T]hese cases illuminate how different models of cloud management can encourage different conclusions regarding the scope of the same legal authority." p.1692 Microsoft turned on the location of the data, whereas Pennsylvania turned on where the access would take place. The different types of cloud systems are relevant to the difference in decisions: "Google Pennsylvania involved a Data Shard cloud, a type of cloud in which the cloud provider stores information both globally and domestically. It breaks data into small components, or shards, which the system routes around the globe, with different bits shifted between various locations. In contrast, Microsoft Ireland involved a Data Localization cloud, a type of cloud in which information is stored extraterritorially." p.1693 Data Shard cloud = company stores information in the cloud in multiple international locations (like Google) Data Localization = company stores information in a cloud that is restricted to a single country or region (Amazon Web Services)/Microsoft Ireland Data Trust = Data Manager oversees the network hardware and software, Data Trustee has the exclusive ability to access the data. "Here, we reach the opposite pole from the Data Shard model, which relies on networked intelligence and ignores national boundaries."
data localization (per Svantesson)
'Data localisation' refers to a mandatory legal or administrative requirement directly or indirectly stipulating that data be stored or processed, exclusively or non-exclusively, within a specified jurisdiction.
Dan Svantesson, Data localisation trends and challenges, OECD Digital Economy Papers No. 301, pp. 8-22 (Part II: Data localisation).
'Data localisation' refers to a mandatory legal or administrative requirement directly or indirectly stipulating that data be stored or processed, exclusively or non-exclusively, within a specified jurisdiction. (p. 8) Data localization has legal and administrative requirements. States may require data localization by law or may incentivize data controllers and processors to transfer data/activities within their jurisdiction. (For this report, strategies that rely solely on an incentive based approach does not constitute data localization.) Data localization can be mandated either directly or indirectly. Any type of data can be subject to data localization requirements, but laws tend to be specific about what types of data, actors, or sectors fall within their scope. Data is required to be stored or processed on physical servers or digital storage units within the specified jurisdiction. Note that conditions being placed on transborder data transfers is not the same as banning transfers. (The latter is data localization, the former is not.) Exclusive data localization → Data copy cannot leave Non-exclusive → data copy must stay (e.g., a copy must stay within the jurisdiction.) Both types require maintaining a technical infrastructure in the jurisdiction in question. Trends Clear indication that countries and regions see value in having data stored and processed locally. Some elements of data localization in 40 jurisdictions. OECD circulated questionnaire to government officials, 38% of countries (11) said they have provisions in data governance and privacy regulations concerning data localization. Some only required specific types of personal data to follow the localization requirement. 10 countries in the questionnaire also said that data localization was one of the main challenges to data flows. Conclusions for OECD review: need for a clear definition of data localization; some countries see benefits in, and have adopted, data localization; some forms of data localization are uncontroversial, other forms are seen as problematic (so adopting one form doesn't mean endorsing the other; data location is one of the main challenges to transborder data flows. Industry attitude → data localization makes it difficult for companies to realize the potential of digitalization. (See example of how wind turbine companies need data from foreign servers to maximize productivity p. 12) But some still see it as useful/sectoral differences. Consumers see value in having data stored and processed locally. (Survey -73% wanted data and personal info to be stored on a server in their own country.) Why data localization? Means of control for economic and geopolitical advantages; cybersecurity, limit cyber espionage, law enforcement/cybercrime (access evidence/surveillance), to protect personal data; ensure government access; physical security of servers (cyber-resilience); to facilitate claims of jurisdiction over data. However it is questionable whether or not localization improves domestic investment/cybersecurity. With regards to jurisdiction, "the idea that a country has jurisdiction to regulate all that occurs in its territory for no other reason than that it -in some form - occurs in its territory simply does not fit with the society of today, characterised as it is, by constant, fluid, and substantial cross-border interaction and data flows, not least via the Internet." p. 17 Data localization may also be an (ineffective) measure to ensure data sovereignty. Concerns with data localization include: restricting data flows; runs counter to borderless reality of the internet, economic and social impacts (e.g., cost of local infrastructure and employment); may entrench power of dominant companies that can afford to comply with multiple data localization requirements; effect on privacy/tool of political repression; data protectionism/impediment to cross-border trade; feasibility (tech limitations/interoperability, etc.)
Thomas Streinz, Data governance in International Economic Law: Non-territoriality of Data and Multi-Nationality of Corporations (excerpted draft paper).
- GDPR makes an important move of ignoring where a corporation is based o The only way for a company to avoid GDPR compliance is to give up the European market o Reduces arbitrage issues - But GDPR regulates too much for Europe to actually enforce it oThus, the regime relies on multi-national corporations' compliance with its law instead of comprehensive enforcement
Frederik Zuiderveen Borgesius, Jonathan Gray and Mireille van Eechoud, Open Data, Privacy, and Fair Information Principles Towards a Balancing Framework (excerpt)
- Releasing government data that contain personal information may threaten privacy and related rights and interests - A maximalist approach to publishing public sector information as open data might imply that a public sector body should not impose any conditions on accessing or re-using public sector information. But a more moderate view is that public sector bodies should be allowed to impose conditions for access and re-use, if this is required to protect privacy interests o We distinguish between three types of disclosure with different degrees of openness: § (i) Restricted access: third parties can query the dataset to conduct analysis § (ii) Restricted use: licenses to limit how data can be used §(iii) Open data: no restrictions
David Serwadda, Paul Ndebele, M. Kate Grabowski, Francis Bajunirwe and Rhoda K. Wanyenze, Open data sharing and the Global South--Who benefits
- The Scholarly Publishing and Academic Resource Coalition defines open data as being "freely available on the internet permitting any user to download, copy, analyze... without financial, legal or technical barriers other than those inseparable from gaining access to the internet itself." -African scientists have expressed concern that open data compromises national ownership and reopens the gates for "parachute-research" (i.e., Northern researchers absconding with data to their home countries).
Katharina Pistor, Rule by Data: The End of Markets?, 83 Law and Contemporary Problems 101-124 (2020).
- Today, controlling vast amounts of data and determining who can access them and at what price is an expression of economic power. And by making access to digital platforms contingent on individual data producers releasing their data, platform owners have effectively created a self-replenishing well for the source of their power. - However, Big Tech has stopped short of claiming full-throttled property rights protection from the state. Not only do tech companies not need this kind of support because they have technological means at their disposal to govern access to the data they have amassed, but they have also benefited from the ambiguity that has surrounded data ownership as they have moved to enclose and extract data from billions of individuals. In effect, Big Tech treated the data it captured as res nullius, or wild animals: things that belong to no one, but can be claimed by whoever catches them first. - In short, the business of data is first and foremost about control, because only control over massive amounts of data can be effectively monetized. The value of data is not revealed by the price others are willing to pay for data points or even the sum of all these data points, but by the processing and analytical capacity of the data controller. This capacity depends on scale; that is, on the volume of data already captured and on future access to data. - Proposed solutions: o First, data producers should be given a claim to the economic returns not on their individual data, but on the database in a prorated fashion oThe interests in data could be transferred to a (public) trust. This trust would have a right to a share in the earnings the company derived from the data and would have the task of channeling these earnings to the data producers. The trust would also exercise voting rights on behalf of the data producers.
James Grimmelmann, INTERNET LAW: CASES & PROBLEMS (11th ed. 2021) (Technical Primer: Computers).
A bit is a piece of information represented by 1 or 0. A byte is a set of 8 bits. This is called binary. Binary is converted into text through a system called ASCII, or another one called Unicode. Sound waves and the color of each pixel of an image are also broken down and represented by specific combinations of bits. All computers contain a CPU (central processing unit) that carries out computations, and memory, which stores bits for the CPU to use. RAM (random access memory) stores the programs that are being actively used, while the hard drive stores things long term. Bits can also represent instructions. "Object code/machine language" = instructions that the computer understands. A computer program is also a form of data. Programming languages convert object code into words that can be combined into sets of instructions for a particular computation--an algorithm. Operating systems allow: hardware independence, powerful features, consistent look and feel, multitasking, multiple users, security
Michael Veale, Sovereignty, privacy, and contact tracing protocols, in Data Justice and COVID-19: Global Perspectives 34 (Linnet Taylor et al (eds), 2020).
A group of researchers concerned about the potential to misuse bluetooth-based Covid-19 contract tracing apps developed a decentralised open protocol and codebase called Decentralised Privacy- Preserving Proximity Tracing (DP-3T), which used used cryptographic methods to enable smartphone owners to be notified if they had a significant contact event ... with a later diagnosed individual, but without requiring a centralised database or persistent identifiers. (Compared with apps like in Singapore which "effectively broadcast an ID Card that only the state can read.") After becoming concerned that the government consortium (Europe) that they partnered with was using their system to slip their own centralized approach to development, the universities behind DP-3T resigned and the consortium collapsed. The government then entered into public-private partnerships with tech giants (i.e., Apple and Google) to run state-sponsored Covid-19 apps. Apple and Google announced Exposure notification, which allowed apps made by national public health authorities to use Bluetooth in the background, although with conditions. The code (based on DP-3T) did not allow the app to obtain a list of all the identifiers the phone had seen, preventing centralization. There's a lot of coercive pressure for centralization and interoperability. (E.g., France was angry at Apple, calling it an attack on their sovereignty.) There's a tension here between privacy concerns and how much power these tech companies have.
Territoriality principle of Jurisdiction
A state has absolute (but not exclusive) power to regulate conduct that occurs within its own territory. When should the state regulate beyond its territory?
Protective Principle of Jurisdiction
A state may exercise jurisdiction over conduct by non- nationals outside its territory that threatens its security(such as counterfeiting the state's money or fabricating its visa documents), so long as that conduct is generally recognized as criminal by states in the international community
University Jurisdiction
A state may exercise jurisdiction over conduct outside its territory if that conduct is universally dangerous to states and their nationals. - "universal character of the crimes"
Nationality principle of Jurisdiction
A state may exercise jurisdiction over its nationals and over their conduct even when they are physically outside the state's territory
Why mandate data localization?
Access to data -Facilitate (for government/law enforcement) -Prevent (foreign government/law enforcement) -Most countries localize certain types of data (e.g., health, passport, national security-related data) Economic/competition advantage Facilitate claims of jurisdiction -Facilitate enforcement -Facilitate assertion of jurisdiction to prescribe Assertion of sovereignty -Reclaiming power (e.g., developing countries, Indigenous Data Sovereignty)
Salome Viljoen, Data as Property?, PHENOMENAL WORLD (October 16, 2020).
According to critics of informational capitalism, datafication is both a process of production and a form of injustice. "Data governance" as defined in this article is a way to "respond to the injustices of informational capitalism" through legislating. Two types of reforms: 1. Propertarian reforms seek to create stronger property and labor rights. These reforms include user-ownership laws, trying to protect those who create data for no pay 2. Dignitarian reforms push back against the commodification of data altogether. "Proposed reforms along these lines grant individuals meaningful capacity to say no to forms of data collection they disagree with, to determine the fate of data collected about them, and to grant them rights against data about them being used in ways that violate their interests." There isn't much evidence that propertarian reforms actually work in terms of redistributing wealth or solving inequalities. Also, "Paying data subjects at the point of collection does nothing to address uses of such data that may violate the civil rights of others and amplify existing social oppression." For example, ICE buying data. But, dignitarian critiques of propertarianism overlook its egalitarian appeal, and focus too much on datafication itself rather than the circumstances, motivations and economic pressures surrounding it. An alternative is to conceive of data in terms of a democratic resource that affects a whole group rather than a single individual. "Democratic data governance schemes consider the relational nature of data: information about one individual is useful (or harmful) precisely because it can be used to infer features about—and thus make decisions affecting—others."
Oscar Borgogno and Giuseppe Colangelo, Data Sharing and Interoperability: Fostering Innovation and Competition Through APIs, 35 COMPUTER LAW & SECURITY REVIEW (2019).
Adopting open and standardized APIs is a critical part of ensuring interoperability and data sharing, and plays a key role in promoting competition and innovation in the realm of AI and IoT (internet of things, definition from wikipedia: "physical objects (or groups of such objects) that are embedded with sensors, processing ability, software, and other technologies that connect and exchange data with other devices and systems over the Internet or other communications networks."). However, so far EU regulations on the subject have been inadequate and inconsistent. Microsoft, Google, Twitter and FB (Meta) launched an open source data sharing initiative in 2018, demonstrating that data portability is an area of interest for tech giants. Both AI and IoT require a huge amount of continuously available data to function optimally. The more open and intercompatible various sources of data are, the easier it is to create "data infrastructures capable of gathering and streaming a vast array of data as a sort of modern pipeline." Therefore, the EU has sought to encourage industry to use standardized, open APIs. "The regulatory approach adopted by the EU reflects the idea that the antitrust enforcement toolbox is inadequate to tackle effectively the need to ensure access to data. The scope of competition law is limited by the fact that it can be invoked only to gain access to a dataset held by a dominant firm, on a case-by-case basis." p.3 Open APIs enable external players to access certain datasets. In addition to creating interoperability, they are themselves monetizable. "APIs adoption generates decreases in operating costs as well as increases in sales, market capitalisation, intangible assets, and net income." p.4 The EU wants to have a large scale regulation encouraging standardized open API use, but so far it has only been tackled through heterogenous legislation. See: personal data portability--GDPR (hindered by lack of interoperability between platforms), European Commission proposal for a regulation on the free-flow of non personal data, account data portability--payment services directive (PSD2). As a result, there is now legislative inconsistency, which may lead companies to adopt their own non-standardized APIs to comply with the laws (making the problem worse). Competition law may be an alternative to regulation for promoting data sharing: essential facility doctrine (EFD) and data pooling agreements. EFD = "a firm which is a monopolist has a duty to share its facilities with everyone asking for access, including competitors." Case law shows that EFD can be applied in the data market context, but it is meant to be used in exceptional circumstances, so does not help with normal data competition issues. Data pooling agreements = "an agreement between different firms to license specific data sets to a central administrator in order to fully exploit its whole value by means of big data analytics." It reduces transaction costs and is anonymizable. Downsides: voluntary, difficult to assess from an antitrust perspective. FRAND (fair, reasonable, and non-discriminatory) terms are likely not a solution--there is little agreement on their contents or how they operate.
GDPR, article 17 (right to erasure) Directive 96/9/EC of 11 March 1996 on the legal protection of databases, OJ L 77/20 of 27 March 1996.
Allows EU citizens to withdraw consent to their data being held and demand it be erased, pursuant to some limitations
What is an API?
An application programming interface (API) allows application developers to bypass traditional web pages and interact directly with the underlying service through function calls. Application programming interface Generally serve as gateways that enable access to and interaction with different software programs, software components, databases, or even hardware
Estonian Ministry of Economic Affairs and Communications and Microsoft Corporation, Implementation of the Virtual Data Embassy Solution: Summary Report of the Research Project on Public Cloud Usage for Government (3.2 - Policy & Legal Environment).
Analysis of the legality and implementation of Estonia's virtual data embassy, which is a project for securely storing data outside of Estonia's territorial borders. "Project's three core findings are: i) The Virtual Data Embassy is consistent with Estonia's existing domestic legal framework, with certain caveats; ii) Migrating and running the selected government services is technically feasible; and, iii) The government needs to be flexible to benefit from the latest technological advances and protections to ensure digital continuity." Therefore, nonsensitive government data can legally be held in a private company's cloud services The government cloud includes: "i) maintenance of data backups and live services within Estonia's borders (Government Operated Cloud); ii) backups at physical Estonian embassy locations or dedicated data centers in allied countries chosen by the government (Physical Data Embassy); and iii) backups of non-sensitive data in private companies' public cloud (Virtual Data Embassy)."
Federated data system
data on data store somewhere , but not willing to transfer to you want to keep data on their system, give access to some of it but want to control what you get and how you use it and what form one way is to contract here dealing with someone who wants access to data, but controller doesn't want to give access to whole set
Code as regulator (Lessig)
Code = "The instructions embedded in the software or hardware that makes cyberspace what it is." The "built environment" of social life in cyberspace Architecture
As technology use increases, data ownership remains a concern (Sept 14, 2021) https://www.restaurantbusinessonline.com/technology/technology-use-increases-data-ownership-remains-concern
Restaurants lag behind the hotel industry in terms of using and collecting data on customer preferences and habits. Due to the pandemic, there has been a spike in take out/delivery app use that has created more data, but the restaurants usually don't have access to it, and they want access
Infrastructure as regulation
As a set of relations → organization, technical, social
Richard A. Epstein, Property Rights and Governance Strategies: How Best To Deal With Land, Water, Intellectual Property, and Spectrum, 14 Colorado Law & Technology Journal 181 (2016).
Asks how legal rules of property should be adapted to deal with different forms of resources. Res communis → property that is open to all in the state of nature, a communal resources open to (e.g. water, beach, air) Res nullius → property that is owned by no one, but that can be acquired by a single individual through occupation or possession of some sort (e.g. land, chattels, animals) How should things be placed in either category? In ancient legal systems → by natural reason and by custom (accumulated practices over time) How to manage res communis? Public governance (See river example on right.) - Sometimes, property must deviate from res communis because we need public ownership of things that must be improved without allowing for access to be closed off (e.g., a river) - Preconditions for property based on land o Scarcity oUniqueness/non-fungibility o Land doesn't move, unlike water o Land is easy to divide - Features of water (river) o Running + needs to be replenished o Navigation rights o Upstream/downstream dynamic empowers those upstream o Many uses have common interests (navigation, irrigation, fishing, etc.) - IP o Legal right to exclude from use of commons o Intangible o Goal is to enlarge commons by fostering innovation and revealing its secrets - Spectrum o Limited number of frequencies o Frequencies need to be allocated to avoid interference o Government must pay if it reallocates - Data o Hard to localize in motion o Small units are not very valuable Can be copied
Benedict Kingsbury, Preface, in Cambridge Handbook of Technical Standardization (Vol. 2): Further Intersections of Public and Private Law xv-xviii (Jorge L. Contreras, ed. 2019).
Traditionally, technical standardization policy has been motivated by the desire to achieve efficiency through interoperability. Science and technology studies/Foucaldian power-knowledge frameworks expand on this by asking how standards interact with power. Introduces how societal and state interests affect/are affected by the process of standardization Private standards development organizations (SDO) promote the following principles for technical standardization: voluntary, open, balanced, transparent, consensus-based SDOS need to raise revenue, rely on: copyrighting standards/trademarks and licensing fees, membership fees + privileged access for membership, certification of compliance
Jonathan Gray, Towards a Genealogy of Open Data. Paper given at the General Conference of the European Consortium for Political Research in Glasgow, 3-6th September 2014.
Asks where the idea of open data came from. There were clear precedents in the 1990s and 2000s for later initiatives to maximise the reliability and minimize restrictions on public data. The Office of Management and Budget's Circular No. A-130 was a key policy document that called for departments to minimize the cost and maximize the usefulness of gov't info, enshrined the principle of unrestricted use of federal government information. Open source advocates inherited from these debates a strong contingency that " consider economic growth and wealth creation to be one of the primary reasons for lending support to open data policies" (p. 8). Also a thread of advocacy that argues that gov't should not seek to provide services that can be provided by non-governmental actors. (See O'Reilly "government as a platform" on pp. 10 -11). Particular focus of open data advocacy has been on enabling non-governmental actors to innovate with information from the public sector. (Role of the state is to provide raw data for services to be delivered by others.) Open data has also been presented as an instrument for public reform. Cite gov't efficiency and cost savings as a reason to support open data. (See UK example pp. 14-15). Current uses of open data also draw from open source, open access, and civic hacking communities. Esp used in scientific and tech communities, "open geodata" or geospatial data. There is also a discourse around the potential for open data in journalism and advocacy to enhance news coverage.
Martin Hayward, Diplomatic Immunity for Data and Bahrain's Data Embassy Law (PrivSec.Report, July 10, 2020).
Bahrain "extended its innovative scope in 2018 by implementing new legislation that allows foreign parties to store their data in data centres located in Bahrain under what is known as a 'data embassy'." "Under Article 3 of the Cloud Law, the data stored in data centres by overseas consumers of cloud services in Bahrain will be subject to the domestic law in the 'Foreign State' where the relevant consumer resides (or is incorporated in cases of legal persons). They will therefore be subject to the jurisdiction of that Foreign State's courts, and other competent authorities."
Mariano-Florentino Cuellar and Aziz Z. Huq, Toward the Democratic Regulation of AI Systems: A Prolegomenon, University of Chicago Public Law Working Paper (9 Nov 2020, available on SSRN).
Because AI is defined in so many different ways, authors suggest using the term "AI systems," defined as "a sociotechnical embodiment of public policy codified in an appropriate computational learning tool and embedded in a specific institutional context." In other words, some mechanism that takes data and performs the task asked of it (such as modeling the data) independently. It usually learns how to do the task by practicing on "training" datasets. A focus on policy is better than a focus on rights because instead of centering the individual, it examines HOW a group of actors pursues a particular goal, and what the biases may be in the outcomes. Characteristics: 1. "these systems are at least ostensibly designed to add either private or social value by facilitating decisions or operations in particular settings." 2. "many applied settings where AI systems are embedded involve collective decision making." Think social media algorithms that determine what gets discussed. 3. "an AI tool will generally include a user interface designed to abstract analytical conclusions and facilitate interaction." 4. Often, "neither the interface nor output is likely to supply the information necessary for someone with technical expertise to evaluate performance or to facilitate comparisons to some sort of 'ground truth.'" 5. "AI systems have...some capacity to adapt and improve performance over time." Reading discusses example of bias in AI training on pages 7, 10, 12 Issues with legislating AI include: 1. AI is spread across many diff industries and contexts. More robust state is needed, but also, more robust state using AI could lead to overreach (China) 2. Zuboff: human agency is undermined by AI's ability to shape behaviors. Authors think this concern is overstated Increased public education on AI and greater transparency is needed
DLA Piper, Regulation of cloud services in China - What does it mean for your China business?, https://blogs.dlapiper.com/privacymatters/china-regulation-of-cloud-services-in-china-what-does-it-mean-for-your-china-business/ [Chinese version available at: https://www.miit.gov.cn/cms_files/filemanager/oldfile/miit/n1278117/n1648113/c5381374/part/5381378.doc]
China passed new rules to regulate cloud services, adding on to existing data localization laws, on-line censorship restriction, and the need for mandatory telecoms and internet based licenses which can only be held by Chinese owned companies. Foreign companies previously got around foreign ownership restrictions on cloud services in the PRC by partnering with local Chinese cloud service providers (CSPs), but these may no longer be allowed. Other requirements (not full, see paper for all, didn't think they were all necessary because some are very specific): CSPs will be required to supervise users of the service. Cloud service platforms must be located in China and any connection to overseas networks can only be made via government (MIIT) approved international gateways.
Lawrence Lessig, CODE 2.0 (2006) (What Things Regulate) Chapter 7.
Code is the "built environment" or "architecture" of cyberspace. Therefore, it is a regulator that could be one (among other) threats to liberty that we don't fully understand yet. Regulation = various constraints that impact the behavior of the individual. Laws, norms, and the market are all forms of regulation. So is the technology of the regulated thing, or the technology affecting its production/creation. This is the "architecture." Each of these constraints/regulators pose different costs on the regulated individual These four types of constraint all apply to cyberspace. Examples: Law--copyright law, defamation law, obscenity laws etc. Norms--how to behave on different forums and social media sites. Markets--pricing structures, busy sites get advertisers. Architecture--"The software and hardware that make cyberspace what it is constitute a set of constraints on how you can behave." Think passwords, encryption vs no encryption, etc. EXAMPLE: How do you prevent theft of car radios? Could change law to increase penalty to life in prison. Could also change the architecture of the radios to only work with one car. The latter is much lower cost Law can change architecture and social norms. This is indirect, as opposed to direct, regulation. The legislator should consider what effects could come from a particular policy and how it would impact the other regulators Possible issue with indirectness is that it misdirects responsibility and creates misunderstandings. New York v United States = Congress can't indirectly order state legislatures to create certain laws. But indirect regulation isn't inherently bad--think requiring streets to have speed bumps to regulate speed of cars. The issue is that governments should be transparent about their motives.
Case C-131/12, Google Spain SL v. Agencia Española de Protección de Datos (AEPD) and Mario Costeja Gonzales, 2014 E.C.R 317 ("Google Spain"), as excerpted.
Court finds that a search engine operator (Google) is the controller of data processing that involves personal information, regardless of the fact that it does not specifically seek out personal information and that the processing is done automatically. The purposes and means of the processing are determined at least in part by the operator, therefore, they are responsible. Secondly, the court finds that the data processing was carried out in the context of Google's operations within the territory of spain, which means that the EU Directive 95/46 has jdx, and therefore Google has to delete the indexed search results about Mr Costeja Gonzalez
James Grimmelmann, Internet Law: Cases & Problems(11th ed., 2021) 40-45 (Technical Primer: Cryptography).
Cryptography is the science of secret communications. A person can encrypt a message so that one person can decrypt it but another can not. Cryptographers can get better security by creating families of codes that use the same encryption algorithm but have many different possible secret keys. Someone sends an encrypted message, the receiver needs the key to decrypt it. The problem with rotate by N or traditional crypto is that it's easy to unlock with a brute force attack - for example, algorithms just try all the possible keys, and that it mimics the patterns of the language it's encrypting. The Advanced Encryption Standard (AES) disguises these patterns. Encrypts messages 128 bits at a time, using a key of up to 256 bits. It mixes up the bytes of the plaintext by scrambling bytes, adding them together, and rearranging them. It's easy to reverse with the key, but "infeasible" to break. (So far.) Public-key Cryptography: Encryption and decryption keys are kept secret. Only the decryption key needs to be private; the encryption key can be public. (Analogy: the public key is like a lock blueprint. Anyone can use it to manufacture locks that only the person with the private key can open.) In some systems you can use a private key to "sign" messages (digital signature). Only the person with the private key can make the signature, but anyone with the public key can check that whoever signed it knew the private key. Applications of digital signatures: authentication (to let people know you generated it), integrity (lets you know if someone altered the document, error detection, watermark, hash value (to check for differences), steganography (hide messages within messages). Encryption systems have a problem of how to distribute keys. Unless they already have a secure way to communicate, it can be difficult. One way is to use third parties to verify identities. (For example, certificate authorities.) Example of cryptography: Transport Layer Security (TLS), a protocol for clients and servers to communicate securely. (For the full process see p. 44 - combiens RSA and AES). TLS is the basis for HTTPS, the encrypted version of HTTP.
Catherine D'Ignazio and Lauren Klein, DATA FEMINISM, Chapter 6: The Numbers Don't Speak For Themselves (excerpts).
Data Feminism asserts that data are not netural or objective and looks at how data is constructed by and constructs social relations. The major problem with most downloaded data/APIs is that there's no context. (This is especially a big problem with open data.) It's the responsibility of the person evaluating the data to make sure that the context behind the data is taken into account. Examples of practices that are attempting to restore context: data biographies, datasheets for datasets (documents outlining how data was collected, maintained, etc.), data user guides (narrative portrait of a dataset) Who should be responsible for providing context?
The World's Most Valuable Resource is No Longer Oil, But Data, The Economist (May 6, 2017).
Data as a commodity is developing into a massive industry, prompting questions about who controls the flow of data. Alphabet, Amazon, Apple, Facebook, and Microsoft seem to dominate the industry - should they be broken up? Their control of data gives them enormous power. The abundance of data changes competition. More data → improve products → more users → generates more data Access to data protects companies from being blindsided by rivals. Because these tech giants can surveill nearly everything, they can see when a new product gains traction and copy it or buy it out. (Shoot-out acquisitions) But traditional antitrust wouldn't work, one would dominate again because of network events. Offers two new ideas to counteract this: (1) antitrust authorities need to take into account the extend of firms' data assets when assessing the impact of deals and (2) give more control to those who provide data (e.g. force companies to reveal to consumers what info they hold and how much money they make from it or mandate sharing of certain data with consent).
What are data models?
Data modelling is a process for identifying things (entities) that a dataset aims to capture, selecting the key properties (attributes) that should be captured to describe those entities in a meaningful way It requires deciding how entities and attributes relate to each other (relationships), and how their information content should be formally codified within a dataset.
Theresa Scassa, Data Ownership (CIGI Papers No. 187 - September 2018).
Data ownership provides a basis for control, but there should be some limits. 3 types of data: 1. representative data involve some kind of measurement, such as age or traffic density 2. "Implied data are those read into an absence, such as inferences drawn about a person's voting preferences based on his or her online activity." 3. "Derived data are those that are 'produced from other data.'" Data are not facts, nor are they neutral, they are the product of human decisions. Ownership under existing laws: Copyright law: covers original artistic or literary works, which includes some computer programs, charts, maps, menus etc. Facts and ideas are NOT covered, but an original selection or arrangement of facts IS. Europe has a sui generis database right, but the US/Canada do not Example cases re: copyrighted facts on p.8 Copyright requires a human author, concerns over AI authorship (though courts have found authorship in cases using algorithms) Confidential information/trade secrets: Does NOT create a property right Requires that the material is not generally known, has commercial value, and the holder has taken steps to keep it secret. P.11 "In Canada, the law of confidential information is based on a number of different areas of law, including contract law, the law of fiduciary relationships, equity and tort, all of which tend to emphasize relationships between individuals" p.12 Personal information generally isn't ownable, though the GDPR leans towards quasi-ownership Discussion of ownership rights: Personal data ownership--getting paid for sharing your own data Sui generis data ownership right--from the EU, generally unpopular Data sovereignty--mostly indigenous movement aimed at reclaiming control over data governance Author: Copyright law evolves slowly and is too one-size-fits-all, but it is better to move slowly than to jump into creating a sui generis right--too complex, could easily get it wrong
Rob Kitchin, THE DATA REVOLUTION: BIG DATA, OPEN DATA, DATA INFRASTRUCTURES & THEIR CONSEQUENCES (2014) Chapter 1.
Describes the qualities of data: usually representative in nature, can also be implied, derived, or change over time. Can be recorded or stored in analogue form or encoded in digital form as bits. Good-quality data are: Discrete and intelligible (datum is individual, separate and separable, and clearly defined); aggregative; supplemented with metadata; capable of being linked to other data sets to provide additional insights. Two main methods of Data generation: direct capture through measurement or observation or produced as a byproduct of some other main function. Data Processing: World → Data (abstraction of the world) → information (linked abstractions) → Knowledge (organized information) → Wisdom (applied knowledge)
Fleur Johns, Data Territories: Changing Architectures of Association in International Law, 47 Netherlands Yearbook of International Law 107 (2016), Sections 5.2 (pp. 114-116) and 5.4 (pp. 125-127).
Discussion of data and territoriality. Summary of territoriality: "territoriality erects and maintains boundary marks and invests them with some hallowedness; second, territoriality effects a presumptive division of resources, including a distribution of lawful authority; and third, territoriality engenders a sense of relational placement and, in many instances, evokes fealty to that placement, or a sense of its relative obduracy." Territoriality in international law is now changing, and may be replaced, by datafication of territory (as in, mapping out and creating representations of territory in data--think google maps, ocean sensors, satellites etc) and territorialization of data (think arrangements for confidential data sharing between states → characterizing data as accessible only to those of certain nationalities or on certain territories).
US Federal Trade Commission (FTC), California Company Settles FTC Allegations It Deceived Consumers about use of Facial Recognition in Photo Storage App (January 11, 2021).
Everalbum provides a cloud service for storing and managing photos. It introduced a facial recognition feature that was automatically enabled for many users and misled users about that fact, as well as lying about whether photos were actually deleted when users deleted their accounts. Everalbum used the data obtained through facial recognition to develop models and algorithms, which it was ordered to delete
Software as a Service (SaaS)
Examples: accessing service through internet, but runs on other computer (e.g., GMail)
United Kingdom, Taskforce on Innovation, Growth and Regulatory Reform (May 2021), paragraph 18, proposal 7.2 (paragraphs 217-227).
Focused on growth and removing "unnecessary regulatory burdens" Criticizes GDPR as overly complex and says it overwhelms people with "consent requests they cannot understand." Suggests "the creation of regulatory architecture that enables "Data Trusts" or "Data Fiduciaries"" Suggests removing article 22 of the GDPR in order to avoid hampering AI development
Human Rights Watch: Joint Statement on Russia's 'Sovereign Internet Bill' (April 24, 2019) https://www.hrw.org/news/2019/04/24/joint-statement-russias-sovereign-internet-bill
HRW asks president Putin not to sign the sovereign internet bill. The bill "provides that the Internet Service Providers (ISPs) should connect with other ISPs, or "peer," at Internet exchange points (IXes) approved by the authorities, and that these IXes should not allow unapproved ISPs to peer. The bill would also create a centralised system of devices capable of blocking Internet traffic. The bill requires ISPs to install the devices, which the government would provide free of charge, in their networks." "Further, the bill creates a national domain name system (DNS)— a system that acts as the address-book for the Internet by allowing anyone to look up the address of the server(s) hosting the URL of a website they are looking for..[which would allow the authorities to] answer any user's request for a website address with either a fake address or no address at all." "The bill contravenes standards on freedom of expression and privacy protected by the International Covenant on Civil and Political Rights (ICCPR) and the European Convention on Human Rights (ECHR), to which Russia is a party. Both treaties allow states to limit freedoms to protect national security but impose clear criteria for such limitations to be valid."
Christopher Kuner (2020) "The GDPR and International Organizations", AJIL Unbound, 114, 15-19.
IOs often process and transfer personal data to fulfil their mandates, therefore IOs need to implement rules to protect the processing of personal data. (Can have life and death consequences.) A number of IOs have their own data protection rules, such as Red Cross (ICRC), UNHCR, INTERPOL, WRFP, and UN. Two main positions with regards to IOs: GDPR does not apply → GDPR seems to equate IOs with third countries as entities subject to a body of law other than EU law, which seems to indicate that IOs are outside of the scope of the GDPR. The EU is also bound to observe international law in its entirety. The European Commission also stated informally that GDPR does not apply to IOs directly, but does apply to data transfers from the EU to IOs. Application of GDPR to IOs should be determined under its material and territorial scope, in light of immunities and status of international law in EU → GDPR lists exemptions to material scope and does not mention IOs, could have mentioned them if it was meant to exclude them. CJEU also found that EU law can take precedence over international law when EU fundamental rights (including data protection) are involved. Also lack of agreed definition of IOs suggests that GDPR was not intended to exclude all IOs per se. Privileges and immunities of IOs outside the UN are primarily from bilateral agreements b/w IOs and (member) states. GDPR does not mention privileges and immunities, and not all member states have granted them → thus need to rely on general EU law to clarify application of GDPR. Looking at enforcement, IOs typically have immunity from legal process, which would include hard enforcement (legal enforcement based on an order by a data protection authority (DPA) or a court). However, soft enforcement - informal pressure that actors in the public and private sectors can exert against IOs to force them to adopt the standards of the GDPR - can be more difficult to resist. For example, an EU agency requiring an IO to comply with GDPR to receive funding. Provisions of the GDPR (e.g, Arts. 46, 49, Recital 112) provide for transfer of personal data to IOs in some cases. IOs should implement data protection policies anyway - reduces possibility of misuse, builds trust, and increases accountability. Can also help prevent the erosion of privileges and immunities by demonstrating that they have put in place alternate mechanisms to protect personal data even when the law may not apply to them directly.
How to challenge/defend web scraping?
Intellectual property law (copyright, database rights) Data protection and privacy law (personal data) Contract law (terms of service) Tort law (trespass to chattels) Cybersecurity law (eg Computer Fraud and Abuse Act) Competition law (unfair competition doctrines)
ICANN
Internet Corporation for Assigned Names and Numbers controls internet's domain name system
IETF
Internet Engineering Task Force controls Internet's foundational protocols
Angelina Fisher and Thomas Streinz, Confronting Data Inequality, IILJ Working Paper 2021/1, Parts I.C.-D. (Disentangling Infrastructures and Identifying Control over Infrastructure).
It's difficult to define Infrastructures because they are inherently context-dependent and relational. Objects like fiber optic cables or data centers become infrastructural only when considered in the context in which they are created and implemented. The technical dimensions of data infrastructures consists of various components of digital infrastructure on different layers (e.g., hardware/software, localized or decentralized storage facilities and computing, etc.) The social dimension looks at human and human-machine interactions, such as social practices, community norms, or individual behavior or data objects. The organizational dimension (e.g. corporate actors), looks at the structure and processes of infrastructure, with a focus on governance structures. Data infrastructures exist in many forms. Some are domain specific while others are general. Some are accessible by all and others are closed. Some are for profit and others are for any purpose or non-commercial. Some are managed by governments, others by non-profits or commercial entities. So how is it possible to regulate? Identifying Control over Infrastructure Control can be exercised over components or over infrastructure as a whole. Control over critical elements can be sufficient to secure overall control. The internet has been foundational in enabling control over expansive data infrastructures. It serves as the foundational infrastructure for cloud computing, which provides the overall organization for data infrastructures. Platform companies are also data infrastructures - they' re not only dominant data holders but also consolidate control over data infrastructures by serving as intermediaries. For example, E-commerce platforms, social media platforms, search engines, and phone OS generate huge amounts of data, which owners hoard and use to exercise control over or acquire data infrastructures. (See example of Amazon and Alibaba on p. 20, Facebook, Youtube, Twitter, Tencent p. 21. ) Cloud computing enables this control over data by establishing parameters for how data is produced, stored, and shared. Infrastructural control over data allows companies not only to extract their users' data but also to shape what data they gather from user behavior. Technical protocols and centralized control also define and structure spaces. (See FB gender example → behind the scenes, still organize into binary male/female for advertising.) Control over data infrastructures also gives control over social, political, and economic organizations of like and is difficult to contest because they're opaque and because they're so ubiquitous. (See farming example, farmers want to use data-driven farming supply data, data management platforms dictate the terms and also create farmers' dependency on particular platforms through proprietary software and limiting interoperability of sensors in equipment.) The more these technologies become diffuse and embedded, the more difficult it is to contest these infrastructure and remedy data inequality.
World Wide Web Consortium
Joint agreement among MIT, ERCIM, Keio University, Beihang University Primary focus: protocols and guidelines for key aspects of the web ranging from HTML and CSS coding to web architecture, XLM technology, web devices, and web browsing and authoring tools Public and free to download
recording of class with Ralf Michaels via Panopto (Global Data Law, Feb 20, 2020; addressing foundational jurisdictional concepts and their potential relevance for data).
Jurisdiction is a normative issue, not a factual one. We aren't bound to territory; territory matters for normative, policy, intl law, domestic law reasons et cetera. Jurisdiction = how far is a statute allowed to reach, NOT how far does a statute in fact reach. The presumption against extraterritoriality is a matter of statutory interpretation, not jurisdiction--the US may have the authority to reach beyond its borders and choose not to do so. 2 types of jdx: prescriptive (how far can you reach--what are you allowed to regulate?) and enforcement (is the US allowed to actually enforce those regulations?). The latter is more territorial and more restricted Bases of jdx (classical international law): territoriality, effects doctrine, active personality, passive personality, protection principle. Personality = regulation of subjects. Intl law has moved towards having more jdx over persons and territory states are trying to protect No real reason to assume that territoriality is the basis of jdx. So why is it usually emphasized? Sense of connection between sovereignty and territory, close link with enforceability in practice, enables manipulation/regulatory arbitrage, pick and choose the regulation regime they are subjected to (ex: Google moving data from EU to US) Because of this emphasis, many laws still lean heavily on territoriality But: territory is not destiny, law of jdx could change Policy preferences: we might want companies to be able to choose a regime, so tying regimes to territory would be desirable. OR, we might want the regime to be based more on the nature of what we are regulating, in which case, maybe data does not fit so well into a territorial regime
Husovec, Martin, The Fundamental Right to Property and the Protection of Investment: How Difficult Is It to Repeal New Intellectual Property Rights?, TILEC Discussion Paper No. 2019-17, Pages 1-14 inclusive.
Key question: "how difficult is it to legislate away a new set of intellectual property rights once they are found to be incapable of delivering on their promises?" p.1 Paper looks at EU Charter article 17 to discuss regulatory autonomy There is a relationship between European human rights law and investment law--they are both used in both contexts. Investments granted through IP rights constrain legislators due to the "inherent right to property." IP law is almost never overturned, and eventually becomes part of a canon of inherent entitlements. Therefore, IP law is a "very heavy handed tool of innovation policy." In the EU, much IP law experimentation is occurring at the EU, supranational level. The result is that there is less flexibility and it is difficult to repeal a bad law. The Database Directive is an example--while there were no positive impacts, there were no specific negative impacts either, and so the law has been left in place. Also, the EU prioritizes unity, even if the unification occurs behind a poor solution. Repealing any EU level law will cause chaos on the national level; therefore, the EU regulator must show that a law is actually harmful before it can be repealed. This is "status quo inertia." This would be partially solved if EU legislators used Regulations instead of Directives--Regulations do not have to be incorporated into national law and are easier to repeal
Amy Kapcynski, The Law of Informational Capitalism, 129 YALE LAW JOURNAL 1460, 1480-1515 (Parts II and III).
Legal realism, formulated in response to laissez-faire, powerfully repudiated its description of the relationship between law and markets. There was a lie at the heart of laissez-faire: markets do not exist outside of law, so as to make state regulation an unjust incursion into a natural order o Big idea of Karl Polanyi - Companies may be hungry for data as a means of price discrimination e.g., Amazon shows each of us the price that we're individually willing to pay - Law has a fear of interfering with its "fetish" of innovation o Moore v. Regents case about cell line for leukemia and fear of destroying economic incentive for important research - Primary sources of power for Google, Facebook, financial algorithmic sector, etc. are... o Trade-Secret Rights § IP scholars have fought hard against recognizing property rights in personal data or databases of personal information § Copyright law does not cover facts and databases can rarely be protected by copyright o Contract Law § Blessing of "click-wrap" agreements -Flawed nature of consent: in the age of the sensing net and the internet of things, we have no idea what data is being collected, what it might mean to those who deploy it, nor what will be done with it
Jennifer Raso and Nofar Sheffi, Data, in THE ROUTLEDGE HANDBOOK OF LAW AND SOCIETY (Mariana Valverde et al, eds.) Chapter 21.
Proposes thinking of data in several ways: Data as processing: Data is a "known". It is assumed as known and used to express or 'discover' an unknown. (p. 113). (E.g. think about how computer programmers are given problems and write programs to help solve these problems. Data is an input and also an output - knowledge that can be acted on.) Data as organization: Information systems archive and keep records, use a knowledge organization system to label, classify and map data in order to locate and retrieve data. Data as abstraction: Data is a symbolic notation that signifies an object or a subject. Data units: Data is also tiny units, or bits, which can be pieced together or broken down to make new data. Data as aggregation: Data as a big data - dynamic and interlinked datasets that constantly update. Data as resource: "Data is the new oil."
Open Letter to WFP re: Palantir Agreement (Feb 8, 2019).
Letter from a number of NGOs expressing concern about WFP data partnership with Palantir (private software company). Critiques the lack of transparency in the deal and delineate risks that are of concern, including: De-anonymization → when huge datasets are merged and analyzed, potential for mosaic effect (linking data sets leads to revealing identities or significant info) Bias → might be biases baked into Palantir models Rights to data, models and derivative analysis → statement from WFP doesn't state what control means and what it covers Future costs → need to evaluate long term costs, future decision to end partnership could be costly as datasets would be difficult to extract from Palantir Undermining humanitarian principles → e.g. Principles for Digital Development, including transparent, inclusive, and equitable use of tech. Transparency and accountability → Nothing has been transparently shared about the procurement process. SHould build in transparent checks and balances, e.g., third party audits, contracts, etc. No source documents for checks and balances mentioned in the statement. Asks for WFP to release terms of agreement, release info about procurement and due diligence processes, establish an independent review panel, and take necessary steps to ensure privacy and security, such as limiting the ability of Palantir to apply models and analyses from WFP's data to other datasets, ensuring WFP is not locked-in to the Palantir system, establishing a transparent grievance mechanism, establishing a clear protocol and agreement for termination, etc.
Microsoft Data Use Agreements - Backgrounder and FAQ
Microsoft created a set of licenses, similar to the Creative Commons licenses, to facilitate data sharing specifically. "Open Use of Data Agreement (O-UDA) - This agreement is intended for use by an individual or organization that owns or has the rights to distribute data for unrestricted uses. This is a "one-to-many" agreement and is intended for use with data for which there is no privacy or confidentiality concern." "Computational Use of Data Agreement (C-UDA) - This agreement is intended for use with data sets that may include material not owned by the data-providing individual or organization, but where it may have been assembled from lawfully and publicly accessible sources. This agreement allows a provider to make the data publicly available only for "Computational Purposes" (activities necessary to enable the use of data for analysis by a computer, like machine learning). This is a "one-to-many"agreement and is intended for use with data for which there is no privacy or confidentiality concern." "Data Use Agreement for Open AI Model Development (DUA-OAI) - This agreement provides terms to govern the sharing of data by an organization with another for the purpose of allowing that second organization to use the data to train an AI model, where the trained model is open sourced. This is a "one-to-one" agreement and contemplates the sharing of data for which there may be privacy and/or confidentiality concerns." "Data Use Agreement for Data Commons (DUA-DC) - This agreement might be used by multiple parties in possession of large data sets pertaining to a particular subject matter who want to share the data sets through a common, Application Programing Interface (API)-enabled database. This multiparty agreement contemplates that each party will contribute data to a common database through agreed upon APIs and then access data from that database through other agreed upon APIs."
Melanie Mitchell, ARTIFICIAL INTELLIGENCE: A GUIDE FOR THINKING HUMANS (2019), excerpts from chapters 2 and 3.
Multilayer neural networks underlie modern AI. They are basically a very basic model based off of a human brain. They include three layers: input, "hidden" or internal layers, and the output layer. A network's number of hidden layers determines how deep the network is. The hidden units have weighted connections going both ways, between them and the input and output layers. (Article includes a more detailed explanation of the math) Training a neural network on datasets essentially means calibrating it so that each unit has very low rates of error, so that it can recognize new examples with greater accuracy Focus in AI research has shifted towards narrow/"weak" AI--AI that can do one task very well, like google translate. General AI, which could integrate many functions to simulate human or superhuman intelligence (like in the movies) has been much harder to achieve.
Russia Internet: Law Introducing New Controls Comes into Force, BBC (Nov. 1, 2019). https://www.bbc.com/news/world-europe-50259597
New "sovereign internet law" gives russian officials the ability to control internet traffic. The reason given is that it would improve cyber security. "It gives the Kremlin the possibility to switch off connections within Russia or completely to the worldwide web "in an emergency". It is up to the government to decide what constitutes a threat and what actions should be taken." "The law requires internet service providers to install network equipment - known as deep packet inspection (DPI) - capable of identifying the source of traffic and filter content." This would allow Russia to filter all internet traffic through certain state controlled points, thus enabling direct censoring and shutting off connectivity altogether.
Mari E. Dugas, Global Encryption Regulation: A Risk Assessment Framework for Technology Companies Providing End-to-End Encryption (Global Data Law research paper).
Offers a risk assessment framework that global tech companies should use as they consider whether to expand their end to end encryption (E2EE) products globally. Encryption is a technological way of making data unreadable to another user or entity who does not possess a "key" to decrypt the data. E2EE is based on public-key protocol. A fundamental source of disagreement between lawmakers and product producers is what counts as feasible. Governments view circumventing encryption as technically feasible, providers of E2EE say its impossible. (See slides for chart and above for specific company info)
Dillon Kraus, Global Encryption Regulation Strategies and Trends (Guarini Global Law & Tech research memo), as excerpted.
One way that countries regulate data is by regulating encryption. Because it's an infrastructural form of security, it's easier to regulate than other forms. There are two primary concerns motivating the regulation of encryption: (1) the need to enable free trade flows while maintaining reasonable consumer protections (achieved by promoting encryption) and (2) the desire to empower government access and control (centers on facilitating government access to encryption keys). Encryption is a method of controlling access to data. It insulates data from unwanted access but also can shield bad actors from scrutiny. Two broad methods of encryption. Symmetric key encryption → the algorithm for encryption and decryption is the same, so to enable another party to decrypt a message, the sender must share the key. Asymmetric key encryption → algorithm for encryption and decryption are different. Original party shares a public key - encryption algorithm- and the decryption algorithm is kept secret. Generally most programs are now the latter, so "whoever controls the key controls access to the encrypted information." There are several different modes for regulating encryption: Some nation states (e.g. U.S., EU, Brazil) explicitly require encryption in certain areas, such as the Federal INformation Processing Standard (FIPS) or HIPAA or the Gramm-Leach-Bliley Act (financial institutions) in the U.S. These types of law typically do not seek to control the encryption method or key. "Implicit in these regulations is the idea that encryption is a common-sense best practice, and there is an assumption that it will be used with little prodding from national regulators." (p. 7). These types of laws aim to reconcile free data flows with privacy concerns. Other regulations implicitly recommend encryption, for example the California Consumer Privacy Act (CCPA), the New York Privacy Act, and the Personal Information and Electronic Documents Act (PIPEDA) in Canada. (E.g., by saying that failing to implement appropriate safeguards can result in penalties.) There are also regulations that explicitly restrict encryption, typically involving tensions between data holders and law enforcement. Law enforcement sometimes argues for exceptional access mechanisms that would allow for selective penetration of the encryption algorithm. This is generally seen as a bad idea - systemic vulnerability. Australia → Telecommunications Access and Assistance Act, companies must provide a way to access encrypted data via a warrant process, but cannot be compelled to introduce systemic vulnerability into software or hardware. UK - Investigatory Powers Act → Requires communication service providers to have the ability to remove the encryption they applied. India → draft laws appear to indicate moving in this direction. (For more examples, see pp. 12-15). There are also laws that don't directly deal with encryption but that conflict with it in some way. For example, the EARN IT Act in the US aims to make tech companies earn platform liability immunity by meeting the list of best practices by a government agency, which could possibly include being compelled to break encryption. There's also debate about whether or not the Fifth Amendment could be used as protection against decryption (in order to avoid self-incrimination). In Russia, the FSB Laws gives the FSB authority over the information security field, including encryption. Effectively, if the FSB deems it necessary, encryption is not a protection. Tech companies play a role in encryption practices. Facebook (states) that it plans to use end-to-end encryption in its services. Google and Microsoft suggest encryption at all stages but have a flexible encryption approach. AWS, Alibaba Cloud, and other cloud providers tend to offer client-side (encrypting before uploading, ensuring control of the encryption process and keys) and server-side encryption (encrypting data when it is uploaded and decrypting when downloaded).
Infrastructure for Data
Physical (cables, centers, etc.) Informational (standards for data classification, formats, etc.)
ALI-ELI principles for a Data Economy - Data Transactions and Data Rights, pages 124-128; 156-158; 162-165; 166-170.
See article for US and EU cases/examples of each principle. Principle 16: Data rights. Data is a non-rivalrous resource, and data rights are not tied to ownership or property principles. Principle 21: Desistance from data activities with regard to co-generated data. A party should have the right to demand a data controller desist use of and/or delete data, if the data could reasonably cause harm to the party or if "the purpose of the data activities is inconsistent with the way that party contributed to the generation of the data" Principle 23: Economic share in profits derived from co-generated data. Parties usually are not entitled to profits made from data except for when there is a contract or agreement for the sharing of said profits, UNLESS: "(a) that party's contribution to the generation of the data (i) was sufficiently unique that it cannot, from an economic point of view, be substituted by contributions of other parties; or (ii) caused that party significant effort or expense; and (b) profits derived by the controller are exceptionally high; and (c) the party seeking an economic share was, when its contribution to the generation of the data was made, not in a position to bargain effectively for remuneration." "Principle 24: Justification for data rights and obligations (1) The law should afford data rights for the public interest, and for similar reasons independent of the share that the party to whom the rights are afforded had in the generation of the data, only if the encroachment on the controller's or any third party's legitimate interests is necessary, suitable and proportionate to the public interest pursued."
Civil Society Calls on International Actors in Afghanistan to Secure Digital Identity and Biometric Data Immediately(August, 2021).
Statement from organizations (including Access Now, EFF) calling on international actors to secure digital identity and biometric data. As the Taliban takes over Afghanistan, there is a concern that actors will target human rights activists and other people as they gain control over information. There are at least 3 digital identity systems in Afghanistan: an Afghanistan Automated Biometric Identification System (maintained by Afghan Ministry of Interior with support from U.S.); e-Tazkira electronic national identity card system (Afghan National Statistics and Information Authority); Handheld Interagency Identity Detection Equipment (U.S. military). This last one was seized by the Taliban. THere are also other smaller biometric systems, such as from UNHCR and WFP. IOs and other organizations have pushed for digital identity systems, but the flaws in them mean they can now be used by the Taliban to monitor and to find people. People in Afghanistan have little to no control over their participation in these systems → required for access to public services, to vote, etc. Systems also contain sensitive info. This is why mandatory and centralized collection of extensive data (esp. Biometric data) is always dangerous. Urges agencies and private actors who deal with digital identity tools in Afghanistan to: shut down systems and erase data; impose a moratorium on continued use of biometrics without human rights assessments, control and audit parties that have access to data; ensure there is no unrestricted access to data; move data off computer infrastructure physically located in Afghanistan, implement data leak protections, inform anyone whose data has been compromised or who may be at risk, publicly announce data breaches, etc.
LISA M. AUSTIN AND DAVID LIE, Safe Sharing Sites, 94 NYU LAW REVIEW 581 (2019).
Summary: "In this Article we argue that data sharing is an activity that sits at the crossroads of privacy concerns and the broader challenges of data governance surrounding access and use. Using the Sidewalk Toronto "smart city" proposal as a starting point for discussion, we outline these concerns to include resistance to data monopolies, public control over data collected through the use of public infrastructure, public benefit from the generation of intellectual property, the desire to broadly share data for innovation in the public interest, social—rather than individual—surveillance and harms, and that data use be held to standards of fairness, justice, and accountability." Sidewalk Toronto example has two key features: 1. The de-identified data will be open by default and shared widely 2. Creating a Civic Data Trust to "manage data in the public interest." The authors suggest "safe sharing sites" as a method of data sharing and data governance. A safe sharing site is a piece of infrastructure that allows an organization to let external players perform certain analysis on the data they hold without allowing anyone outside the organization to see the raw data. (Similar to a "federated system," but would operate independently of the organizations that want to share data). This is also similar to a "data trust" in the sense that it is a repeatable framework of terms and mechanisms. Data sharing implicates both individual privacy concerns and larger data governance concerns such as bias, algorithmic accountability, and profiling. But, greater transparency and reviewability can resolve some of those concerns. Sidewalk Toronto proposes to deal with privacy concerns about personally identifying data (PII) by de-identifying the data at the time of collection. Authors say this is insufficient, because re-identifying is possible and de-identifying can warp the data. Furthermore, the line between what is and isn't PII is blurry. One challenge would be legal complexity from having organizations from multiple jurisdictions. Safe sharing sites would include a registry of relevant data laws for the jurisdictions of each data sharing organization, and each jdx would decide who can access the data (for auditing and transparency purposes).
International Standards Organization
Swiss private association (based in Geneva) 22955 International Standards Agreement on Technical Barriers to Trade (TBT Agreement) of the World Trade Organization (WTO) → directs members to use international standards like those developed by ISO as basis for domestic technical regulations Available for purchase for a fee
Alexandra Giannopoulou, Understanding Open Data Regulation: An Analysis of the Licensing Landscape, in Open Data Exposed (Bastiaan van Loenen, Glenn Vancauwenberghe, & Joep Crompvoets eds., 2018)
The Berne Convention Article 2.1 determines the definition of literary and artistic works, thus requiring that data must pass the originality test to be copyrightable. Article 2.5 permits the copyrighting of collections of data (also supported by the WIPO Copyright Treaty of 1996 and Article 1013 of the Agreement on Trade Related Aspects of Intellectual Property Rights of 1994). European Court of Justice: ""it is only through the choice, sequence and combination of those words that the author may express his creativity in an original manner and achieve a result which is an intellectual creation." A sui generis database right exists in Europe, it gives the creator of the database 15 years exclusive copyright in order to protect the economic investment put into creating the database (NOT creating the data). Public databases are theoretically also covered, but some courts have ruled against them. Transnational Open Data Licensing Models: Creative Commons: American NGO that has created a set of free, easy to use copyright licenses. All licenses are based on four elements: Attribution (BY) (requires licensee to indicate the name of the author on every distribution), No derivatives (ND) (forbids altering the licensed material), No commercial uses (NC) (forbids commercial use), and Share Alike (SA) (allows derivatives but requires the derivative work to be licensed under the same type of license as the original). "The licenses exist in three different formats or "layers". The license is first delivered as a summary of its core elements, called the commons deed or human readable license. Then, the second layer is the legally binding license called the legal deed, and third is the machine-readable license, which describes the permissions and restrictions of the license in a form of digital-rights expression making it easier to identify and manage the shared work." Previous versions of CC licenses used "porting" to adapt the licenses to local jdx, but now they are designed to be used internationally. CC0 = no rights reserved, places material into the public domain. But, that means other people don't have to respect the principles of open data. Separate from CC, Public Domain Dedication and License (PDDL) = a waiver of all rights and claims in a database. "The innovation of the PDDL is the voluntary addition of community norms in the use of open data" by providing an example set of norms to follow. Open Data Commons created two database-specific licenses using the share alike and attribution principles from Creative Commons, the Attribution License (ODC-By) and the Open Database License (ODC-ODbL). Similar to PDDL in that they are database-specific, but they are binding. Some countries (UK and France) have also created open data licenses. The UK's Open Government License only requires attribution and a link to the license (although they have also created a Non-Commercial version and a Charged License which allows the government to charge for the license). The French License Ouverte/Etalab requires attribution and makes public sector data automatically open, with an exception for personal data. There has been fragmentation/creation of a range of types of data licenses, which is problematic because many of the licenses are incompatible. In recent years there has been an effort made to create compatible licenses, but open data licenses are frequently not included (Creative Commons has a list that does not include open data licenses). Another issue is that open data (and other) licenses only work when there is an underlying sui generis right or copyright over the data.
James Alford, Can the California Consumer Privacy Act Curb Big Tech? https://www.theregreview.org/2020/12/17/alford-can-california-consumer-privacy-act-curb-big-tech/
The California Consumer Privacy Act (CCPA) → digital privacy law. Gives consumers privacy rights, including right to access (request-and-respond requirement → right to request data from a business and the business is required to respond), right to deletion, right to opt out of the sale of their data. The request-and-respond model originated with FOIA, but there have been many deficiencies, such as underfunding and understating, plus expansive exemptions. Alpert argues that it is unlikely that the CCPA, even if it increases transparency, will rein in the personal data economy. Individual requesters are more likely to lead to delayed responses, and most people can't sort through data dumps and extract the important info. The CCPA does little to discourage trading data to join a social media platform or download a game; obscures the harm posed by data aggregation. Alpert considers an alternative, an "affirmative disclosure regime that would require entities to publish the type of data they have about a particular consumer, why and when they collected the data, and with whom they shared it." P. 3 But it would have same problem Instead, he recommends a tailored tax regime on the tech industry at the point of data collection. This would force companies to carefully consider how, when, and why they collect data. If passed on to consumers → companies would have to convince them that the data collection is in their best interest.
Nadezhda Purtova, The law of everything. Broad concept of personal data and future of EU data protection law, 10 LAW, INNOVATION AND TECHNOLOGY 40 (2018).
The GDPR is becoming too broad and inclusive, running the risk of becoming impossible to comply with and therefore ignored. As our daily lives become more datafied, more and more data can be considered to be "personal" and therefore fall within the scope of the law. "in the age of the Internet of Things, datafication, advanced data analytics and data-driven decision-making, any information relates to a person in the sense of European data protection law." But the broad definition of personal data isn't a problem--"if all data has a potential to impact people and is therefore personal, all data should trigger some sort of protection against possible negative impacts." The issue is that the GDPR isn't scalable. There should be different levels of protection in different data processing situations. This article discusses the GDPR, the Data Protection Directive, and Article 29 Working Party ('WP29') opinion on the concept of personal data to demonstrate how the definition of personal data is very flexible and expansive. Case law: in Google Spain and Lindqvist, the court found that proportionality is adequately considered at the national implementation stage and therefore is secondary to issues of scope in the court's analysis. Breyer: a dynamic IP address was found to be personal data because "The website provider was found to have the means reasonably likely to be used to identify the website visitors on the basis of a dynamic IP address with the help of third parties, namely, the Internet service provider and the competent authority." Nowak: any information really means all information, not just that which is sensitive and private. SOLUTION: "seek remedies for 'information-induced harms' - understood broadly as any individual or public negative consequences of information processing - without a sentimental attachment" to the idea of "personal" data.
Jean-François Blanchette, Introduction: Computing's Infrastructural Moment, in: Christopher S. Yoo & Jean-François Blanchette (eds.), Regulating the Cloud: Policy for Computing Infrastructure (2015), chapter 1.
The cloud can be defined in many different ways (list on 2nd page) but its key features are increased centralization, increased processing power, increased integration between processing, storage, and networking, and it creates increased urgency re: access to broadband. Author: "I propose instead that 'the Cloud' is shorthand for the moment where computing has become, both materiality and symbolically, infrastructure; that is, a sociotechnical system that has become ubiquitous, essential, and foundational." Like other forms of infrastructure, the Cloud "developed incrementally, from the progressive laying down of its infrastructural components, including data centers, fiber cables, economic models, and regulatory frameworks. Such incremental development means that early-stage design choices persist, often with unforeseen consequences, and become increasingly difficult to correct as the infrastructure becomes ubiquitous, its functionality expands, and the nature of the traffic it serves evolves." p.3 The Cloud has become increasingly integral to other forms of infrastructure. It is also expected to keep up with enormous rates of growth. "The Cloud thus emerges at the historical confluence of several long-standing technical traditions within computing: modularity, which has allowed cloud providers to create unprecedented amounts of computing power by merely pooling together massive numbers of low-cost, off the shelf components; virtualization, which makes it possible to distribute, meter, and charge for these computing resources in highly granular and flexible ways while allowing continuity with legacy software designs; and distributed architectures, which allow for the partitioning of computing resources between mobile devices and data centers." 3 types of cloud: Software as a Service (SaaS) like Google Drive, and infrastructure as a service (IaaS) that provides virtual machines like Rackspace, platform as a service (PaaS) like the Google App Engine, that offers direct access to the cloud's computing powers Two intersecting issues: criticality (we come to rely on the cloud) and reliability (therefore, it must be reliable). Privacy and liability are two more potential issues. The cloud is decentralized and delocalized, which creates potential legislative issues. For example, the switch to providing a coherent "thing" like a system to providing a service changes the type of contracts that are used "Bauch coins the term body-data to recognize that electronic data, stored remotely in data centers and moving through digital networks, "contributes to the objecthood of something else (in this case, a human body) in a different location."
Aaron Perzanowski and Jason Schultz, The End of Ownership (2016), chapter 8: 'The Internet of Things You Don't Own'.
The internet of things refers to a range of devices that generally combine embedded software, network connectivity, microscopic sensors, and large-scale data analytics. So many objects are now computers, e.g., phones, cars, light bulbs, etc. and are therefore susceptible to external limitations and controls of other digital goods. This has implications for ownership and consumer welfare (see example of smart thermostat being discontinued, pp. 140-141). Possible beginning of IoT → The iPhone as a walled garden approach - can only install software Apple allows, can only configure settings Apple gives access too, etc. This sparked a practice of jailbreaking to install software, replace operating system, and customize the phone. In legal battle that followed, Apple argued that iPhone users are licensees, not owners, of the OS. Copyright Office ruled in favor of right to jailbreak phones but sidestepped the issue of ownership. Farm example: Monsanto sold the herbicide Roundup, but this also damaged crops. They manufactured crops resistant to roundup and "licensed [them] for a single season" to farmers, so farmers couldn't replant the following year, instead they had to buy new seeds. John Deere imposed a software layer between farmers and tractors in order to force farmers to have equipment serviced by authorized John Deere dealers. John Deere claimed it owned the code in tractors it sold to farmers and farmers had an implied license for the life of the vehicle. This raises the price for farmers and also impacts innovation. Modern cars (GM, Mercedes-Benz) : Also contains tools that weaken the property interests of owners, including DRM that prevents repair and customization, soft- ware that monitors and controls your driving, even restrictions on vehicle resale. Again argued that underlying computer software was licensed. Potential to be locked out of car. Copyright office gave special DMCA exemption to allow vehicle owners to break DRM and access software for repair, security, and personal modification. But exemptions only last 3 years. Keurig: Tried to prevent people from using non-Keurig pods in coffee makers. Angry public reaction and Keurig reversed course (slightly). Still blocked competitors but made a pod for people to use their own coffee grounds. Barbie: IoT Barbie can converse with child and learn about them. Does this by recording them and transmitting them to a third party cloud-based speech recognition service. 3rd party and unnamed partners can use the information about children's conversations for testing, enhancing tech, AI, etc. Pacemaker: Pacemaker ran on proprietary software. Manufacturers won't let patients look inside or test the devices they purchase, nor can they read the data from their own device while at home or on the road, can only access from manufacturer approved sources. IoT threatens our sense of control over the devices we purchase.
JAMES GRIMMELMANN, INTERNET LAW: CASES & PROBLEMS (11th ed., 2021) 25-34 (Technical Primer: The Internet).
The key to every internet network is a protocol: a set of rules/specifications that describes how computers should interact and communicate with each other on the network. There are different protocols for different physical mediums. Physical mediums that connect two computers according to a protocol include ethernet, wi-fi, EDGE, fiber optic cables, etc. Most computers are not on a single network. The solution is the Internet Protocol (IP). Routing = sending information from computer A along a chain of computers (routers) to computer B using multiple types of physical medium, all of which have the ability to carry IP messages. The different mediums encode the message differently, but the contents stay the same. Every computer has a unique IP address, which allows each computer along the path to cross-reference the address with its "routing table" and decide which nearby computer to send the message along to. Packet-switching: most messages are too big to send all at once, so they are broken up into packets which are sent separately, usually along multiple paths. It is flexible, resilient, and allows the packets to reroute and avoid "traffic jams" to travel more efficiently Besides the IP, computers usually use a number of other protocols simultaneously, called a protocol stack. Example of a protocol stack used by a typical home computer: • Application (e.g. email, web, etc.) • Transport (TCP)--transmission control protocol. Keeps track of which packets it has received. If one is missing it requests that the other computer resends it. Also determines how fast packets are sent to match up with the available broadband • Network (IP) • Link (Ethernet) • Physical (category 5 cable)" Applications: Servers are computers that host some kind of service (like email, or a website). Clients are computers that request information from servers. When there is no clear distinction between which computers are acting as servers and which are clients, it is called "peer to peer." Applications usually have their own protocols that are layered on top of the rest. "The Web" operates via two protocols: HTTP, which allows servers to send web pages to clients, and HTML, which determines what the websites actually look like and do. Domain-name system: when you search an URL, the computer asks the domain-name server for the domain's IP address. If it doesn't know it redirects the computer until it receives the IP.
How does the law regulate data?
The law can regulate: Activity (what is being done to data) An entity or person (directed at the "person" doing something to data) Computers/machines
World Food Programme, A statement on the WFP-Palantir partnership (Medium, Feb 7, 2019).
The private sector can offer digital analytical solutions to the World Food Programme (WFP). The WFP logistics and supply chain teams have worked with Palantir to purchase and deliver food. WFP holds beneficiary data in a secure system, hosted on UN Premises, that is subject to regular independent stress tests and verification. WFP has also developed data privacy policies. Private partners must work with/agree to these principles to work with WFP. States that Palantir "understands" the WFP's commitment to data privacy values and has agreed to rules of engagement, including: no access to beneficiary information, does not provide data to WFP or collect on WFP's behalf. Only analyzes data not related to beneficiary info. WFP maintains strict control of systems and determines which data sets are provided and for what. Palantir treat's WFP as confidential, won't use it for commercial benefit, won't share it, and won't use it for data mining. WFP retains full control over the data, analysis, and derivative work. Each company WFP works with also goes through a due diligence process.
Data Localization Model
data is stored in one location
Angelina Fisher and Thomas Streinz, Confronting Data Inequality, IILJ Working Paper 2021/1, Section 2C (Data Rights).
The rights-based approach towards data emerged in the 1970s. The EU is a major player--see 1995 Data Protection Directive and the GDPR. The US does not have any equivalent laws, but "California passed a consumer privacy act (CCPA) in 2018, augmented in 2020 by the California Privacy Rights Act." Data protection and privacy laws are ineffective in challenging equal control over data, partially because they are underenforced but partly by design. They focus on "personal data," which incentivizes finding ways to use nonpersonal data, or de-personalize data (by anonymizing, using synthetic data, etc). But there are still inequalities perpetuated by such "non personal" data Another issue is that there is a focus on empowering individuals to challenge the use of their data or to delete their data. Individuals do not use that power, at least not in the quantities necessary to have much impact. Plus, the rule only applies to data the subject provided, not data that was inferred about the subject. Data portability = the right to transfer one's data from one platform to another, goes a step beyond traditional data protection. But a right to data portability is not very useful when data is contextual and there is no rule about how the data should be transferred. 2 ways data protection law can increase inequality: 1. Companies use data protection concerns as an excuse to refuse to provide data for the public good (research etc), 2. Increased transaction costs may privilege richer entities' access to data
Bradford Biddle, No Standard for Standards: Understanding the ICT Standards-Development Ecosystem, in The Cambridge Handbook of Technical Standardization Law (J. Contreras ed., 2017) 17-28.
There are "no standards for standards." Technological standards are created, maintained, and propagated in a variety of ways by a diverse set of actors. Standards-Settings Organizations (SSOs) = inclusive superset of formal standards-development organizations (SDOs) and consortia Single promoter standards = intentionally created and promulgated by single companies for adoption by third parties. Examples include the I2C standard. Standards-development organizations (SDOs) → formally recognized by some government authority. Examples include ITU, ANSI Types of SDOs: Big Three (Four) International SDOs: International Organization for Standardization (ISO), International Electrotechnical Commission (IEC), International Telecommunications Union (ITU), JTC-1 (joint technical commission of ISO and IEC) Regional National Large private sector-led/small private sector-led SDOs of SDOs Consortia are private sector-led organizations that create or otherwise support standards, but are not formally recognized by a government authority. There are two kinds. Incorporated consortia are distinct legal entities that are usually formed by and funded by companies. Contractual consortia are defined by a contractual relationship among participants. Umbrella consortia are nonprofit incorporated consortia that host other consortia under a defined framework. (E.g., the IEEE Industry Standards and Technology Organization (ISTO) and Linux Foundation) Definitions of open standards fall into two camps: (1) 1) a process-oriented definition favored by traditional formal SDOs. Formal standards are more likely to satisfy this definition. Example → OpenStand Principles: due process, broad consensus, transparency, balance, openness. (2) a royalty-free intellectual property-oriented definition favored by the open source community. Example → Open Source Initiatives open standard requirements: availability, no agreements. Performed a study on a computer and identified that there were 251 interoperability standards on a laptop. 44% developed by consortia, 20% single promoter standards (SPPs), 36% formal SDOs. From this we can see that consortia and SPPs play a critical role in standardization. Consortia can also interact with formal SDOs in a number of ways, such as explaining and marketing new technologies for consumers. (See document for full list, p. 24.) However, many consortia standards are arguably not "open standards" under the process-oriented definition. ICT standardization is changing rapidly for several reasons. Functions that were previously implemented in hardware are not being implemented in software. Frustration with the slow pace of the standardization system also drives change. (Open source software can move faster than SSOs. to solve interoperability problems.)
GDPR, Article 3 ("territorial" scope of application).
This Regulation applies to the processing of personal data in the context of the activities of an establishment of a controller or a processor in the Union, regardless of whether the processing takes place in the Union or not. This Regulation applies to the processing of personal data of data subjects who are in the Union by a controller or processor not established in the Union, where the processing activities are related to: the offering of goods or services, irrespective of whether a payment of the data subject is required, to such data subjects in the Union; or the monitoring of their behaviour as far as their behaviour takes place within the Union. This Regulation applies to the processing of personal data by a controller not established in the Union, but in a place where Member State law applies by virtue of public international law.
Microsoft Computational Use of Data Agreement (C-UDA)
This document is an example C-UDA with annotated explanations of some provisions.
Microsoft Open Use of Data Agreement (O-UDA)
This document is an example O-UDA with annotated explanations of some provisions. The O-UDA requires attribution, disclaimers and limitations on liability to be included with any redistributed data (does NOT pertain to results of processing).
World Economic Forum, "Federated Data Systems: Balancing Innovation and Trust in the Use of Sensitive Data" (White paper), Title page and pages 4-10
This paper agrees with the Borgogno article that fragmentation and lack of interoperability poses a threat to future innovation using data. However, it seeks to balance the need for protecting sensitive data (by siloing it apart from other data) with innovation's need for openness. The solution proposed is the use of federated data systems, in which organizations hold their own data separately. A researcher can use an API to send a query, the API "visits" each organization, processes the data, then returns the answer to the query to the researcher, without the data ever leaving its original location. "Central to this approach is federated system architecture, with multiple interconnected nodes that align on shared principles and open standards to ensure security, interoperability and reliably high performance. While physical (and legal) entities are distributed around the world, they are logically connected through interfaces that allow seamless, authorized access to secure data." APIs and a common technical framework are essential here. "Another unique technical differentiator of federated systems is that the computation moves to the data (i.e. the data does not leave the organization). As such, the business, legal, technical and societal risks inherent with data transfer and/or centralization are greatly reduced." This decentralized method spreads out costs and reduces some of the risks of centralization (such as illegitimate access). It also enables local control/autonomy. Challenges: The main issue is coordinating many different stakeholders with different interests. Data access--who can access what data? Who pays for the maintenance of the system? What organizations can join the federation? What happens when an org leaves? Is their data erased? What about the products of that data? Transparency Who owns the data Next steps: Development of robust and agile governance frameworks Development of open technical standards Focus on shared principles Accelerate pilots that deploy federated data systems
Tire Data Ownership, Safety Explored (Sept. 17, 2021). https://www.vatanbir.org/ownership-of-tire-data-safety-explored/
Tire companies can put monitoring devices in tires, which they then sell to companies (usually delivery) that own fleets of trucks. The data from the tires is processed through the cloud, and returned to the fleets, who (usually) can control who has access to the data. But who actually owns that data, the tire companies or the fleets? (Unclear.) As of the time of that article, the chips were read-only, not editable by the fleets.
Julie Cohen, BETWEEN TRUTH AND POWER: THE LEGAL CONSTRUCTIONS OF INFORMATIONAL CAPITALISM, Chapter 1 ("Everything Old Is New Again—Or Is It?"). The whole book is available at https://juliecohen.com/between-truth-and-power/.
Two fundamentally incorrect ideas about the relationship between technology and law: law is standing in the way of progress in the form of technology, technology fatally undermines the rule of law. Neither law nor technology is an immovable, monolithic force. Paper focuses on power in the legal-institutional context, not in the abstract Informational capitalism = "market actors use knowledge, culture, and networked information technologies as means of extracting and appropriating surplus value." We are moving away from an industrial economy and towards an informational economy. Neoliberalism is the primary school of thought associated with economic capitalism (p.7 for discussion and definition). Neoliberal governmentality wants to reshape government in the image of markets. Law has a facilitative role in economic and ideological transformation. Law is already transforming in the information age in response to powerful interests: "Law for the information economy is emerging not via discrete, purposive changes, but rather via the ordinary, uncoordinated but self-interested efforts of information-economy participants and the lawyers and lobbyists they employ." p.9
Microsoft Corp. v. United States
U.S. law case U.S. government wanted access to email about drug trafficking case, servers were in Ireland, could access data from U.S. issued warrant under Stored Communications Act (SCA) Microsoft —> presumption against extraterritoriality in U.S. warrants SDNY: warrant does not violate presumption against extraterritoriality 2nd circuit: reversed, SCA does not authorize extraterritorial application Supreme Court: Oral Arguments Microsoft essentially a material/physical process most action happens outside of U.S. Arguments for Microsoft "floodgates" —> other countries might do same, protecting U.S. citizens, obligation to another government, conflict with territorial jurisdiction of Ireland ,privacy violation in Ireland, Could ask for MLAT (mutual legal assistance treaty) Arguments for Government Microsoft is U.S. company, access/control, jurisdiction —> "effect" (crime), no commity issue Not resolved in Court cause gov't passed Clarifying Lawful Overseas Use of Data Act (CLOUD Act)
URL
Uniform Resource locator (standard imposed on internet on basis of DNS)
Dan Ciuriak, Unpacking the Valuation of Data in the Data-Driven Economy (April 27, 2019).
What does it mean when we refer to data as the "new oil?" (e.g., essential capital of the data-driven economy) In IoT, big data is information related to the physical processes related to a company's business activity and captured by its monitoring equipment. Acquisition is not transactional; rather capture, curation and classification of data is a library function (part of general business management). Difference b/w basic info is data is when the analysis of a set of data yields a pattern that is not present in the individual data points and "when actions taken on the basis of that information feedback to affect the process that generated the data in the first place." Ways in which data can be monetised: Exploiting information asymmetry → economies of scale, economies of scope (increase in value of data when it can be cross-referenced with other types of data), network externalities promote market concentration and market share capture. Can lead to market failure/monopolies. Enables machine learning → allows for the acceleration of innovation, speed advantage over other firms Creation of machine knowledge capital → factor of production, can be reproduced at essentially zero marginal cost and distributed globally easily. Will capture tasks that formerly went to people, enabling market share capture. Optimization of process → big data enables firms to improve business processes, reduce costs, increase operating margins Capture market surplus → bid data on consumer preferences and habits allows companies to apply different prices based on individual consumer's reservation price (Example: Uber) Monetizing open data → value of data is reflected in the market valuations of the firms that exploit it Strategic value → future military advantage Conclusion: Only the market value of data-driven firms appears to be a realistic approach to capturing the value of data
Julie Cohen, Review of Zuboff's The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power, Surveillance & Society 17(1/2): 240-245.
Zuboff retheorizes data-driven algorithmic commercial surveillance as instrumentarian power. Zuboff argues that "because a principal modus operandi of surveillance capitalism involves brushing legal restrictions aside, surveillance capitalism is best characterized as an assertion of a right to lawless space." But book reviewer thinks that this does not adequate capture the complexity of the relationship between law and surveillance capitalism. Cohen applies a Polanyian analysis (move from agraniamism to industrial capitalism in which labor, land, and money were reconceptualized as commodities; British legal system provided the framework for appropriation, enclosure, displacement, extraction, and accumulation of natural resources from farm to factories) to the present era by framing the move to informational capitalism in terms of three large-scale shifts: the propertization (or enclosure) of intangible resources, the dematerialization and datafication of the basic factors of industrial production, and the embedding (and rematerialization) of patterns of barter and exchange within information platforms. (241). Legal institutions allow information economy actors to advance their goals by making economic rationality specific and detailed. The process of data extraction includes the legal construct of the public domain → the idea that there are raw materials that are there for the taking. (Creates a field for acts of appropriation.) Another legal tool is the terms-of-use agreement. Data and algorithms have immense commercial value and therefore become the target of strategies that attempt to shift them in such a way as to favor the needs and desires of surveillance capitalists in order to appropriate the data/algorithms. ToAs move personal info from the field of public domain to enclosure. Dual-class stock ownership gives privileged groups of shareholders greater voting power. (Allows founders to retain authority over corporate strategy after IPO.) Another tool is constitutional and statutory strategies that avoid accountability for harm. (For example, the use of the first amendment to protect information processing as speech.) Tech firms work to invoke previous understandings of speech rights but also work to alter the ways those rights are understood. Regulatory institutions have also failed to constrain surveillance capitalism. One problem is an imbalance of resources b/w firms and regulators. New tech can also be used to define and achieve compliance with regulatory targets. There's also been a shift from liberal forms of governmentality under the industrial political economy to neoliberal governmentality under information capitalism. Neoliberal governmentality brings market techniques and methods into government and subjecting them to managerial oversight. Managerial regulation requires elites with tech and info skills to manage bureaucracy. This leads to the idea of an administrative state that sees new tech as a way to manage legal and regulatory processes, which enables surveillance capitalism. (See quote on right.) Surveillance capitalists have also encouraged regulation that favors self-regulation and self-certification over actual oversight.
Software
a set of instructions to perform a specific task, or a set of tasks, written in a specialized language intelligible to a computer
Protocol
a set of rules/specifications that describes how computers should interact and communicate with each other on the network
Data Feminism
a way of thinking about data, both their uses and their limits, that is informed by direct experience, by a commitment to action, and by intersectional feminist thought Also considers how race, class sexuality, ability, age, religion, geography, etc. are factors that together influence each person's experience and opportunities in the world
Data Trust Model
can separate who is managing the facility where the data is stored vs who is getting access to the data how is this enforced, examples: passcode encryption create contract based, trust based structure
Platform as a Service (PaaS)
program on someone else's computer creating an environment in which you can create software
Routing
sending information from computer A along a chain of computers (routers) to computer B using multiple types of physical medium, all of which have the ability to carry IP messages. The different mediums encode the message differently, but the contents stay the same. Every computer has a unique IP address, which allows each computer along the path to cross-reference the address with its "routing table" and decide which nearby computer to send the message along to.
Data Shard Model
store information in multiple pieces in many places
Infrastructure as a Service (IaaS)
the basic computing functionalities that are being made available through the internet data storage, data processing