Generative AI
Foundation models and Gen AI
"General purpose AI model means an AI model, including when trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable to competently perform a wide range of distinct tasks regardless of the way the model is placed on the market and that can be integrated into a variety of downstream systems or applications. This does not cover AI models that are used before release on the market for research, development and prototyping activities" (Art. 3, par. 1, n. 63). Horizontal compliance obligations for general purpose AI models (called in the EP version foundation models ), in particular relating to the transparency of copyright-protected content used in the formation of the foundation models (rec. 105). Responsibility on the provider to understand if a specific GAI falls within high risk or low risk.
Other obligations
Data Curation Duties: LGAIM developers should bear certain data curation responsibilities, such as ensuring representativeness and balance between protected groups Addressing Discrimination: Discrimination risks should be addressed during both development and deployment stages to prevent discriminatory output. AI value chain: Discrimination in AI systems should be tackled at its roots, primarily within the training data, to prevent propagation down the AI value chain. Proactive Auditing: LGAIM developers should proactively audit training data sets for misrepresentations of protected groups and implement mitigation measures. Balancing Biases: Real-world training data should be complemented with synthetic data to balance historical and societal biases. For example, gender-neutral professions could be created by automatically exchanging gender-specific identifiers. Tailored Approach: Regulatory requirements should be tailored to the size and type of training material used by LGAIM developers, ensuring proportionate compliance efforts.
Critique: Risks
Narrower definitions of high-risk AI systems still pose challenges due to the versatility of large AI models. Providers cannot exclude high-risk uses entirely, as these models may be repurposed for various applications, some of which are high-risk. Large AI models like LGAIMs are inherently high-risk due to their versatility and potential for various applications. Compliance with high-risk obligations, such as establishing comprehensive risk management systems, becomes challenging. Analyzing risks for every possible application in high-risk cases is daunting and may border on the impossible for LGAIM providers. The obligation to address risks to health, safety, and fundamental rights across all potential uses adds complexity: FUNDAMENTAL RIGHT IMPACT ASSESSMENT
Transparency for users
Professional Users: Obligated to disclose LGAIM-generated or adapted content, especially in fields like journalism or academic research. Non-Professional Users: Not required to disclose AI use, except in cases of social media where AI detection tools can uncover harmful content. Enforcement and Technical Measures: - AI Office - Enforcement of transparency rules requires technical support like digital rights management and watermarks.
Operationalizing Principles
Recital emphasizing that obligations apply primarily to original GPAI providers. Legal presumption that further adaptations of the FM, excluding significant changes in risk profile, do not invoke new obligations. Corresponding recital clarifying that this presumption covers various modification techniques.
Critique: Complementarity with other regulations
The Digital Services Act (DSA) faces criticism for being outdated upon enactment due to limitations in its scope. The DSA applies only to intermediary services, such as Internet access providers, caching services, and hosting services. However, LGAIMs don't fit into these categories as they are not comparable to traditional hosting service providers. While users prompt LGAIMs for information, it's the AI system itself that generates the content, unlike user-provided data on social media platforms. As a result, LGAIMs fall outside the scope of the DSA, leaving their content generation uncovered by the act. However: - Even though LGAIM content generation isn't covered by the DSA, it may still be subject to content liability laws. - LGAIM outputs may be regulated under speech laws, similar to online comments made by human users. - The limitations of the DSA highlight the need for updated regulations that can effectively address emerging technologies like LGAIMs in the digital landscape.
Generative AGI
A new wave of AGI technologies, termed 'general purpose AI' or 'foundation models,' possess generative capabilities. Features: - Training Data: These models are trained on extensive unlabeled data, allowing for versatile applications with minimal fine-tuning. - Process: LGAIMs analyze training data as probability distributions, sampling and mixing to create new content. - Output: LGAIMs can produce various outputs like text, images, audio, or even video based on human input. - Data Source: Developers often use openly available internet data, which may be biased or of varying quality. - Risk: Generated content can inherit biases or be harmful; developers must employ curation techniques
Categories
Artificial intelligence (AI) encompasses various technologies, broadly categorized into 'artificial narrow intelligence' (ANI) and 'artificial general intelligence' (AGI). - ANI: Also known as weak AI, ANI technologies are specialized for specific tasks and operate within predefined environments, like image and speech recognition systems. - AGI: Referred to as strong AI, AGI technologies are designed to perform a wide range of tasks, think abstractly, and adapt to new situations. Recent advancements, including large language model (LLM) techniques, have accelerated AGI development.
Liability
Assumption of Responsibilities: Deployers take on provider responsibilities if they make substantial modifications to the model, as per Article 25(1)(b) of the AI Act. Generally, deployers should not bear provider responsibilities for merely fine-tuning a general-purpose model to avoid chilling effects on the AI value chain. The challenge lies in precisely defining what constitutes a "significant modification" to the model, balancing innovation and legal clarity. Rec 23 'substantial modification' means a change to an AI system after its placing on the market or putting into service which is not foreseen or planned in the initial conformity assessment carried out by the provider and as a result of which the compliance of the AI system with the requirements set out in Chapter III, Section 2 is affected or results in a modification to the intended purpose for which the AI system has been assessed
Risks of Gen AI
Black box Unreliability ('hallucinations') Potential for misuse New systemic risks (e.g. disinformation, bias, security and safety) Unexpected capabilities potentially Impact on labor market Negative environmental impact
Classification of general purpose AI, articles 51-55
Classification based on the risk posed by the GPAI. A GPAI is considered systemic risk if: - Has high impact capabilities assessed on the basis of appropriate technical tools and methodologies, including indicators and benchmarks - Based on a decision by the Commission, either ex officio or following a qualified referral by the Scientific Panel, that an IA model for general use has equivalent capacity or impact to those referred to in point a The classification of GPAI models as posing systemic risks will initially depend on the capacity, based either on a quantitative threshold of the cumulative amount of computation used for its training, measured in floating point operations (FLOPs), or on an individual designation decision of the Commission taking into account the criteria listed in Annex IXc (e.g. number of parameters, quality and security level). The threshold identified is 10^25 FLOPs.
Additional Obligations for Systemic-Risk Foundation Models (FMs), articles 51-55
Evaluation and Red Teaming: Providers must conduct model evaluations using advanced protocols and tools. Adversarial testing (red teaming) is necessary to identify and mitigate systemic risks. Risk Assessment and Mitigation: A comprehensive risk management system is required to assess and mitigate systemic risks. This includes preventing major accidents, disruptions of critical sectors, and serious consequences to public health and safety. Cybersecurity: Ensuring a high level of cybersecurity for both the AI model and its physical infrastructure is essential to safeguard against potential threats. Incident Reporting: Providers must report any incidents to the AI Office promptly, ensuring transparency and accountability in managing risks associated with systemic-risk foundation models.
Generative AI
Generative AI models are trained on large datasets from which they learn patterns and structure and then generate new synthetic content that has similar characteristics.
Transparency requirements for LGAIMs
LGAIM provider and deployers should report on: - Provenance and curation of training data. - Model performance metrics. - Incidents and mitigation strategies for harmful content. - Ideally, disclose greenhouse gas emissions for sustainability impact assessment.
Opportunities of Gen AI
Scale and speed Low cost mentry Domain adaptation and fine-tuning General purpose/multi-tasks Innovation and creativity Efficiency and economic growth Potential to solve global challenges (e.g., health, climate)
Critique: Competition
The current regulations for General-Purpose Artificial Intelligence Systems (GPAIS) could have adverse effects on the competitive landscape of LGAIMs. - Open-source developers, including those in research or philanthropic sectors, are affected by the AI Act's stringent requirements. - Compliance with high-risk obligations may be feasible only for large players like Google, Meta, or Microsoft/Open AI due to their resources. Smaller developers and SMEs may find it prohibitively costly to meet these requirements, potentially leading to market concentration. - This contradicts the aim of promoting innovation and competitiveness, as outlined in the AI Act. - Similar challenges have been observed with GDPR regulations, leading to anti-competitive concentration
Critique: too broad definitiom
The definition of GPAIS in Article 3 AI Act is overly broad. Inspired by foundation models and LGAIMs, GPAIS operate with vast parameters, training data, and compute power. While not yet approaching artificial general intelligence, LGAIMs are more versatile than traditional AI systems. Issues: - LGAIMs can solve tasks they weren't specifically trained for and cover a broader range of problems. - The concept of "generality" in GPAIS refers to their ability, tasks, or outputs. - The current definition in the AI Act risks including simple systems that lack significant generality - Only genuinely versatile AI systems should be classified as GPAIS.