🇪🇺✍️ EU AI Act: Systemic Risks in the First CoP Draft Comments ✍️🇪🇺

Community Article Published December 12, 2024

As we navigate the evolving landscape of AI governance, the first draft of the Code of Practice offers an important step toward fostering transparency, accountability, and safe deployment of AI systems. This voluntary code is based on the requirements outlined in the EU AI Act and aims to guide the responsible development and use of general-purpose AI.

At Hugging Face, as the leading platform for sharing and collaborating on AI models, systems, and datasets, we bring together a diverse community of participants in the development of AI technology, ranging from large organizations to SMEs, open-source contributors, users of the technology, and independent researchers. Through our participation in the Code of Practice drafting process, we aim to support this wide range of stakeholders to ensure that requirements remain inclusive of their different needs and enable broader participation in the shaping of AI. To that end, we are publicly sharing our comments on the first draft of the Code of Practice to invite discussion and support transparency of the process, you can find them here in full.

In this blog post, we provide a high-level summary of these comments, and elaborate further on the notion of systemic risks, since we find the current over-representation of remote and speculative risks following definitions put forward by some large developers exclusionary of smaller entities and external stakeholders and unlikely to effectively prevent harms.

On General Transparency Requirements

The transparency requirements for the AI Office and downstream providers (Measures 1 and 2) are generally headed in the right direction. Sufficiently transparent and detailed documentation, traceability, and accurate representation of a model’s performance and limitations are a cornerstone of responsible and secure deployment of the technology; including basic information about a model’s general architecture and running cost, training data composition, and performance evaluation that is grounded in scientific principles including replicability and ability to compare across systems.

We do make some recommendations to improve the efficacy of the proposed measures. First, since several of the categories that are currently only disclosed to the AI Office are of direct relevance to both downstream providers and to external stakeholders, subsequent drafts should direct more of the information toward the latter, including by encouraging public-facing documentation. Additionally, some of the language in specific categories could be adapted to better work for open and collaborative development settings and make recommendations going in that direction.

On Requirements Related to Copyright

The requirements related to copyright (Measures 3, 4, and 5) also go in promising directions overall. The sub-measures’ focus on transparency and on converging on standards and common good practice through collaboration between different categories of stakeholders are particularly welcome. Fragmentation of processes for implementing TDM rights reservation and for handling copyright related complaints would be broadly detrimental; it not only hurts the open and collaborative development of AI systems by making it harder for well-intentioned smaller actors to access the tools and standards they need, it also harms individual copyright holders who would need to navigate a complex and unintelligible ecosystem of crawlers and relationships with individual GPAI developers.

We do however have some specific concerns about sub-measures in that category. Sub-measure 3.1 requires developers to establish and implement a copyright policy. In our assessment, this risks leading to more fragmentation, and to exclude open and collaborative developers of AI systems and developers of AI components and datasets who are both less able to absorb compliance costs than larger integrated developers and have to contend with many more use cases. We instead recommend focusing on providing common guidance on what constitutes an acceptable copyright policy at different stages of development. We also express concerns about requirements that are organizationally incompatible with the use or development of open datasets, which play an important role in both enabling development by smaller actors and in providing the clarity needed to develop informed copyright practices to benefit all stakeholders. Additionally, both for the transparency requirements and for the copyright transparency measure (measures 1, 2, and 5), the relationship between each other and with the “sufficiently detailed training data summary” template should be clarified.

On the Proposed Taxonomy of Systemic Risks

Our main concern with the current draft lies in the proposed taxonomy of systemic risks outlined in Measure 6. The taxonomy’s overall focus at this stage discounts many of the more likely risks in favor of a narrow set that includes remote and unlikely hazards, which makes it significantly more difficult to work on collaborative evidence-based solutions to address the more immediate concerns; and is thus especially damaging to smaller actors and distributed developers. In particular, the taxonomy nearly entirely misses what is likely to be the main vector of systemic harm for GPAI systems: as they are becoming increasingly ubiquitous as universal digital infrastructure, “unintentional” harms caused by immature or inappropriate commercial deployment at unprecedented scales are growing much more concerning.

This focus is detrimental both to smaller developers and to external stakeholders. On the one hand, the lack of scientific consensus or evidence basis to direct joint efforts makes it significantly more difficult for small and medium developers to rely on open or collaborative solutions to develop effective risk mitigation solutions; whereas larger developers are more likely to be able to dedicate resources to compliance requirements that are more dependent on performance than on results. On the other hand, with significant effort dedicated to these categories, mitigations of systemic risks that are more grounded in recent evidence are likely to go under-resourced, or inappropriately handled by an entirely model-level context-agnostic approach.

At this (still relatively) early stage in the Code of Practice elaboration, we recommend taking a step back to fully reconsider what a sustainable and effective approach to mitigating systemic risks arising from GPAI use should look like by prioritizing the elements of the current draft that foster transparency and evidence-base collaboration.

What is the current taxonomy missing?

Systemic risks in the AI Act are broadly defined as risks to public health, safety, society as a whole, or to entire domains of activity or communities arising from “high-impact capabilities”. Recital 110 specifically mentions risks tied to “major accidents” or “disruptions of critical sectors” as well as effects on democratic processes and information ecosystems. This brings to mind risks like the recent global outage due to the CrowdStrike failure, or major threats to e.g. access of EU citizens to essential services. These categories of risk arise from “high-impact capabilities”; i.e. systems that are marketed as capable of producing secure software code for critical systems, or as capable of summarizing an individual’s entire history and legal context well enough to assign unemployment or other benefits. They are supported by enough substantive evidence across fields of technology and social sciences to make them a significant and immediate concern. They can also be mitigated individually and collaboratively by signatories of the Code of Practice through design choices, robust and transparent documentation of the model’s performance and limitations for different categories of capabilities, and sufficient access to downstream developers to support in-context evaluation and mitigation approaches.

In contrast, the proposed taxonomy – currently a list without further structure – reflects a different view of systemic risks and of the role of GPAI developers in mitigating them. Of the six categories in the taxonomy, three rely primarily on implicit models of malicious users leveraging properties of the models sometimes described as “dangerous” (rather than “high-impact”) capabilities. These include contributions to CBRN risks – despite recent work showing that they remain a remote concern – and large-scale “persuasion and manipulation” phenomena – even though those are too context-dependent to be meaningfully evaluated or mitigated at the model level. One category in the list covers “loss of control” – a speculative notion without a clear definition or threat model at this time. The item on “automatic use of AI in research” is not in itself a risk or even a hazard, and could at best be seen as a risk factor for other risks. The other two categories, cyber-offence and discrimination at scale, are more grounded in specific models of harm; but still raise significant questions as to how they relate to measurable model characteristics, especially outside of a specific deployment context. Overall, the starting list constitutes at best a narrow coverage of the risks the technology is likely to pose, and at worst a (partial) red herring poised to direct significant risk mitigation efforts to building on inappropriate foundations.

It is easy to understand the appeal of those categories in the context of a Code of Practice drafting process that is tasked with finding consensus around measures that developers might agree to: since several of the larger GPAI developers have put out statements describing their safety strategies or preparedness frameworks that focus on those very topics, it can reasonably be assumed that they would be more willing to continue prioritizing them than other externally-defined categories. But following this approach comes at a cost to most concerned parties besides those few largest developers, and to the scientific foundations of the Code of Practice. First, the disconnect between the categories of harms developers currently focus on and the ones that are more likely to affect stakeholders outside of the technical development chain is likely to make the elaboration process less inclusive of the latter’s voices, as they have to fight harder to have their concerns recognized. Second, the focus on categories that are chosen, framed, operationalized, and measured by a few developers without external scrutiny or scientific consensus – often realized through elaborate scenarios that have little grounding in realistic current practical conditions – presents a risk to the integrity of the process, as the lack of transparency, explicit plausible harm mechanisms, or falsifiable claims precludes informed discussion or confrontation of perspectives across stakeholders with different priorities.

The systemic risk taxonomy in the GPAI developer’s code of practice faces a difficult challenge: it needs to provide direction to shape a broad set of strategies that can help mitigate potential large-scale harms in the most efficient way possible, without leading to requirements developers will find unmanageable or disproportionately onerous. While basing the initial version of the taxonomy on categories of risk that some developers have shown willingness to discuss might appear as a promising strategy to meet those goals, its effectiveness is limited by the lack of scientific consensus or sufficiently broad external involvement to date in defining those risks.

How can the Code of Practice move forward?

The systemic risk taxonomy in the EU AI Act Code of Practice for GPAI developers can go farther at a lower implementation cost to developers by focusing instead on supporting research into risk and risk mitigation by independent third parties. Such an approach would see the primary roles of the developers as important sources of transparent, timely, and reliable information to support this research and as technical experts charged with adapting and implementing its general findings in their own specific technical contexts. Consensus-based and collaborative approaches to risk mitigation can cover a greater breadth of cases, lower the cost of developing risk mitigation strategies by avoiding duplication of effort, help level the playing field between larger and small to medium actors who need to rely more on open and collaborative research, and foster greater prioritization of the more likely systemic risks. They can also lead to effective solutions faster than in-house development while leveraging access to more relevant expertise – as long as sufficient information about design decisions and early information on model properties are shared in a timely manner, which should be the focus of the commitments from the signatories. The Code of Practice can move in a direction that better supports this approach by focusing the next draft on the following:

Process: put the horse back before the cart. Some of the sub-measures in measures 9 and 10 are already calling for more grounded and scientifically validated risk evaluation; which are unfortunately hobbled by an initial framing around the categories currently outlined in the taxonomy. In particular, sub-measure 10.1 on “model-agnostic evidence” should constitute the basis for most of the systemic risk assessment as in most cases the high-impact capabilities most likely to contribute to systemic risks are the product of high-level design decisions that can be discussed across model instances, with choices about what data to include and what context to deploy in typically having much stronger influence than incremental performance gains on specific metrics. Sub-measures 9.1 (Methodologies), 10.3 (Scientific rigour), 10.5 (Models as part of a system), 10.8 (Sharing tools and best practices), and 10.9 (Sharing results) all similarly speak to what needs to happen upstream of the risk taxonomy definition. Additionally, for these measures to be effective, risk elicitation and mitigation research needs to be directed and coordinated outside of the developer organizations – for example, by the proposed AI Office scientific panel – to ensure access to appropriate expertise and minimize conflicts of interest. Developers do have a strong role to play by doing the significant work necessary to provide legitimate parties with the necessary information. Directing this effort toward meeting sufficient evidence standards to support the framing and scoping of systemic risk assessments and mitigations – rather than toward building mitigations of uncertain value to the safety and integrity of systems and people in isolation and on shaky foundations – will ensure that the implementation of the AI Act is indeed future-proof and gives itself the best chance of meeting its goals.

Framing: better-balanced categories to support different risk mitigation approaches. Within the collaborative approach outlined above, a systemic risk taxonomy can best support multi-stakeholder research into effective systemic risk by providing a shared language to discuss different characteristics of systemic risks and a vocabulary to tie different types of risks to specific categories of intervention. Such a taxonomy could for example cover:

  • Systemic risks resulting from inappropriate AI deployment in critical settings
    • Examples: Crowdstrike-like global outages resulting from failures in AI-generated code, compounded effects of failures from multiple AI-supported infrastructure on specific groups
    • Research: support research on formal verification for AI-enabled systems focusing on new high-impact capabilities. Support research on the effect of compounded biases coming from different uses of a GPAI on discriminatory outcomes.
  • Systemic risks pertaining to information security questions raised by scale
    • Examples: endemic use of personal data making commercial models more vulnerable to military ISTAR exploitation, increased risk of Cambridge Analytica-like breaches and misinformation campaigns
    • Research: develop standards and techniques for training data curation that minimizes personal data uses, develop evaluations of systems’ ability to make inferences about personal data across modalities, assess how high-impact capabilities in chatbot deployment settings lead users to share more personal details in conversations shared with deployers, develop standards for traceability of AI systems across training data sources and models
  • Systemic risks resulting from drastically scaling up abuse
    • Example: enable hackers to sift through terabytes of information to more efficiently identify and automatically exploit vulnerabilities and leaked secrets
    • Research: develop performance benchmarks that jointly measure a model’s performance on target in-scope use cases and adversarial uses to evaluate trade-offs. Leverage these same systems to warn individuals whose identity or information has been compromised.

Compared to the list provided in the first Code of Practice draft, the proposal above focuses more on outlining mechanisms underlining potential systemic risks based on evidence from related fields. This grounding in verified harm vectors can help ensure that the more likely hazards are appropriately prioritized while keeping the flexibility to address potential new GPAI-specific issues; without making the more speculative harms a primary concern. It also outlines the need to address risks across the full technological stack going from training data curation to deployment contexts, not just at the GPAI model level.

The next three drafts of the EU Code of Practice for GPAI developers have their work cut out for them to foster both meaningful consultation and a higher likelihood of arriving at a practical artifact that will encourage efficient responsible development practices by signatories. The framing of systemic risks, whose definition remains more nebulous than most other areas of focus of the Code, represents a particularly challenging question. To address this question in a manner that is more likely to mitigate the risks outlined in the Act without placing an undue burden on smaller developers and external stakeholders, the Code should build for sustainability by focusing on setting up scientifically grounded processes to enable collaborative research, even if it means stepping away from some of the priorities put forward to date by large developers.

Acknowledgements: this blog post is based on our submitted responsed to the CoP draft co-authored with Bruna Trevelin.