Sort:  

Part 1/3:

Navigating the Risks of Autonomous AI Systems

The way we test and assess the risks of autonomous AI systems is crucial, as these models become increasingly capable of conducting AI research themselves. This threshold of AI models being able to engage in AI research is an important milestone, as it signifies a level of true autonomy.

Defining Autonomy Levels (ASL)

To address this challenge, the RSP (Research Safety Policy) has developed an "if-then" structure that outlines different Autonomy Safety Levels (ASL):

  1. ASL1: Systems that clearly pose no risk of autonomy or misuse, such as a chess-playing bot like Deep Blue.

  2. ASL2: Today's AI systems, which have been measured and deemed not smart enough to autonomously self-replicate or conduct dangerous tasks like providing information on CBRN (chemical, biological, radiological, and nuclear) weapons beyond what can be found through a basic web search.

[...]

Part 2/3:

  1. ASL3: Models that are capable of enhancing the capabilities of non-state actors, requiring special security precautions to prevent theft and misuse.

  2. ASL4: Models that could enhance the capabilities of already knowledgeable state actors or become the primary source of such risks.

  3. ASL5: Models that are truly capable of exceeding human abilities in any of these tasks.

The "If-Then" Commitment

The purpose of this "if-then" structure is to avoid antagonizing people or harming the ability to have a voice in the conversation by imposing burdensome requirements on models that are not currently dangerous. The commitment is to clamp down with strict safety and security measures when a model is shown to be dangerous, with a sufficient buffer threshold to ensure the danger is not missed.

[...]

Part 3/3:

This framework is not perfect and has required frequent updates, as the team behind it grapples with the technical, organizational, and research-related challenges of getting these policies right. Nonetheless, the "if-then" commitment and triggers aim to minimize burdens and false alarms while ensuring an appropriate response when the dangers of autonomous AI systems become evident.