Red Team Approach: The Potential Uses of Artificial Intelligence in Biological Attacks

Mohamed SAKHRI September 7, 2024

0 1,805 6 minutes read

Red Team Approach: The Potential Uses of Artificial Intelligence in Biological Attacks

Recently, there has been an expansion in the use of artificial intelligence across various fields, from medical research and chemical reaction management to predictions and the application of AI techniques in cyber and physical attacks. Dario Amodei, CEO of the AI company Anthropic, has even warned about the potential of AI to create biological weapons in the near future.

In this context, Christopher A. Mouton, Caleb Lucas, and Ella Guest discussed the potential risks of the misuse of AI or large language models (LLMs) in biological weapon attacks in a 2024 RAND Corporation report. The report highlights a failed attempt by the Japanese group Aum Shinrikyo in the 1990s to attack the Tokyo subway with botulinum toxin, which failed due to a lack of understanding of the bacteria—a problem that AI can now solve easily and quickly. The report thus offers policy insights to mitigate any risks associated with the responsible development of AI.

The report primarily focused on researching the potential misuse of LLMs by “malicious” non-state actors in planning biological attacks. It relied on empirical evidence regarding the development of quantitative metrics, testing methods, and accountability tools. It did not disclose the specific programs tested to balance providing sufficient information for academic and policy discussions while avoiding disclosing details that could enable malicious actors. The goal is to contribute to understanding the potential threats of biological weapons and support the development of strategies to counter these threats, fostering a safer world.

Red Team Approach:

In cybersecurity, “Red Teaming” refers to simulating a real attack with the knowledge of the organization’s management, aiming to examine all possible scenarios and assess the readiness of relevant systems to counter various types of attacks. This involves identifying potential weaknesses and ranges from extracting weapon designs from AI, creating advanced cyber offensive tools, to encouraging unintended risky behaviors. These actions help accurately measure risks, ensure efficient allocation of resources, and focus on hazardous elements.

The Red Team approach requires balancing methodological precision with creative adaptability to effectively assess and address a range of potential threats. The primary goal is not to predict a limited set of future scenarios accurately but to examine a broad range of possible future scenarios. In this report, the Red Team adopted a multi-method approach that integrates qualitative and quantitative elements and aligns with social sciences, providing a research rationale for understanding complex human interactions. Thus, the report embraced a methodology consisting of a diverse and extensive range of threat scenarios.

Given that the report focuses on assessing the real-world operational impact of LLMs, theoretical risks appear closer to actionable insights. It began by studying biological weapon threats through brief “vignette” projects—projects that identify the strategic goals of four elements: the attacker, the location of interest, the target population, and available resources—across various realistic risk scenarios to determine the strategic objectives of malicious actors and assess virtual biological weapon attacks.

By selecting four brief vignettes, the report avoided fragile single-point predictions and provided a variety of potential future conditions that could guide AI organization. These vignettes do not encompass the full range of threats but establish a baseline for initial results and subsequent iterations. Two groups actively used AI to develop their work. The first, “Cyborg,” focuses on integrating AI responses with human expertise and continuously improving them. The second, “Centaur,” involves delegating appropriate tasks to AI and concentrating on its expertise.

Here’s the translated text with some adjustments:

Test Scenarios:

After reviewing the report on the “large language models” created during the research, it was concluded that some outputs from these models are inappropriate or contain unsuitable material for carrying out a biological attack at present. In one scenario, the “large language models” identified biological agents causing dangerous epidemics such as smallpox, anthrax, plague, and a strain of influenza virus, and discussed the potential for these epidemics to cause mass deaths and the expected mortality rates, which depend on factors like: the size of the affected population, and the speed and effectiveness of the response to the biological attack.

In a related context, the “large language models” discussed methods and ways of transport, timing, cost, and barriers related to obtaining live samples for biological attacks such as: rodents and fleas infected with Yersinia pestis. Extracting this information from the “large language models” requires bypassing safeguards, meaning using text prompts that override the safety constraints of the chat program, as these models initially refused to discuss these topics, although this information is generally available through government awareness campaigns.

In another scenario, the “large language models” provided simple instructions on how to grow Yersinia pestis, the bacteria that causes plague. Notably, obtaining this information does not require bypassing safeguards; detailed protocols for completing these steps can be found online and in publicly available academic journals. Therefore, the “large language models” simplified the task into basic steps and provided a brief explanation of what each step requires, such as providing a suitable environment for bacterial growth, monitoring, and then harvesting.

In a third scenario, the “large language models” offered a detailed discussion on the pros and cons of different methods for delivering botulinum toxin, via aerosol or food. The models considered food delivery methods to be “clear” but “risky,” particularly concerning the potential detection of the toxin when placed in certain types of food. Conversely, aerosol delivery methods were considered by the “large language models” to be effective for impacting a large population quickly, but they require specialized equipment and expertise.

In the fourth scenario, the “large language models” suggested simple methods for evading drone restrictions in a major U.S. city. These suggestions included information on illegally operating a drone, recommending that the drone be small, fast, and used at night to reduce the chance of detection. The “large language models” also suggested using radar jamming equipment, which indicates a confusion among these models between small commercial drones and their military counterparts, which often use radar systems to detect attacks.

Assessment of Results:

The “RAND” report finds that the scenarios developed were unsatisfactory in terms of having a sufficiently detailed and accurate basis for a malicious actor to carry out an effective biological attack. The reason is that the red team’s methodology focuses only on the planning stage and does not account for the actual execution stage. As a result, the report notes several factors outside the capabilities or limitations of Large Language Models (LLMs) that may explain the shortcomings in the results. First, the red team participants lacked the expertise and knowledge needed to accurately simulate actual attackers. The second factor relates to inherent study design limitations that hindered the realistic development of a viable plan. Finally, the fundamental complexity associated with designing a successful biological attack ensures there are deficiencies in the plans.

Overall, the results suggest that tasks involving planning a biological weapon attack are likely beyond the current capabilities of LLMs. Given the complex nature of these tasks, which require specialized expertise, the use of such models does not seem to significantly enhance the risks of biological weapon attacks. However, the technological capabilities of LLMs are not static and are expected to evolve over time. Therefore, even if current results do not reflect a significant and immediate threat posed by existing LLMs, these findings underscore the need for ongoing effort.

Future Risks:

The report notes no difference between biological weapon attack plans created with or without the assistance of LLMs regarding their sustainability. The findings also indicate that planning for a biological weapon attack exceeds the capabilities of the available LLMs at least until the end of the summer of 2023. The robustness of this result is related to future developments in LLM technology, which is an open question. Additionally, the report did not examine models without any protective barriers; while these models may be less capable, future versions might be more efficient and less constrained in engaging in biological weapon attack planning. Therefore, continued research is necessary to monitor these developments according to the report.

The red team methodology is considered a potential tool for future studies due to its ability to simulate countermeasure dynamics, through the early detection of emerging LLM capabilities that could enhance the feasibility of biological weapon attack plans. As understanding of the specific threats posed by LLMs increases, red team exercises could focus on specific tasks, enhancing future research. Moreover, expanding the scope of the experiment conducted in the report to include a larger and more diverse group of researchers is essential.

It is acknowledged that these larger experiments will require more time and money, making it crucial to improve study efficiency. As for the peer review process, it could potentially move to asynchronous systems in the future; the significant discrepancies in results require in-depth discussion. Time requirements for participants could be reduced by providing them with basic information in advance or by instilling initial attack planning concepts.

In conclusion, the report suggests that with more time, advanced skills, additional resources, or renewed motivation, it is conceivable that a non-governmental malicious actor could be incentivized to use current or future LLMs for planning or conducting a biological weapon attack. Finally, the report did not dismiss the existence of risks but did not find any significant advantage that these models offer compared to the internet alone in developing a biological weapon attack.

Christopher A. Mouton, Caleb Lucas, Ella Guest. The Operational Risks of AI in Large-Scale Biological Attacks. RAND. 2024