Skip to main content

Now available: AI that finds and provides autonomous patching at scale

DARPA and ARPA-H say the winners and finalists of the AIxCC have developed a level of cyber reasoning that's now ready to tip the scales in favor of defenders – and hopefully make ransomware a thing of the past.
By Andrea Fox , Senior Editor
Glasses in front of computer screens with software code
Photo: Kevin Ku/Pexels

The winners of the AI Cyber Challenge, or AIxCC, were announced this past week at DEF CON 33, the long-running hacker conference, in Las Vegas.

The challenge, offered jointly by the Defense Advanced Research Projects Agency and the Advanced Research Projects Agency for Health under the U.S. Department of Health and Human Services, shined a spotlight on several new cyber reasoning systems – some of which, including those of the top three winners, are already available for access and download.

Pairing genAI with cyber reasoning

Unsophisticated cyber actors are undermining systems with large language models like ChatGPT, so DARPA funded a competition in partnership with organizations willing to make their foundational models available.

The goal was to test whether generative artificial intelligence, paired with cyber reasoning systems, could automatically both find and fix vulnerabilities in open source software.

There was a chance that it may not have worked, but the agency chose to take on the risk of an open, public competition, Kathleen Fisher, director of DARPA’s Information Innovation Office, explained in a video posted last year during the AIxCC semi-finals.

The two-year challenge, now completed, aimed to develop a technology that can give cyber defenders a much-needed edge in identifying and patching vulnerabilities at speed and scale to better protect water utilities, energy suppliers and healthcare delivery organizations.

The use of AI to secure open-source software and technology advanced significantly from the semifinal competition held last year, according to DARPA, with teams more than doubling both the identification of the competition’s synthetic vulnerabilities and their patching.

Teams submitted patches in an average of 45 minutes, according to the announcement about the winners.

Protecting at-risk hospitals

The achievement – using genAI-powered cyber reasoning systems to find and auto patch vulnerabilities – is significant because "software runs the world," Fisher said Friday morning after the winners were announced.

ARPA-H collaborated with DARPA and worked with the competitors to develop the systems because health infrastructure is a prime target that not only jeopardizes patients and their data, but also the hospitals themselves.

It's been a huge challenge.

"A third of our nation's hospitals are at risk of going bankrupt," Jennifer Roberts, director of the ARPA-H resilient systems office, said at the media briefing that followed the announcement of the competition's winners. "A cyberattack could be the thing between them and closing their doors."

On average, it has taken the health sector 491 days to patch software vulnerabilities, Roberts noted. "The off-the-shelf tools are not cutting it," she said.

ARPA-H has committed $20 million to move the winning ideas into practical application across medical devices, health IT and biotech. The cyber reasoning systems will move healthcare cybersecurity toward "a future where ransomware attacks against hospitals become a thing of the past," she said.

Find-and-patch with free CRS

All seven cyber reasoning systems in the final competition are being released as open source software under a license approved by the Open Source Initiative. Five teams' cyber reasoning systems are already available, while the others will be released anytime in the coming weeks.

"We are immediately making these tools available for cyber defenders," DARPA Director Stephen Winchell said in a statement. "Finding vulnerabilities and patching codebases using current methods is slow, expensive and depends on a limited workforce – especially as adversaries use AI to amplify their exploits."

"We see evidence that the process of a cyber reasoning system finding a vulnerability may empower patch development in situations where other code synthesis techniques struggle," Andrew Carney, AIxCC program manager, added.

The final competition scoring algorithm prioritized performance based on the ability of each competitor's platform to create patches for vulnerabilities quickly, as well as their report analyses, DARPA said in the announcement.

Team Atlanta scored highest in finding and proving vulnerabilities, generating patches and pairing vulnerabilities and patches.

The team, composed of experts from Georgia Tech, Samsung Research, the Korea Advanced Institute of Science & Technology and the Pohang University of Science and Technology, took the top prize and was awarded $4 million for its Atlantis CRS.

Coming in second place was New York-based Trail of Bits, which held a $3 million purse. While the boutique firm released the two versions of its CRS, Buttercup, that competed in both AIxCC’s semifinal and final rounds, it also released a standalone version.

According to a statement from the company, this third version runs on a typical laptop, so it works within an AI budget appropriate for individual projects.

"We've finally shown that we can find real-world vulnerabilities, we can patch them, and we can do this at scale and we can do it at a price that is reasonable for virtually any organization to invest in," said Michael Brown, the boutique firm's principal security engineer and the head of AI/ML research, during the media briefing.

Trail of Bits previously participated in an ARPA-H project called DIGIHEALS that sought partners that could apply proven technologies developed for national security to health systems, clinical care facilities and personal health devices.

The firm developed a software patching capability that empowered developers to create patches for vulnerabilities in the program binaries executed by medical devices and related non-medical devices.

Buttercup has four components, including a multi-agentic patch generation system, which the company said uses seven distinct AI agents to fix vulnerabilities and avoid breaking the program’s other functionalities.

Theori took third place with $1.5 million. The company said in a blog post on its website that its RoboDuck CRS is unique among the AIxCC competitors because it offers a full pipeline to develop Proofs of Vulnerability without fuzzing, symbolic execution or related techniques.

The cyber reasoning systems from the remaining teams, All You Need IS A Fuzzing Brain and Shellphish, are available with those from Lacrosse and 42-b3yond-6ug, soon to be added to the cyber challenge's repository.

Of note, because the competition was based on using real-world software, teams could discover vulnerabilities not intentionally introduced to the competition, and the final round uncovered 18 new non-synthetic vulnerabilities, which DARPA said are being responsibly disclosed to open source project maintainers.

Of these, six were in C codebases and 12 were in Java codebases. Finalist teams provided 11 patches for these vulnerabilities.

"Since the launch of AIxCC, community members have moved from AI skeptics to advocates and adopters," Carney said. "Quality patching is a crucial accomplishment that demonstrates the value of combining AI with other cyber defense techniques."

Andrea Fox is senior editor of Healthcare IT News.
Email: afox@himss.org
Healthcare IT News is a HIMSS Media publication.