Automating C To Rust Code Conversion
In a groundbreaking move, the US Defense Advanced Research Projects Agency (DARPA) is embarking on an ambitious project to modernize programming practices. The new initiative, named TRACTOR, stands for TRanslating All C TO Rust, and aims to revolutionize the way legacy C code is converted to the more secure Rust programming language using advanced artificial intelligence (AI) techniques.
The driving force behind TRACTOR is to address one of the most pressing issues in software development: memory safety. Memory safety bugs, such as buffer overflows, are notorious for causing critical vulnerabilities in software systems.
By transitioning legacy code from C, a language with known memory safety issues, to Rust, which is designed to prevent such vulnerabilities, DARPA seeks to enhance the security of software applications significantly.
DARPA’s New Initiative TRACTOR
According to DARPA’s official statement, “Eliminating Memory Safety Vulnerabilities Once and For All DARPA initiates a new program to automate the translation of the world’s highly vulnerable legacy C code to the inherently safer Rust programming language.”
This initiative addresses the prevalent issue of memory safety vulnerabilities that affect computer memory by either allowing direct manipulation or resulting in undefined behaviors when the language standard is unclear.
The move towards Rust is supported by a consensus in the software engineering community that mere bug-finding tools are insufficient to tackle these issues. The Office of the National Cyber Director has emphasized the need for proactive measures to combat memory safety vulnerabilities, highlighting the urgency of this initiative.
The challenge, however, lies in the vast scale of rewriting legacy code. Since its inception in the 1970s, C has become deeply entrenched in various applications, from modern smartphones to complex defense systems. The Department of Defense, in particular, relies heavily on C, making the task of updating these systems even more critical.
TRACTOR Aims to Leverage Modern Technology
Recent advancements in machine learning, including large language models (LLMs), have created new opportunities for tackling this problem. TRACTOR aims to leverage these technologies to automate the conversion process, making it feasible to update extensive codebases efficiently.
Dr. Dan Wallach, DARPA’s program manager for TRACTOR, explains, “You can go to any of the LLM websites, start chatting with one of the AI chatbots, and all you need to say is ‘here’s some C code, please translate it to safe idiomatic Rust code,’ cut, paste, and something comes out, and it’s often very good, but not always.” He adds, “The research challenge is to dramatically improve the automated translation from C to Rust, particularly for program constructs with the most relevance.”
The goal of TRACTOR is not just to automate code conversion but to achieve the high quality and style of Rust code that a skilled developer would produce manually. By doing so, the program aims to eradicate the class of memory safety vulnerabilities inherent in C programs. In addition to leveraging software analysis methods, including static and dynamic analysis, TRACTOR will incorporate LLM-powered solutions and host public competitions to showcase and test these innovations.
“Rust forces the programmer to get things right,” Wallach remarks. “It can feel constraining to deal with all the rules it forces, but when you acclimate to them, the rules give you freedom. They’re like guardrails; once you realize they’re there to protect you, you’ll become free to focus on more important things.”
DARPA will hold a Proposers Day on August 26, 2024, providing an opportunity for participants to learn more about the initiative, either in person or virtually. Interested parties must register by August 19, 2024. More details and registration information are available on SAM.Gov.