Digital Security, Secure Coding
The limits of current AI need to be tested before we can rely on their output
18 Aug 2023
4 min. read
Dr. Craig Martell, Chief Digital and Artificial Intelligence Officer, United States Department of Defense made a call for the audience at DEF CON 31 in Las Vegas to go and hack large language models (LLM). It’s not often you hear a government official asking for an action such as this. So, why did he make such a challenge?
LLMs as a trending topic
Throughout Black Hat 2023 and DEF CON 31, artificial intelligence (AI) and the use of LLMs has been a trending topic and given the hype since the release of ChatGPT just nine months ago then it’s not that surprising. Dr. Martell, also a college professor, provided an interesting explanation and a thought-provoking perspective; it certainly engaged the audience.
Firstly, he presented the concept that this is about the prediction of the next word, when a data set is built, the LLM’s job is to predict what the next word should be. For example, in LLMs used for translation, if you take the prior words when translating from one language to another, then there are limited options – maybe a maximum of five – that are semantically similar, then it’s about choosing the most likely given the prior sentences. We are used to seeing predictions on the internet so this is not new, for example when you purchase on Amazon, or watch a movie on Netflix, both systems will offer their prediction of the next product to consider, or what to watch next.
If you put this into the context of building computer code, then this becomes simpler as there is a strict format that code needs to follow and therefore the output is likely to be more accurate than trying to deliver normal conversational language.
The biggest issue with LLMs is hallucinations. For those less familiar with this term in connection with AI and LLMs, a hallucination is when the model outputs something that is “false”.
Dr. Martell produced a good example concerning himself, he asked ChatGPT ‘who is Craig Martell’, and it returned an answer stating that Craig Martell was the character that Stephen Baldwin played in the Usual Suspects. This is not correct, as a few moments with a non-AI-powered search engine should convince you. But what happens when you can’t check the output, or are not of the mindset to do so? We then end up admitting an answer from ‘from artificial intelligence’ that is accepted as correct regardless of the facts. Dr. Martell described those that don’t check the output as lazy, while this may seem a little strong, I think it does drive home the point that all output should be validated using another source or method.
Related: Black Hat 2023: ‘Teenage’ AI not enough for cyberthreat intelligence
The big question posed by the presentation is ‘How many hallucinations are acceptable, and in what circumstances?’. In the example of a battlefield decision that may involve life and death situations, then ‘zero hallucinations’ may be the right answer, whereas in the context of a translation from English to German then 20% may be ok. The acceptable number really is the big question.
Humans still required (for now)
In the current LLM form, it was suggested that a human needs to be involved in the validation, meaning that one or several model(s) should not be used to validate the output of another.
Human validation uses more than logic, if you see a picture of a cat and a system tells you it’s a dog then you know this is wrong. When a baby is born it can recognize faces, it understands hunger, these abilities go beyond the logic that is available in today’s AI world. The presentation highlighted that not all humans will understand that the ‘AI’ output needs to be questioned, they will accept this as an authoritative answer which then causes significant issues depending on the scenario that it is being accepted in.
In summary, the presentation concluded with what many of us may have already deduced; the technology has been released publicly and is seen as an authority when in reality it’s in its infancy and still has much to learn. That’s why Dr. Martell then challenged the audience to ‘go hack the hell out of those things, tell us how they break, tell us the dangers, I really need to know’. If you are interested in finding out how to provide feedback, the DoD has created a project that can be found at www.dds.mil/taskforcelima.
Before you go: Black Hat 2023: Cyberwar fire-and-forget-me-not