In the last year the tech world has seen some spectacular expectation resets on “revolutionary” technology. Social media has gone from a boon to democracy to a hellscape of mediocre and abusive content. Cryptocurrency and NFT swallowed more fortunes than they created and the Metaverse is little more than a money pit for Mark Zuckerberg. But no advancement changed in perception as fast as large language model (LLM) machine-learning systems.
The technology, in development for most of the past decade, rose rapidly in public attention in November 2022 when OpenAI announced the general availability of ChatGPT. By December both the negative/positive hype were popular in the media. By January 2023 legislative bodies around the world were discussing regulating it. In April, the CEO of OpenAI appeared before the US Congress practically begging for someone to stop them from creating a monster.
Honestly, the technology increases human productivity, but its ability to live up to expectations is severely tempered by three factors: The inability to do what proponents claim, a growing general distrust of technology companies, and the technological weaknesses that make it a security nightmare.
The latter concerns have been widely discussed even before “generative AI” became a household word. The National Security Commission on AI (NSCAI) completed a three-year study in 2021, issuing a 756-Page report that could be summarized in one of its sentences. “The threat is not theoretical. Adversarial attacks are happening and already impacting commercial ML systems.”
Adversarial methodologies
Adversarial machine learning is a branch of AI research rooted in the early 2000s to attempt to evade AI-powered email spam filters. An adversarial attack on an AI algorithm could introduce design flaws or install a back door to bypass the security measures of a system. In Not with a Bug but with a Sticker, (Wiley, 2023) Microsoft data science pioneers Ram Shankar and Dr. Hyrum Anderson said. “If a user can use your ML-powered system, they can replicate it. If they can replicate it, they can very much attack it.”
The semiconductor industry has been using AI in some form for decades and represents a more nuanced acceptance of the technology. Mike Borza, Principal Security Technologist at Synopsys said, “I think it’s not possible to say that you can stop (data corruption in AI) completely, but there’s a high probability it’ll be detected. Our data tend to be fairly well controlled and the AI is very private.”
According to Dan Yu, AI/ML product solutions manager at Siemens, ML models are limited by the scope of the training data. “You are to know what those limits are,” he explained. “We want to use AI to improve productivity, but not replace human involvement.”
Yu places machine-learning tools only at the front end of the design and they have to be verified the same way as chips were verified before the popularity of AI. “It’s still a human sitting there to decide that the quality is good.”
No Guarantees
In general use, however, just keeping tight control over data doesn’t guarantee security.
Frank Huerta, CEO of Curtail, a development tool provider, said screening of known vulnerabilities doesn’t protect against unknown vulnerabilities. “There’s a whole other universe of problems out there that we don’t have a good way to screen for. The premise is if that data set is large enough, you’ve got everything. If you don’t, what do you do? With security, an adversary is trying to screw up your analysis. Hackers are using AI-based attacks to intentionally thwart what you’re doing with this kind of large data model approach.”
Shankar and Anderson, presented multiple studies showing people are more likely to follow directions of an automated system that has previously demonstrated incompetence than in common sense, even to the point of lethal consequences. “It is not AI’s failure to meet expectations,” said Shankar and Anderson. “It’s that we have very high expectations in the first place. The problem is that in many settings we overtrust it.”
Defending the indefensible
We turned to the Infosec.live group of cybersecurity professionals and asked them to, list defenses for generative AIs. They recommended using multiple AI algorithms and diverse sets of training data to reduce the risk of attacks. An adversary may manipulate one AI algorithm with other algorithms detecting the manipulation to prevent a flawed design. If an attacker can infer the layout or architecture of a semiconductor from the output of one AI algorithm, they may not be able to do so from the output of another algorithm.
Making AI algorithms explain their decision-making process in a human-understandable way is another defense the Infosec.live team recommended. If an AI algorithm can explain how it arrived at a particular output, it can make it more difficult for an attacker to infer information about the semiconductor design from the output.
Integrating AI on bare metal
A security processor startup, Axiado, last week began sampling a novel, AI-driven security processor to perform security against ransomware, supply chain, side-channel and other cyberattacks.
“Typically an AI/ML model is stored on top of a system (e.g., Linux). Axiado’s Secure AI™ models reside on bare metal that is inherently secure because lots of the vulnerabilities come from the higher level of the system,” said Axiado CEO Gopi Sirineni. “Our ML pipeline builds data lakes with various vulnerability and attack data sets. All Axiado’s model updates go through stringent security scrutiny methodology before they get accepted into data lakes and customer deployments.”
Axiado employs a secured fabrication-house-to-chip assembly to customer production. Axiado uses life-cycle management to enforce secure manufacturing and customer provisioning policies. The Trusted Control Unit (TCU) uses one-time programmable memory to store lifecycle stages and is protected from all side-channel attacks.
Secure the data
Generative AI programs require access to a large amount of data to learn and generate new content. It’s important to secure this data so that it can’t be accessed by unauthorized users. This can be done through data encryption, access control, and monitoring.
That can be difficult, however. Data at rest can be encrypted. Data in transit can be encrypted. But until recently, data in use has been vulnerable. Not only can adversaries install spyware to monitor and exfiltrate design data, but the popularity of AI as a way to clean up code has put proprietary code into public use. Vaultree, claimed to have resolved this problem with a tool that can encrypt data in use.
Vaultree CEO Ryan Lasmali said their tool can be used even with a public generative AI service, as long as it is with a registered business account. “Data shared publicly on public servers becomes open source, essentially,” he said. “But on a business account with the Vaultree tool that sensitive information is encrypted and processed in its encrypted form.”
ML-powered systems significantly increase design productivity, but that productivity boost does not trump concerns about inherent security weaknesses. they are not a replacement for skilled and educated workers. Their use should be approached warily before a design is complete.
Lou Covey is the Chief Editor for Cyber Protection Magazine. In 50 years as a journalist he covered American politics, education, religious history, women’s fashion, music, marketing technology, renewable energy, semiconductors, avionics. He is currently focused on cybersecurity and artificial intelligence. He published a book on renewable energy policy in 2020 and is writing a second one on technology aptitude. He hosts the Crucial Tech podcast.