Google and Meta researchers are warning that AI agents should be treated as ‘untrusted’ systems as companies race to deploy autonomous software capable of handling emails, payments, coding and enterprise workflows.
In a new paper titled ‘Agent Security is a Systems Problem,’ researchers argued that simply making large language models more robust will not be enough to secure next-generation AI agents. Instead, security protections must be built around the systems controlling them, much like safeguards used in operating systems and cloud infrastructure.
The report notes:
We take the position that agent security must be approached as a systems problem: the AI model powering the agent must be treated as an untrusted component, and security invariants must be enforced at the system level. Through this lens, efforts to increase model robustness (the dominant viewpoint in the community) are insufficient on their own.
Instead, we must complement existing efforts with techniques from the systems security domain. Based on our experience as cybersecurity researchers in operating systems, networks, formal methods, and adversarial machine learning, we articulate a set of core principles, grounded in decades of systems security research, that provide a foundation for designing agentic systems with predictable guarantees.
As evidence, we analyze eleven representative real-world attacks on agents and discuss how systems principles, if realized, could have prevented these attacks. We also identify the research challenges that stand in the way of implementing these principles in agents.
The report analyzed 11 real-world attacks on AI agents and concluded that many failures stem from giving models excessive permissions or direct access to sensitive systems without sufficient isolation or oversight.
Researchers warned that agents remain vulnerable to
even when underlying models improve.
The findings come as Silicon Valley intensifies efforts to commercialize ‘agentic AI’ – software that can independently execute tasks with minimal human supervision. Companies including Google, Meta, Microsoft, and Amazon Web Services (AWS) are investing heavily in AI agents for enterprise and consumer applications.
The researchers said the industry’s current approach mirrors early cybersecurity mistakes in computing where systems trusted components that later proved exploitable. Their proposed framework would treat AI models as inherently unreliable and enforce security guarantees at the infrastructure layer instead.
The paper adds to growing concern across the AI industry about autonomous systems gaining access to corporate data, developer environments, and financial infrastructure. Recent incidents involving coding agents deleting production databases and AI systems executing unintended actions have amplified scrutiny over the technology’s deployment risks.
The authors called for:
before AI agents are widely trusted with critical operations.
Stay tuned to BitKE on crypto and AI developments.
Join our WhatsApp channel here.
Follow us on X for the latest posts and updates
Join and interact with our Telegram community
___________________________________________

