Why Cybersecurity Minds Are Perfect for the Challenges of AI Security

Nov 17, 2025
6 min read

This is a guest post by Eytan Schulman a Heron AI Security community member

Over the past few months I have been getting more familiar with the AI security space, especially after attending the AI Security Forum in Vegas. The conference covered a wide range of issues, from protecting model weights and building mechanisms for AI control to trusted execution environments. Yet one theme came up repeatedly: the urgent need for more cybersecurity professionals to get involved in the field.

In this post, I want to unpack that idea. Why did so many speakers emphasize this point, and why are cybersecurity professionals in such a strong position to contribute to AI safety?

At first, I was puzzled why the call for cybersecurity talent came up so often. After all, AI safety might sound like a field for researchers, policy makers, or machine learning engineers. But the more I listened, the clearer it became that many of the challenges ahead surrounding AI are not only adjacent to cybersecurity but represent cybersecurity adapted for novel threat models and attack vectors. Securing AI model weights involves the same principles as protecting sensitive data, but with additional considerations like model extraction attacks and training data reconstruction.

Assessing AI systems for vulnerabilities mirrors traditional security assessment methodologies, while extending them to cover novel attack classes like prompt injection, model poisoning, and adversarial examples. Moreover, red teaming an AI system leverages the same adversarial mindset that security professionals practice daily, though applied to fundamentally new categories of system behavior and failure modes.

An Expertise Gap at a Critical Moment

The fundamental issue is that AI safety and security requires domain expertise that doesn't yet exist at scale. The problems are urgent because AI systems are already being deployed in critical applications, but the talent pipeline is underdeveloped. Programmers with traditional AI research backgrounds often lack the adversarial mindset and practical security experience needed for real-world deployment challenges. Moreover, many AI labs face incentive structures that prioritize capability advancement and time-to-market over comprehensive security assessment, creating a gap between what security requires and what organizations resource.

Cybersecurity professionals are uniquely positioned because they already understand threat modeling, systematic vulnerability assessment, and the catastrophic cost of security failures. They know that attackers don't follow the rules, that systems fail in unexpected ways, and that robust defense requires thinking like an adversary from day one.

Traditional Cybersecurity with Expanded Threat Models

Many AI security challenges extend familiar cybersecurity problems into a domain where the stakes are fundamentally different. One such challenge is protecting model weights, which goes beyond intellectual property concerns: stolen weights enable adversaries to transform the model into their own tool.

One of the more concerning risks is adversarial fine tuning. With access to weights, attackers can adapt models to bypass safeguards and generate harmful outputs. Research such as the BadLlama study shows how easily this can be done. This isn't just about generating offensive text: a fine-tuned model could provide step-by-step guidance for synthesizing novel biological weapons, orchestrate sophisticated social engineering attacks at scale, or generate and implement exploits for zero-day vulnerabilities in critical infrastructure with future model capabilities. Unlike traditional malware or hacking tools that require significant expertise to develop, a sufficiently capable fine-tuned model could democratize access to these capabilities, enabling even relatively unskilled actors to launch attacks that would previously require teams of specialists. Weight protection is therefore about preventing dangerous misuse as much as preventing theft.

AI models are developed by engineers and researchers by leveraging large amounts of compute across vastly large infrastructure. Securing these systems remains crucial for protecting training pipelines, compute resources, and deployment environments, and many practices can draw from established cybersecurity frameworks. Yet these systems are highly optimized for AI training and often fall outside traditional models of defense, which makes securing them especially difficult. At the same time, their scale and importance resemble national infrastructure, where a single compromise could trigger systemic effects across multiple sectors.

What's different is the attack surface and the potential impact of compromise. A breach in AI training infrastructure doesn't just affect the company; it can have cascading real-world consequences. Compromised models deployed in healthcare could provide incorrect diagnoses or treatment recommendations affecting patient safety. Models used in financial systems could enable market manipulation or fraudulent transactions at unprecedented scale. AI systems controlling critical infrastructure like power grids or transportation networks could be manipulated to cause physical disruptions. Perhaps most concerning, a compromised foundation model could propagate vulnerabilities to thousands of downstream applications across sectors, creating systemic risks analogous to a supply chain attack on widely-used software libraries, but with AI agents increasingly making autonomous decisions rather than just processing data, the potential repercussions are almost unquantifiable.

Adversarial Methodologies Adapted for AI losing Thoughts

AI red teaming leverages the same systematic adversarial analysis that cybersecurity professionals practice daily: structured attack path enumeration applied to fundamentally new categories of system behavior. Instead of exploiting network vulnerabilities or application logic flaws, practitioners probe for prompt injection vulnerabilities, model poisoning opportunities, distributional shift weaknesses, and dangerous emergent capabilities.

The core skill set translates seamlessly. Hypothesis-driven testing, creative exploitation techniques, and understanding how attackers think differently than defenders are the same capabilities that make effective penetration testers. The difference lies in the target: instead of breaking into networks, you're breaking AI safeguards or discovering hidden capabilities that could be misused.

Prompt injection attacks use the same creative thinking behind SQL injection or cross site scripting exploits because they manipulate how input is parsed to force unintended behavior. The technical details differ but the methodology is the same. Crucially, an attacker does not need deep AI or machine learning expertise to find effective prompt injections.

Automated Defenses for Evolving AI Systems

Adversarial testing and structured evaluations both contribute to AI product security, though they serve different purposes. Red teaming focuses on discovering unexpected vulnerabilities through creative attacks, while evaluations provide automated and repeatable ways to measure systems against defined criteria. Cybersecurity already uses frameworks that combine discovery, testing, and monitoring, and similar approaches can be adapted for AI.

By integrating these practices into development pipelines, organizations can move toward automated product security for AI systems. This means combining adversarial discovery with systematic assessments and continuous monitoring, creating a feedback loop that helps teams detect risks early, test mitigations at scale, and maintain resilience as systems evolve.

Evaluations become particularly important as AI capabilities advance rapidly. Manual red teaming can't scale to assess every model variant or deployment scenario. As a result, automated evaluation frameworks, built by people who understand both the adversarial landscape and robust testing methodologies, become essential infrastructure.

Cyber Capability Evaluations

Beyond product security, evaluations also play a critical role in monitoring AI’s growing cyber capabilities. These are not about prompt injection or model safety features, but about measuring how well models can perform offensive or defensive cyber tasks. Such evaluations help flag when offensive capabilities begin to outpace defensive safeguards — a red flag for policymakers and practitioners alike.

Designing these evaluations requires deep expertise in cybersecurity. Only experts who understand the realities of network defense, attack surfaces, and exploitation techniques can create meaningful tests. Without this grounding, capability evaluations risk being either too narrow to detect emerging threats or too broad to generate actionable insights.

The Frontier Where Your Skills Matter Most

AI safety and security will not be achieved by researchers working in isolation. It requires practitioners who understand real-world attack scenarios, operational security challenges, and the economics of defense. The field needs people who have seen how security breaks down under pressure, who understand that perfect theoretical security rarely survives contact with production environments.

For cybersecurity professionals, AI represents a domain where your adversarial expertise isn't just useful but urgently needed to prevent systemic risks at unprecedented scale. The opportunity exists now to shape an emerging field where traditional security thinking must evolve to meet entirely new categories of threats.

The transition isn't just about applying old skills to new problems. It's about helping define what AI security means in practice, building the frameworks and methodologies that will secure systems we can barely imagine today. For anyone with a cybersecurity background, this is a frontier with enormous opportunity to have a meaningful and lasting impact on technology that will shape and define the future.