zerocam.studio All Articles
Industry News

36% Of The Agent Skills You Just Installed Are Compromised

NVIDIA just shipped SkillSpector. A Snyk audit found 36.82% of 3,984 agent skills had security flaws. Here's what operators need to do this week.

By · June 22, 2026 · 6 min read

36% Of The Agent Skills You Just Installed Are Compromised

Last week NVIDIA quietly open-sourced a security scanner called SkillSpector, designed to do one job: tell you whether the AI agent skill you're about to install is going to wreck your business.

The reason that scanner exists is uncomfortable. A study of 42,447 agent skills from major marketplaces — published earlier this year — found that 26.1% contained at least one security vulnerability, and 5.2% showed patterns strongly suggestive of malicious intent[1]. A separate Snyk audit of 3,984 skills hit even harder numbers: 36.82% had at least one security flaw, 13.4% had critical issues, and the researchers catalogued 1,467 distinct malicious payloads with 76 confirmed weaponized skills[2].

If you're an operator running a $5M business and you've been letting your team install agent skills from a marketplace because someone in your Slack said it would save them four hours a week — you have a problem. You just didn't know it yet.

What an "agent skill" actually is

Skills are little bundles of instructions and code that extend what an AI agent can do. Anthropic's version, Claude Skills, launched in October 2025 and racked up 17,000+ GitHub stars almost immediately[3]. Other agent ecosystems have their own. They're how operators are bolting "AI" onto existing workflows without writing real software.

The pitch is great. Install a skill, your agent suddenly knows how to read PDFs, query Stripe, push to Shopify, draft email replies in your tone. No code. Drop it in a folder, the agent picks it up, done.

The problem is the same problem every package manager has had since the 1990s. Npm. PyPI. Chrome extensions. WordPress plugins. The skill is code. The "marketplace" doesn't vet it. The agent loads it with the same privileges it has — which, by default, is "whatever you can do on your machine."

What gets weaponized, and how

In December 2025, Cato Networks demonstrated this concretely. They built a Claude skill that looked like a productivity helper and used it to deploy MedusaLocker ransomware. The skill bundled hidden code that ran before the model even saw the instructions — so the model's safety training had no chance to refuse[4]. The skill was disclosed to Anthropic and the demo was contained, but the technique generalizes.

Datadog Security Labs followed in May 2026 with research on coding agent skills specifically. Their finding: skills with executable scripts run shell commands before the model decides whether to trust them, which means "model refuses to do harmful thing" is the wrong defense layer. A malicious author can move the dangerous part of the attack outside the model's decision point entirely[5].

The 42,447-skill study put a multiplier on this. Skills that bundled executable scripts were 2.12× more likely to be vulnerable than instruction-only ones[1]. The category most operators want — "do stuff for me, automatically" — is exactly the category most likely to be compromised.

Why operators are exposed and don't know it

Here's the part nobody on LinkedIn talks about: most owners of $1M–$20M businesses are not the ones installing these skills. Their VAs are. Their freelancers are. Their cousin's-friend-who's-good-with-AI is. And they're installing them on devices that have credentialed access to Stripe, QuickBooks, Gmail, the CRM, and the Shopify admin.

Three uncomfortable facts:

  • There is no equivalent of npm audit running by default. Until SkillSpector dropped this month, there wasn't even a community scanner. Skills install silently into hidden directories.
  • Permissions are inherited, not requested. Unlike a browser extension that has to ask before reading your email, a skill runs as whatever process loaded it. If your agent has Gmail credentials, your skill has Gmail credentials.
  • The marketplaces don't have an "untrust" button. Once a skill is in your ~/.skills/ directory, removing it doesn't undo what it did. Exfiltrated tokens stay exfiltrated.

Anthropic's whitelisted Files API, for example, lets approved processes read folders without a per-file prompt — meaning a malicious skill that gets initial folder access can quietly drain it[6]. The consent model assumes the skill is honest.

Why most takes are wrong

The dominant LinkedIn take this week is: "Just don't install untrusted skills." That's the security equivalent of "just don't click bad links." It tells you nothing useful and blames the user.

The real lessons are operational.

Treat agent skills like software dependencies, not like browser extensions. That means a review step. A list of approved skills. Pinned versions. The same hygiene you'd apply to npm packages going into production. Most operators don't do this because skills feel lightweight. They aren't.

Restrict the blast radius before you trust the agent. Don't run your agent as a user with full admin credentials to every system. Give it scoped tokens. Read-only where possible. Separate machines for separate trust levels. This is boring infrastructure work that nobody is selling you a course on, which is exactly why it matters.

Log what skills actually do, not just what they claim to do. Datadog's recommendation — telemetry that can answer "what executed, what it accessed, where it connected, which credentials are now compromised" — is the minimum bar[5]. Without it, you have no incident response. You just have hope.

Stop installing the long tail. The marketplaces have power-law distributions. A small number of skills are heavily used and somewhat scrutinized. The rest is a long tail of garbage and traps. If a skill has six stars and a publisher you've never heard of, the upside is "30 minutes saved." The downside is "your customer database on a Russian forum." Bad math.

What I'd actually do this week

If I were running a 20-person business that's let agents creep into the workflow, here's the order I'd attack this in.

  • Audit what's installed. Get a literal list. Every machine, every agent platform, every skill. Most operators cannot produce this list. That's the first problem.
  • Run a scanner. SkillSpector is free, open-source, and built specifically for this[7]. Snyk's commercial offering covers the same ground with more enterprise wiring[2]. Pick one and run it across what you found in step one.
  • Cut admin tokens. Audit which agents have which credentials. Anything with write access to billing, customer data, or payroll gets a scoped read-only replacement until you can prove the agent needs the write path.
  • Decide who installs. Today, in most small businesses, the answer is "anyone." That has to change. Pick one person who reviews and approves new skills. The bottleneck is the point.

None of this stops the productivity story. Agents are still going to do the work. But the difference between "agents save me 20 hours a week" and "agents drained my Stripe account" is a few hours of unsexy plumbing — and that plumbing is what most operators haven't done.

The bottom line

The agent skills marketplace, today, is what npm looked like in 2018: too valuable to ignore, too unsupervised to trust blindly, and quietly compromised at a rate most people would find shocking if anyone bothered to tell them. NVIDIA shipping SkillSpector is the equivalent of npm's first serious audit tooling — necessary, late, and not a substitute for treating dependencies like dependencies.

If your agent has the keys to your business, the skills your agent installs have the keys to your business. Plan accordingly.


If you want help auditing what your agents are actually running — what's installed, who has what access, where the next ransomware demo is going to start — that's the audit call. 30 minutes, no pitch, you walk out with a list and an order to fix it.

Sources 7 references
  1. Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale
    arXivprimary

    Study of 42,447 skills: 26.1% vulnerable, 5.2% malicious; executable-script skills 2.12x more vulnerable.

  2. Snyk Finds Prompt Injection in 36%, 1467 Malicious Payloads in ToxicSkills Study
    Snykreport

    Snyk audit of 3,984 skills: 36.82% had flaws, 13.4% critical, 1,467 malicious payloads, 76 confirmed malicious skills.

  3. Weaponizing Claude Skills with MedusaLocker
    Cato Networksanalysis

    Claude Skills launched October 2025, racked up 17,000+ GitHub stars; can be weaponized via hidden pre-model code.

  4. "Do Not Mention This to the User": Detecting and Understanding Malicious Agent Skills in the Wild
    arXivprimary

    Systematic analysis of 98,380 skills found 157 confirmed malicious skills containing 632 vulnerabilities across 13 attack techniques, including the MedusaLocker-style supply-chain pattern.

  5. Malicious Coding Agent Skills and the Risk of Dynamic Context
    Datadog Security Labsanalysis

    Skills with executable scripts run shell commands before the model's decision point — model defenses cannot be the only control.

  6. Claude Cowork File Exfiltration Vulnerability: What CISOs Need to Know
    MintMCPanalysis

    Whitelisted Files API allows approved processes to read folders without per-file consent, enabling exfiltration once initial access is granted.

  7. NVIDIA/SkillSpector — Security scanner for AI agent skills
    GitHub / NVIDIAdocs

    NVIDIA open-sourced SkillSpector, citing 26.1% of agent skills contain vulnerabilities and 5.2% show likely malicious intent.

ai-agentsagent-skillsai-securitysupply-chainoperatorsai-systems

Ready to build your own AI system?

Book a Free Audit Call →

Keep Reading