Gloss Key Takeaways
  1. Nearly a third of CVEs are exploited within 24 hours of disclosure, making manual patch triage too slow to be reliable.
  2. A practical defense is an automated CVE-to-PR pipeline that turns vulnerability disclosures into reviewable pull requests quickly.
  3. Keep the pipeline mostly deterministic (feed ingestion, parsing, SBOM matching) and use an LLM only for the fix-planning step to avoid hallucinations and complexity.
  4. Use OSV.dev as the primary CVE source (with NVD as fallback for OS-level issues), poll frequently, and cache by CVE ID to prevent redundant processing.
  5. Only act on vulnerabilities with known fixed versions and prefer the smallest safe version bump to minimize review time and risk.

Soft conveyor turning vulnerability alerts into pull requests

Patch Faster Than the Attackers, an Automated CVE-to-PR Pipeline

Mandiant's M-Trends 2026 report puts a hard number on something defenders have felt for years: 28.3% of CVEs are exploited within 24 hours of public disclosure. The window between "this vulnerability is now public" and "this vulnerability is actively being weaponized against you" has collapsed to roughly the time it takes a human to read the morning security digest. Manual patch triage is a losing strategy and most security teams already know it. The question is what to put in its place.

This is a job that fits agents almost too well. The inputs are structured feeds. The matching logic is mechanical. The output is a pull request a human reviews. You do not need a 50-step plan-and-reflect agent. You need a small, dependable pipeline that reads CVE data, checks it against your software bill of materials, and opens patches before lunch. Here is how to build one.

What the pipeline actually does

Five steps, each boring on its own, useful in sequence.

[CVE feed] -> [Parser] -> [SBOM matcher] -> [Fix planner] -> [PR opener]
                                                |
                                          [LLM with tools]

Pull new CVEs from a feed. Parse them into structured records. Match each CVE against your SBOM to find affected repos. For each match, ask an LLM to propose a fix, usually a version bump, sometimes a config change or a workaround. Open a PR with the fix, the CVE reference, and a clear summary. A human approves and merges.

The agent only enters at step four. The first three steps are scripts. People skip this distinction and end up with a Rube Goldberg agent that hallucinates CVSS scores. Keep the deterministic parts deterministic.

Step one, the CVE feed

NVD publishes a JSON feed. GitHub Security Advisories has an API. OSV.dev aggregates across ecosystems and is, in practice, the cleanest source for application dependencies. Pick OSV as your primary, fall back to NVD for OS-level CVEs, and pull every fifteen minutes. Cache by CVE ID so you do not reprocess the same vulnerability fifty times when a new advisory updates an old one.

A new CVE record looks roughly like this once parsed:

@dataclass
class Vulnerability:
    id: str                   # "CVE-2026-12345"
    summary: str
    severity: str             # "CRITICAL", "HIGH", ...
    affected: list[Package]   # name, ecosystem, version range
    fixed_in: list[Package]   # versions that contain the fix
    references: list[str]
    published_at: datetime

Reject records without a fixed_in field. If there is no fix yet, the agent has nothing to do. Log it for the security team and move on.

Step two, the SBOM matcher

Your SBOM is the canonical list of what is actually deployed. Generate it with Syft, CycloneDX, or whatever your CI already produces, and store it per repository in a database keyed by ecosystem and package name. When a new CVE arrives, query for every repository that contains an affected package at a vulnerable version.

def find_affected_repos(vuln: Vulnerability, db) -> list[Match]:
    matches = []
    for pkg in vuln.affected:
        rows = db.query("""
            SELECT repo, current_version FROM sbom_packages
            WHERE ecosystem = %s AND name = %s
        """, pkg.ecosystem, pkg.name)
        for row in rows:
            if version_in_range(row["current_version"], pkg.version_range):
                matches.append(Match(
                    repo=row["repo"],
                    package=pkg,
                    current=row["current_version"],
                    target=pick_fix_version(vuln.fixed_in, pkg),
                ))
    return matches

pick_fix_version is the smallest version bump that lands inside fixed_in. Smallest, not latest. A patch from 4.2.1 to 4.2.2 is reviewable in five minutes. A bump from 4.2.1 to 6.0.0 is a Friday afternoon you will never get back.

Coral droplet rippling through teal water, representing a fresh CVE spreading across systems

Step three, the fix planner

This is where the LLM finally enters. Use any frontier model with tool use. The agent gets a small set of tools, a tight prompt, and exactly one job per invocation: produce a patch for one repository for one vulnerability.

Tools the agent needs:

The prompt is short. Hand the agent the Match record, the manifest contents, and a system prompt that says: bump the affected package to the target version, update the lockfile, do not change anything else, and explain why in two sentences. If the manifest is unfamiliar, the agent reads files first. If a lockfile regeneration is needed, the agent calls a sandboxed npm install --package-lock-only or its equivalent. If the test command exists and runs cleanly, even better, but do not block on it. The human reviewer is the safety net.

The reason this works is that the scope is microscopic. The agent is not deciding whether to patch. It is not picking the version. It is not architecting a refactor. It is editing one or two files to land a known fix. Frontier models do this nearly perfectly when the prompt does not let them improvise.

Step four, opening the PR

Use the GitHub or GitLab API. The PR template should be machine-generated and ruthlessly consistent.

Title: [security] bump <package> to <target> for <CVE-ID>

Summary
- CVE: <id> (<severity>)
- Package: <name> <current> -> <target>
- Fix source: <reference>

What changed
- <files modified>

Notes from the agent
- <two-sentence explanation>

Add a label like security/auto-patch, request review from the security team, and assign no one else. If the repo has CODEOWNERS, those rules will fire automatically. Do not auto-merge. Do not skip required checks. The whole point is that humans stay in the loop on the merge decision while the toil disappears.

Three soft gears connected by ribbons, representing an automated patch pipeline

Guardrails that matter

Three failure modes will bite you if you skip them.

The flood. A single popular dependency disclosure can hit hundreds of repos in your fleet. Rate-limit PR creation per repo per day. Batch transitive bumps where possible. Better to ship ten clean PRs and hold thirty than ship forty PRs that overwhelm reviewers and get ignored.

The wrong fix. The agent will occasionally propose a version that satisfies fixed_in but breaks an unrelated peer dependency. Always run the existing CI suite on the PR. If CI fails, the agent files an issue instead of a PR. An open issue is a better outcome than a green PR that bricks production.

The infinite loop. Some CVEs reopen, get re-scored, or chain into supply-chain advisories that supersede them. Track which CVEs you have already addressed per repo and which PRs are open. Never propose the same fix twice.

Where the working repo lives

A reference implementation is on GitHub at marcokotrotsos/cve-to-pr (placeholder, swap with your own fork). It is roughly 600 lines of Python, uses OSV as the primary feed, ships with a Postgres schema for SBOM storage, and includes Anthropic and OpenAI tool-use adapters. The agent prompt is in agents/fix_planner.md and is the file you will tune the most. Run it as a cron job, point it at one repo to start, and only widen the blast radius once you have shipped twenty clean PRs.

The point is not the model. The point is the pipeline. CVEs arrive faster than humans can read them, but the work of turning a CVE into a one-line version bump is exactly the kind of work that should happen while you sleep. Build the small thing first. Let the agent do the boring part. Keep humans on the merge button. That is the shape of security work that actually scales in 2026.

Gloss What This Means For You

Set up a lightweight pipeline that polls OSV.dev regularly, parses new advisories, and matches them against per-repo SBOM data so you can identify exactly which repositories are affected. Configure it to ignore CVEs without a published fix, and when a fix exists, automatically open PRs that apply the smallest version bump that lands in a known-fixed range. Keep the early steps as simple scripts and reserve the LLM for proposing the specific change and PR summary, so humans can review and merge quickly.