Your tests passed. Your PR was approved.
Your change still broke production.
Tests confirm expected behavior. Code review confirms intent. Neither validates what your change actually does at runtime.
GauntletCI detects Behavioral Change Risk in pull request diffs, identifying logic shifts, missing validations, and hidden regressions before they merge.
Detect breaking changes, regressions, and behavioral drift that pass tests and code review.
Built for .NET and C# teams running diff-aware validation before commit and before merge.
The Problem: Diffs are Deceptive
A diff can look clean, compile successfully, and pass every unit test while still changing runtime behavior. The risk is not always in the code that looks wrong. It is often in the assumption that changed quietly.
The Risk: Behavioral Change Risk
Behavioral Change Risk appears when a small edit changes a contract, branch, exception path, validation rule, or side effect without an equally clear test or review signal.
The Solution: Diff-first Behavioral Change Risk validation
GauntletCI acts as an automated auditor for the change itself. It flags unintended side effects, broken assumptions, and unvalidated behavior shifts before they leave your machine or reach a pull request.
The Visibility Gap
You have tests. You have linters. You still have regressions.
Most development pipelines validate code quality, security, and known test cases. They do not validate the behavioral impact of the change itself.
Tests
Confirm expected behavior that someone remembered to test. They do not prove that new behavior is safe.
Linters and static analysis
Find style issues, code smells, and known risky patterns. They do not reason about the intent of a specific diff.
Code review
Helps humans understand whether the change looks reasonable. Reviewers still miss subtle runtime behavior shifts.
GauntletCI
Analyzes the diff itself and flags Behavioral Change Risk before the change reaches review, CI, or production.
"Tests tell you if the code works. Diffs tell you what changed. GauntletCI tells you if your intent is still intact."
GauntletCI closes the gap between the build is green and this change is safe.
Where it fits
GauntletCI runs before everything else
It does not replace tests, review, or CI. It runs before them - closing the gap that exists before any of those tools see the change.
Without GauntletCI
With GauntletCI
GauntletCI runs on the developer's machine in under a second. No full build required. Core analysis requires no network. By default, no code leaves the machine.
Definition
What is Behavioral Change Risk?
Behavioral Change Risk is the risk that a code change alters runtime behavior in a way that is not clearly intentional, reviewed, or validated.
It appears when a diff changes a contract, branch, guard clause, exception path, data shape, async flow, or side effect without matching validation.
GauntletCI treats those changes as review-critical because they are exactly the kinds of regressions that pass tests and look harmless in pull requests.
Behavior changed + validation missing = Behavioral Change Risk
See it in action
GauntletCI intercepts behavioral risks before your commit lands. Here's what it looks like when it catches a real breaking change.

Running against StackExchange.Redis PR #2995 - GIF recorded with ScreenToGif
Breakdown
What each finding tells you
Rule ID + severity
Each finding maps to a named rule. BLOCK halts the commit. WARN surfaces without blocking.
Caller impact
GauntletCI counts downstream callers and checks whether tests were updated to match the new signature.
Exception path
A new throw with no catch block and no test coverage - a silent crash waiting for the first request.
PII detection
Customer email logged to a structured sink. Flagged as a warning, not a block - your team decides.
Commit blocked
The pre-commit hook exits non-zero. Git does not create the commit until blocking findings are resolved.
How it works
Understand GauntletCI in 60 seconds
One command. Sub-second analysis. Findings before the commit exists.
Stage your changes
Write code as normal. When you're ready to commit, GauntletCI reads the staged diff directly - no compilation, no network, no setup.
Analysis runs in under a second
30 behavioral change rules evaluate only what changed. The rest of the codebase is not touched. No false positives from pre-existing issues.
Up to 3 high-signal findings
Each finding includes a rule ID, severity, the exact line, and a plain-English explanation of why the change is risky. No noise. No style warnings.
Fix before the commit exists
The risky change never reaches tests, review, or CI. You resolve it locally - the pipeline sees a clean diff from the start.
$ gauntletci analyze --staged
Real detection -- not synthetic
The change that looks fine in review
It compiles. Tests pass. Every reviewer approves it. GauntletCI flags it before the commit lands.
[High] GCI0003: Guard clause removed at line 3. ArgumentNullException no longer thrown on null input. Callers relying on this contract will see NullReferenceException deeper in the call stack.
Why code review misses it
The change is a single line removal. The surrounding code still looks correct. The PR description says "cleaned up redundant null check" -- and upstream callers do validate input, so it seems safe. No test fails. The reviewer approves.
Why tests miss it
Tests exercise the happy path. The null input case was implicitly covered by the guard -- but there is no explicit test for it. Coverage reports show green. The regression ships.
Why GauntletCI catches it
GCI0003 fires on any diff where a null guard is removed from a public method. It does not care about tests, coverage, or upstream callers. It analyzes only what changed -- and that line removal is a behavioral contract break, not a cleanup.
Designed for high signal
GauntletCI avoids noise by design. Focus on what matters: behavioral changes that could slip through code review.
Sub-second analysis
No full build required. Core analysis requires no network. Get instant feedback on your changes before the commit is created.
Diff-first risk analysis
Analyzes only what changed. No style or formatting checks, behavioral risk only, scoped to the exact lines you touched.
Local execution by default
Core analysis runs entirely on your machine by default. Auto-redaction prevents sensitive data exposure. Can run air-gapped.
Baseline delta mode
Snapshot existing findings and suppress them. Subsequent runs show only net-new risks, no legacy noise.
High-signal output
Up to 3 findings per run by design. No alert fatigue. Output developers actually read every time.
Docker deployment
Official runtime image for CI pipelines, self-hosted runners, and air-gapped environments. Pull and run.
MCP server (Pro tier)
Expose GauntletCI as a tool to Claude, Cursor, GitHub Copilot, and Windsurf. Ask your AI assistant about risk.
SARIF output
Emit findings as SARIF for GitHub Security tab, IDE diagnostics, and any SARIF-compatible pipeline tool.
GitHub Checks integration (Teams tier)
Post findings as GitHub Checks with inline annotations on the exact diff lines that triggered them.
Privacy-focused
All analysis runs locally. Telemetry is optional and anonymous. Never includes code, diffs, or findings.
Features & Benefits
Every capability ships with a concrete outcome. Here's what GauntletCI does and what that means for your team.
Detection is deterministic
Rules are rule-based and produce the same output every time. No model, no API key, no network call required.
LLM enrichment is optional
Local AI explanations are opt-in via --with-llm. They add plain-English context only. They do not affect which findings are reported.
Deterministic Change-Risk Detection
30+ rules analyze the exact lines added or removed in a diff, not the whole file. Each rule targets a specific class of risk: removed logic without tests, breaking API changes, hardcoded secrets, unsafe casts, missing null guards, and more.
The benefit
A second opinion on every commit that focuses entirely on what changed and why it might fail in production, catching the things that look fine in review but break at runtime.
Sub-Second Feedback in the Developer Loop
Installs as a pre-commit hook with `gauntletci init`. Before every commit, it runs on the staged diff. No full build required, no network call.
The benefit
Developers catch their own risky changes before they leave the machine, when the fix costs nothing and the context is freshest. Fast enough that it doesn't change how people work.
High Signal, Low Noise
Every rule surfaces up to 3 findings per run. Baseline delta mode snapshots existing findings and suppresses them; subsequent runs show only net-new risks.
The benefit
Teams actually read the output. Alert fatigue is why most static analysis tools get disabled. GauntletCI is designed to be looked at every time because it's almost always relevant.
CI Gate with GitHub Inline Comments
A drop-in GitHub Actions composite action runs GauntletCI on every PR, fails the check if findings are produced, and posts findings as inline review comments directly on the diff.
The benefit
Risky changes can't merge unless reviewed or suppressed. Findings appear on the exact lines that triggered them, no separate report to read, no manual triage.
100% Local Execution & Privacy
All analysis runs entirely on the machine where the command runs. No diff, no finding, no file path is ever transmitted. Evidence strings for PII and secrets are auto-redacted in output.
The benefit
Works in air-gapped environments, on proprietary codebases, and in organizations with strict data residency requirements, no policy exceptions needed. Local execution is the default and requires no configuration.
Local LLM Enrichment (Fully Offline)
Runs high-confidence findings through a locally hosted Phi-4 Mini model and appends a plain-English explanation. No API key, no network call. The model runs on your hardware.
The benefit
Junior developers get actionable context on why a finding matters and what to do about it, without asking a senior engineer. Generated locally so it can safely reference the actual diff.
Architecture Policy Enforcement
Reads a `forbidden_imports` list from `.gauntletci.json` and flags any added import that violates a configured dependency pair, e.g. a Domain project importing an Infrastructure namespace.
The benefit
Architectural boundaries that exist only in wikis and verbal agreements drift silently. GauntletCI enforces them in the diff, at commit time, before the violation is ever merged.
Ticket Context Attachment
Reads the Jira, Linear, or GitHub Issue ticket referenced in the branch name and attaches the ticket description to findings.
The benefit
Flags scope drift: a finding in a database layer on a UI-ticket branch is a strong signal the change wasn't intentional. Verifies not just 'is this code safe?' but 'is it doing what the ticket asked for?'
How GauntletCI compares
Most tools analyze the whole codebase on a schedule. GauntletCI analyzes what changed, locally, before you commit.
GauntletCI is designed to complement the tools your team already uses.
SonarQube, Semgrep, Snyk, CodeQL, and similar tools are valuable, but they primarily answer different questions: Is this code maintainable? Is it vulnerable? Does it match known patterns?
GauntletCI answers a narrower question at a more urgent moment: Did this diff introduce Behavioral Change Risk that should block or change the review?
Validated against real open-source PRs
GauntletCI rules were developed by running the engine against historical pull requests from major .NET OSS projects. These are the findings that would have been caught before merge.
The change
A PR refactored a query helper method. The new implementation introduced LINQ inside a loop over a large result set.
What GauntletCI flagged
O(n2) performance risk flagged before commit. The pattern would have been invisible in code review -- each piece looked fine in isolation.
The change
A property getter was refactored to lazily initialize a backing field. The initialization had a side effect that mutated shared context.
What GauntletCI flagged
Pure context mutation in property getter caught pre-commit. No test covered the initialization path. The bug would have appeared only under concurrent access.
The change
An enum member used in JSON serialization was removed during a cleanup pass. All tests passed because they used different enum values.
What GauntletCI flagged
Enum member removal detected as a serialization contract break. Existing stored or transmitted JSON would have failed to deserialize after deploy.
The change
A null-forgiving operator was added to suppress a compiler warning on a value that could legitimately be null at runtime.
What GauntletCI flagged
Null-forgiving operator misuse flagged. The suppressed warning was masking a real null path that would throw in production on certain query results.
The change
A retry handler was simplified during cleanup. The author replaced an awaited call with .Result to avoid propagating async through the call chain.
What GauntletCI flagged
Synchronous block on async method detected. Under load this pattern causes thread pool starvation -- the same mechanism behind classic ASP.NET Core deadlocks.
The change
A public Validate() method was refactored to remove 'redundant' checks. A null guard on the incoming model was removed as assumed to be handled upstream.
What GauntletCI flagged
Null guard removal on a public API method detected. Callers passing null now receive a NullReferenceException deep in the validation pipeline instead of a clear ArgumentNullException.
30+ built-in detection rules
Comprehensive coverage across behavioral risk categories.
Behavior & Contracts
Behavior changes without tests, API and serialization changes
Security
SQL injection risks, hardcoded secrets, PII exposure
Data Integrity
Numeric truncation/overflow risks, state mutation issues
Async & Concurrency
Blocking async calls, disposable leaks
Observability
Missing logging, silent failures
Architecture
Structural changes that impact system design
Test Quality
Test coverage gaps, assertion quality
Why teams adopt GauntletCI
Not another tool to manage. A pre-commit gate that closes the gap between "the build is green" and "the change is safe."
Tests keep passing after risky refactors
A guard clause removed, an error path swallowed, a type changed quietly - these changes compile and pass tests. GauntletCI flags them before the commit exists.
Reviewers stop playing detective
Code review time shifts from hunting structural issues to verifying intent. The behavioral and contract risks are already handled before the PR opens.
Nothing leaves the machine by default
Core analysis does not send code to external servers. No account required. Runs entirely on your developer machine. Meets data residency and air-gap requirements. Optional integrations only transmit configured data.
Signal without noise
Rules are scoped to the diff, not the whole codebase. Existing issues in untouched files do not appear. Every finding is directly caused by the current change.
Who this is for
Built for engineers who have seen risky changes ship
GauntletCI is not a general-purpose tool. It is a focused gate for teams who already know the visibility gap exists.
Senior engineers
You have seen risky changes ship after passing tests and review. You want a tool that catches the subtle behavioral shifts that experience teaches you to fear - not a linter that flags line length.
Tech leads and engineering managers
Your team ships fast. Review bandwidth is limited. You want a pre-PR gate that handles structural and behavioral risk so reviewers can focus on intent and design - not hunting regressions in diffs.
CI/CD-mature teams
You already have tests, linters, and security scans. You know the gap: none of them validate what the change actually does at runtime. GauntletCI slots into the pre-commit step you do not have yet.
Teams maintaining large .NET codebases
The more surface area a codebase has, the easier it is to miss a behavioral change in review. GauntletCI is diff-scoped: it only reports what the current change introduced, regardless of codebase size.
Not a fit if you are looking for:
- —Teams looking for a code formatter or style enforcer
- —Projects that do not use C# or .NET
- —Teams that want AI to write or summarize their code
- —Anyone looking for a replacement for tests or code review
Why this exists
A single line caused a production incident
The change was small. One line removed from a service that had been stable for two years. The null check was flagged in review as redundant. Tests passed. The reviewer approved. It shipped on a Friday.
By Monday, support tickets were coming in. The null reference was not redundant - callers in three other services relied on the early exception to distinguish between missing input and a downstream failure. Without it, errors surfaced two hops away with no context. The bug took four hours to trace back to that one removed line.
What made it painful was not the incident itself. It was that the risk was visible in the diff the whole time. The check was removed. The callers were not updated. No test covered the new behavior. Every tool in the pipeline had seen the change and said nothing.
"Tests tell you if the code works. Code review tells you if the change looks right. Neither tells you if the change is safe."
GauntletCI was built to close that gap. Not to replace tests or review - both still matter - but to add a layer that answers the question neither one asks: did this change introduce behavior that is not validated anywhere?
The tool runs locally, analyzes only the diff, and surfaces up to three findings before the commit is created. The feedback loop is as tight as possible: you see the risk before it reaches anyone else.
That is the whole idea. Catch the thing that looks fine but is not.
Integrations
GauntletCI plugs into the tools your team already uses, without sending your code anywhere.
CI / CD
- GitHub Actions
Drop-in composite action for pull request analysis. Fails the check, posts inline annotations.
- GitHub Checks
Teams tier: --github-checks posts findings as native Checks on the PR head commit.
- Docker
Official runtime image for self-hosted runners and air-gapped pipelines.
Notifications
- Slack
--notify-slack <webhook> sends a findings summary on every run.
- Microsoft Teams
--notify-teams <webhook> sends an adaptive card to any Teams channel.
Ticket Context
- Jira
Reads the linked Jira ticket from the branch name and attaches it to findings.
- Linear
Reads Linear issue context from the branch name for scope-drift detection.
- GitHub Issues
Fetches the linked GitHub Issue body and attaches it to findings.
Incident Management
- PagerDuty
trace command correlates the deploy diff with live PagerDuty incidents.
- Opsgenie
trace command supports Opsgenie as an alternative incident source.
Coverage & Security
- Codecov
--with-coverage attaches coverage data to findings for context.
- GitHub Security
--sarif emits SARIF output consumed by the GitHub Security tab.
AI Assistants (MCP - Pro tier)
- Claude
gauntletci mcp serve exposes analyze and audit as callable tools.
- Cursor
Ask Cursor to run GauntletCI on the current diff from inside the IDE.
- GitHub Copilot
Copilot Chat can invoke GauntletCI for deterministic risk answers.
- Windsurf
Full MCP tool support: analyze, audit, and rule listing.
Less than 2 minutes from install to audit
Install the tool, run it on your current changes, and see up to 3 high-signal findings. No setup required.
dotnet tool install -g GauntletCIgauntletci analyze --stagedgauntletci analyze --commit <sha>gauntletci baseline createWorks with .NET 6.0 and later. Supports Windows, macOS, and Linux.
The Honest FAQ for Skeptical Engineers
- "I already have Roslyn, SonarQube, and linters. Why do I need another tool screaming at me?"
- Myth: GauntletCI is a linter.
Reality: Linters care about style. GauntletCI cares about broken assumptions. It only speaks up when the exact lines you changed introduce a behavioral risk your existing tests did not think to check for.Result: High signal, near-zero noise. - "If this flags every null check I add, I'm just going to --no-verify it."
- Myth: GauntletCI says "This is broken."
Reality: GauntletCI says "This is unverified." It is a prompt for the reviewer to ask: "Did you mean to change this behavior?" We tune for precision over volume. If a finding is not relevant, suppress it once in.gauntletci-baseline.jsonand it never appears again. - "Is this just an LLM wrapper making up fake security risks?"
- Myth: AI finds the bugs.
Reality: The detection engine is 100% deterministic Roslyn analysis. It uses a fixed set of 30+ rules to identify changes. The AI (which runs 100% offline, locally) is only used to explain the finding in plain English so juniors do not have to Google the error code. - "Our CI is already slow. Adding more checks is a non-starter."
- Myth: Analysis makes CI slower.
Reality: GauntletCI runs locally in under one second on just the diff. By catching the risky change before you push, you avoid the push, wait 15 minutes, CI fails, fix loop that actually kills velocity. - "Our Jira tickets are garbage. If this tool relies on tickets, it's doomed."
- Myth: GauntletCI requires perfect tickets.
Reality: If the ticket is vague, GauntletCI flags a Requirement Risk. This forces a healthy conversation: "Hey PM, you said fix the login, but this change touches the payment ledger. Is that intentional?" It surfaces scope creep that usually gets rubber-stamped in review.
Common Questions
- What is Behavioral Change Risk?
- Behavioral Change Risk is the risk that a code change alters runtime behavior in a way that is not clearly intentional, reviewed, or validated. Examples include removed guard clauses, changed branching logic, new exception paths, serialization contract changes, unsafe async changes, and hidden side effects.
- How is Behavioral Change Risk different from a bug?
- A bug is a confirmed defect. Behavioral Change Risk is a pre-merge signal that a diff changed behavior in a way that deserves validation before it becomes a defect. GauntletCI flags the risk while the change is still cheap to fix.
- Why do tests and code review miss Behavioral Change Risk?
- Tests usually verify expected paths. Code review usually checks whether the change looks reasonable. Behavioral Change Risk often appears in the gap between those two signals: a small change that looks safe but alters an untested contract, branch, exception path, or runtime assumption.
- Is GauntletCI trying to replace SonarQube, Snyk, CodeQL, or tests?
- No. GauntletCI complements those tools. Static analysis, SAST, dependency scanning, and tests remain valuable. GauntletCI focuses on the diff itself and asks whether the specific change introduced unvalidated runtime risk.
- Can I add custom detection rules?
- Yes. GauntletCI is open source and built for extension. Implement
IRule(or extendRuleBase), place the file insrc/GauntletCI.Core/Rules/Implementations/, and write a test. The orchestrator discovers all rules via reflection at startup - no registration step needed. For team-specific rules without code, theexperimental.engineeringPolicyconfig lets you define rules in plain markdown, evaluated locally by the LLM. Read the custom rules guide. - What is diff-based code analysis?
- Diff-based analysis looks only at the lines added or removed in a commit or pull request, not the full codebase. Every finding GauntletCI produces is directly tied to a change you made, not a pre-existing issue in code you never touched.
- How is GauntletCI different from traditional static analysis?
- Traditional static analysis tools scan the whole codebase for known patterns. GauntletCI scans only what changed in the diff and asks whether the change breaks an assumption that your tests may not cover. The scope difference is the key: whole-repo vs. exactly what changed.
- Why do tests miss bugs?
- Tests verify what you expected to happen. They do not verify what you did not expect. A logic change that looks safe can alter a guard clause, shift a branch, or orphan a check that tests never exercised. GauntletCI flags those behavioral changes in the diff before they reach review.
- What is shift-left code analysis?
- Shift-left means moving feedback earlier in the development cycle, closer to when code is written. GauntletCI gives you change-risk feedback before you commit, which eliminates the push, wait 15 minutes, CI fails, fix loop entirely.
- How does GauntletCI work with GitHub Actions?
- Add the workflow to your repo. It runs on every pull request, diffs the branch against the base, posts findings as inline review comments on the exact diff lines that triggered them, and exits with code 1 if blocking findings are detected. See the CI/CD Integrations doc for the full YAML.
- Is GauntletCI a Roslyn analyzer?
- The detection engine is built on Roslyn, but GauntletCI is not a Roslyn analyzer in the traditional sense. It does not run during compilation or integrate with the MSBuild diagnostic pipeline. It runs as a separate CLI step against a diff, either pre-commit or in CI.
- Does GauntletCI support local-first AI?
- Yes. The
--with-llmflag enriches high-confidence findings with a plain-English explanation using a built-in ONNX inference engine running Phi-4 Mini. Rungauntletci model downloadonce to cache the model (~2 GB) locally. No API key, no network call at analysis time. The detection itself is always deterministic; the AI only adds context. - Is GauntletCI a linter, SAST scanner, or test runner?
- No to all three. GauntletCI focuses only on change-risk in your diff -- it does not replace any of these tools and is designed to complement them.
- Not a linter: It does not enforce style, formatting, or naming conventions.
- Not a test runner: It does not execute your tests or measure code coverage.
- Not a SAST scanner: It does not scan your entire codebase for known vulnerability patterns.
- Can I use GauntletCI in an air-gapped environment?
- Yes. The core tool has no external runtime dependencies. The optional LLM feature uses a built-in ONNX engine with a model downloaded once via
gauntletci model download. After that, no internet access is needed at analysis time. Nothing in GauntletCI phones home.
Local-First. Privacy-Always.
GauntletCI's core analysis runs on your own hardware. By default, no code is uploaded and no network is required. This makes it ideal for air-gapped, data-residency, and zero-trust networks. Optional integrations only transmit data to services you configure.
No Source Code Uploads
Core analysis runs entirely on your local hardware or within your private CI/CD runners. By default, source code and findings stay private.
No LLM Training
We never use your code to train models. LLM enrichment is optional, local by default, and powered by models you control.
Deterministic Core
Built-in detection is pure Roslyn-based analysis. Every finding is reproducible, auditable, and explainable. No probabilistic guesses.
Simple, predictable pricing
Start free. No account required. Pay only when your team needs enforcement, CI integration, or AI enrichment.
gauntletci model download once to cache the model locally. Setup guide included with license.Core analysis runs entirely on your machine. No code is sent to any external service by default.
GauntletCI is a local-first Behavioral Change Risk engine for C# and .NET. It analyzes pull request diffs to catch breaking changes, behavioral regressions, and unverified logic shifts that pass tests and code review. Detection is deterministic, diff-scoped, and designed to run in under one second. Optional offline AI explanations run locally and never send your code to an external service.
