Local-first context cleaner for AI agents

Stop paying LLMs to read raw HTML, repeated logs, and secret-shaped noise.

ContextClean turns noisy files, web exports, terminal output, and project folders into compact, reviewable, token-budgeted context packs with reports that explain the savings.

$ ctxclean report build.log --max-tokens 8000

ContextClean Report
input_tokens:       1,284
output_tokens:        418
tokens_saved:         866
compression_ratio:  0.326

biggest_noise_sources:
- repetition: 312 tokens (repeated log lines)
- stack_trace: 93 tokens (duplicate stack frames)
- log_noise: 38 tokens (safe install/build noise)

recommended_command:
ctxclean build.log --max-tokens 8000 --format markdown

Fixture-backed excerpts

From noisy HTML and logs to model-ready context

HTML Before

<script>window.analytics = true;</script>
<nav>Home Pricing Login Cookie preferences</nav>
<div id="cookie-banner">Accept all cookies</div>
<main>
  <h1>API Setup &amp; Troubleshooting</h1>
  <p>Read <a href="/docs/setup">setup guide</a>.</p>
  <table><tr><th>Mode</th><th>Keeps</th></tr></table>
</main>
<footer>Newsletter signup</footer>

HTML After

# API Setup & Troubleshooting

Read [setup guide](/docs/setup).

| Mode | Keeps |
| --- | --- |

Tokens saved: 211
Reduction: 65.7%

Log Before

added 481 packages
found 0 vulnerabilities
2026-07-04T10:00:01Z warn Connection timeout to database
2026-07-04T10:00:02Z warn Connection timeout to database
2026-07-04T10:00:03Z warn Connection timeout to database
FAIL packages/api/user.test.ts
Unique failure:
TypeError: Cannot read properties of undefined
    at loadUser (/app/src/user.ts:42:13)
    at main (/app/src/main.ts:8:1)
    at loadUser (/app/src/user.ts:42:13)
    at main (/app/src/main.ts:8:1)
Final error summary: request failed after retries

Log After

[Repeated 3 times from 2026-07-04T10:00:01Z to 2026-07-04T10:00:03Z] Connection timeout to database
FAIL packages/api/user.test.ts
TypeError: Cannot read properties of undefined
    at loadUser (/app/src/user.ts:42:13)
    at main (/app/src/main.ts:8:1)
[Collapsed stack frames: 2 duplicate frames removed]
Final error summary: request failed after retries

Tokens saved: 35
Reduction: 27.3%

Cleaner + Crusher + Reporter

Built for the context AI agents actually need

HTML Cleaner

Remove scripts, styles, navs, footers, comments, cookie banners, modals, ads, and tracking blocks.

Preserve Structure

Keep headings, links, paragraphs, tables, lists, inline code, and fenced code blocks readable.

Log Crusher

Collapse repeated lines and duplicate stack frames while preserving failed tests, timestamps, and final errors.

Budget Packer

Use exact token counting, --max-tokens, and model presets for GPT, Claude, and Gemini-sized contexts.

Context Reports

Run ctxclean report to see input tokens, output tokens, compression ratio, biggest noise, and the command to use next.

Protect

Redact secret-like values by default, respect ignore files, skip generated paths, and require explicit sensitive-file opt-in.

Run locally

No API keys. No telemetry. No cloud dependency.

cargo install --path crates/contextclean-cli

ctxclean fixtures/dirty_html_small.html --fit gpt-4.1 --format json
ctxclean report fixtures/ci_failure_log.txt --max-tokens 8000

Trust model

Context stays on your machine.