Local-first context cleaner for AI agents

Stop paying LLMs to read raw HTML, repeated logs, and secret-shaped noise.

ContextClean turns noisy files, web exports, terminal output, and project folders into compact, reviewable, token-budgeted context packs.

$ ctxclean dirty.html --mode standard --format json

input_tokens:       124
output_tokens:       37
tokens_saved:        87
reduction:        70.1%

removed:
- script/style blocks
- nav/footer boilerplate
- cookie and newsletter noise

preserved:
- main article
- TypeError at src/app.ts:42

Verified fixture result

From noisy scrape to model-ready context

Before

<script>window.analytics = true;</script>
<nav>Home Pricing Login Cookie preferences</nav>
<main>
  <h1>ContextClean keeps the main article</h1>
  <p>Unique failure: TypeError at src/app.ts:42.</p>
</main>
<footer>Newsletter signup</footer>

After

ContextClean keeps the main article

Unique failure: TypeError at src/app.ts:42.

Tokens saved: 87
Reduction: 70.1%

Phase 1 foundation

Built like a real developer tool from day one

Clean

Remove scripts, styles, navs, footers, comments, cookie banners, and repeated lines.

Pack

Fit output into an estimated token budget with explicit truncation metadata.

Report

Emit JSON with input/output tokens, tokens saved, removed sections, warnings, and source.

Protect

Redact secret-like values by default and respect ignore files during directory scans.

Run locally

No API keys. No telemetry. No cloud dependency.

cargo install --path crates/contextclean-cli

ctxclean fixtures/dirty_html_small.html --format json

Trust model

Context stays on your machine.