ai-summary

Web search & summarization CLI for AI coding agents

Reduces token consumption by compressing web content through local LLMs before feeding it to Claude Code.

curl -fsSL https://ai-summary.agent-tools.org/install.sh | sh

How It Works

Pipeline:

Web Search → Fetch Pages → Readability Extract → LLM Summary → Compressed output (60–98% smaller)

Instead of sending raw 50K+ page content to Claude, ai-summary returns a focused 1–4K summary.

Features

Quick Start

ai-summary "what is the latest Rust version"
ai-summary fetch https://example.com/article -p "key points"
echo "large text" | ai-summary compress -m 4000
ai-summary github "error handling" -r tokio-rs/tokio -l rust
ai-summary github owner/repo src/main.rs -p "explain this"
ai-summary repo user/repo -p "explain the architecture"
ai-summary stats

Subcommands

Flags: --deep, --raw, --json, --browser, --cf, --api-url, --api-key, --model

Claude Code Integration

One-command setup

ai-summary init                # Install prompt + hook
ai-summary init --with-repomix # Also install repomix (for repo command)
ai-summary init --uninstall    # Remove

Installs prompt injection (Claude + subagents use ai-summary), Bash hook (test commands compressed), and WebFetch/WebSearch hooks (one-time education per session).

Without hook: cargo test → 3000 tokens (raw)
With hook:    cargo test → 15 tokens (compressed)

Tee mode

Failed commands save full output to /tmp/ai-summary-tee/ — AI can read the raw log if the summary isn't enough.

Benchmarks (v2.6.0)

Real-world evaluation run on 2026-03-14. All tests use default config with free LLM backends.

Scenario Input Output Compression Time
Repo (ai-summary, 15 .rs files) 59.1K chars 1.6K chars 97% 59s
Repo (repomix, src/**/*.ts) 182.3K chars 1.7K chars 99% 36s
Fetch (docs.rs/reqwest) ~1K tokens ~516 tokens 51% 10s
Fetch (react.dev/learn) ~1K tokens ~308 tokens 69% 10s
Fetch (Wikipedia/Rust) ~1K tokens ~378 tokens 62% 10s
Fetch (Hacker News) ~938 tokens ~262 tokens 72% 10s

Cumulative Stats (166 queries)

Metric Value
Total tokens saved300,700
Overall compression84%
Estimated Claude cost saved$0.90 (at $3/M input tokens)
LLM cost$0.18 (mostly free backends)
ROI5x return

By Mode

Mode Queries Tokens Saved Avg Compression
gemini-cli (search)58171,60085%
repo (repomix)259,50099%
fetch (single URL)9348,70076%
compress (no LLM)410,10083%
gemini API26,10088%
stdin pipe51,70071%
hook-bash (auto)286730%

Repo mode achieves 97-99% compression using repomix Tree-sitter extraction + LLM summarization. Compress mode uses no LLM — pure text extraction at near-instant speed.

Installation

Quick install (recommended)

curl -fsSL https://ai-summary.agent-tools.org/install.sh | sh

Downloads prebuilt binary for your platform (macOS/Linux, x86_64/aarch64). Set AI_SUMMARY_INSTALL_DIR to customize install path (default: ~/.local/bin).

From crates.io

cargo install ai-summary

From source

git clone https://github.com/agent-tools-org/ai-summary
cd ai-summary
cargo install --path .

Releases: GitHub Releases