Search Is Not Repo Understanding

AI agents can grep files. Real repos need more: behavior-level discovery, exact symbols, runnable next steps, callers, callees, freshness checks, and bounded evidence before an edit.

"It edited the obvious file."

Nearby text is not the same as the right implementation path. Satori gives the agent a way to inspect the code it is about to rely on.

"It missed the caller."

A change can look small until another function depends on it. Satori exposes callers and callees when call graph data is ready.

"It used stale context."

Active repos move while agents work. Satori checks freshness and returns explicit reindex guidance instead of silently trusting old data.

"I cannot tell what it read."

Satori returns file paths, symbols, spans, warnings, and fallback navigation so agent decisions are easier to inspect.

"It found code, then stalled."

Search results can include suggested follow-up actions: read the exact symbol, inspect the outline, or trace callers and callees.

One Flow: Investigate Before Editing

One compact demo is enough: ask for a change, inspect real repo context, then edit with fewer blind spots.

Without Satori

The agent searches a few nearby files, guesses which function matters, edits from partial context, and leaves you to discover what it missed.

With Satori

The agent asks a plain-English behavior question, follows returned next actions, reads the exact implementation span, and checks caller/callee context before proposing an edit.

1. Find relevant code

Plain-English search over indexed code returns grouped results, warnings, freshness state, and runnable next actions.

2. Open the exact symbol

The agent uses deterministic file outlines and symbol spans instead of copying the nearest text chunk into a prompt.

3. Follow callers and callees

When the sidecar is ready, call graph traversal shows surrounding code paths. When it is not ready, Satori returns a runnable navigation fallback.

4. Edit with fresher context

Satori does not promise bug-free changes. It reduces blind edits by giving agents specific repo evidence first.

Install Satori Read the architecture

Built for Devs Using Agents on Real Repos

For indie devs and vibe coders using Claude Code, Codex, MCP clients, or Cursor-style agent workflows where the repo is too large to paste into chat.

You want faster shipping

Keep the agent moving, but make investigation part of the workflow before it touches important files.

Your repo is not a demo

Mixed docs, tests, generated files, stale indexes, and renamed paths are normal. Satori treats those as first-class states.

You need visible evidence

Results point to real files, symbols, line spans, warning states, and exact next tool calls.

You want read-only repo context

Satori does not expose source-code write tools. It gives agents search, navigation, and read evidence; edits stay in your normal editor or host workflow.

You work inside packages

Search can start from a subdirectory inside an indexed repo, while Satori keeps the root identity and navigation paths consistent.

You want a practical trial path

Satori is MIT/open source. Zilliz and VoyageAI may offer free allowances suitable for a trial; provider limits can change.

How It Works

Index once, then let the agent investigate through a small MCP surface: search, outline, graph, read, sync, and lifecycle management.

01 Index repo

Satori chunks code, stores embeddings, builds a symbol registry and relationship sidecars, and tracks fingerprints so incompatible indexes do not look ready.

02 Ask agent

Your MCP-compatible client calls Satori tools instead of relying on ad hoc file search alone.

03 Choose next step

Search results tell the agent whether to read the exact symbol, inspect the outline, or trace caller/callee context next.

04 Edit with guardrails

Requires-reindex, backend timeout, missing registry or relationship sidecar, and noisy-result states return explicit guidance instead of silent degradation.

Reduce Blind Edits

Satori does not just help your agent find code. It helps your agent decide what to inspect next, while staying honest about what static evidence can and cannot prove.

Exact spans

Open the symbol or file range that matters instead of dumping broad context into the model.

Callers/callees

Depth-bounded call graph context helps the agent see nearby code relationships before changing a function.

Agent-ready next steps

Search results include suggested follow-up actions: open the exact symbol, inspect the outline, or trace callers and callees.

Changed-code context

Debug active branches faster with changed files, changed symbols, and direct callers surfaced from call graph metadata.

Graph confidence states

Relationship-backed navigation keeps confidence honest: ambiguous same-name targets are skipped, low-confidence cross-file calls stay constrained, and unsupported graph paths fail closed.

Generated-output checks

When build artifacts appear in context, Satori reminds agents to verify the generated file directly.

Stale-index warnings

Fingerprint gates and freshness decisions stop old indexes from pretending to be current.

Deterministic output

Stable sorting, warning codes, and fallback payloads make results easier to replay and debug.

Questions Your Agent Should Ask First

The six MCP tools stay small. The outcomes are what matter during a coding session.

Find relevant code

search_codebase searches by plain-English intent, exact identifiers, and exact terms with runtime, docs, or mixed scope, then returns grouped results with recommended actions, structured warnings, capabilities, and navigation hints.

Understand a file

file_outline maps symbols and resolves exact labels or symbolInstanceId values without guessing on ambiguity.

Follow callers

call_graph traverses relationship-backed callers and callees when compatible navigation sidecars are ready for the symbol.

Read exact evidence

read_file opens files, line ranges, or exact symbols with optional outline metadata.

What should I inspect next?

Search results can suggest the next move: open the exact symbol, inspect the outline, or trace callers and callees.

Why did graph navigation degrade?

Sidecar compatibility is explicit. Missing or incompatible relationship navigation returns runnable reindex or fallback guidance instead of guessed graph context.

Check freshness

list_codebases and search freshness states show what is ready, indexing, failed, or requires reindex.

Repair lifecycle state

manage_index handles create, sync, reindex, status, and clear with explicit destructive-action semantics. Mutations are blocked if another live Satori runtime has a different fingerprint, version, or config identity.

Install in One Path First

Start with one repo and one MCP client. Add providers later. Use the CLI installer first; open manual config only when you need to wire a client yourself.

CLI Installer

npx -y @zokizuan/satori-cli@0.4.5 install --client all
npx -y @zokizuan/satori-cli@0.4.5 install --client opencode
npx -y @zokizuan/satori-cli@0.4.5 install --client all --dry-run
npx -y @zokizuan/satori-cli@0.4.5 doctor

The installer writes managed client config and copies the first-party satori workflow skill. The doctor command checks provider and Milvus setup without starting an MCP client.

Repo Index Profiles

npx -y @zokizuan/satori-cli@0.4.5 install --client all --profile minimal

[index]
profile = "minimal"

The installer writes repo-local satori.toml in the current working directory. It is index policy only, not MCP client config and not provider config.

default is safe-broad for source, docs/text, config, scripts, infra/query files, and known extensionless files. minimal indexes source plus docs/text. all-text adds unknown UTF-8 text files under the size cap.

Every profile still honors .gitignore, .satoriignore, and the hard denylist for secrets, lockfiles, generated output, dependency folders, bundles, logs, database dumps, and snapshots.

Managed Client Targets

Codex

Config: ~/.codex/config.toml
Skills: ~/.codex/skills

Claude

Config: ~/.claude.json
Skills: ~/.claude/skills

OpenCode

Config: ~/.config/opencode/opencode.json
Instructions: ~/.config/opencode/AGENTS.md

Supported clients launch Satori through the installer-owned Node launcher at ~/.satori/bin/satori-mcp.js. Resident MCP config should not use npx or timeout workarounds.

Runtime env names are visible in client config too: Codex gets env_vars plus an optional env template, Claude gets mcpServers.satori.env, and OpenCode gets mcp.satori.environment.

For People Who Want the Details

The landing page stays simple. The technical depth is still here: deterministic ranking, sync-on-read search, fingerprint gates, completion-proof checks, and timeout-safe lifecycle recovery.

Incremental sync

Stat-first hash-on-change updates changed files, reuses changed-file symbol output, preserves unchanged registry state, and recomputes relationships without re-splitting unchanged files. If changed-file indexing stops early, navigation state is cleared instead of publishing a mixed generation.

Hybrid retrieval

Dense semantic search can be paired with BM25 keyword retrieval and reciprocal-rank fusion, so concepts and exact tokens both matter.

Symbol-owned results

Grouped search returns owner symbols first. Matching chunks stay as evidence so agents navigate by implementation units instead of raw fragments.

AST-aware chunks

Supported languages are split around real code structure, so search results are less likely to cut through function or class boundaries.

Scope filtering

scope=runtime keeps implementation discovery first, excludes docs/generated noise, and keeps tests searchable without letting them dominate unless the query has test intent.

Index profiles

Repo-local satori.toml chooses default, minimal, or all-text. It controls what enters the index, while search scope controls what results are queried.

Honest capability states

TypeScript, JavaScript, and Python are the production-ready call_graph languages. Go and Rust are symbol_only: outline can work, graph traversal returns unsupported_language.

SQLite parity gate

JSON navigation sidecars stay canonical. Explicit SQLite serving is allowed only after proving parity with the JSON registry and relationship sidecars.

Relationship v0 sidecar

Completed indexes store conservative calls plus TS/JS relative imports and exports. call_graph uses these records directly, promotes low-confidence cross-file calls only when import/export-supported evidence points to the target, and skips ambiguous same-name targets.

Capability-aware defaults

Cloud and local providers use different search budgets and rerank policies, so Ollama-style local workflows stay responsive.

Cloud completion proof

Cloud collections repair local ready state only after marker, path, and fingerprint validation.

Timeout-safe lifecycle

Indeterminate Milvus/Zilliz delete or validation timeouts are returned as backend state, not repo failure.

Path-scoped live evidence

Exact path: searches can supplement dirty tracked files with bounded live reads, so fresh regression lines are not hidden by stale vector chunks.

Deterministic tie-break chain:
score desc -> file asc -> start_line asc -> symbol label asc -> symbol id asc

The architecture page keeps the deeper system view: open architecture.html.

Failure States Are Explicit

If context is not safe to trust, Satori says so and gives the next recovery step.

Requires Reindex

Any requires_reindex response includes hints.reindex with the exact path to repair before retrying the original call.

manage_index({ action: "reindex", path: <hints.reindex.args.path> })
// then retry the original tool call

Runtime Owner Conflict

If manage_index returns reason="runtime_owner_conflict", restart all Satori MCP clients so only one runtime fingerprint/config identity is active, then retry the mutation. Satori MCP tools do not kill processes or ask interactive cleanup questions.

Noisy Results

If hints.noiseMitigation appears, apply the suggested .satoriignore patterns, wait one debounce window when provided, then rerun search.

Generated Output in Context

If generated files like dist, build, or .output appear in search context, hints.verification.generatedArtifacts points to direct artifact reads. Source matches do not prove generated output is current.

Call Graph Not Ready

If call_graph returns not_ready, search results still include recommendedNextAction, capabilities, and navigationFallback with a runnable read span and optional file outline window.

Partial Index

If a full index reaches a limit, search may still return partial chunks, but Satori warns that results may be incomplete and navigation sidecars were not published as complete.

Backend Timeout

If Zilliz/Milvus times out during collection validation or deletion, Satori returns retryable backend guidance and preserves local state unless remote absence is verified.

Collection Limit Reached

Zilliz free tier limits collections. manage_index create can return guided text naming a collection to drop. Retry with zillizDropCollection.

Ready for Public Launch

Satori is packaged for a public Product Hunt-style launch: one-command setup, current npm versions, launch metadata, and a clear read-only promise for developers evaluating agent tooling.

Product Hunt tagline

Repo retrieval for MCP coding agents.

Launch promise

Help agents inspect real code before editing: semantic search, exact symbols, caller/callee context, bounded reads, and lifecycle guidance.

Public setup path

satori-cli install --client all owns supported client config so users do not copy runtime cache paths or add long startup timeout workarounds.

Launch checklist

See docs/LAUNCH_CHECKLIST.md for Product Hunt copy, preflight commands, website checks, and launch-day monitoring.

Satori

Repo-aware retrieval for MCP coding agents.