Refine CLI
Refine is an open-source CLI that finds contradictions, staleness, and drift in your docs -- locally.
Commands
| Command | Description |
|---|---|
ody-refine <dir> | Scan a directory (ingest + detect + report) |
ingest <dir> | Ingest markdown, PDF, or text files |
detect | Re-run detectors on previously ingested data |
resolve | Interactive TUI to triage and fix findings |
export | Export findings as JSONL, tickets, or training data |
report | Generate or regenerate the HTML health report |
ci | CI mode -- exit non-zero if health score below threshold |
scan <url> | Crawl and scan public docs from a URL |
connect <source> | Connect to Slack, Notion, Confluence, etc. |
diff | Compare current scan against a previous one |
badge | Generate an SVG health badge for your README |
status | Show knowledge graph stats |
config | Display or edit configuration |
optimize | Auto-tune detector thresholds based on your feedback |
Detectors
Refine ships with 5 detectors plus consultant analysis and consensus voting:
- Contradiction detector -- finds conflicting statements across documents
- Staleness detector -- catches outdated claims
- Time bomb detector -- finds expired deadlines and promises ("by Q3 2024")
- Duplicate detector -- identifies redundant content
- Drift detector -- detects divergence between related documents
Consensus voting runs multiple passes for consistent, low-noise results.
Resolve
The interactive TUI lets you triage findings one by one:
ody-refine resolve
For each finding you can accept, reject, suppress, or annotate. Every resolution creates a preference pair -- training data for Forge.
Export
Export resolved findings in multiple formats:
# DPO-format JSONL for model fine-tuning (TRL-compatible)
ody-refine export --format trl
# General JSONL
ody-refine export --format jsonl
CI/CD
Fail builds when docs health drops below a threshold:
# .github/workflows/docs-health.yml
- name: Check docs health
run: npx ody-refine ci --min-health 80
Generate a badge for your README:
ody-refine badge -o docs/health-badge.svg
GitHub Action
Use the Ody Refine GitHub Action to automatically scan docs on every PR and post a health report as a comment:
# .github/workflows/docs-health.yml
name: Docs Health
on: [pull_request]
jobs:
refine:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ufukkaraca/ody-platform/.github/actions/refine@main
id: refine
with:
path: './docs'
min-health: '80'
Inputs
| Input | Default | Description |
|---|---|---|
path | ./docs | Directory to scan |
fail-on | critical | Minimum severity that fails the check: critical, warning, info, none |
min-health | 70 | Minimum health score (0-100) |
fail-on-regression | false | Fail if score decreased since last run |
provider | none | LLM provider for deeper detection: ollama, anthropic, openai, none |
model | (auto) | Model name for the chosen provider |
format | markdown | Output format: json or markdown |
Outputs
| Output | Description |
|---|---|
score | Health score (0-100) |
findings-count | Total findings |
pass | true or false |
report-path | Path to the generated report file |
Using outputs in subsequent steps
- uses: ufukkaraca/ody-platform/.github/actions/refine@main
id: refine
with:
path: './docs'
- run: echo "Health score is ${{ steps.refine.outputs.score }}"
With LLM-powered detection
Set an API key in your repo secrets for deeper detection:
- uses: ufukkaraca/ody-platform/.github/actions/refine@main
with:
path: './docs'
provider: 'openai'
model: 'gpt-4o-mini'
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
On PRs, the action posts a health report comment and updates it on subsequent pushes.
Configuration
Create ~/.config/ody-refine/config.toml:
[embedding]
provider = "transformers" # or "ollama", "openai", "cohere"
[llm]
provider = "ollama" # or "openrouter", "mlx"
model = "qwen2.5:7b"
LLM Providers
Refine auto-detects your LLM provider from environment variables in this order:
| Provider | Env Variable | Default Model |
|---|---|---|
| Anthropic | ANTHROPIC_API_KEY | claude-haiku-4-5-20251001 |
| OpenAI | OPENAI_API_KEY | gpt-4o-mini |
| Groq | GROQ_API_KEY | llama-3.3-70b-versatile |
| Gemini | GEMINI_API_KEY | gemini-2.0-flash |
| xAI | XAI_API_KEY | grok-4.1-fast-non-reasoning |
| OpenRouter | OPENROUTER_API_KEY | google/gemini-2.0-flash-lite-001 |
| MLX | (none, Apple Silicon) | (auto) |
| Ollama | (none, local) | llama3 |
Override with --provider and --model:
npx ody-refine ./docs/ --provider groq --model llama-3.3-70b-versatile
Embedding Providers
Embeddings power duplicate and drift detection. Refine uses a zero-config cascade — it tries each provider in order and uses the first one available:
| Priority | Provider | Model | Dimensions | Requires |
|---|---|---|---|---|
| 1 | Ollama | nomic-embed-text | 768 | Ollama running locally with an embedding model |
| 2 | TransformersJS | all-MiniLM-L6-v2 | 384 | @huggingface/transformers installed (works offline after first download) |
| 3 | Cohere | embed-english-v3.0 | 1024 | COHERE_API_KEY |
| 4 | OpenAI | text-embedding-3-small | 1536 | OPENAI_API_KEY + embedding.provider = "openai" in config |
You can also set the provider explicitly in config:
[embedding]
provider = "ollama" # or "transformers", "openai", "cohere"
model = "nomic-embed-text"
Suppression
Suppress false positives with .ody-refine-ignore:
type:time_bomb
text:rate limit
Use Cases
- Audit your wiki before onboarding -- new hires shouldn't guess which docs are true
- Run in CI to catch docs drift -- merge a feature, break a doc, get a failing check
- Clean your AI's context -- LLM memory files accumulate contradictions; Refine finds them
- Generate training data -- every resolved contradiction becomes a DPO pair for fine-tuning