Skip to main content

Refine CLI

Refine is an open-source CLI that finds contradictions, staleness, and drift in your docs -- locally.

Commands

CommandDescription
ody-refine <dir>Scan a directory (ingest + detect + report)
ingest <dir>Ingest markdown, PDF, or text files
detectRe-run detectors on previously ingested data
resolveInteractive TUI to triage and fix findings
exportExport findings as JSONL, tickets, or training data
reportGenerate or regenerate the HTML health report
ciCI mode -- exit non-zero if health score below threshold
scan <url>Crawl and scan public docs from a URL
connect <source>Connect to Slack, Notion, Confluence, etc.
diffCompare current scan against a previous one
badgeGenerate an SVG health badge for your README
statusShow knowledge graph stats
configDisplay or edit configuration
optimizeAuto-tune detector thresholds based on your feedback

Detectors

Refine ships with 5 detectors plus consultant analysis and consensus voting:

  • Contradiction detector -- finds conflicting statements across documents
  • Staleness detector -- catches outdated claims
  • Time bomb detector -- finds expired deadlines and promises ("by Q3 2024")
  • Duplicate detector -- identifies redundant content
  • Drift detector -- detects divergence between related documents

Consensus voting runs multiple passes for consistent, low-noise results.

Resolve

The interactive TUI lets you triage findings one by one:

ody-refine resolve

For each finding you can accept, reject, suppress, or annotate. Every resolution creates a preference pair -- training data for Forge.

Export

Export resolved findings in multiple formats:

# DPO-format JSONL for model fine-tuning (TRL-compatible)
ody-refine export --format trl

# General JSONL
ody-refine export --format jsonl

CI/CD

Fail builds when docs health drops below a threshold:

# .github/workflows/docs-health.yml
- name: Check docs health
run: npx ody-refine ci --min-health 80

Generate a badge for your README:

ody-refine badge -o docs/health-badge.svg

GitHub Action

Use the Ody Refine GitHub Action to automatically scan docs on every PR and post a health report as a comment:

# .github/workflows/docs-health.yml
name: Docs Health
on: [pull_request]

jobs:
refine:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ufukkaraca/ody-platform/.github/actions/refine@main
id: refine
with:
path: './docs'
min-health: '80'

Inputs

InputDefaultDescription
path./docsDirectory to scan
fail-oncriticalMinimum severity that fails the check: critical, warning, info, none
min-health70Minimum health score (0-100)
fail-on-regressionfalseFail if score decreased since last run
providernoneLLM provider for deeper detection: ollama, anthropic, openai, none
model(auto)Model name for the chosen provider
formatmarkdownOutput format: json or markdown

Outputs

OutputDescription
scoreHealth score (0-100)
findings-countTotal findings
passtrue or false
report-pathPath to the generated report file

Using outputs in subsequent steps

- uses: ufukkaraca/ody-platform/.github/actions/refine@main
id: refine
with:
path: './docs'
- run: echo "Health score is ${{ steps.refine.outputs.score }}"

With LLM-powered detection

Set an API key in your repo secrets for deeper detection:

- uses: ufukkaraca/ody-platform/.github/actions/refine@main
with:
path: './docs'
provider: 'openai'
model: 'gpt-4o-mini'
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

On PRs, the action posts a health report comment and updates it on subsequent pushes.

Configuration

Create ~/.config/ody-refine/config.toml:

[embedding]
provider = "transformers" # or "ollama", "openai", "cohere"

[llm]
provider = "ollama" # or "openrouter", "mlx"
model = "qwen2.5:7b"

LLM Providers

Refine auto-detects your LLM provider from environment variables in this order:

ProviderEnv VariableDefault Model
AnthropicANTHROPIC_API_KEYclaude-haiku-4-5-20251001
OpenAIOPENAI_API_KEYgpt-4o-mini
GroqGROQ_API_KEYllama-3.3-70b-versatile
GeminiGEMINI_API_KEYgemini-2.0-flash
xAIXAI_API_KEYgrok-4.1-fast-non-reasoning
OpenRouterOPENROUTER_API_KEYgoogle/gemini-2.0-flash-lite-001
MLX(none, Apple Silicon)(auto)
Ollama(none, local)llama3

Override with --provider and --model:

npx ody-refine ./docs/ --provider groq --model llama-3.3-70b-versatile

Embedding Providers

Embeddings power duplicate and drift detection. Refine uses a zero-config cascade — it tries each provider in order and uses the first one available:

PriorityProviderModelDimensionsRequires
1Ollamanomic-embed-text768Ollama running locally with an embedding model
2TransformersJSall-MiniLM-L6-v2384@huggingface/transformers installed (works offline after first download)
3Cohereembed-english-v3.01024COHERE_API_KEY
4OpenAItext-embedding-3-small1536OPENAI_API_KEY + embedding.provider = "openai" in config

You can also set the provider explicitly in config:

[embedding]
provider = "ollama" # or "transformers", "openai", "cohere"
model = "nomic-embed-text"

Suppression

Suppress false positives with .ody-refine-ignore:

type:time_bomb
text:rate limit

Use Cases

  • Audit your wiki before onboarding -- new hires shouldn't guess which docs are true
  • Run in CI to catch docs drift -- merge a feature, break a doc, get a failing check
  • Clean your AI's context -- LLM memory files accumulate contradictions; Refine finds them
  • Generate training data -- every resolved contradiction becomes a DPO pair for fine-tuning