Refine CLI

Refine is an open-source CLI that finds contradictions, staleness, and drift in your docs -- locally.

Commands

Command	Description
`ody-refine <dir>`	Scan a directory (ingest + detect + report)
`ingest <dir>`	Ingest markdown, PDF, or text files
`detect`	Re-run detectors on previously ingested data
`resolve`	Interactive TUI to triage and fix findings
`export`	Export findings as JSONL, tickets, or training data
`report`	Generate or regenerate the HTML health report
`ci`	CI mode -- exit non-zero if health score below threshold
`scan <url>`	Crawl and scan public docs from a URL
`connect <source>`	Connect to Slack, Notion, Confluence, etc.
`diff`	Compare current scan against a previous one
`badge`	Generate an SVG health badge for your README
`status`	Show knowledge graph stats
`config`	Display or edit configuration
`optimize`	Auto-tune detector thresholds based on your feedback

Detectors

Refine ships with 5 detectors plus consultant analysis and consensus voting:

Contradiction detector -- finds conflicting statements across documents
Staleness detector -- catches outdated claims
Time bomb detector -- finds expired deadlines and promises ("by Q3 2024")
Duplicate detector -- identifies redundant content
Drift detector -- detects divergence between related documents

Consensus voting runs multiple passes for consistent, low-noise results.

Resolve

The interactive TUI lets you triage findings one by one:

ody-refine resolve

For each finding you can accept, reject, suppress, or annotate. Every resolution creates a preference pair -- training data for Forge.

Export

Export resolved findings in multiple formats:

# DPO-format JSONL for model fine-tuning (TRL-compatible)
ody-refine export --format trl

# General JSONL
ody-refine export --format jsonl

CI/CD

Fail builds when docs health drops below a threshold:

# .github/workflows/docs-health.yml
- name: Check docs health
  run: npx ody-refine ci --min-health 80

Generate a badge for your README:

ody-refine badge -o docs/health-badge.svg

GitHub Action

Use the Ody Refine GitHub Action to automatically scan docs on every PR and post a health report as a comment:

# .github/workflows/docs-health.yml
name: Docs Health
on: [pull_request]

jobs:
  refine:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: ufukkaraca/ody-platform/.github/actions/refine@main
        id: refine
        with:
          path: './docs'
          min-health: '80'

Inputs

Input	Default	Description
`path`	`./docs`	Directory to scan
`fail-on`	`critical`	Minimum severity that fails the check: `critical`, `warning`, `info`, `none`
`min-health`	`70`	Minimum health score (0-100)
`fail-on-regression`	`false`	Fail if score decreased since last run
`provider`	`none`	LLM provider for deeper detection: `ollama`, `anthropic`, `openai`, `none`
`model`	(auto)	Model name for the chosen provider
`format`	`markdown`	Output format: `json` or `markdown`

Outputs

Output	Description
`score`	Health score (0-100)
`findings-count`	Total findings
`pass`	`true` or `false`
`report-path`	Path to the generated report file

Using outputs in subsequent steps

- uses: ufukkaraca/ody-platform/.github/actions/refine@main
  id: refine
  with:
    path: './docs'
- run: echo "Health score is ${{ steps.refine.outputs.score }}"

With LLM-powered detection

Set an API key in your repo secrets for deeper detection:

- uses: ufukkaraca/ody-platform/.github/actions/refine@main
  with:
    path: './docs'
    provider: 'openai'
    model: 'gpt-4o-mini'
  env:
    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

On PRs, the action posts a health report comment and updates it on subsequent pushes.

Configuration

Create ~/.config/ody-refine/config.toml:

[embedding]
provider = "transformers"   # or "ollama", "openai", "cohere"

[llm]
provider = "ollama"         # or "openrouter", "mlx"
model = "qwen2.5:7b"

LLM Providers

Refine auto-detects your LLM provider from environment variables in this order:

Provider	Env Variable	Default Model
Anthropic	`ANTHROPIC_API_KEY`	`claude-haiku-4-5-20251001`
OpenAI	`OPENAI_API_KEY`	`gpt-4o-mini`
Groq	`GROQ_API_KEY`	`llama-3.3-70b-versatile`
Gemini	`GEMINI_API_KEY`	`gemini-2.0-flash`
xAI	`XAI_API_KEY`	`grok-4.1-fast-non-reasoning`
OpenRouter	`OPENROUTER_API_KEY`	`google/gemini-2.0-flash-lite-001`
MLX	(none, Apple Silicon)	(auto)
Ollama	(none, local)	`llama3`

Override with --provider and --model:

npx ody-refine ./docs/ --provider groq --model llama-3.3-70b-versatile

Embedding Providers

Embeddings power duplicate and drift detection. Refine uses a zero-config cascade — it tries each provider in order and uses the first one available:

Priority	Provider	Model	Dimensions	Requires
1	Ollama	`nomic-embed-text`	768	Ollama running locally with an embedding model
2	TransformersJS	`all-MiniLM-L6-v2`	384	`@huggingface/transformers` installed (works offline after first download)
3	Cohere	`embed-english-v3.0`	1024	`COHERE_API_KEY`
4	OpenAI	`text-embedding-3-small`	1536	`OPENAI_API_KEY` + `embedding.provider = "openai"` in config

You can also set the provider explicitly in config:

[embedding]
provider = "ollama"      # or "transformers", "openai", "cohere"
model = "nomic-embed-text"

Suppression

Suppress false positives with .ody-refine-ignore:

type:time_bomb
text:rate limit

Use Cases

Audit your wiki before onboarding -- new hires shouldn't guess which docs are true
Run in CI to catch docs drift -- merge a feature, break a doc, get a failing check
Clean your AI's context -- LLM memory files accumulate contradictions; Refine finds them
Generate training data -- every resolved contradiction becomes a DPO pair for fine-tuning

Commands​

Detectors​

Resolve​

Export​

CI/CD​

GitHub Action​

Inputs​

Outputs​

Using outputs in subsequent steps​

With LLM-powered detection​

Configuration​

LLM Providers​

Embedding Providers​

Suppression​

Use Cases​