v0.1 · MIT licensed · Apple Silicon
Small Harness is a TUI harness for open-weight LLMs. One interface across Ollama, LM Studio, MLX, llama.cpp, and OpenRouter — with real filesystem and shell tools, sensible approval gates, and parsing built for small models.
# profile: mac-mini-16gb · backend: ollama · model: qwen2.5-coder:7b small-harness › read src/main.rs and tell me what /compare does › tool · read_file src/main.rs approved › tool · grep "/compare" src/ approved /compare runs the same prompt against your local model and a chosen OpenRouter model side-by-side, then shows tokens/sec, latency, and a diff of the two responses. Useful for checking whether a 7B model is good enough before reaching for a frontier one. small-harness › /compare openai/gpt-4o-mini
OpenAI-compatible chat completions against Ollama, LM Studio, MLX, or llama.cpp running on your machine.
/compare any prompt against an OpenRouter model. See if 7B is enough before paying for 400B.
mac-mini-16gb and mac-studio-32gb ship with model and context defaults that just work.
Read, write, edit, grep, glob, list-dir, shell, apply-patch — each with a per-tool approval gate and diff preview.
An inline JSON detector catches tool calls even when small models forget the prescribed format.
Populates the prompt-eval cache before your first message so the first reply doesn't feel cold.
Auto-selects tool schemas to fit the context budget and shows you exactly where the tokens went.
Tokens stream as they arrive, with grouped tool-call display so the transcript stays readable.
Append-only JSONL logs. List, resume, or export any past conversation from the prompt.
| Backend | Default URL | Best for |
|---|---|---|
| Ollama | localhost:11434/v1 | Easiest setup; mature tool-call templates. |
| LM Studio | localhost:1234/v1 | GUI model browser; explicit load and unload. |
| MLX | localhost:8080/v1 | Fastest inference on Apple Silicon. |
| llama.cpp | localhost:8080/v1 | Direct GGUF serving for full control. |
| OpenRouter | openrouter.ai/api/v1 | Cloud A/B comparison and frontier models. |
rustup)OPENROUTER_API_KEY if you want /compareSlash commands