ye/pi-model-router

Fork 0

Extension for the pi coding agent that intelligently routes each turn to the right LLM tier (high/medium/low) based on task intent, session budget, context size, and custom rules.

TypeScript 100%

Find a file

Ye Liu 4e0444a919 docs: update README with accurate feature descriptions and add MIT license		2026-04-13 11:44:27 -04:00
docs	format	2026-03-24 11:32:06 -04:00
extensions	Update to pi 0.65.0, remove deprecated session_switch/session_fork handlers	2026-04-03 16:19:54 -04:00
.gitignore	fix: finalize model capacity reporting and ensure correct initial router status	2026-03-19 15:58:24 -04:00
.prettierignore	chore: add prettier and refactor all functions to arrow functions	2026-03-19 02:16:06 -04:00
.prettierrc	format	2026-03-24 11:32:06 -04:00
AGENTS.md	docs: update README with accurate feature descriptions and add MIT license	2026-04-13 11:44:27 -04:00
LICENSE	docs: update README with accurate feature descriptions and add MIT license	2026-04-13 11:44:27 -04:00
model-router.example.json	format	2026-03-24 11:32:06 -04:00
package-lock.json	Update to pi 0.65.0, remove deprecated session_switch/session_fork handlers	2026-04-03 16:19:54 -04:00
package.json	fix: resolve typescript errors and update library compatibility	2026-03-24 11:56:53 -04:00
README.md	docs: update README with accurate feature descriptions and add MIT license	2026-04-13 11:44:27 -04:00
tsconfig.json	fix: resolve typescript errors and update library compatibility	2026-03-24 11:56:53 -04:00

README.md

pi-model-router

Intelligent per-turn model router extension for the pi coding agent. Automatically selects between high, medium, and low-tier LLMs based on task intent, session budget, context size, and custom rules — with automatic fallbacks and phase awareness.

What it does

Logical Router Provider: Registers a router provider that exposes stable profiles (e.g., router/auto) as models.
Per-Turn Routing: Intelligently chooses between high, medium, and low tiers for every turn based on task intent and complexity.
Task-Aware Heuristics: Detects planning vs. implementation vs. lightweight tasks using keyword analysis, word count, and conversation history.
Advanced Controls: Includes built-in support for:
- LLM Intent Classifier: Optionally use a fast model to categorize intent (overrides heuristics).
- Custom Rules: Define keyword-based tier overrides for specific patterns (e.g., deploy → high).
- Context Trigger: Automatically upgrade to high-tier when token usage exceeds a threshold.
- Cost Budgeting: Set a session spend limit; high tier downgrades to medium once exceeded.
- Fallback Chains: Automatic retry with alternative models if the primary choice fails.
Phase Memory: Biased stickiness to keep you in the same tier during multi-turn planning or implementation work.
Thinking Control: Full control over reasoning/thinking levels per tier and profile.
Persistent State: Pins, profiles, costs, and debug history are remembered across agent restarts and conversation branches.

Installation

From this project directory:

pi install .

Or load directly for one run:

pi -e ./extensions/index.ts

Configuration

Copy the example config to one of:

~/.pi/agent/model-router.json (Global)
.pi/model-router.json (Project-specific)

Basic Config Shape

{
  "defaultProfile": "auto",
  "classifierModel": "google/gemini-flash-latest",
  "maxSessionBudget": 1.0,
  "profiles": {
    "auto": {
      "high": { "model": "openai/gpt-5.4-pro", "thinking": "high" },
      "medium": { "model": "google/gemini-flash-latest", "thinking": "medium" },
      "low": { "model": "openai/gpt-5.4-nano", "thinking": "low" }
    }
  }
}

Configuration Fields

Field	Description
`defaultProfile`	The profile to use when starting a new session.
`classifierModel`	(Optional) Model used to categorize intent. If omitted, fast heuristics are used.
`maxSessionBudget`	(Optional) USD budget for the session. Forces `medium` tier once exceeded.
`largeContextThreshold`	(Optional) Token count trigger to force `high` tier for large contexts.
`phaseBias`	(0.0 - 1.0) Stickiness of the current phase. Higher = more stable. Default `0.5`.
`rules`	List of custom keyword rules (e.g. `{ "matches": "deploy", "tier": "high" }`).
`profiles`	Map of profile definitions, each containing `high`, `medium`, and `low` tiers.

Commands

Command	Description
`/router`	Show detailed status, current profile, spend, and settings.
`/router status`	Alias for `/router` (show current status).
`/router profile [name]`	Switch to a profile or list available ones (enables router if off).
`/router pin [prof] <t\|a>`	Pin a tier (high/medium/low/auto) for the current or specified profile.
`/router fix <tier>`	Correct the last decision and pin that tier for the current profile.
`/router thinking ...`	Override thinking levels (e.g. `/router thinking low xhigh`).
`/router disable`	Disable the router and switch back to the last non-router model.
`/router widget <on\|off>`	Toggle the persistent state widget (supports `toggle`).
`/router debug <on\|off>`	Toggle turn-by-turn routing notifications (supports `toggle`, `clear`, `show`).
`/router reload`	Hot-reload the configuration JSON.
`/router help`	Show usage help for all subcommands.

Documentation

Architecture Guide: Deep dive into the routing logic and modular design.
Sample Configuration: Diverse profile examples (cheap, deep, balanced).