PebbleRouter

PebbleRouter is your organisation’s LLM gateway and routing control plane. It sits between PebbleChat (and any other client) and the underlying AI model providers, deciding which actual model handles each request based on the routing profile you configure.

When a user picks Auto in their PebbleChat model selector, PebbleRouter is the thing making the choice. When a flow calls a “Smart” or “Fast” alias, PebbleRouter is the thing resolving the alias.

PebbleRouter — Routing Profiles tab

Find it at Admin → Organisation → Settings → PebbleRouter.

Why a router exists

You could give every PebbleAI user direct access to every model and let them pick. That works on day one. By month three:

Costs are unpredictable because users default to the most expensive model
Outages aren’t handled gracefully — when one provider goes down, every chat using that provider breaks
You can’t enforce data residency rules (some models leave your region; users won’t notice)
You can’t enforce rate limits or quotas at the org level
A/B testing a new model means asking every user to switch manually

PebbleRouter is the answer to all of those:

Cost optimisation — route cheap requests to cheap models, save expensive models for hard problems
Failover — if one model is rate-limited or down, route to a sibling automatically
Compliance — enforce that all traffic goes through providers in your region
Quotas and limits — enforced at the gateway, not the client
Single integration point — clients call one URL with one API key, the router handles the rest

PebbleRouter is the routing gateway. PebbleAI provides the management UI, multi-tenancy, attribution, and integration with PebbleChat on top of it.

The three tabs

PebbleRouter has three tabs at the top of the page:

Routing Profiles — the meat of the page; where you define routes
API Keys — generate keys clients use to call PebbleRouter
Settings — gateway-level options including allowed origins and verbosity

Above the tabs is an Enabled toggle that turns the entire router on or off, plus a Health URL display showing where the gateway is reachable.

Routing Profiles tab

PebbleRouter Routing Profiles

A routing profile is a named bundle of settings that defines:

A friendly name (e.g. Auto, Org Global Default)
A description
A routing strategy that controls how the profile picks between models
A list of models the profile is allowed to use (with optional weights)
Per-profile API keys

When a client calls PebbleRouter, it specifies which profile to use, and PebbleRouter follows the profile’s rules to pick a concrete model.

The profile selector

At the top of the Routing Profiles tab is a profile picker. Pick an existing profile to view/edit it, or click Create New Profile to make a new one. New profiles are blank — they don’t inherit models or settings from another profile.

Routing profile fields

For the selected profile, you’ll see:

Field	What it does
Route (e.g. `Auto`)	Internal name; this is what clients pass to PebbleRouter to use the profile
Description	Free-text human description of what the profile does
Strategy	The routing algorithm — see below
Models sub-tab	The list of models in this profile (one row each)
Routing sub-tab	Strategy-specific tuning (timeouts, retries, fallback chains)
API Keys sub-tab (per profile)	Keys scoped to this specific profile

Routing strategies

The available strategies typically include:

Org Global Default — uses your organisation’s default model order
Latency-based — picks the model with the lowest current latency
Cost-based — picks the cheapest model that meets the request’s capability requirements
Usage-based / least busy — load-balances across models based on current usage
Tag-based — routes based on tags in the request (e.g. “needs-vision”, “needs-long-context”)
Failover (priority) — tries models in a fixed priority order, falling back when one is unavailable
Round robin — distributes evenly across models

Step-by-step: creating a routing profile for cost optimisation

Click Create New Profile
Name it Cost Optimised
Description: Routes simple requests to Haiku, complex requests to Opus, with Sonnet as a middle ground
Strategy: pick Tag-based (or Usage-based depending on your setup)
Save
Switch to the Models sub-tab
Click Add Model
Add claude-haiku as the default for low-complexity requests
Add claude-sonnet for medium-complexity
Add claude-opus for high-complexity
Configure tags or weights as needed
Save

Step-by-step: failover profile

Create a new profile named Failover
Strategy: Failover (priority)
Add models in priority order — primary first, fallback next
Configure timeouts and retry counts in the Routing sub-tab
Save

When clients use this profile, PebbleRouter tries the primary model first; if it fails or times out, it falls back to the next one in the list.

How profiles relate to PebbleChat’s `Auto`

When a user picks Auto in their PebbleChat model selector, PebbleChat sends the request to PebbleRouter using the default profile for the organisation. The default profile is typically named Auto and is the one a user would experience.

You can have many profiles — one for chat, one for batch processing, one for testing, one for cost optimisation — and route different parts of your platform through different profiles by changing which profile a client is configured to use.

API Keys tab

PebbleRouter API Keys

Generate API keys clients use to authenticate calls to PebbleRouter.

Three different API key concepts (don’t confuse them)

PebbleAI has three different things called “API Key” — make sure you know which is which:

Type	Where to find it	What it authenticates
PebbleRouter API Keys (this tab)	Admin → PebbleRouter → API Keys	Calls to the routing gateway from external clients
PebbleFlows API Keys	User Settings → API Keys	Calls to individual flows you’ve published in PebbleFlows
Per-profile API Keys	Admin → PebbleRouter → Routing Profiles → (a profile) → API Keys sub-tab	Calls to the gateway, scoped to a single routing profile

Step-by-step: generate a router API key

Click Generate Key
Give it a descriptive name — Production App, Mobile Backend, CI Pipeline
Optionally restrict it to a specific routing profile, or leave it global
Save
Copy the displayed key value — this is the only time you’ll see it in full
Store it in your application’s secure secrets system

Quick start hint shown in the UI

The page includes a quick start link: “Quick Start: set base URL to <your-router-url> and create your global PebbleRouter key in Console > Cluster, Cluster Code, or Continue.”

This is where you copy the gateway base URL clients should call.

Using a router API key

curl -X POST https://<your-router-url>/v1/chat/completions \
  -H "Authorization: Bearer <your-router-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Auto",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

The model field is the route name (Auto, Cost Optimised, etc.) — PebbleRouter resolves it to a concrete model based on the profile.

Settings tab

PebbleRouter Settings

Gateway-wide configuration that applies to every routing profile.

What you set here

Verbose mode — Disabled / Limited / Full. Controls how much detail PebbleRouter logs about each request. Use Full while debugging routing issues, then dial back down — high verbosity has a small performance cost.
Cache TTL (s) — How long PebbleRouter should cache identical requests. Caching reduces cost and latency for repeated identical prompts.

Tips

Default to Limited verbosity in production, Full when debugging
Set cache TTL to 0 if you’re testing prompt changes — caching will hide your changes from view
Enable caching when serving public-facing applications with high request volume — it’s a free win

Models composition

Within each routing profile, the Models table at the bottom of the page shows which models are part of the profile. Each row has:

Model — the model identifier (matching one in Organisation Models)
Provider — provider name
Status — Active / Inactive
Actions — remove from profile

Click Add Model to add a model that’s enabled in your Organisation Models catalogue.

A model must be in your Organisation Models catalogue first before you can add it to a routing profile. Including with Pebble Router only sharing scope — that’s the scope to use for models that should be in routing profiles but not in user model selectors.

Step-by-step: a complete first-time setup

For a fresh organisation:

Add credentials in Credentials — at least one for each provider (AWS, OpenAI, Anthropic)
Enable models in Organisation Models — pick the half-dozen models you want available
Come here and create the Auto routing profile
Add models to the profile from the enabled catalogue
Pick a strategy — Org Global Default or Latency-based is a good starting point
Save the profile
Generate a router API key (API Keys tab) for any external clients you have
Toggle the Enabled switch at the top of the page
Test from PebbleChat — pick Auto in the model selector and send a message; verify the response works
Watch the Activity Stream to see which model PebbleRouter picked

Troubleshooting

“Auto in PebbleChat shows ‘No Models Available’”

Check the Enabled toggle is on
Check the Auto profile has at least one model in its Models table
Check the model’s credential is valid in Credentials

“Routing always picks the same model even though I have multiple”

Strategy may be Failover (priority) with the first model always succeeding — try Latency-based or Round robin instead
Check the routing logs in PebbleObserve

“Latency is high”

Set verbose mode to Limited or Disabled
Enable response caching with a short TTL
Check which models are in the profile — slow models drag down latency-based strategies

“I want different routing for different teams”

Create separate profiles per team, give each team its own router API key tied to that profile
Or use tag-based routing within a single profile, with each team passing a different tag

Organisation Models — the source of models for routing profiles
Credentials — credentials for the underlying providers
PebbleObserve → Usage — see what PebbleRouter is actually doing
User Settings → API Keys — the other kind of API key (for individual flows, not the router)

Models PebbleFlows Models