Organisation AdminPebbleRouter

PebbleRouter

PebbleRouter is your organisation’s LLM gateway and routing control plane. It sits between PebbleChat (and any other client) and the underlying AI model providers, deciding which actual model handles each request based on the routing profile you configure.

When a user picks Auto in their PebbleChat model selector, PebbleRouter is the thing making the choice. When a flow calls a “Smart” or “Fast” alias, PebbleRouter is the thing resolving the alias.

PebbleRouter — Routing Profiles tab

Find it at Admin → Organisation → Settings → PebbleRouter.

Why a router exists

You could give every PebbleAI user direct access to every model and let them pick. That works on day one. By month three:

  • Costs are unpredictable because users default to the most expensive model
  • Outages aren’t handled gracefully — when one provider goes down, every chat using that provider breaks
  • You can’t enforce data residency rules (some models leave your region; users won’t notice)
  • You can’t enforce rate limits or quotas at the org level
  • A/B testing a new model means asking every user to switch manually

PebbleRouter is the answer to all of those:

  • Cost optimisation — route cheap requests to cheap models, save expensive models for hard problems
  • Failover — if one model is rate-limited or down, route to a sibling automatically
  • Compliance — enforce that all traffic goes through providers in your region
  • Quotas and limits — enforced at the gateway, not the client
  • Single integration point — clients call one URL with one API key, the router handles the rest

PebbleRouter is the routing gateway. PebbleAI provides the management UI, multi-tenancy, attribution, and integration with PebbleChat on top of it.

The three tabs

PebbleRouter has three tabs at the top of the page:

  1. Routing Profiles — the meat of the page; where you define routes
  2. API Keys — generate keys clients use to call PebbleRouter
  3. Settings — gateway-level options including allowed origins and verbosity

Above the tabs is an Enabled toggle that turns the entire router on or off, plus a Health URL display showing where the gateway is reachable.

Routing Profiles tab

PebbleRouter Routing Profiles

A routing profile is a named bundle of settings that defines:

  • A friendly name (e.g. Auto, Org Global Default)
  • A description
  • A routing strategy that controls how the profile picks between models
  • A list of models the profile is allowed to use (with optional weights)
  • Per-profile API keys

When a client calls PebbleRouter, it specifies which profile to use, and PebbleRouter follows the profile’s rules to pick a concrete model.

The profile selector

At the top of the Routing Profiles tab is a profile picker. Pick an existing profile to view/edit it, or click Create New Profile to make a new one. New profiles are blank — they don’t inherit models or settings from another profile.

Routing profile fields

For the selected profile, you’ll see:

FieldWhat it does
Route (e.g. Auto)Internal name; this is what clients pass to PebbleRouter to use the profile
DescriptionFree-text human description of what the profile does
StrategyThe routing algorithm — see below
Models sub-tabThe list of models in this profile (one row each)
Routing sub-tabStrategy-specific tuning (timeouts, retries, fallback chains)
API Keys sub-tab (per profile)Keys scoped to this specific profile

Routing strategies

The available strategies typically include:

  • Org Global Default — uses your organisation’s default model order
  • Latency-based — picks the model with the lowest current latency
  • Cost-based — picks the cheapest model that meets the request’s capability requirements
  • Usage-based / least busy — load-balances across models based on current usage
  • Tag-based — routes based on tags in the request (e.g. “needs-vision”, “needs-long-context”)
  • Failover (priority) — tries models in a fixed priority order, falling back when one is unavailable
  • Round robin — distributes evenly across models

Step-by-step: creating a routing profile for cost optimisation

  1. Click Create New Profile
  2. Name it Cost Optimised
  3. Description: Routes simple requests to Haiku, complex requests to Opus, with Sonnet as a middle ground
  4. Strategy: pick Tag-based (or Usage-based depending on your setup)
  5. Save
  6. Switch to the Models sub-tab
  7. Click Add Model
  8. Add claude-haiku as the default for low-complexity requests
  9. Add claude-sonnet for medium-complexity
  10. Add claude-opus for high-complexity
  11. Configure tags or weights as needed
  12. Save

Step-by-step: failover profile

  1. Create a new profile named Failover
  2. Strategy: Failover (priority)
  3. Add models in priority order — primary first, fallback next
  4. Configure timeouts and retry counts in the Routing sub-tab
  5. Save

When clients use this profile, PebbleRouter tries the primary model first; if it fails or times out, it falls back to the next one in the list.

How profiles relate to PebbleChat’s Auto

When a user picks Auto in their PebbleChat model selector, PebbleChat sends the request to PebbleRouter using the default profile for the organisation. The default profile is typically named Auto and is the one a user would experience.

You can have many profiles — one for chat, one for batch processing, one for testing, one for cost optimisation — and route different parts of your platform through different profiles by changing which profile a client is configured to use.

API Keys tab

PebbleRouter API Keys

Generate API keys clients use to authenticate calls to PebbleRouter.

Three different API key concepts (don’t confuse them)

PebbleAI has three different things called “API Key” — make sure you know which is which:

TypeWhere to find itWhat it authenticates
PebbleRouter API Keys (this tab)Admin → PebbleRouter → API KeysCalls to the routing gateway from external clients
PebbleFlows API KeysUser Settings → API KeysCalls to individual flows you’ve published in PebbleFlows
Per-profile API KeysAdmin → PebbleRouter → Routing Profiles → (a profile) → API Keys sub-tabCalls to the gateway, scoped to a single routing profile

Step-by-step: generate a router API key

  1. Click Generate Key
  2. Give it a descriptive name — Production App, Mobile Backend, CI Pipeline
  3. Optionally restrict it to a specific routing profile, or leave it global
  4. Save
  5. Copy the displayed key value — this is the only time you’ll see it in full
  6. Store it in your application’s secure secrets system

Quick start hint shown in the UI

The page includes a quick start link: “Quick Start: set base URL to <your-router-url> and create your global PebbleRouter key in Console > Cluster, Cluster Code, or Continue.”

This is where you copy the gateway base URL clients should call.

Using a router API key

curl -X POST https://<your-router-url>/v1/chat/completions \
  -H "Authorization: Bearer <your-router-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Auto",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

The model field is the route name (Auto, Cost Optimised, etc.) — PebbleRouter resolves it to a concrete model based on the profile.

Settings tab

PebbleRouter Settings

Gateway-wide configuration that applies to every routing profile.

What you set here

  • Verbose mode — Disabled / Limited / Full. Controls how much detail PebbleRouter logs about each request. Use Full while debugging routing issues, then dial back down — high verbosity has a small performance cost.
  • Cache TTL (s) — How long PebbleRouter should cache identical requests. Caching reduces cost and latency for repeated identical prompts.

Tips

  • Default to Limited verbosity in production, Full when debugging
  • Set cache TTL to 0 if you’re testing prompt changes — caching will hide your changes from view
  • Enable caching when serving public-facing applications with high request volume — it’s a free win

Models composition

Within each routing profile, the Models table at the bottom of the page shows which models are part of the profile. Each row has:

  • Model — the model identifier (matching one in Organisation Models)
  • Provider — provider name
  • Status — Active / Inactive
  • Actions — remove from profile

Click Add Model to add a model that’s enabled in your Organisation Models catalogue.

A model must be in your Organisation Models catalogue first before you can add it to a routing profile. Including with Pebble Router only sharing scope — that’s the scope to use for models that should be in routing profiles but not in user model selectors.

Step-by-step: a complete first-time setup

For a fresh organisation:

  1. Add credentials in Credentials — at least one for each provider (AWS, OpenAI, Anthropic)
  2. Enable models in Organisation Models — pick the half-dozen models you want available
  3. Come here and create the Auto routing profile
  4. Add models to the profile from the enabled catalogue
  5. Pick a strategyOrg Global Default or Latency-based is a good starting point
  6. Save the profile
  7. Generate a router API key (API Keys tab) for any external clients you have
  8. Toggle the Enabled switch at the top of the page
  9. Test from PebbleChat — pick Auto in the model selector and send a message; verify the response works
  10. Watch the Activity Stream to see which model PebbleRouter picked

Troubleshooting

“Auto in PebbleChat shows ‘No Models Available’”

  • Check the Enabled toggle is on
  • Check the Auto profile has at least one model in its Models table
  • Check the model’s credential is valid in Credentials

“Routing always picks the same model even though I have multiple”

  • Strategy may be Failover (priority) with the first model always succeeding — try Latency-based or Round robin instead
  • Check the routing logs in PebbleObserve

“Latency is high”

  • Set verbose mode to Limited or Disabled
  • Enable response caching with a short TTL
  • Check which models are in the profile — slow models drag down latency-based strategies

“I want different routing for different teams”

  • Create separate profiles per team, give each team its own router API key tied to that profile
  • Or use tag-based routing within a single profile, with each team passing a different tag