Providers & connections
Pyre is bring-your-own-key (BYOK): it does nothing until you point it at a model provider. You add an endpoint and a key once, and every request goes straight from your device to that provider over HTTPS — there is no Pyre server in the middle.
How the connection works
Pyre talks to any OpenAI-compatible endpoint — OpenRouter, OpenAI, NanoGPT, a self-hosted/local server, a community proxy, anything that speaks the standard chat-completions protocol. The connection is direct: your messages travel exactly one hop, from your device to the provider you chose. Nobody else sees your traffic, and Pyre never proxies it.
You manage all of this under More → API Connections.
Provider kinds
When you add a provider you choose a kind. This is purely a UI grouping to keep your list tidy — all three kinds use the same OpenAI-compatible protocol under the hood, so the kind you pick does not change how requests are made.
| Kind | What it's for |
|---|---|
| External | First-party hosted services (e.g. OpenRouter, OpenAI, NanoGPT). |
| Proxy | A community SillyTavern-style URL + key. |
| Localhost | A self-hosted model running on the same machine or your LAN. |
Localhost / self-hosted servers
If you run your own model server (for example a local OpenAI-compatible server), choose the Localhost kind. A few things work differently here:
- The API key is usually optional. Most local servers ignore it — leave it blank if yours does.
- The server auto-loads the model you pick. Browse the server's model list, pick one, and many local servers load that model on demand when Pyre requests it. (Some servers ignore the requested name and use whatever model is already loaded.)
- Warm-up on launch. Pyre can fire a tiny preload request when the app starts so the model is ready by the time you chat, instead of paying the cold-load wait on your first message. This is a localhost-only option, controllable per provider.
- Longer timeouts. Local servers can take a while to hold a connection open while a cold model loads, and local inference is often slower than a hosted API, so Pyre uses more patient timeouts for localhost providers than for cloud ones.
Note
Pyre browses a server's models via its /models endpoint. That list can include non-chat models (such as embedding models) — pick a chat- or vision-capable one.
How to add a provider
- Open More → API Connections and add a new provider.
- Choose its kind (External / Proxy / Localhost).
- Paste the endpoint URL and your API key (the key is optional for most localhost servers).
- Browse models to fetch the provider's model list, then pick a model — or type the model name yourself.
Tip
Pyre uses smart URL handling, so you don't have to be precise about the /v1 suffix — it won't produce a doubled /v1/v1/ if you paste a base URL that already includes it.
Multiple providers and switching
You can keep several providers configured at once and switch which one is active for chat at any time. The Creator can also point at a different recommended model than the one you chat with — handy when a model that's great at roleplay isn't the best at building a card, or vice versa.
Per-provider extra params
Each provider has a free-form extra params field: JSON that Pyre merges into the request body. This is the escape hatch for provider-specific flags Pyre doesn't model directly — you don't have to wait for app support to send them.
The most common use is the per-family "disable reasoning" flag, which differs between model families:
| Model family | Typical use |
|---|---|
| Qwen | Turn its reasoning step off |
| OpenAI o-series | Turn its reasoning step off |
| Grok | Turn its reasoning step off |
| DeepSeek R1 | Turn its reasoning step off |
The exact key/value depends on the family and provider — consult your provider's docs for the field it expects.
Warning
Pyre-managed fields always win on conflict. If your extra-params JSON sets a field Pyre already controls (such as a sampling parameter it manages), Pyre's value takes precedence.
Smart provider fallback
When a generation fails or is refused and you have another provider configured, Pyre offers to switch to the next provider in your chain — so a down or refusing provider doesn't dead-end your scene. This always asks first; Pyre never switches silently.
- Conservative refusal detection. Pyre only treats a reply as a refusal when it is short, carries no roleplay markup (
*actions*/"dialogue"), and contains an English refusal phrase. In-character "as an AI" lines and emotional "I'm sorry" beats won't false-trigger it. - Self-learning history. Pyre tracks which providers tend to refuse, so it can suggest one that "tends to handle this better" — counted at most once per message to avoid inflation.
- Opt-out. A toggle in API Connections → Advanced disables the prompt entirely; behavior is then identical to having no fallback.
Context-window display
Pyre makes a best-effort attempt to auto-detect the active model's context length from the provider's /models endpoint (it scans the field variants used by different servers). When auto-detection can't determine it, you can set a manual override — a universal escape hatch. Knowing the context size helps you judge how much room you have for memory and presets.
Connection validation & security
Pyre guards the connection so your key and your screen stay clean:
- Header sanitization blocks CR/LF (carriage-return / line-feed) injection in request headers.
- Secret scrubbing redacts leaked
Bearer/sk-tokens from provider error bodies before any error message reaches the screen — so your key never lands on-screen or in a screenshot.
Tip
Free options exist if you want to try Pyre before paying for a model — some providers (for example OpenRouter) offer free tiers, and a local server costs nothing to run. These are examples, not endorsements; use whatever provider you trust.
See also
- AI Creator — which can point at a different recommended model than your chat provider.
- Presets & sampling — the per-chat tuning that layers on top of your provider.
- Data & keys — where your API keys live and how backups handle them.