OpenAI-compatible providers
Configure any LLM provider that provides OpenAI-compatible endpoints with agentgateway. This includes providers like xAI (Grok), Cohere, Ollama, Together AI, Groq, and many others.
xAI (Grok)
xAI provides OpenAI-compatible endpoints for their Grok models.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: api.x.ai
backendTLS: {}
backendAuth:
key: $XAI_API_KEY
backends:
- ai:
name: xai
hostOverride: api.x.ai:443
provider:
openAI:
model: grok-2-latestCohere
Cohere provides an OpenAI-compatible endpoint for their models.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: api.cohere.ai
path:
full: "/compatibility/v1/chat/completions"
backendTLS: {}
backendAuth:
key: $COHERE_API_KEY
backends:
- ai:
name: cohere
hostOverride: api.cohere.ai:443
provider:
openAI:
model: command-r-plusOllama (Local)
Ollama runs models locally and provides an OpenAI-compatible API.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: localhost:11434
backends:
- ai:
name: ollama
hostOverride: localhost:11434
provider:
openAI:
model: llama3.2Together AI
Together AI provides access to open-source models via OpenAI-compatible endpoints.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: api.together.xyz
backendTLS: {}
backendAuth:
key: $TOGETHER_API_KEY
backends:
- ai:
name: together
hostOverride: api.together.xyz:443
provider:
openAI:
model: meta-llama/Llama-3.2-90B-Vision-Instruct-TurboGroq
Groq provides fast inference via OpenAI-compatible endpoints.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: api.groq.com
path:
full: "/openai/v1/chat/completions"
backendTLS: {}
backendAuth:
key: $GROQ_API_KEY
backends:
- ai:
name: groq
hostOverride: api.groq.com:443
provider:
openAI:
model: llama-3.3-70b-versatileDeepSeek
DeepSeek provides access to their reasoning and chat models via OpenAI-compatible API.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: api.deepseek.com
backendTLS: {}
backendAuth:
key: $DEEPSEEK_API_KEY
backends:
- ai:
name: deepseek
hostOverride: api.deepseek.com:443
provider:
openAI:
model: deepseek-chatMistral
Mistral La Plateforme provides access to Mistral models via OpenAI-compatible endpoints.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: api.mistral.ai
backendTLS: {}
backendAuth:
key: $MISTRAL_API_KEY
backends:
- ai:
name: mistral
hostOverride: api.mistral.ai:443
provider:
openAI:
model: mistral-large-latestPerplexity
Perplexity provides OpenAI-compatible chat completion endpoints.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: api.perplexity.ai
backendTLS: {}
backendAuth:
key: $PERPLEXITY_API_KEY
backends:
- ai:
name: perplexity
hostOverride: api.perplexity.ai:443
provider:
openAI:
model: llama-3.1-sonar-large-128k-onlineFireworks AI
Fireworks AI offers fast inference for open-source models via OpenAI-compatible API.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: api.fireworks.ai
path:
full: "/inference/v1/chat/completions"
backendTLS: {}
backendAuth:
key: $FIREWORKS_API_KEY
backends:
- ai:
name: fireworks
hostOverride: api.fireworks.ai:443
provider:
openAI:
model: accounts/fireworks/models/llama-v3p1-70b-instructSelf-Hosted Solutions
vLLM
vLLM is a high-performance LLM serving engine for self-hosted deployments.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: vllm-server.example.com:8000 # Your vLLM server
backendTLS: {} # Include if vLLM uses HTTPS
backends:
- ai:
name: vllm
hostOverride: vllm-server.example.com:8000
provider:
openAI:
model: meta-llama/Llama-3.1-8B-Instruct # Model loaded in vLLMhttp://localhost:8000 and omit the backendTLS policy.Starting vLLM server:
vllm serve meta-llama/Llama-3.1-8B-Instruct \
--host 0.0.0.0 \
--port 8000LM Studio
LM Studio provides a desktop UI for running models locally with OpenAI-compatible API.
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: localhost:1234 # LM Studio default port
backends:
- ai:
name: lmstudio
hostOverride: localhost:1234
provider:
openAI:
model: llama-3.2-3b-instruct # Model loaded in LM StudioGeneric configuration
For any OpenAI-compatible provider, use this template:
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
listeners:
- routes:
- policies:
urlRewrite:
authority:
full: <provider-api-host>
path:
full: "<provider-chat-endpoint>" # Often /v1/chat/completions
backendTLS: {} # Include if provider uses HTTPS
backendAuth:
key: $PROVIDER_API_KEY
backends:
- ai:
name: <provider-name>
hostOverride: <provider-api-host>:443
provider:
openAI:
model: <model-name>Configuration reference
| Setting | Description |
|---|---|
urlRewrite |
Configure a policy to rewrite the URL of the upstream requests to match your LLM provider. |
authority |
Set the default hostname authority to forward incoming requests. |
path |
Rewrite the path to the appropriate LLM provider endpoint. This setting is optional if requests on the provider hostname are already sent on this path. |
backendTLS |
Optionally configure a policy to use TLS when connecting to the LLM provider. |
backendAuth |
You can optionally configure a policy to attach an API key that authenticates to the LLM provider on outgoing requests. If you do not include an API key, each request must authenticate per the LLM provider requirements. |
ai.name |
The name of the LLM provider for this AI backend. |
ai.hostOverride |
Override the hostname. If not set, the hostname defaults to OpenAI (api.openai.com). This setting is optional if the hostname is already set in the URL rewrite policy’s authority setting. |
ai.provider.openAI.model |
Optionally set the model to use for requests. If not set, the request must include the model to use. |