Lightweight AI Gateway
One API key, works everywhere. A universal, OpenAI-compatible AI gateway providing access to 40 frontier and open & specialized models.
Getting Started
Connecting to Lightweight is incredibly simple. Because our API is 100% compatible with OpenAI's API format, any tool or library that supports OpenAI can connect to Lightweight by changing just two settings.
- Obtain your API key from the Lightweight Dashboard.
-
Set your tool's API base URL to
https://api.lightweight.one/v1 - Use your Lightweight key in place of an OpenAI key.
For standard shell environments, simply export these variables:
export OPENAI_API_KEY="lw-your-key-here"
export OPENAI_BASE_URL="https://api.lightweight.one/v1"
API Reference
The Lightweight API strictly follows the OpenAI specification, meaning no new SDKs are required. You can interact with it using standard HTTP requests or any OpenAI client library.
-
Base URL:
https://api.lightweight.one/v1 -
Authentication: Bearer token via
Authorizationheader using yourlw-...key.
Chat Completions
POST /v1/chat/completions
Generate text or code based on conversational context.
curl "https://api.lightweight.one/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer lw-your-key-here" \
-d '{
"model": "gpt-5",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Write a quicksort in Python."}
],
"stream": true
}'
Streaming Support: Set
"stream": true in your request to receive Server-Sent
Events (SSE) exactly matching OpenAI's streaming format.
List Models
GET /v1/models
Returns a list of all currently accessible models on your account.
curl "https://api.lightweight.one/v1/models" \
-H "Authorization: Bearer lw-your-key-here"
Check Usage
GET /v1/usage
Returns your current token consumption and remaining quota.
Error Codes
| Status | Description |
|---|---|
401 Unauthorized |
Invalid, revoked, or missing API key. |
403 Forbidden |
Account limit reached, or model access denied. |
429 Too Many Requests |
Rate limit exceeded. Check Retry-After header.
|
502 Bad Gateway |
Upstream provider error or timeout. |
Models Catalog
We provide unified access to 40 state-of-the-art models. Use the
exact string in the "Model ID" column for the
model parameter in your API requests.
Frontier Models (16)
| Model ID | Context | Status |
|---|---|---|
gpt-5 | 128K | GA |
gpt-5-chat | 128K | GA |
gpt-5-mini | 128K | GA |
gpt-5-nano | 128K | GA |
gpt-4.1 | 1M | GA |
gpt-4.1-mini | 1M | GA |
gpt-4.1-nano | 1M | GA |
gpt-4o | 128K | GA |
gpt-4o-mini | 128K | GA |
o3 | 200K | GA |
o3-mini | 200K | GA |
o4-mini | 200K | GA |
o1 | 200K | GA |
o1-mini | 128K | GA |
grok-3 | 131K | GA |
grok-3-mini | 131K | GA |
Open & Specialized Models (24)
| Model ID | Context | Status |
|---|---|---|
llama-4-maverick-17b-128e-instruct-fp8 | 256K | GA |
llama-4-scout-17b-16e-instruct | 512K | GA |
meta-llama-3.1-405b-instruct | 128K | GA |
meta-llama-3.1-8b-instruct | 128K | GA |
llama-3.3-70b-instruct | 128K | GA |
llama-3.2-11b-vision-instruct | 128K | GA |
llama-3.2-90b-vision-instruct | 128K | GA |
phi-4 | 16K | GA |
phi-4-mini-instruct | 128K | GA |
phi-4-mini-reasoning | 128K | GA |
phi-4-multimodal-instruct | 128K | GA |
phi-4-reasoning | 16K | GA |
mai-ds-r1 | 32K | GA |
codestral-2501 | 256K | GA |
ministral-3b | 128K | GA |
mistral-medium-2505 | 128K | GA |
mistral-small-2503 | 32K | GA |
deepseek-r1 | 164K | GA |
deepseek-r1-0528 | 164K | GA |
deepseek-v3-0324 | 128K | GA |
cohere-command-a | 128K | GA |
cohere-command-r-08-2024 | 128K | GA |
cohere-command-r-plus-08-2024 | 128K | GA |
Integration Guides
Because Lightweight perfectly mimics the OpenAI API, you can seamlessly connect practically any AI coding tool, editor, or framework. Find your tool below.
OpenCode
OpenCode supports generic OpenAI-compatible endpoints directly in its settings.
- Open OpenCode Settings.
- Find the AI Provider section and select OpenAI-compatible.
-
Set Base URL to
https://api.lightweight.one/v1. -
Set API Key to your
lw-...key. - Save and test a prompt.
Tip: You can manually type the model name like
gpt-4.1 if it doesn't appear in a dropdown.
Cursor
Cursor allows overriding the default OpenAI Base URL for custom proxy routing.
- Open Cursor Settings.
- Navigate to Models > OpenAI.
- Toggle "Override OpenAI Base URL".
-
Set the URL to
https://api.lightweight.one/v1. -
Paste your
lw-...key in the OpenAI API Key field. - Ensure external infrastructure integration toggles are off.
Aider
Aider is an AI pair programming tool in your terminal. You can pass the API settings as CLI arguments or environment variables.
aider --openai-api-key "lw-your-key-here" --openai-api-base "https://api.lightweight.one/v1" --model "gpt-4o"
Alternatively, export OPENAI_API_KEY and
OPENAI_API_BASE in your shell profile.
Zed
Zed's built-in AI assistant can be configured via your
settings.json.
- Open Zed and edit your
settings.json. - Add or update the
language_modelsblock:
"language_models": {
"openai": {
"api_url": "https://api.lightweight.one/v1",
"api_key": "lw-your-key-here"
}
}
Continue
Continue is an open-source IDE extension.
- Open your
~/.continue/config.jsonfile. -
In the
modelsarray, configure an OpenAI provider:
{
"models": [
{
"title": "Lightweight Opus",
"provider": "openai",
"model": "gpt-4.1",
"apiKey": "lw-your-key-here",
"apiBase": "https://api.lightweight.one/v1"
}
]
}
Cline
Cline perfectly matches any OpenAI-compatible endpoint.
- Open Cline's settings panel.
- Under API Provider, select OpenAI Compatible.
-
Set Base URL to
https://api.lightweight.one/v1. - Paste your
lw-...key in the API Key field. - Specify the Model ID (e.g.,
gpt-5).
Windsurf
Windsurf can be configured to use external API gateways.
- Open Windsurf settings.
- Navigate to the API or Models configuration section.
- Select Custom API Endpoint.
-
Enter the Base URL:
https://api.lightweight.one/v1. - Input your Lightweight API key.
Ollama
If you use tools that expect an Ollama-style proxy but want to route them through Lightweight, set these environment variables before starting the tool.
export OLLAMA_OPENAI_BASE_URL="https://api.lightweight.one/v1"
export OLLAMA_OPENAI_API_KEY="lw-your-key-here"
OpenAI SDK (Python)
The official Python SDK requires no modifications other than instantiating the client with our base URL and key.
from openai import OpenAI
client = OpenAI(
base_url="https://api.lightweight.one/v1",
api_key="lw-your-key-here"
)
response = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
OpenAI SDK (Node.js)
Similarly, the Node.js/TypeScript SDK natively supports custom base URLs.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.lightweight.one/v1",
apiKey: "lw-your-key-here",
});
async function main() {
const completion = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);
}
main();
LangChain
Use the standard ChatOpenAI class in LangChain,
overriding the base URL parameter.
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="https://api.lightweight.one/v1",
api_key="lw-your-key-here",
model="gpt-4.1"
)
print(llm.invoke("Explain quantum entanglement.").content)
LiteLLM
When using LiteLLM to proxy requests locally, direct the OpenAI proxy target to Lightweight.
litellm --model openai/gpt-4.1 \
--api_base "https://api.lightweight.one/v1" \
--api_key "lw-your-key-here"
Any OpenAI-Compatible Tool
For any other application that doesn't have an explicit UI setting, environment variables are universally respected by the underlying OpenAI SDKs.
-
Find where the tool loads its environment variables (e.g., a
.envfile, a shell script, or system environment settings). - Set
OPENAI_API_KEY=lw-your-key -
Set
OPENAI_BASE_URL=https://api.lightweight.one/v1 - Restart the application.
Rate Limits
To ensure fair usage across the beta, we enforce RPM (Requests Per
Minute) limits based on your account tier. When you exceed the
limit, you will receive a
429 Too Many Requests response.
| Account Tier | Requests Per Minute (RPM) |
|---|---|
| Beta Access | 60 RPM |
| Pro Tier | 250 RPM |
| Enterprise | Custom |
If you hit a rate limit, the response headers will include a
Retry-After value indicating how many seconds you
should wait before retrying.
Frequently Asked Questions
What models are available?
We provide unified access to 40 models, including 16 frontier and 24 open & specialized models. See the complete catalog for exact model IDs.
Is this OpenAI-compatible?
Yes. Lightweight is designed as a drop-in replacement. Any tool, library, or codebase that can talk to OpenAI can talk to Lightweight simply by changing the Base URL and API Key.
How do I check my usage?
You can use the Dashboard UI, or query
the GET /v1/usage endpoint programmatically.
What happens when I hit my limit?
You will receive a 403 Forbidden HTTP status code with
a clear message indicating your quota has been exhausted.
Can I use streaming?
Yes. Add "stream": true to your
/chat/completions payload. Responses will be streamed
using Server-Sent Events (SSE) following the exact OpenAI format.
How do I get an invite?
Lightweight is currently in an invite-only beta. Reach out in our Discord or community channels to request access.