APAC model routing snapshot
For the live public model inventory, use /models. This page explains APAC pricing and routing behavior.
| Model | Provider | Input / 1M | Output / 1M | Context | APAC regions | Latency (SG/TYO/SYD) | Residency |
|---|---|---|---|---|---|---|---|
| Amazon Titan Embed Text v2 Titan | Amazon | $0.02 | — | 8,192 | Tokyo, Seoul, Taiwan, Mumbai, Sydney, New Zealand | 40ms / 58ms / 51ms | in-region |
| BGE Base English v1.5 Bge | BAAI | $0.02 | $0.02 | 512 | Singapore | 40ms / 58ms / 51ms | in-region |
| BGE Large English v1.5 Bge | BAAI | $0.02 | $0.02 | 512 | Singapore | 40ms / 58ms / 51ms | in-region |
| BGE-M3 (Multilingual) Bge | BAAI | $0.02 | $0.02 | 8,192 | Singapore | 40ms / 58ms / 51ms | in-region |
| Multilingual E5 Large E5 | Intfloat | $0.02 | $0.02 | 512 | Singapore | 40ms / 58ms / 51ms | in-region |
| Amazon Nova Micro Nova | Amazon | $0.04 | $0.14 | 128,000 | Singapore, Sydney, Tokyo | 40ms / 58ms / 51ms | in-region |
| Gemma 3 4B Instruct Gemma | $0.04 | $0.08 | 131,072 | Singapore | 40ms / 58ms / 51ms | in-region | |
| Voxtral Mini 3B Mistral | Mistral | $0.04 | $0.04 | 131,072 | Singapore | 40ms / 58ms / 51ms | in-region |
| Nemotron Nano 9B v2 Nemotron | NVIDIA | $0.04 | $0.16 | 131,072 | Singapore | 40ms / 58ms / 51ms | in-region |
| Google Gemma 3 4B Gemma | $0.05 | $0.10 | 131,072 | Tokyo, Seoul, Taiwan, Mumbai, Sydney, New Zealand | 40ms / 58ms / 51ms | in-region | |
| Nemotron 3 Nano 30B Nemotron | NVIDIA | $0.05 | $0.20 | 262,144 | Singapore | 40ms / 58ms / 51ms | in-region |
| Qwen3 Embedding 8B Qwen | Qwen | $0.05 | $0.05 | 32,768 | Singapore | 18ms / 29ms / 24ms | in-region |
| Voxtral Mini 3B Mistral | Mistral | $0.05 | $0.05 | 131,072 | Tokyo, Seoul, Taiwan, Mumbai, Sydney, New Zealand | 40ms / 58ms / 51ms | in-region |
| Amazon Nova Lite Nova | Amazon | $0.06 | $0.24 | 300,000 | Singapore, Sydney, Tokyo | 40ms / 58ms / 51ms | in-region |
| NVIDIA Nemotron Nano 30B Nemotron | NVIDIA | $0.07 | $0.29 | 131,072 | Tokyo, Seoul, Taiwan, Mumbai, Sydney, New Zealand | 40ms / 58ms / 51ms | in-region |
| NVIDIA Nemotron Nano 9B Nemotron | NVIDIA | $0.07 | $0.28 | 131,072 | Tokyo, Seoul, Taiwan, Mumbai, Sydney, New Zealand | 40ms / 58ms / 51ms | in-region |
| OpenAI GPT-OSS 20B Gpt | OpenAI | $0.07 | $0.31 | 131,072 | Jakarta, Singapore, Malaysia, Thailand, Tokyo, Seoul, Taiwan, Mumbai, Sydney, New Zealand | 40ms / 58ms / 51ms | in-region |
| Amazon Titan Embed Image v1 Titan | Amazon | $0.08 | — | 128 | Mumbai, Sydney, New Zealand | 40ms / 58ms / 51ms | in-region |
| Gemma 3 27B Instruct Gemma | $0.08 | $0.45 | 131,072 | Singapore | 40ms / 58ms / 51ms | in-region | |
| Gemma 3 27B Pretrained Gemma | $0.08 | $0.45 | 131,072 | Singapore | 40ms / 58ms / 51ms | in-region |
Showing 1-20 of 98 models
Models & regions
Proprietary and Brightnode-hosted models are available in APAC today, with additional regions and model families rolling out.
Proprietary models
Sydney
- Amazon Nova Lite
- Amazon Nova Micro
- Amazon Nova Pro
- Amazon Titan Embed Image v1
- Amazon Titan Embed Text v2
plus 51 more models in this region
Best for: Production chat, agents, long context
Proprietary models
Singapore
- Amazon Nova 2 Lite
- Amazon Nova Lite
- Amazon Nova Micro
- Amazon Nova Pro
- Claude 3 Haiku
plus 29 more models in this region
Best for: Southeast Asia latency, data residency
Brightnode-hosted
Singapore
- BGE Base English v1.5
- BGE Large English v1.5
- BGE-M3 (Multilingual)
- Gemma 3 12B Instruct
- Gemma 3 27B Instruct
plus 32 more models in this region
Best for: Cost-effective inference, full control, same API
Specify model in your request; we route to the right region and provider automatically.
Get started in minutes
OpenAI-compatible API. Swap the base URL and use your existing code.
Sign up at the console, create an API key, and add credits. No long-term contract.
Console → API keys# Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
base_url="https://api.brightnode.cloud/v1",
api_key="YOUR_BRIGHTNODE_API_KEY",
)
response = client.chat.completions.create(
model="meta-llama/Llama-3.3-70B-Instruct",
messages=[{"role": "user", "content": "Hello from APAC."}],
)Same for Node, curl, or any OpenAI-compatible client. We support streaming and embeddings.
Full API reference, model list, and rate limits are in our docs. For agent frameworks, just change the base URL.
Documentation