Pricing

Inference-first pricing with clear paths for dedicated capacity and GPU Workspaces.

Transparent pricing for every deployment path

Inference first. Dedicated endpoints for enterprise traffic. GPU Workspaces when you need to build before serving.

Uses model pricing from this repository's catalog. No synthetic cross-provider discount assumptions.

ModelInput tokens/requestOutput tokens/requestRequests/month

Selected model usage

$5,025.00

Monthly estimate

Dedicated endpoints

Coming soon

Reserved capacity pricing will be published at launch.

For proprietary-routed models, token rates are pass-through from provider list pricing; Brightnode adds routing, latency, and observability layers.

Popular live models pulled from the repository model catalog.

Family	Model	Route	Context	Input	Output	Region	Status
Claude	Claude Sonnet 4	Proprietary models	200,000	$3.00	$15.00	Singapore, Sydney, Tokyo, Thailand, Malaysia, Jakarta, New Zealand, Seoul, Taiwan, Mumbai	Live
Claude	Claude Haiku 4.5	Proprietary models	200,000	$1.00	$5.00	Singapore, Jakarta, Malaysia, Thailand, Tokyo, Seoul, Taiwan, Mumbai, Sydney, New Zealand	Live
Llama	Llama 3.3 70B Instruct	Brightnode-hosted	131,072	$0.22	$0.50	Singapore	Live
Qwen	Qwen3 32B	Brightnode-hosted	131,072	$0.10	$1.20	Singapore	Live
Deepseek	DeepSeek V3	Proprietary models	163,840	$0.60	$1.74	Jakarta, Singapore, Malaysia, Thailand, Tokyo, Seoul, Taiwan, Mumbai, Sydney, New Zealand	Live
Mistral	Mistral Nemo	Brightnode-hosted	131,072	$0.15	$0.15	Singapore	Live

Secondary product lane for fine-tuning and eval before deploy-to-inference.

GPU	vRAM	Price from	Region	Best for
T4	16GB	$0.50/hr	Singapore	ComfyUI, prototyping, lightweight model work
L4	24GB	$0.87/hr	Singapore	Embedding pipelines and medium-size inference tests
A100	80GB	$4.01/hr	Singapore	Fine-tuning, evaluation suites, 70B+ experimentation
H100	80GB	$14.29/hr	Singapore	Heavy training and high-throughput pre-production validation
B200	180GB	On request	Singapore	Frontier-scale workloads with reserved capacity

Enterprise-grade reserved capacity with regional deployment control.

A100 80GB · Singapore

Dedicated throughput for production chat and agents

Coming soon

H100 80GB · Singapore

High-throughput enterprise inference and heavy traffic

Coming soon

Pay per second. No hourly minimums. No commitments.

Network egress: free within APAC regions

Persistent storage: $0.044/GB/month

Try It Risk-Free

No credit card required

$100 trial credit on signup

Deploy in 60 seconds

Pre-configured workloads ready to run

Delete anytime

One click to stop, pay only for what you use