Your idle GPUs.
Our inference workloads.
BrightNode brings the developers. You provide the compute. Revenue flows when GPUs are utilised, with zero commitment required to start.
Representative economics
Example at Tier 1: one 8x A100 80GB cluster at 60% utilisation can generate roughly USD $18,000-$30,000/month before power, space, and hardware costs. Actual realised rates vary by workload mix, region, and latency profile.
5 priority markets
APAC regions
Zero
Commitment to start
Tier 1 option
Revenue share model
~8 weeks
Time to production
APAC inference demand is
accelerating fast
Enterprise and developer adoption of LLMs, embedding models, and multimodal AI is creating sustained demand for GPU compute, particularly NVIDIA A100, H100, L4, and T4 accelerators.
Much of the GPU capacity deployed in the region sits underutilised outside peak training windows. BrightNode channels real-time inference workloads to partner infrastructure, monetising capacity that would otherwise generate zero return.
BrightNode brings the workloads
We manage all customer relationships, billing, and metering.
You provide the GPUs
Power, cooling, and hardware, that's all we need from you.
Revenue flows on utilisation
No upfront cost to partners. You earn when your GPUs serve inference.
Who is BrightNode
Operator-led, APAC focused
BrightNode is a Singapore-based AI infrastructure company founded by a repeat Southeast Asia technology operator who previously built and sold Aquient Pte Ltd. We run inference infrastructure on Google Cloud today and are expanding our APAC backbone through data centre partnerships.
Current traction
Live workloads, not a slide deck
BrightNode already serves production inference traffic from Singapore across chat, embeddings, transcription, and multimodal use cases. Our platform includes a 100+ model catalog and benchmarked live serving paths, giving partners immediate workload categories to absorb.
Start on-demand, scale with confidence
Designed for minimal risk on both sides. Most partnerships begin at Tier 1 and transition to Tier 2 within 3–6 months based on utilisation data.
Indicative rates shown for modeling. Final pricing reflects market conditions, geography, latency requirements, and workload profile.
Tier 1
On-Demand
Start immediately with zero commitment. BrightNode sends workloads when your GPUs are available.
Pricing
Indicative Tier 1: A100 80GB at USD $1.80-$2.60/GPU-hr, H100 80GB at USD $3.20-$4.90/GPU-hr
Commitment
None, month-to-month, pause any time
Tier 2
Reserved Capacity
BrightNode commits to a minimum monthly GPU-hour volume. You guarantee capacity availability.
Pricing
15-25% below Tier 1 effective rates (monthly commit)
Commitment
3–6 month terms, 500–2,000 GPU-hrs/month minimum
Tier 3
Dedicated Allocation
Specific GPU nodes allocated exclusively to BrightNode workloads, always-on, highest predictability.
Pricing
30-45% below Tier 1 effective rates (fixed dedicated capacity)
Commitment
6–12 month terms, fixed node count
Built for both sides to win
Monetise idle GPUs
Turn unused GPU capacity into a revenue stream with zero sales effort. BrightNode brings the customers and workloads.
Zero commitment to start
Begin with on-demand access at Tier 1. No minimum volumes, no contracts. Scale only when the data justifies it.
Predictable demand growth
AI inference is a sustained, growing workload, not a one-off burst. As our developer base compounds, so does your utilisation.
No operational overhead
BrightNode manages all software, model deployment, scaling, monitoring, and billing. You provide power, cooling, and hardware.
APAC-first positioning
Ride the wave of AI adoption in Southeast Asia, India, Japan, and Australia. BrightNode is building the inference backbone for the region.
Transparent metering
Full visibility into GPU utilisation, workload types, and revenue via a partner dashboard. Usage events are signed and retained in immutable audit logs for partner reconciliation.
Minimal friction, maximum compatibility
Our platform is designed to integrate with partner infrastructure quickly. The core requirements below cover hardware, connectivity, and operations.
Supported GPU hardware
| GPU | Primary use case | Min VRAM | Priority |
|---|---|---|---|
| NVIDIA H100 | Large language models, high-throughput inference | 80 GB HBM3 | High |
| NVIDIA A100 (80GB) | Large language models (70B+), training runs | 80 GB HBM2e | High |
| NVIDIA L4 | Mid-size models (7–32B), embeddings, Whisper | 24 GB GDDR6 | Medium |
| NVIDIA T4 | Embeddings, small models, audio transcription | 16 GB GDDR6 | Medium |
Connectivity
- → Secure network link to BrightNode control plane (VPN, private interconnect, or direct peering)
- → Kubernetes-compatible orchestration (we can deploy our own K8s or integrate with yours)
- → Low-latency path to APAC end users (Singapore, Mumbai, Tokyo, Sydney preferred)
- → NVMe or high-speed SSD for model weight caching (2 GB – 150 GB per model)
Operational requirements
- → 24/7 availability for Tier 2 and Tier 3 commitments (Tier 1 can be interruptible)
- → 4-hour hardware fault response SLA target for Tier 3
- → Power and cooling to spec (6–10 kW per node for A100/H100)
- → Physical security and access controls consistent with enterprise data centre standards
What runs on your hardware
BrightNode routes these workload categories to partner infrastructure based on GPU type and availability.
LLM Inference
PrimaryReal-time chat completions and text generation using open-weight models (Llama, Qwen, Mistral, DeepSeek). Served via vLLM. Highest volume category.
Embedding Generation
Vector embedding computation for RAG pipelines and semantic search. Lower GPU intensity, very high request volume.
Audio Transcription
Speech-to-text using Faster Whisper models. Moderate GPU requirement, bursty demand pattern.
Training & Fine-Tuning
ComingLonger-running, higher-value workloads as the partnership matures. Benefits from Tier 3 dedicated capacity.
Priority regions for 2026 capacity
If you operate in one of these metros, you are in our active partner pipeline for current-year deployment.
Singapore
Expanding existing capacity
Tokyo
Highest priority new region
Sydney
Active demand coverage
Mumbai
Low-latency India route
Jakarta
Southeast Asia growth
From conversation to live traffic
Designed to be fast and low-friction. Most partners are live with production inference traffic within eight weeks.
Initial conversation
Week 1Align on GPU types, capacity, and location. Agree on Tier 1 on-demand terms.
Technical assessment
Weeks 2–3Validate network connectivity, GPU readiness, and storage. BrightNode engineering works with your ops team.
Pilot deployment
Week 4BrightNode deploys a lightweight Kubernetes-based orchestration layer (vLLM serving, model weight caching, health monitoring) and routes initial test workloads to your GPUs. We manage software lifecycle; you manage hardware and connectivity.
Production ramp
Weeks 5–8Begin routing live inference traffic. Monitor utilisation, performance, and revenue together.
Commitment review
Month 3Evaluate utilisation data and discuss Tier 2 transition if volumes warrant it.
Have GPU capacity?
Let's put it to work.
Whether you have a handful of idle A100s or an entire GPU cluster, BrightNode can route workloads to your infrastructure. Start with zero commitment at Tier 1.
Technical deep-dives available on request
