Overview
Adding Your First Node
Desktop (Windows/Mac/Linux)
- Download RemoteXPU from rxpu.nixosoft.com/download
- Install and launch β it auto-starts in the system tray
- Sign in with Google β your node appears here automatically
Headless Server (Linux/macOS)
curl -fsSL https://rxpu.nixosoft.com/setup-node.sh | bash
Then enter your cluster API key. Get it from rxpu.nixosoft.com β Settings β API Keys
Accessing a Remote Node's Dashboard
Once nodes are connected on VPN, click "Dashboard" on any node in the table below. This proxies through your local RXPU daemon β VPN must be connected.
CPU Servers (LiteLLM Routers)
curl -fsSL https://rxpu.nixosoft.com/setup-node.sh | bash on your CPU server.
| Server | IP:Port | CPU | RAM | LiteLLM | Health | XPU Nodes | Status | Last Seen |
|---|---|---|---|---|---|---|---|---|
|
GB total
GB used
|
GPU Model Breakdown
| GPU Model | Count | Total VRAM |
|---|---|---|
| GB |
Online Nodes
Auto-refreshes every 10sXPU Nodes
| Node Name β | Owner | GPU Model | VRAM | VPN IP β | Mode | IP:Port | Status | Last Heartbeat | Remote Access | Actions |
|---|---|---|---|---|---|---|---|---|---|---|
|
OS Platform
Driver Version
CUDA Version
Registered
Mode
Pool Min VRAM
Pool GPU Filter
API Key
Node ID
User ID
|
||||||||||
Cluster Models β
| Model | Size | Runs On | Compatible Devices | Speed / Quality | API Key | Enabled | Actions |
|---|---|---|---|---|---|---|---|
|
π GPU
π΅ CPU
π£ NPU
|
/
/
β
|
|
|
|
Assign XPU Node to CPU Server
Node:
Set Pool Filters
Node:
Add Cluster Model
What are you trying to do?
Manually configure a model. It will auto-download to XPU nodes when Ollama starts.
Edit Model β
β‘ Cross-Device Benchmark β
What benchmarks tell you
Run the same model on different nodes to compare real-world inference speed. Measures time-to-first-token (TTFT) and tokens/second.
Run Benchmark
Benchmark Results
| Node | Model | Compute | TTFT | Tok/sec | Total Time | Status | When | Actions |
|---|---|---|---|---|---|---|---|---|
Model Comparison
No comparison data available for this model.
π‘οΈ Remote Diagnostics β
Nodes with Issues
No active alerts β all nodes are healthy! π
Suggestion
Recent Diagnostic Logs
No diagnostic logs yet.
| Time | Node | Level | Message |
|---|---|---|---|
Send Message to Node
πΊοΈ Node Map
Node Locations
| Node | Location | ISP | GPU | Status |
|---|---|---|---|---|
| Online Offline |
Add a Node
One script handles everything. Your token is embedded automatically β just copy and run.
Anchor Server Run first
Public Linux server β WireGuard VPN hub + compute node. All other nodes connect through it.
Compute Node
Any Linux/macOS server with GPU, CPU, or NPU. Connects to the Anchor and shares its resources.
curl -fsSL https://rxpu.nixosoft.com/setup-node.sh | bash
Desktop App
Windows, macOS, Linux β sign in with Google, everything is automatic.
Architecture
Registry (rxpu.nixosoft.com) β auth & accounting only β βββΊ Anchor Server β WireGuard hub + compute node β set up first β nodes behind NAT connect through here β βββΊ Node (GPU/CPU/NPU) β tunnels to Anchor, shares resources βββΊ Desktop App β same, with GUI
Use Your Cluster
API Endpoint
API Keys
No enabled models.
Manage keys and models on the page.
Code Snippets
How Load Balancing Works
Your cluster automatically distributes requests across all online GPU nodes.
β’ Multiple requests run in parallel β one per GPU
β’ LiteLLM routes to the least-busy node automatically
β’ If a node goes offline, requests route to remaining nodes
β’ Add more GPU nodes (same Google account) to increase capacity
β’ Add more CPU servers to increase API throughput
Add a CPU Server (LiteLLM Router)
A CPU server acts as the API gateway β it receives requests and routes them to your GPU nodes. You can have multiple CPU servers for redundancy.
Linux / macOS server:
curl -fsSL https://rxpu.nixosoft.com/setup-node.sh | bash
What it does:
β’ Installs LiteLLM router
β’ Connects to rxpu.nixosoft.com to discover your GPU nodes
β’ Starts an OpenAI-compatible API on port 4000
β’ Auto-updates its node list every restart
Multiple CPU servers:
Run the same command on any Linux server. Each becomes an independent API endpoint.
Useful for: different regions, redundancy, high availability.
Direct API test:
GPU Node Utilization
This shows which GPUs are available to serve your requests. Refreshes every 10s.
No online GPU nodes with VPN IPs found.
Settings
Current configuration loaded from environment variables. To modify, update the .env file and restart the service.
How to Update
1. SSH into the server hosting the Docker container
2. Edit the .env file in the registry-service directory
3. Restart the container:
docker compose up -d shared-gpu-registry