The cluster, right now

Live numbers from the BharatCode serving VM: vLLM throughput, queue depth, model list, and GPU telemetry. Current counters come directly from the serving VM, while the graph tracks serving-slot capacity and queue depth over time.

SERVING LIVE1 GPU · bharatcode-a100
A100 VM1x NVIDIA A100 40GB
0% load31C
OPERATIONAL HISTORY5-minute avg
0%5-minute averaged serving capacity
0%instantaneous GPU hardware load
Serving capacity utilization
100%50%0%28 May, 04:2528 May, 16:2529 May, 04:25
Average queue depth
32028 May, 04:2528 May, 16:2529 May, 04:25
TOP TOKEN CONSUMERS
1Shivani Raja13,60,59,651
2pradeep reddy1,30,86,664
3Ritvik Sharma89,18,583
4Ishaan Kesarwani18,40,819
5PANDHRINATH RAHUL17,08,775
6Abhishek Keshri1,59,727
7vaibhav tiwari1,894
8Anonymous76
STEWARDSHIP SCORE
1Shivani Raja1,120
2pradeep reddy799
3Ritvik Sharma771
4Ishaan Kesarwani700
5PANDHRINATH RAHUL685
6Abhishek Keshri544
7vaibhav tiwari336
8Anonymous204
RIGHT NOW29/5/2026, 4:25:16 am IST
0running requests
0waiting requests
8751msavg latency
0queue depth
0%capacity used
0%KV cache
0%GPU hardware load

Runtime counters

6,056completed requests
43,54,13,097total tokens
43,05,80,650prompt tokens
48,32,447generation tokens
1450msavg time to first token
9msavg output token interval

Serving models

bharatcode:qwen36-35b-awq-200k2,00,000 context
bharatcode:qwen36-35b-q6-256k-vision2,00,000 context
bharatcode:qwen36-35b-q8-256k2,00,000 context

GPU

36.4 GB / 40 GB VRAM, 31C, 50.5 W.