The cluster, right now

Live numbers from the BharatCode serving VM: vLLM throughput, queue depth, model list, and GPU telemetry. Current counters come directly from the serving VM, while the graph tracks serving-slot capacity and queue depth over time.

SERVING LIVE1 GPU · bharatcode-a100
A100 VM1x NVIDIA A100 40GB
0% load31C
OPERATIONAL HISTORY1-hour avg
0%1-hour averaged serving capacity
0%instantaneous GPU hardware load
Serving capacity utilization
100%50%0%28 May, 01:3029 May, 02:3030 May, 02:30
Average queue depth
1028 May, 01:3029 May, 02:3030 May, 02:30
TOP TOKEN CONSUMERS
1Shivani Raja14,99,51,761
2pradeep reddy1,30,86,664
3Ritvik Sharma89,54,943
4Ishaan Kesarwani34,44,950
5PANDHRINATH RAHUL20,02,270
6Abhishek Keshri1,59,727
7vaibhav tiwari1,894
8Anonymous76
STEWARDSHIP SCORE
1Shivani Raja1,157
2pradeep reddy799
3Ritvik Sharma773
4Ishaan Kesarwani753
5PANDHRINATH RAHUL703
6Abhishek Keshri544
7vaibhav tiwari336
8Anonymous204
RIGHT NOW30/5/2026, 2:53:46 am IST
0running requests
0waiting requests
8746msavg latency
0queue depth
0%capacity used
0%KV cache
0%GPU hardware load

Runtime counters

7,096completed requests
50,15,86,780total tokens
49,58,23,273prompt tokens
57,63,507generation tokens
1353msavg time to first token
9msavg output token interval

Serving models

bharatcode:qwen36-35b-awq-200k2,00,000 context
bharatcode:qwen36-35b-q6-256k-vision2,00,000 context
bharatcode:qwen36-35b-q8-256k2,00,000 context

GPU

36.4 GB / 40 GB VRAM, 31C, 51.4 W.