Server Benchmarking for vLLM, Luminal and SGlang Inference


I got the opportunity recently to work on Luminal(YC S25)'s inference server benchmarking tool. I'm using Prometheus and coroutine based algorithms to get the most accurate server metrics for every LLM delivering high-fidelity latency/throughput metrics (p50–p99.9), CPU/GPU utilization, and per-model comparisons across backends.

Trackvest

An Agentic AI investment research tool. Track all alternative investments as well as learn from KPIs to invest smarter.

This was a Full Stack(React + TSX), Using open source models + custom scrapers.

Working on V2!

$30k grant from the 1517 fund

DockSleek

DockSleek is a Go-powered tool that analyzes and rewrites Dockerfiles to reduce image size, speed up builds, and improve caching. It suggests multi-stage builds, collapses redundant layers, and strips out unnecessary artifacts, delivering leaner, faster, and more secure containers.


This is becoming important as I get deeper into DevOps.


Currently a WIP


ContextQ

Fit your LLM to any task with 1 line! Saves up to 80% of size and cuts down inference by 2x minimum.(AWQ) Context-Aware Model Library for task specific LLM quantization.


pip install contextq

Chord Wizard

ChordWizard is a real-time chord recognition system using OpenCV and spectral analysis. Processes audio input through Librosa for MEL spectrogram generation, then uses CNNs to detect and identify guitar chords. Users can create their own progressions and there is also a built in finger position visualizer.

Server Benchmarking for vLLM, Luminal and SGlang Inference


I got the opportunity recently to work on Luminal(YC S25)'s inference server benchmarking tool. I'm using Prometheus and coroutine based algorithms to get the most accurate server metrics for every LLM delivering high-fidelity latency/throughput metrics (p50–p99.9), CPU/GPU utilization, and per-model comparisons across backends.

Trackvest

An Agentic AI investment research tool. Track all alternative investments as well as learn from KPIs to invest smarter.

This was a Full Stack(React + TSX), Using open source models + custom scrapers.

Working on V2!

$30k grant from the 1517 fund

DockSleek

DockSleek is a Go-powered tool that analyzes and rewrites Dockerfiles to reduce image size, speed up builds, and improve caching. It suggests multi-stage builds, collapses redundant layers, and strips out unnecessary artifacts, delivering leaner, faster, and more secure containers.


This is becoming important as I get deeper into DevOps.


Currently a WIP


ContextQ

Fit your LLM to any task with 1 line! Saves up to 80% of size and cuts down inference by 2x minimum.(AWQ) Context-Aware Model Library for task specific LLM quantization.


pip install contextq

Chord Wizard

ChordWizard is a real-time chord recognition system using OpenCV and spectral analysis. Processes audio input through Librosa for MEL spectrogram generation, then uses CNNs to detect and identify guitar chords. Users can create their own progressions and there is also a built in finger position visualizer.

Ayan Jhunjhunwala

Focused on full-stack development, distributed systems, and applied LLM research

Computer Science @ UC Irvine

Focused on full-stack development, applied LLMs + data driven systems and devops!

Computer Science @ UC Irvine

Server Benchmarking for vLLM, Luminal and SGlang Inference


I got the opportunity recently to work on Luminal(YC S25)'s inference server benchmarking tool. I'm using Prometheus and coroutine based algorithms to get the most accurate server metrics for every LLM delivering high-fidelity latency/throughput metrics (p50–p99.9), CPU/GPU utilization, and per-model comparisons across backends.

Trackvest

An Agentic AI investment research tool. Track all alternative investments as well as learn from KPIs to invest smarter.

This was a Full Stack(React + TSX), Using open source models + custom scrapers.

Working on V2!

$30k grant from the 1517 fund

DockSleek

DockSleek is a Go-powered tool that analyzes and rewrites Dockerfiles to reduce image size, speed up builds, and improve caching. It suggests multi-stage builds, collapses redundant layers, and strips out unnecessary artifacts, delivering leaner, faster, and more secure containers.


This is becoming important as I get deeper into DevOps.


Currently a WIP


ContextQ

Fit your LLM to any task with 1 line! Saves up to 80% of size and cuts down inference by 2x minimum.(AWQ) Context-Aware Model Library for task specific LLM quantization.


pip install contextq

Chord Wizard

ChordWizard is a real-time chord recognition system using OpenCV and spectral analysis. Processes audio input through Librosa for MEL spectrogram generation, then uses CNNs to detect and identify guitar chords. Users can create their own progressions and there is also a built in finger position visualizer.

Skills

Skills

Python

Python

C++

C++

.NET

.NET

TypeScript

TypeScript

React/Angular

React/Angular

Node.js

Node.js

Full Stack

Full Stack

SQL

SQL

Pytorch

Pytorch

Golang

Golang

Huggingface

Huggingface

Terraform

Terraform

Docker

Docker

Prometheus

Prometheus

Git

Git

Apache Spark+Airflow

Apache Spark+Airflow

AWS Stack

AWS Stack

Azure

Azure

ML/AI

ML/AI

Experience

Experience

AI Engineering Intern (Summer) + SWE Intern (Fall)

AI Engineering Intern (Summer) + SWE Intern (Fall)

LBA

LBA

June 2025 - Present

June 2025 - Present

Machine Learning Researcher

Machine Learning Researcher

UCI Health

UCI Health

March 2025 - Present

March 2025 - Present

Accelerate Fellow

Accelerate Fellow

IBM

IBM

May 2024 - July 2024

May 2024 - July 2024