AI Infra Engineer · LLM/Agentic Systems · Cost Optimization · AKS/Terraform · NVIDIA + OSS AI
I build efficient AI infrastructure — from optimized GPU clusters to fast LLM serving (vLLM, Triton, SGLang), agentic workflows (LangGraph/CrewAI), and cost-aware pipelines.
A 14-day AI Infra portfolio showcasing:
- GPU cost savings (Spot+OD, autoscaling, DCGM dashboards)
- LLM serving benchmarks: Triton vs vLLM vs TGI vs SGLang
- Quantization + speculative decoding
- Long-context efficiency (128k–1M tokens)
- RAG cost optimization
- Multi-agent orchestration cost tracing
- CICD for AI systems (GitHub Actions → AKS)
🌍 rohankataria.com
🔗 linkedin.com/in/imrohan
🤗 huggingface.co/thewise
📸 instagram.com/byrohankataria