I'm an AI/ML Engineer focused on building LLM-powered systems that work in the real world — not benchmarks, not demos. My work sits at the intersection of language models, data infrastructure, and production-grade deployment.
Right now I'm deep in the space of RAG pipelines, LLM fine-tuning, and local model serving — building applications where intelligence meets utility. I care about systems that are fast, explainable, and actually solve problems worth solving.
The best AI system is the one running in production. Everything else is a prototype.
A local-first application that lets you query any database using natural language — no SQL required.
What makes it different:
- 🔍 Automatic DB detection & connection — point it at a connection string and it maps your schema on its own
- 🧠 Dual-mode inference — runs with a local LLM (Ollama / custom endpoint) or the Claude API, swappable at runtime
- 🔌 Custom endpoint support — bring your own model server; any OpenAI-compatible API works
- 🧩 Schema-aware prompting — dynamically injects table structure into context for accurate query generation
- ⚡ Lightweight, self-hostable — no cloud dependency required; runs fully offline
Stack: Python · FastAPI · LangChain · Ollama · Claude API · SQLAlchemy
Building retrieval-augmented generation pipelines that go beyond naive chunk-and-retrieve:
- Hybrid search (dense + sparse), reranking, metadata filtering
- Chunking strategies tuned per document type
- Evaluation pipelines for faithfulness, relevance, and groundedness
Fine-tuning open-source models for domain-specific tasks:
- LoRA / QLoRA on task-specific datasets
- Instruction tuning with custom data pipelines
- Benchmarking fine-tuned vs. prompted base models
Two-stage emotion classification system — coarse-to-fine label hierarchy for better generalization and interpretability. Deployed as a FastAPI inference endpoint.
| Project | Description |
|---|---|
| 130-Emotion NLP Classifier | Fine-tuned transformer on 130 fine-grained emotion labels with label balancing |
| Missing Person Detection | Face-embedding search engine with cosine similarity + vector search pipeline |
| Attendance on Autopilot | Automated face-recognition attendance system backed by FastAPI |
LLM & AI
LangChain LlamaIndex Ollama Claude API Hugging Face Transformers PEFT / LoRA vLLM
ML / Deep Learning
PyTorch TensorFlow scikit-learn sentence-transformers spaCy OpenCV
Backend & Infra
FastAPI Docker PostgreSQL SQLite SQLAlchemy REST APIs
Cloud & DevOps
AWS (EC2, S3) Render Railway GitHub Actions
Languages
Python SQL
Email: thevinayakajith@gmail.com
LinkedIn: vinayak-ajith-208993266