kernel_crush HarshTomar1234

About

AI/ML Engineer passionate about building end-to-end AI systems. I believe in understanding AI by implementing from scratch — from Vision Transformers to multi-agent architectures.

Focus: Computer Vision • Research Implementations • GenAI & Agents • MLOps

Tech Stack

Programming Languages

ML/AI Frameworks

Computer Vision

Supervision YOLOv5-v8 Object Detection Image Segmentation DeepSORT ByteTrack Optical Flow

GenAI & LLM

LangGraph AG2 (AutoGen) Agno RAG DeepSeek Prompt Engineering OpenAI API

Data Science

Seaborn Scikit-learn Statistical Analysis Feature Engineering A/B Testing

Development Tools

Streamlit FastAPI Flask Gradio

Cloud & Deployment

EC2 S3 Lambda CI/CD MLflow DVC

Databases

Pinecone Chroma FAISS Vector Databases

Featured Projects

Tennis Vision ★ 23

Real-time tennis match analysis system with advanced computer vision

YOLOv8 custom-trained on 1000+ annotated frames
ByteTrack for robust multi-object tracking
CNN-based court homography detection
Shot classification: 85%+ accuracy
Player detection: 95% | Ball tracking: 88%
Mini-court tactical visualization
Real-time speed & distance analytics

Python YOLOv8 OpenCV PyTorch ByteTrack

Field Fusion

Multi-sport CV analysis pipeline for football/soccer

YOLOv8 + DeepSORT tracking pipeline
K-means team identification using HSV color
Optical flow camera compensation
Tactical heatmap generation
Player possession & movement analysis
Perspective transformation for top-view
Ball control & pass detection

Python YOLO DeepSORT OpenCV Supervision

Histopathology Analysis

Deep learning medical imaging system for cancer detection

Binary classification: benign/malignant
Grad-CAM heatmaps for model explainability
OpenCV feature extraction pipeline
Auto-generated PDF diagnostic reports
Transfer learning with pretrained CNNs
Flask web interface for predictions

TensorFlow OpenCV Flask Grad-CAM Keras

MolecuQuest

AI-powered molecular research & drug discovery platform

NVIDIA MolMIM for molecule generation
RDKit 3D molecular visualization
PubChem API integration for data
CMA-ES optimization + QED scoring
Interactive molecular property explorer
Firebase backend for user sessions

TypeScript Next.js Firebase RDKit NVIDIA NIMs

QuantaAI

Conversational AI with real-time web search capabilities

LangGraph agentic architecture
Real-time web search integration
GPT-4 powered intelligent responses
Live search visualization UI
Multi-turn conversation memory
FastAPI backend + Next.js frontend

LangGraph Next.js FastAPI GPT-4 OpenAI API

DeepGuard

Production-ready Deepfake Detection MLOps Pipeline

EfficientNet-based image classifier
DVC pipeline for data versioning
MLflow experiment tracking on DagsHub
Automated CI/CD with GitHub Actions
Docker containerized deployment
Deployed on Hugging Face Spaces

TensorFlow DVC MLflow Docker CI/CD

Research Implementations

Building architectures from scratch to truly understand them.

Paper	Implementation	Key Details
Vision Transformer	ViT	16×16 patch embedding, 12-layer encoder, multi-head attention
LoRA & QLoRA	PyTorch-LoRA	Low-rank adaptation, 4-bit quantization, 83% VRAM reduction
Vision-Language Models	VLMverse	PaLiGemma, SigLIP, cross-modal attention fusion

Multi-Agent Systems

Project	Description
AgentForge ★ 2	Multi-agent orchestration with CrewAI, LangGraph, PhiData
Travel Planner	4-agent system: Flight, Hotel, Itinerary, Budget
Google ADK	Production agent patterns with Google ADK

Learning Repos

Comprehensive implementations from fundamentals to SOTA:

ML & Deep Learning — Classical ML, CNNs, RNNs, Transformers, NLP
Computer Vision — RCNN, YOLO, U-Net, Mask R-CNN, tracking
PyTorch Deep Dive — Autograd, custom layers, DDP, AMP
RAG Systems — Vector DBs, chunking, retrieval, production patterns

GitHub Stats

3D Contributions

Let's Connect

Open to: AI/ML Engineering • Computer Vision • GenAI/LLMOps

"Building AI systems by understanding them from first principles"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly