Skip to content
View HarshTomar1234's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report HarshTomar1234

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
HarshTomar1234/README.md

Typing SVG


LinkedIn X Portfolio


About

AI/ML Engineer passionate about building end-to-end AI systems. I believe in understanding AI by implementing from scratch — from Vision Transformers to multi-agent architectures.

Focus: Computer Vision • Research Implementations • GenAI & Agents • MLOps


Tech Stack

Programming Languages

ML/AI Frameworks

HuggingFace

Computer Vision

Roboflow

Supervision YOLOv5-v8 Object Detection Image Segmentation DeepSORT ByteTrack Optical Flow

GenAI & LLM

HuggingFace LlamaIndex CrewAI OpenAI Anthropic

LangGraph AG2 (AutoGen) Agno RAG DeepSeek Prompt Engineering OpenAI API

Data Science

NumPy Pandas Matplotlib

Seaborn Scikit-learn Statistical Analysis Feature Engineering A/B Testing

Development Tools

Jupyter

Streamlit FastAPI Flask Gradio

Cloud & Deployment

EC2 S3 Lambda CI/CD MLflow DVC

Databases

Pinecone Chroma FAISS Vector Databases


Featured Projects

Real-time tennis match analysis system with advanced computer vision

  • YOLOv8 custom-trained on 1000+ annotated frames
  • ByteTrack for robust multi-object tracking
  • CNN-based court homography detection
  • Shot classification: 85%+ accuracy
  • Player detection: 95% | Ball tracking: 88%
  • Mini-court tactical visualization
  • Real-time speed & distance analytics

Python YOLOv8 OpenCV PyTorch ByteTrack

Demo

Multi-sport CV analysis pipeline for football/soccer

  • YOLOv8 + DeepSORT tracking pipeline
  • K-means team identification using HSV color
  • Optical flow camera compensation
  • Tactical heatmap generation
  • Player possession & movement analysis
  • Perspective transformation for top-view
  • Ball control & pass detection

Python YOLO DeepSORT OpenCV Supervision

Demo

Deep learning medical imaging system for cancer detection

  • Binary classification: benign/malignant
  • Grad-CAM heatmaps for model explainability
  • OpenCV feature extraction pipeline
  • Auto-generated PDF diagnostic reports
  • Transfer learning with pretrained CNNs
  • Flask web interface for predictions

TensorFlow OpenCV Flask Grad-CAM Keras

GitHub

AI-powered molecular research & drug discovery platform

  • NVIDIA MolMIM for molecule generation
  • RDKit 3D molecular visualization
  • PubChem API integration for data
  • CMA-ES optimization + QED scoring
  • Interactive molecular property explorer
  • Firebase backend for user sessions

TypeScript Next.js Firebase RDKit NVIDIA NIMs

GitHub

Conversational AI with real-time web search capabilities

  • LangGraph agentic architecture
  • Real-time web search integration
  • GPT-4 powered intelligent responses
  • Live search visualization UI
  • Multi-turn conversation memory
  • FastAPI backend + Next.js frontend

LangGraph Next.js FastAPI GPT-4 OpenAI API

GitHub

Production-ready Deepfake Detection MLOps Pipeline

  • EfficientNet-based image classifier
  • DVC pipeline for data versioning
  • MLflow experiment tracking on DagsHub
  • Automated CI/CD with GitHub Actions
  • Docker containerized deployment
  • Deployed on Hugging Face Spaces

TensorFlow DVC MLflow Docker CI/CD

Demo


Research Implementations

Building architectures from scratch to truly understand them.

Paper Implementation Key Details
Vision Transformer ViT 16×16 patch embedding, 12-layer encoder, multi-head attention
LoRA & QLoRA PyTorch-LoRA Low-rank adaptation, 4-bit quantization, 83% VRAM reduction
Vision-Language Models VLMverse PaLiGemma, SigLIP, cross-modal attention fusion

Multi-Agent Systems

Project Description
AgentForge ★ 2 Multi-agent orchestration with CrewAI, LangGraph, PhiData
Travel Planner 4-agent system: Flight, Hotel, Itinerary, Budget
Google ADK Production agent patterns with Google ADK

Learning Repos

Comprehensive implementations from fundamentals to SOTA:


GitHub Stats



3D Contributions


Let's Connect

Open to: AI/ML Engineering • Computer Vision • GenAI/LLMOps


LinkedIn X Email Portfolio


"Building AI systems by understanding them from first principles"


Pinned Loading

  1. Tennis-Vision Tennis-Vision Public

    Tennis Detection and Visualization System An advanced computer vision system for tennis match analysis that tracks players and ball movement with high precision. The system uses YOLOv8 and custom-t…

    Jupyter Notebook 23 1

  2. MoleCuQuest MoleCuQuest Public

    Connecting biologists, medical researchers & molecule nerds — because science is ❤️........

    TypeScript

  3. vision_transformer-ViT- vision_transformer-ViT- Public

    Vision Transformer implementation in PyTorch

    Jupyter Notebook

  4. AgentForge AgentForge Public

    A comprehensive guide and collection of examples for building AI agents using modern frameworks like CrewAI, Agno, and smolagents. Learn to forge powerful AI agents through hands-on examples and pr…

    Jupyter Notebook 2 1

  5. VLMverse VLMverse Public

    PyTorch implementations of cutting-edge vision-language models from scratch. Demystifying multimodal AI with clean, educational code and detailed architectural breakdowns. Turn research papers into…

    Python

  6. PyTorch-LoRA-QLoRA PyTorch-LoRA-QLoRA Public

    Pure PyTorch implementations of LoRA and QLoRA for memory-efficient fine-tuning of large language models and vision transformers.

    Python