Ashwin Mathur

AI Engineer · Agentic RAG & Reranking · LLM Fine-Tuning & RL · Domain-Specific AI

I work on LLM systems for domain-specific applications in Finance, Bio-Medical, and Legal AI, spanning retrieval, agents and model training. I've contributed to Haystack, MTEB, HuggingFace, and scikit-learn, and co-authored MMTEB, published at ICLR 2025. Developing open-source AI at AVNLP.

Developing Open-Source AI @ AVNLP

LLM Training & RL Alignment

Repository	Description
BioThink	Self-Reflective Bio-Medical QA training with QLoRA + GRPO to generate structured self-reflection tokens using six reward functions; evaluated across seven metrics via LLM-as-a-Judge.
RAG Model Training	Fine-tuning LLMs for Adaptive-RAG, Corrective RAG, RQ-RAG, Self-RAG, Agentic RAG, and ReZero via SFT and GRPO across finance, biomedical, and open-domain QA.
GRPO	Four GRPO implementations comparing format/correctness rewards, DeepSpeed vs. PyTorch training, frozen/server/periodic reference models, and vLLM vs. Transformers rollout generation.
LLM Finetuning	SFT, DPO, KTO, ORPO, PPO, and GRPO pipelines with QLoRA/LoRA/DoRA/P-Tuning/Prefix-Tuning adapter training across ARC, FactScore, TriviaQA, PopQA, Earnings Calls, and GSM8K.

Retrieval Augmented Generation and Agents

Repository	Description
RAG Pipelines	Domain-specific RAG pipelines combining LangGraph orchestration, BAML structured generation, Milvus Hybrid Search, 3-layer metadata enrichment, and instruction-following rerankers for Medical and Financial QA.
DSPy Optimizers	DSPy RAG optimization with Weaviate Hybrid Search, Query Rewriting, Sub-Query Decomposition using MIPROv2/COPRO/BootstrapFewShot optimizers on FreshQA, HotpotQA, TriviaQA, and PubMedQA.
VectorDB	Haystack and LangChain retrieval pipelines spanning Dense/Sparse/Hybrid search, Reranking, Parent-Child Retrieval, Query Enhancement, and Multi-Tenancy across Pinecone, Weaviate, Milvus, Qdrant, and Chroma.

Information Retrieval & Ranking

Repository	Description
LLM Rankers	LLM rankers using Pairwise, Setwise, and Listwise techniques with RankZephyr/RankLlama, Pydantic-validated structured generation, and efficient zero-shot sorting.
Pairwise Ranking Prompting	Zero-shot pairwise reranking with All-Pairs, Heapsort, and Sliding-K strategies, using bidirectional comparison for position-bias mitigation and Pydantic-validated outputs.
Reciprocal Rank Fusion and LLM Rankers	Hybrid retrieval combining Reciprocal Rank Fusion with Diversity, Lost-in-the-Middle, and Similarity rankers, evaluated on BEIR (NDCG, MAP, Recall, Precision).
LLM Blender	LLM ensembling framework using PairRanker for cross-attention candidate ranking and GenFuser for top-K output fusion, packaged as a Haystack component.

Open-Source Contributions

Haystack - Built the Haystack evaluation framework (eval, EvaluationResult, calculate_metrics) and four metrics (EM, F1, SAS, MRR); added HuggingFace TEI Embedders and a sentence-transformer Diversity Ranker.
MTEB - Added the complete LegalBench Benchmark (160+ legal classification and retrieval datasets) and four Japanese benchmarks (JMTEB Clustering, JSICK, JaGovFaqs, NLPJournal).
Haystack Core Integrations - Implemented INSTRUCTOR Embedders, Optimum Embedders (ONNX runtime), Llama.cpp Generator, Pinecone Document Store, and Cohere V3 Embed model support.
HuggingFace Transformers, Evaluate - BioGPTForSequenceClassification and Trainer-free ViT pre-training scripts in Transformers; scikit-learn integration guides in Evaluate.
scikit-learn, imbalanced-learn - Three core scikit-learn features: OOB fitted scores for Gradient Boosting, sparse-matrix support for silhouette_samples, and multiclass average_precision_score.
voyage-embedders-haystack - Full Haystack integration for Voyage AI: text/document embedders, reranker, multimodal embeddings, and contextualized chunk embeddings; published on PyPI.

Publications

MMTEB: Massive Multilingual Text Embedding Benchmark (ICLR 2025)

Largest multilingual text embedding benchmark: 500+ tasks across 250+ languages and 10 task categories. Contributed the complete LegalBench suite - 160+ legal domain classification and retrieval datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ashwin Mathur

Developing Open-Source AI @ AVNLP

LLM Training & RL Alignment

Retrieval Augmented Generation and Agents

Information Retrieval & Ranking

Open-Source Contributions

Publications

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Ashwin Mathur

Developing Open-Source AI @ AVNLP

LLM Training & RL Alignment

Retrieval Augmented Generation and Agents

Information Retrieval & Ranking

Open-Source Contributions

Publications

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages