Lists (6)
Sort Name ascending (A-Z)
Stars
fine-tuning the alpaca model for sinhala.
An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
A Comprehensive Toolkit for High-Quality PDF Content Extraction
Get your documents ready for gen AI
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 80+ languages.
This repository contains an exhaustive coverage of a hands on approach to PyTorch along side powerful tools to accelerate model tuning and training
Various installation guides for Large Language Models
[EMNLP 2025] ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
Bring data to life with SVG, Canvas and HTML. πππ
End-to-End-Graphrag-implementation
This is a repository of RALM surveys containing a summary of state-of-the-art RAG and other technologies
A package for visualising Chroma vector collections in 3D
A single library to (down)load all existing sign language handshape datasets.
π± a fast, batteries-included static-site generator that transforms Markdown content into fully functional websites
Lime: Explaining the predictions of any machine learning classifier
An evolving list of electronic media data sets used to model mental-health status.
Minimal, single page, smooth-scrolling theme for Hugo static site generator.
Turns Data and AI algorithms into production-ready web applications in no time.
A utility library to programatically generate markdown files
Jekyll theme for minimalists. Live at https://rahulbothra.com/parchment
A python script that enables a Slack user to extract date, time, user and text information from a given channel and output information to a text file.