Stars
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
"RAG-Anything: All-in-One RAG Framework"
Fast and Accurate ML in 3 Lines of Code
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Solve Visual Understanding with Reinforced VLMs
⚡ TabPFN: Foundation Model for Tabular Data ⚡
Scalable and user friendly neural 🧠 forecasting algorithms.
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence https://arxiv.org/abs/2509.03505
[AAAI-23 Oral] Official implementation of the paper "Are Transformers Effective for Time Series Forecasting?"
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
All-in-one training for vision models (YOLO, ViTs, RT-DETR, DINOv3): pretraining, fine-tuning, distillation.
Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
Edit Banana: A framework for converting statistical formats into editable.
An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework.
PKU-DAIR / open-box
Forked from thomas-young-2013/open-boxTowards Generalized and Efficient Blackbox Optimization System/Package (KDD 2021 & JMLR 2024)
VisioFirm: Cross-Platform AI-assisted Annotation Tool for Computer Vision
Python module for creating GDSII stream files, usually CAD layouts.
Robust Molecular Structure Recognition with Image-to-Graph Generation
Deep Reinforcement Learning of Analog Circuit Designs
Python (3.5) tool to convert .asc files into circuiTikz graphics
A versatile generative model capable of designing topologies for wide range of analog circuits.
Code release for Representation Subspace Distance for Domain Adaptation Regression (ICML 2021)