Lists (1)
Sort Name ascending (A-Z)
Stars
Claude Code CLI integration for Unreal Engine 5.7 - Get AI coding assistance with built-in UE5.7 documentation context directly in the editor.
Official implement of arxiv paper "CausalEmbed: Auto-Regressive Multi-Vector Generation in Latent Space for Visual Document Embedding".
[EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference"
[NAACL 2025🔥] MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference
Implement some method of LLM KV Cache Sparsity
The absolute trainer to light up AI agents.
MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension
[CVPR 2025] Docopilot: Improving Multimodal Models for Document-Level Understanding
Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
This repository contains multiple implementations of Flash Attention optimized with Triton kernels, showcasing progressive performance improvements through hardware-aware optimizations. The impleme…
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models
"RAG-Anything: All-in-One RAG Framework"
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
MQN-80 / mindnlp
Forked from candle-org/mindnlpEasy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of 🤗Huggingface.
Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Multimodal Retrieval-augmented Generation Framework Built by Tongyi Lab, Alibaba Group.
Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".
本项目旨在收集开源的表格智能任务数据集(比如表格问答、表格-文本生成等),将原始数据整理为指令微调格式的数据并微调LLM,进而增强LLM对于表格数据的理解,最终构建出专门面向表格智能任务的大型语言模型。