Stars
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data generation.
Official Implementation of Visual Abstraction: A Plug-and-Play Approach for Text-Visual Retrieval
Train transformer language models with reinforcement learning.
[NeurIPS 2025] 🌐 WebThinker: Empowering Large Reasoning Models with Deep Research Capability
[NeurIPS 2025 Spotlight] ReasonFlux (long-CoT), ReasonFlux-PRM (process reward model) and ReasonFlux-Coder (code generation)
🔍 Search-o1: Agentic Search-Enhanced Large Reasoning Models [EMNLP 2025]
Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".
Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".
Benchmarking Chat Assistants on Long-Term Interactive Memory (ICLR 2025)
VisRet: Visualization Improves Knowledge-Intensive Text-to-Image Retrieval
Repository for paper Visual-RAG: Benchmarking Text-to-Image Retrieval Augmented Generation for Visual Knowledge Intensive Queries
MTEB: Massive Text Embedding Benchmark
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
Synthetic data curation for post-training and structured data extraction
Lists of HTTP, SOCKS4, SOCKS5 proxies with geolocation info. Updated every hour.
Custom location files for Endless ATC · By startgrid
Official Code Repository of Paper, 'The Factuality Tax of Diversity-Intervened Text-to-Image Generation: Benchmark and Fact-Augmented Intervention'.
LLM Test-Time Compute Scaling: Papers and Resources 🔥
飞行手册FlightManual: FCOM, FCTM, SOP, QRH, NATOPS, ...