- Hong Kong
Stars
CL-bench: A Benchmark for Context Learning
Fast, Flexible and Portable Structured Generation
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
DeepResearch: A model-agnostic multi-agent framework that transforms complex research into systematic, collaborative sub-agent—planner, information collection and iterating content through intellig…
A simple yet powerful agent framework that delivers with open-source models
MiroThinker is a deep research agent optimized for complex research and prediction tasks. Our latest models, MiroThinker-1.7 and MiroThinker-H1, achieve 74.0 and 88.2 on the BrowseComp, respectively.
LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.
Train transformer language models with reinforcement learning.
An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…
Streamlit — A faster way to build and share data apps.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Youtu-Embedding is an industry-leading, general-purpose text representation model developed by Tencent Youtu Lab.
A lightweight LMM-based Document Parsing Model
[Pytorch] The repo contains the code for "FORGE: Forming Semantic Identifiers for Generative Retrieval in Industrial Datasets"
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…
Benchmarking Recommendation Abilities for Large Language Models
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Kode Agent — Design for post-human workflows. One unit agent for every human & computer task.
RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation
Toolkit for linearizing PDFs for LLM datasets/training
Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
[KDD 2025] Quadratic Neural Networks for Click-through Rate Prediction
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。
使用open-webui中的pipelines技术在open-webui中调用ragflow的agent实现基于知识库的智能对话,并拥有美观的界面。
Pipelines: Versatile, UI-Agnostic OpenAI-Compatible Plugin Framework