Lists (9)
Sort Name ascending (A-Z)
Stars
🚀🚀 「大模型」2小时完全从0训练64M的小参数GPT!🌏 Train a 64M-parameter GPT from scratch in just 2h!
🔥 大模型 & Agent 面试八股文完全指南 | LLM & Agent Interview Preparation Guide
【三年面试五年模拟】AIGC算法工程师面试秘籍。涵盖AIGC、LLM大模型、AI Agent、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、强化学习、大数据挖掘、具身智能、元宇宙、AGI等AI行业面试笔试干货经验与核心知识。
大模型算法岗面试题(含答案):常见问题和概念解析 "大模型面试题"、"算法岗面试"、"面试常见问题"、"大模型算法面试"、"大模型应用基础"
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
[ICLR 2026] ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving
The Source Code for OmniVideoBench @ICLR 2026
OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.
Universal Video Temporal Grounding with Generative Multi-modal Large Language Models
Unified Codebase for Advanced World Models.
Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs, LAION-CLAP, MS-CLAP, DeSync
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.
[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding
**Deep Video Discovery (DVD)** is a deep-research style question answering agent designed for understanding extra-long videos.
A GUI client for Windows, Linux and macOS, support Xray and sing-box and others
AutoDL平台服务器适配梯子, 使用 Clash 作为代理工具
✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"
Easy Data Preparation with latest LLMs-based Operators and Pipelines.
verl: Volcano Engine Reinforcement Learning for LLMs
Time-R1: Framework and resources for endowing LLMs with comprehensive temporal reasoning (understanding, prediction, creative generation) using a novel three-stage RL curriculum. Includes the Time-…
🔥Awesome Multimodal Large Language Models Paper List
amed Entity Recognition (NER) for biomedical research papers using BERT, BioBERT, BiLSTM, and CRF models. Implements deep learning and reinforcement learning to enhance medical text extraction accu…
Notebook for BERT medical named entity recognition