Highlights
- Pro
Lists (32)
Sort Name ascending (A-Z)
3d
3d-reconstruction-gen
4D-spatial
agent-tools
autonomous-driving
basemodel
casuality
contrastive-learning
depth
depth related work.detection
diffusion
to learn the diffusion related worksgeneration
image generation worksgeometry
3d geometry libs & reposgraph
graph-related works.human_pose
Know-distillation
knowledge
some knowledge to learnlaneDetection
LLM-tools
LLMs/VLMs
multi-modal
online-course resource
some online course materials for learning.open-vocabulary
prompt-related
works involve `prompts`RL
robotics
segmentation
papers implement about segmentationself/semi-supervised learning
works in self-/ semi-supervised learningtemporal
time-varied workstools
tranditional_cv
world-model
world model papers / reposStarred repositories
A股全栈数据工具包 — 7层架构 · 27端点 · 13数据源 · 零第三方依赖 | Full-stack China A-Share data toolkit for AI coding assistants
Production-grade engineering skills for AI coding agents.
TradingAgents: Multi-Agents LLM Financial Trading Framework
This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥
A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
Must-read papers on prompt-based tuning for pre-trained language models.
Learn any journal's writing conventions from its published papers, then revise your manuscript to match — section by section.
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…
WorldEngine: Towards the Era of Post-Training for Physical AI
A self-hosted ML coding practice platform. 68 problems from ReLU to flow matching — attention, training, RLHF, diffusion, and more. Instant feedback in the browser.
Code for SIRE: SE(3) Intrinsic Rigidity Embeddings
[ICLR'26] YoNoSplat: You Only Need One Model for Feedforward 3D Gaussian Splatting
[ICCV 2025 Highlight] Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction
[ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation" & Causal Forcing++
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024 Oral) - Official Implementation
🤗 smolagents: a barebones library for agents that think in code.
Learning Less is More - 6D Camera Localization via 3D Surface Regression
Tools for converting Copilot chat conversations to markdown format
[AAAI 2026] AD-L-JEPA: Self-Supervised Representation Learning with Joint Embedding Predictive Architecture for Automotive LiDAR Object Detection
A novel lightweight monocular depth estimation method
毒奶博主的自用机场推荐——100GB/15元/月起(最高享8折优惠),SS/v2Ray/Trojan协议支持,IEPL专线加持,稳定低延迟,ChatGPT,Netflix等流媒体解锁;
🧑🚀 全世界最好的LLM资料总结(多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.
VIP cheatsheet for Stanford's CME 295 Transformers and Large Language Models
【TPAMI 2026】A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation
《Build a Large Language Model (From Scratch)》是一本深入探讨大语言模型原理与实现的电子书,适合希望深入了解 GPT 等大模型架构、训练过程及应用开发的学习者。为了让更多中文读者能够接触到这本极具价值的教材,我决定将其翻译成中文,并通过 GitHub 进行开源共享。