Skip to content
View LBH1024's full-sized avatar

Block or report LBH1024

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

这是一份入门AI/LLM大模型的逐步指南,包含教程和演示代码,带你从API走进本地大模型部署和微调,代码文件会提供Kaggle或Colab在线版本,即便没有显卡也可以进行学习。项目中还开设了一个小型的代码游乐场🎡,你可以尝试在里面实验一些有意思的AI脚本。同时,包含李宏毅 (HUNG-YI LEE)2024生成式人工智能导论课程的完整中文镜像作业。

Python 4,027 427 Updated Apr 20, 2026

Recommend new arxiv papers of your interest daily according to your Zotero libarary.

Python 5,216 4,601 Updated Apr 14, 2026

Kimi-Vendor-Verifier

Python 62 8 Updated Feb 24, 2026

Moonshot's most powerful model

1,886 230 Updated Jan 31, 2026
Python 26 Updated Feb 27, 2026

[EMNLP 2025] LightThinker: Thinking Step-by-Step Compression

Python 154 5 Updated Apr 7, 2026

[ACL 2026 Findings] "Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning"

Python 62 3 Updated Jan 28, 2026

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Python 4,357 328 Updated Jan 14, 2026

NEO Series: Native Vision-Language Models from First Principles

Python 729 27 Updated Apr 26, 2026

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,898 371 Updated Apr 6, 2026

[CVPR2026] Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"

Python 78 5 Updated Mar 20, 2026

[CVPR 2026] Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"

Python 178 2 Updated Mar 19, 2026

An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.

Python 1,840 211 Updated Apr 10, 2026

Training Large Language Model to Reason in a Continuous Latent Space

Python 1,591 175 Updated Apr 8, 2026

This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.

322 6 Updated Apr 2, 2026

📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.

385 6 Updated Nov 5, 2025

Implementation of "Interleaved Latent Visual Reasoning with Selective Perceptual Modeling".

Python 49 3 Updated Apr 8, 2026

[NeurIPS 2025] Official code for paper: Latent Chain-of-Thought for Visual Reasoning

Python 35 Updated Oct 16, 2025

Official codebase for the paper Latent Visual Reasoning

Python 150 9 Updated Oct 22, 2025

Open-source unified multimodal model

Python 5,877 522 Updated Oct 27, 2025

Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)

Python 713 27 Updated Sep 24, 2025

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 1,659 132 Updated Nov 21, 2025

[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding

Python 522 12 Updated Nov 14, 2025

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 3,434 224 Updated May 19, 2025

This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)

Python 306 28 Updated Dec 5, 2024

This repo contains the code for 1D tokenizer and generator

Jupyter Notebook 1,145 67 Updated Mar 20, 2025

[CVPR 2026] OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models

Python 76 2 Updated Apr 20, 2026

A collection of token reduction (token pruning, merging, clustering, etc.) techniques for ML/AI

417 15 Updated Apr 29, 2026

[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding

Python 206 16 Updated Dec 19, 2025

Agentic Keyframe Search for Video Question Answering

Python 18 Updated Apr 7, 2025
Next