Skip to content
View tkhe's full-sized avatar

Block or report tkhe

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

这是一个从头训练大语言模型的项目,包括预训练、微调和直接偏好优化,模型拥有1B参数,支持中英文。

Python 840 106 Updated Feb 18, 2025

MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.

Python 1,438 88 Updated Apr 30, 2026

AllenAI's post-training codebase

Python 3,727 540 Updated May 17, 2026

PyTorch building blocks for the OLMo ecosystem

Python 1,223 241 Updated May 17, 2026

[NeurIPS2024] - SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion

Python 103 4 Updated Oct 29, 2025

[CVPR2026] Detect Anything via Next Point Prediction

Jupyter Notebook 1,351 91 Updated Feb 22, 2026

Depth Anything 3

Python 5,295 571 Updated Mar 21, 2026

torchcomms: a modern PyTorch communications API

C++ 363 147 Updated May 17, 2026

Official code repo for our work "Native Visual Understanding: Resolving Resolution Dilemmas in Vision-Language Models"

Python 54 3 Updated Jun 17, 2025

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 19,193 1,764 Updated Jan 30, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 27,932 5,958 Updated May 18, 2026

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,963 616 Updated May 3, 2024

Train a 1B LLM with 1T tokens from scratch by personal

Jupyter Notebook 802 79 Updated Apr 27, 2025

从零构建大模型:从预训练到RLHF的完整实践

Python 2,649 208 Updated May 16, 2026

训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。

Python 82 12 Updated Sep 6, 2024

Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.

Jupyter Notebook 593 66 Updated Jul 11, 2024

中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

Python 1,712 196 Updated Apr 20, 2024

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 10,411 839 Updated Mar 30, 2026

Nano vLLM

Python 13,471 2,101 Updated Apr 26, 2026

A single-file educational implementation for understanding vLLM's core concepts and running LLM inference.

Python 43 6 Updated Apr 7, 2026

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,369 417 Updated Jan 17, 2026

将SmolVLM2的视觉头与Qwen3-0.6B模型进行了拼接微调

Python 581 54 Updated Sep 8, 2025
Python 46 3 Updated Jan 14, 2026

[ICCV2025] PyTorch implementation of "Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models"

Python 123 5 Updated Jan 24, 2026

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python 3,136 263 Updated May 15, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 80,280 16,883 Updated May 18, 2026

[CVPR 2025] Official implementation for "Empowering LLMs to Understand and Generate Complex Vector Graphics" https://arxiv.org/abs/2412.11102

Python 637 12 Updated May 22, 2025

从无名小卒到大模型(LLM)大英雄~ 欢迎关注后续!!!

Jupyter Notebook 2,185 147 Updated May 4, 2026

👀「大模型」2小时从0训练65M参数的视觉多模态VLM!Train a 65M-parameter VLM from scratch in just 2h!

Python 7,920 864 Updated May 12, 2026

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

22,567 2,128 Updated May 10, 2026
Next