Skip to content
View csgcmai's full-sized avatar
😜
Be the fire and wish for the wind
😜
Be the fire and wish for the wind

Block or report csgcmai

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

CaptionQA: Is Your Caption as Useful as the Image Itself?

Python 25 1 Updated Dec 10, 2025

The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…

Python 6,233 722 Updated Dec 11, 2025

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

TypeScript 20,023 1,909 Updated Dec 15, 2025

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 732 55 Updated Aug 6, 2025

🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.

Python 21,971 2,099 Updated Nov 20, 2025

Official implementation of "Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs".

Python 94 10 Updated Nov 7, 2025

[NeurIPS 2025 Spotlight] A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone.

Python 39 Updated Oct 29, 2025

个人构建MoE大模型:从预训练到DPO的完整实践

Python 2,075 155 Updated Dec 16, 2025

MISP-Meeting Dataset & Code

Python 2 2 Updated Jul 23, 2025

Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.

Python 507 28 Updated Aug 14, 2025

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 1,482 216 Updated Dec 15, 2025

Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core.

Python 140 29 Updated Dec 19, 2025

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 5,027 395 Updated Dec 19, 2025

ScalarLM - a unified training and inference stack

Python 93 10 Updated Nov 18, 2025

AllenAI's post-training codebase

Python 3,458 476 Updated Dec 20, 2025

PyTorch building blocks for the OLMo ecosystem

Python 599 108 Updated Dec 20, 2025

Repository containing code and data for the paper "ArgCMV: An Argument Summarization Benchmark for the LLM-era", accepted at EMNLP 2025 Main Conference.

Python 1 Updated Nov 7, 2025

Lumina-DiMOO - An Open-Sourced Multi-Modal Large Diffusion Language Model

Python 911 57 Updated Nov 26, 2025

Awesome LLM pre-training resources, including data, frameworks, and methods.

298 20 Updated Apr 29, 2025

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 599 75 Updated Sep 11, 2024

Latency and Memory Analysis of Transformer Models for Training and Inference

Python 467 55 Updated Apr 19, 2025

llm theoretical performance analysis tools and support params, flops, memory and latency analysis.

Python 113 10 Updated Jul 11, 2025

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

1,850 78 Updated Dec 6, 2025

A high-performance inference engine for LLMs, optimized for diverse AI accelerators.

C++ 827 101 Updated Dec 19, 2025

Transformer related optimization, including BERT, GPT

C++ 6,370 927 Updated Mar 27, 2024

[ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Python 319 25 Updated Nov 26, 2025

[TMLR 2024] Efficient Large Language Models: A Survey

1,239 97 Updated Jun 23, 2025

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

374 31 Updated Nov 11, 2025
Next