Skip to content
View BodhiHu's full-sized avatar
🌴
bodhicitta
🌴
bodhicitta
  • AMD, MooreThreads
  • Shanghai

Block or report BodhiHu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
106 stars written in Python
Clear filter

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,983 524 Updated Apr 11, 2025

Colored logcat script which only shows log entries for a specific application package.

Python 4,921 509 Updated May 10, 2024

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

Python 4,737 1,291 Updated Nov 6, 2025

Deep Learning Tutorial notes and code. See the wiki for more info.

Python 4,142 2,128 Updated Sep 30, 2020
Python 3,892 255 Updated Mar 15, 2024

MS-Agent: Lightweight Framework for Empowering Agents with Autonomous Exploration in Complex Task Scenarios

Python 3,553 406 Updated Nov 7, 2025

Awesome React Native UI components updated weekly

Python 3,451 338 Updated Aug 28, 2023

Sparsity-aware deep learning inference runtime for CPUs

Python 3,160 191 Updated Jun 2, 2025

A Pythonic framework to simplify AI service building

Python 2,796 192 Updated Nov 6, 2025

Data manipulation and transformation for audio signal processing, powered by PyTorch

Python 2,765 738 Updated Nov 6, 2025

PyTorch native quantization and sparsity for training and inference

Python 2,492 363 Updated Nov 7, 2025

Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc.…

Python 2,463 282 Updated Sep 26, 2024

Convert Sketch files into React Native components

Python 2,320 161 Updated Sep 3, 2020

A lightweight framework for building LLM-based agents

Python 2,198 223 Updated Aug 6, 2025

Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

Python 2,146 157 Updated Jun 2, 2025

Docker images for production and development setups of the Frappe framework and ERPNext

Python 2,082 2,089 Updated Nov 5, 2025

Use AnimeGANv3 to make your own animation works, including turning photos or videos into anime.

Python 1,961 258 Updated Aug 23, 2025

中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3

Python 1,947 167 Updated Sep 23, 2024

An AutoGPT agent that controls Chrome on your desktop

Python 1,749 213 Updated Oct 25, 2023

A quickstart and benchmark for pytorch distributed training.

Python 1,663 296 Updated Jul 25, 2024

Chat language model that can use tools and interpret the results

Python 1,586 118 Updated Nov 4, 2025

A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment…

Python 1,518 192 Updated Nov 7, 2025

🩹Editing large language models within 10 seconds⚡

Python 1,350 101 Updated Aug 13, 2023

[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filli…

Python 1,148 63 Updated Sep 30, 2025

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

Python 995 62 Updated Dec 6, 2024

LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.

Python 862 124 Updated Nov 7, 2025

An LLM-based Agent for the New Automation Paradigm - Agentic Process Automation

Python 852 94 Updated Dec 27, 2023

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 749 148 Updated Nov 7, 2025

Quick and reliable way to convert NGINX configurations into JSON and back.

Python 745 90 Updated May 20, 2024

Convert ONNX models to PyTorch.

Python 707 85 Updated Oct 14, 2025