Skip to content
View cyche23's full-sized avatar

Block or report cyche23

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
13 stars written in Python
Clear filter

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 69,914 13,330 Updated Feb 10, 2026

The official Meta Llama 3 GitHub site

Python 29,235 3,515 Updated Jan 26, 2025

Universal LLM Deployment Engine with ML Compilation

Python 22,018 1,932 Updated Feb 9, 2026

On-device AI across mobile, embedded and edge for PyTorch

Python 4,248 830 Updated Feb 10, 2026

本项目分享了中山大学计算机学院本科和研究生阶段的课程资料、笔记、期末考试卷和其他实用的相关资源。希望对同学们的学习有所帮助❤️,如果喜欢记得给个star🌟

Python 2,307 292 Updated Jan 29, 2026

Code for the ACL 2017 paper "Get To The Point: Summarization with Pointer-Generator Networks"

Python 2,196 806 Updated Jun 16, 2022

"MiniRAG: Making RAG Simpler with Small and Open-Sourced Language Models"

Python 1,708 228 Updated Oct 16, 2025

Code for paper Fine-tune BERT for Extractive Summarization

Python 1,504 420 Updated Jan 11, 2022

MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.

Python 1,406 87 Updated Apr 21, 2025

Qualcomm® AI Hub Models is our collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

Python 916 157 Updated Jan 28, 2026

Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK

Python 90 5 Updated Feb 5, 2026

YOLOv5在高通AI Engine Direct环境下进行QNN量化,CPU推理的项目

Python 16 1 Updated Sep 10, 2024
Python 5 2 Updated Dec 12, 2024