Skip to content
View cyche23's full-sized avatar

Block or report cyche23

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Code for the ACL 2017 paper "Get To The Point: Summarization with Pointer-Generator Networks"

Python 2,195 809 Updated Jun 16, 2022

Code for paper Fine-tune BERT for Extractive Summarization

Python 1,506 422 Updated Jan 11, 2022

On-device AI across mobile, embedded and edge for PyTorch

Python 3,757 767 Updated Dec 20, 2025

"MiniRAG: Making RAG Simpler with Small and Open-Sourced Language Models"

Python 1,615 216 Updated Oct 16, 2025

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…

C++ 13,735 2,141 Updated Dec 19, 2025

Awesome papers on Language-Model-as-a-Service (LMaaS)

550 32 Updated May 14, 2024
C++ 39 6 Updated Dec 16, 2025

High-speed and easy-use LLM serving framework for local deployment

C++ 139 19 Updated Aug 7, 2025

Self-implemented NN operators for Qualcomm's Hexagon NPU

C 34 6 Updated Sep 30, 2025

Universal LLM Deployment Engine with ML Compilation

Python 21,769 1,891 Updated Dec 11, 2025

A primitive library for neural network

C++ 1,370 223 Updated Nov 24, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 65,813 12,083 Updated Dec 20, 2025

LLM inference in C/C++

C++ 91,603 14,159 Updated Dec 20, 2025

MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.

Python 1,400 85 Updated Apr 21, 2025

Fast Multimodal LLM on Mobile Devices

C++ 1,282 156 Updated Dec 19, 2025

Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK

C++ 89 5 Updated Dec 2, 2025

Qualcomm® AI Hub Models is our collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

Python 867 151 Updated Dec 16, 2025

The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

Java 350 88 Updated Dec 17, 2025

Run Chinese MobileBert model on SNPE.

C++ 15 5 Updated May 19, 2023

A simple tutorial of SNPE.

C++ 182 46 Updated Mar 30, 2023

YOLOv5在高通AI Engine Direct环境下进行QNN量化,CPU推理的项目

Python 16 Updated Sep 10, 2024

The official Meta Llama 3 GitHub site

Python 29,141 3,501 Updated Jan 26, 2025

High-speed Large Language Model Serving for Local Deployment

C++ 8,496 461 Updated Aug 2, 2025
Python 5 2 Updated Dec 12, 2024

关于2022年CS保研夏令营通知公告的汇总。欢迎大家积极分享夏令营信息,资瓷一下互联网精神吼不吼啊?

1,248 177 Updated Sep 28, 2022