Stars
Code for the ACL 2017 paper "Get To The Point: Summarization with Pointer-Generator Networks"
Code for paper Fine-tune BERT for Extractive Summarization
On-device AI across mobile, embedded and edge for PyTorch
"MiniRAG: Making RAG Simpler with Small and Open-Sourced Language Models"
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…
Awesome papers on Language-Model-as-a-Service (LMaaS)
High-speed and easy-use LLM serving framework for local deployment
Self-implemented NN operators for Qualcomm's Hexagon NPU
Universal LLM Deployment Engine with ML Compilation
A high-throughput and memory-efficient inference and serving engine for LLMs
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK
Qualcomm® AI Hub Models is our collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
Run Chinese MobileBert model on SNPE.
High-speed Large Language Model Serving for Local Deployment
关于2022年CS保研夏令营通知公告的汇总。欢迎大家积极分享夏令营信息,资瓷一下互联网精神吼不吼啊?