🎯
Focusing
Stars
6
stars
written in Python
Clear filter
Official inference framework for 1-bit LLMs
SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
An Open Framework for Federated Learning.
Real-time human detection and tracking camera using YOLOV5 and Arduino