Stars
Implementation for "Model Compression with Exact Budget Constraints via Riemannian Manifolds"
Convert PDF to markdown + JSON quickly with high accuracy
Efficient non-uniform quantization with GPTQ for GGUF
A high-throughput and memory-efficient inference and serving engine for LLMs
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
FedCCL: Federated Clustered Continual Learning Framework for Privacy-focused Energy Forecasting
Tensors and Dynamic neural networks in Python with strong GPU acceleration
A lightning-fast search engine API bringing AI-powered hybrid search to your sites and applications.
A library for efficient similarity search and clustering of dense vectors.
High performance self-hosted photo and video management solution.