Stars
The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".
Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)
oneAPI Collective Communications Library (oneCCL)
This is for enabling GPU utilization and inferencing with an Intel ARC GPU even when there is no REBAR available in the system due to chipset not supporting those features
Glances an Eye on your system. A top/htop alternative for GNU/Linux, BSD, Mac OS and Windows operating systems.
Run Generative AI models with simple C++/Python API and using OpenVINO Runtime
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
A Model Context Protocol (MCP) server for ATLAS, a Neo4j-powered task management system for LLM Agents - implementing a three-tier architecture (Projects, Tasks, Knowledge) to manage complex workfl…
Optimizing inference proxy for LLMs
chnxq / ollama
Forked from ollama/ollamaGet up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
Kilo is the all-in-one agentic engineering platform. Build, ship, and iterate faster with the most popular open source coding agent. #1 on OpenRouter. 750k+ Kilo Coders. 6.1 trillion tokens/month.
Delete all your messages in groups / supergroups using this python script
Hybrid Schema-Guided Reasoning (SGR) has agentic system design created by neuraldeep community
A scalable inference server for models optimized with OpenVINO™
Developer kits reference setup scripts for various kinds of Intel platforms and GPUs
Let llama3 performs web searches and retrieves information using searXNG
Boost your efficiency with Fish Speech Batch Inference. Easily process multiple texts and achieve consistently great results. 🗨️🐟
Supercharge Your LLM with the Fastest KV Cache Layer
Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over OpenAI endpoints.
OpenAPI-like API-server for voice generation (TTS) based on fish-speech-1.5 model.
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
21 Lessons, Get Started Building with Generative AI
A repository for Skyline, Strato, Vita3K and Yuzu Android compatible Adreno drivers.
Make use of Intel Arc Series GPU to Run Ollama, StableDiffusion, Whisper and Open WebUI, for image generation, speech recognition and interaction with Large Language Models (LLM).