Skip to content
View cool9203's full-sized avatar

Block or report cool9203

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of …

Python 2,272 271 Updated Nov 5, 2025

Reliable model swapping for any local OpenAI compatible server - llama.cpp, vllm, etc

Go 1,812 117 Updated Nov 4, 2025

Sparse Inferencing for transformer based LLMs

Python 201 12 Updated Aug 11, 2025

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 1,973 221 Updated Nov 5, 2025

Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali

Python 2,532 174 Updated Oct 30, 2025

Label Studio is a multi-type data labeling and annotation tool with standardized output format

JavaScript 25,329 3,165 Updated Nov 5, 2025

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Python 2,920 222 Updated Nov 4, 2025

World's first AI meeting copilot → The Invisible Companion for Work + Life

JavaScript 2,745 210 Updated May 27, 2025

Instrument your FastAPI with Prometheus metrics.

Python 1,269 107 Updated Oct 1, 2025

Dynamic DNS Server with Web UI written in Go

Go 208 52 Updated Apr 16, 2025

SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.

Python 7,415 612 Updated Aug 17, 2025

letsencrypt/acme client implemented as a shell-script – just add water

Shell 6,144 726 Updated Oct 24, 2025

DSPy: The framework for programming—not prompting—language models

Python 29,808 2,375 Updated Nov 4, 2025

A framework for few-shot evaluation of language models.

Python 10,528 2,829 Updated Oct 29, 2025

Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C++) & mobile (Android & iOS) support, running quickly with OpenAI gpt-oss, Granite4, Qwen3VL, Gemma 3n and more.

Go 5,597 722 Updated Nov 5, 2025

Fast State-of-the-Art Static Embeddings

Python 1,890 106 Updated Oct 11, 2025

A curated list of awesome things related to FastAPI

10,636 778 Updated Oct 27, 2025

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Python 32,283 2,187 Updated Nov 3, 2025

Production-ready platform for agentic workflow development.

TypeScript 118,105 18,255 Updated Nov 5, 2025

A compact LLM pretrained in 9 days by using high quality data

Python 332 26 Updated Apr 9, 2025

Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024

Python 2,471 226 Updated Nov 4, 2025

《程式英文》:用英文提昇程式可讀性

C# 1,001 47 Updated May 20, 2021

Open Weight, tool-calling LLMs

Makefile 155 19 Updated Oct 24, 2024

A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.

Python 1,565 96 Updated May 28, 2025

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

JavaScript 114,309 15,932 Updated Nov 5, 2025

The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector, sparse vector, tensor (multi-vector), and full-text.

C++ 4,170 392 Updated Nov 4, 2025

Ollama Python library

Python 8,810 848 Updated Oct 7, 2025

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev…

Python 2,774 303 Updated Jun 24, 2024

Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.

Python 4,341 388 Updated Oct 22, 2025

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

Python 3,749 260 Updated May 17, 2025
Next