Skip to content
View rootfs's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@ceph @rook @fast-ml @redhat-et @os-climate

Block or report rootfs

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

a toolkit on knowledge distillation for large language models

Python 375 36 Updated Mar 10, 2026

Transform unstructured text into structured knowledge with LLMs. Graphs, hypergraphs, and spatio-temporal extractions — with one command.

Python 919 104 Updated May 18, 2026

GitNexus: The Zero-Server Code Intelligence Engine - GitNexus is a client-side knowledge graph creator that runs entirely in your browser. Drop in a GitHub repo or ZIP file, and get an interactive …

TypeScript 38,796 4,441 Updated May 18, 2026

CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies

Rust 49,522 3,016 Updated May 17, 2026

A light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code by TÂCHES.

JavaScript 62,785 5,341 Updated May 18, 2026

omo; the best agent harness - previously oh-my-opencode

TypeScript 58,315 4,727 Updated May 18, 2026

LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar

Python 14,446 1,213 Updated May 14, 2026

TabICLv2: A state-of-the-art tabular foundation model

Python 874 111 Updated May 1, 2026

⚡ TabPFN: Foundation Model for Tabular Data ⚡

Python 7,082 695 Updated May 16, 2026

Implementation of the sap-rpt-1-oss deep learning model with inference pipeline as described in the paper "ConTextTab: A Semantics-Aware Tabular In-Context Learner".

Python 168 24 Updated Nov 27, 2025

TabTune: A Unified Library for Inference and Fine-Tuning Tabular Foundation Models

Python 90 8 Updated May 11, 2026

Easy Data Preparation with latest LLMs-based Operators and Pipelines.

Python 3,722 378 Updated Apr 15, 2026

A high-performance and light-weight router for vLLM large scale deployment

Rust 230 81 Updated May 6, 2026

Modeling, training, eval, and inference code for OLMo

Python 6,509 757 Updated Nov 24, 2025
Python 110 10 Updated Jun 2, 2025

Bringing BERT into modernity via both architecture changes and scaling

Python 1,676 145 Updated Mar 1, 2026

A powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval

JavaScript 14,268 1,440 Updated May 1, 2026

A framework for efficient model inference with omni-modality models

Python 4,791 938 Updated May 18, 2026

RouterArena: An open framework for evaluating LLM routers with standardized datasets, metrics, an automated framework, and a live leaderboard.

Python 77 18 Updated May 15, 2026

The Future of Data Engineering — A CLI SQL client for the modern data stack, enabling AI-native context engineering for data.

Python 1,246 196 Updated May 18, 2026

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Go 4,181 673 Updated May 18, 2026

LLM Semantic Router: Intelligent Mixture-of-Models (MoM) System with Privacy Preservation and Prompt Guard. The semantic router intelligently directs OpenAI compliant API requests to the most suita…

Python 21 12 Updated Aug 30, 2025

Run OpenAI's CLIP and Apple's MobileCLIP model on iOS to search photos.

Swift 2,940 450 Updated Mar 29, 2026

CLIP-Finder enables semantic offline searches of images from gallery photos using natural language descriptions or the camera. Built on Apple's MobileCLIP-S0 architecture, it ensures optimal perfor…

Swift 91 11 Updated Jul 25, 2024

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 3,201 478 Updated May 16, 2026

Latency and Memory Analysis of Transformer Models for Training and Inference

Python 487 58 Updated Apr 19, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,900 1,046 Updated May 7, 2026

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

Python 2,954 325 Updated Jan 14, 2026

Cloud Native Observability and Policy Engine for LLM Applications

Python 7 1 Updated Mar 19, 2026
Next