Skip to content
View rootfs's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@openshift @ceph @coreos-inc @rook @fast-ml @redhat-et @os-climate @llm-d

Block or report rootfs

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Intelligent Mixture-of-Models Router for Efficient LLM Inference

Python 1,632 183 Updated Oct 9, 2025

LLM Semantic Router: Intelligent Mixture-of-Models (MoM) System with Privacy Preservation and Prompt Guard. The semantic router intelligently directs OpenAI compliant API requests to the most suita…

Python 18 10 Updated Aug 30, 2025

Run OpenAI's CLIP and Apple's MobileCLIP model on iOS to search photos.

Swift 2,884 443 Updated Jan 4, 2025

CLIP-Finder enables semantic offline searches of images from gallery photos using natural language descriptions or the camera. Built on Apple's MobileCLIP-S0 architecture, it ensures optimal perfor…

Swift 84 10 Updated Jul 25, 2024

llm-d enables high-performance distributed LLM inference on Kubernetes

Makefile 1,857 189 Updated Oct 9, 2025

Latency and Memory Analysis of Transformer Models for Training and Inference

Python 457 53 Updated Apr 19, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,363 945 Updated Sep 23, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

Python 2,868 304 Updated Mar 10, 2025
Python 4,534 361 Updated Jun 12, 2025

Cloud Native Observability and Policy Engine for LLM Applications

Python 7 1 Updated Feb 10, 2025

GitHub Action to Create an AWS EC2 Self-hosted Runner

Shell 2 1 Updated Oct 3, 2025

Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...

2,132 172 Updated Apr 30, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 74,927 10,959 Updated Oct 9, 2025

Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks.This framework enables Claud…

Python 11,127 1,170 Updated Dec 12, 2024

Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget

Python 161 9 Updated Aug 11, 2025

Carbon Limiting Auto Tuning for Kubernetes

Go 37 8 Updated Nov 11, 2024

Grok open release

Python 50,527 8,375 Updated Aug 30, 2024

vLLM Router

Python 44 2 Updated Mar 11, 2024
Jupyter Notebook 8 1 Updated Apr 28, 2024

A reproduction of the Gemini demo using GPT-vision.

JavaScript 127 45 Updated Dec 20, 2023

Create an AWS EC2 Github Action Self hosted Runner

Shell 1 Updated Dec 12, 2023

A MIT-licensed, deployable starter kit for building and customizing your own version of AI town - a virtual town where AI characters live, chat and socialize.

TypeScript 8,880 899 Updated Feb 13, 2025

code samples for the goodreads datasets

Jupyter Notebook 289 62 Updated Feb 4, 2025

FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

Jupyter Notebook 17,682 2,515 Updated Oct 3, 2025

Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vicuna, Claude, ChatGLM, MOSS, 讯飞星火, 文心一言 and more, discover the best answers

JavaScript 16,038 1,691 Updated Sep 25, 2025

A local-first knowledge management app

Vue 468 29 Updated Aug 11, 2025

Type less, code more: Cody is an AI code assistant that uses advanced search and codebase context to help you write and fix code.

TypeScript 3,795 473 Updated Aug 1, 2025

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.

JavaScript 49,788 5,201 Updated Oct 8, 2025

Document Chatbot — multiple files. Powered by GPT / Embedding.

TypeScript 3,350 486 Updated Dec 17, 2024
Next