-
Red Hat
- United States
-
12:42
(UTC -04:00) - in/huaminchen
- @root_fs
Lists (1)
Sort Name ascending (A-Z)
Stars
Intelligent Mixture-of-Models Router for Efficient LLM Inference
LLM Semantic Router: Intelligent Mixture-of-Models (MoM) System with Privacy Preservation and Prompt Guard. The semantic router intelligently directs OpenAI compliant API requests to the most suita…
Run OpenAI's CLIP and Apple's MobileCLIP model on iOS to search photos.
CLIP-Finder enables semantic offline searches of images from gallery photos using natural language descriptions or the camera. Built on Apple's MobileCLIP-S0 architecture, it ensures optimal perfor…
llm-d enables high-performance distributed LLM inference on Kubernetes
Latency and Memory Analysis of Transformer Models for Training and Inference
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.
Cloud Native Observability and Policy Engine for LLM Applications
GitHub Action to Create an AWS EC2 Self-hosted Runner
Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks.This framework enables Claud…
Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
Carbon Limiting Auto Tuning for Kubernetes
A reproduction of the Gemini demo using GPT-vision.
Create an AWS EC2 Github Action Self hosted Runner
A MIT-licensed, deployable starter kit for building and customizing your own version of AI town - a virtual town where AI characters live, chat and socialize.
code samples for the goodreads datasets
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vicuna, Claude, ChatGLM, MOSS, 讯飞星火, 文心一言 and more, discover the best answers
Type less, code more: Cody is an AI code assistant that uses advanced search and codebase context to help you write and fix code.
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
Document Chatbot — multiple files. Powered by GPT / Embedding.