-
NIVIC
- HeFei
Lists (3)
Sort Name ascending (A-Z)
Stars
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
Astron-xmod-shim — Lightweight, declarative middleware for reliably converging AI service workloads.
Cross-platform AI workflow DSL converter supporting iFlytek Spark, Dify, and Coze platforms with unified intermediate representation and bidirectional transformation capabilities.
A workload for deploying LLM inference services on Kubernetes
Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, TensorRT-LLM, and Triton
This a simple implementation of an MCP server using iFlytek. It enables calling iFlytek workflows through MCP tools.
A lightweight data processing framework built on DuckDB and 3FS.
Analyze computation-communication overlap in V3/R1.
A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.
My learning notes for ML SYS.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
FlashMLA: Efficient Multi-head Latent Attention Kernels
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
SGLang is a high-performance serving framework for large language models and multimodal models.
🪄 Turns your machine learning code into microservices with web API, interactive GUI, and more.
A collection of community maintained NRI plugins
SciLifeLab Serve is a platform offering machine learning model serving, data science app hosting (Shiny, Gradio, Streamlit, Dash, etc.), and other tools to life science researchers affiliated with …
Examples of models deployable with Truss
BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation …
A high-throughput and memory-efficient inference and serving engine for LLMs
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
Free ChatGPT&DeepSeek API Key,免费ChatGPT&DeepSeek API。免费接入DeepSeek API和GPT4 API,支持 gpt | deepseek | claude | gemini | grok 等排名靠前的常用大模型。
Slim(toolkit): Don't change anything in your container image and minify it by up to 30x (and for compiled languages even more) making it secure too! (free and open source)
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.