- Hong Kong
-
17:20
(UTC -12:00)
-
-
flashinfer Public
Forked from flashinfer-ai/flashinferFlashInfer: Kernel Library for LLM Serving
Cuda Apache License 2.0 UpdatedJul 30, 2024 -
vattention Public
Forked from microsoft/vattentionDynamic Memory Management for Serving LLMs without PagedAttention
C MIT License UpdatedJul 29, 2024 -
InfiniGen Public
Forked from snu-comparch/InfiniGenInfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)
Python Apache License 2.0 UpdatedJul 10, 2024 -
-
-
-
orion Public
Forked from eth-easl/orionAn interference-aware scheduler for fine-grained GPU sharing
Python MIT License UpdatedApr 15, 2024 -
-
SpotServe Public
Forked from Hsword/SpotServeSpotServe: Serving Generative Large Language Models on Preemptible Instances
Apache License 2.0 UpdatedFeb 22, 2024 -
streaming-llm Public
Forked from mit-han-lab/streaming-llmEfficient Streaming Language Models with Attention Sinks
Python MIT License UpdatedOct 5, 2023 -
chroma Public
Forked from chroma-core/chromathe AI-native open-source embedding database
Python Apache License 2.0 UpdatedAug 23, 2023 -
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ Other UpdatedAug 14, 2023 -
faiss Public
Forked from facebookresearch/faissA library for efficient similarity search and clustering of dense vectors.
C++ MIT License UpdatedAug 11, 2023 -
hnswlib Public
Forked from nmslib/hnswlibHeader-only C++/python library for fast approximate nearest neighbors
C++ Apache License 2.0 UpdatedAug 11, 2023 -
-
milvus Public
Forked from milvus-io/milvusA cloud-native vector database, storage for next generation AI applications
Go Apache License 2.0 UpdatedAug 5, 2023 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedJul 20, 2023 -
perftest Public
Forked from linux-rdma/perftestInfiniband Verbs Performance Tests
C Other UpdatedJul 17, 2023 -
-
rdma-core Public
Forked from linux-rdma/rdma-coreRDMA core userspace libraries and daemons
C Other UpdatedJul 16, 2023 -
gpgpu-sim_distribution Public
Forked from gpgpu-sim/gpgpu-sim_distributionGPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…
C++ Other UpdatedJul 6, 2023 -
cuda-samples Public
Forked from NVIDIA/cuda-samplesSamples for CUDA Developers which demonstrates features in CUDA Toolkit
C Other UpdatedJun 30, 2023 -
README_tools Public
Forked from guodongxiaren/READMEREADME文件语法解读,即Github Flavored Markdown语法介绍
The Unlicense UpdatedMar 8, 2023 -
kvm-hello-world Public
Forked from dpw/kvm-hello-worldA minimal kvm example
C MIT License UpdatedJul 30, 2022