Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Vector search engine inside Milvus, integrating FAISS, HNSW, DiskANN.
A library of GPU kernels for sparse matrix operations.
Distributed LR、 FM model on Parameter Server. FTRL and SGD Optimization Algorithm.
Knowhere is an open-source vector search engine, integrating FAISS, HNSW, etc.
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
Convolutional Neural Network with CUDA (MNIST 99.23%)
A fast, small C/C++ function call tracer for x86-64/Linux, supports clang & gcc, ftrace, threads, exceptions & shared libraries
torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters in a single C++ process.
High performance HTTP server built on C++20 coroutines and io_uring
Vector Search Engine base on BRPC + FAISS
A toy compiler written in C++17 that translates SysY (a C-like toy language) into ARM-v7a assembly.
一个搜索引擎迷你项目,涉及分词,建倒排索引,网页去重,计算相似度,文本聚类,多进程编程,网络编程,守护进程编写,makefile编写,工程组织等各方面内容
Distributed FM and LR based on Parameter Server with Ftrl
dlsys-course / tinyflow
Forked from tqchen/tinyflowTutorial code on how to build your own Deep Learning System in 2k Lines
Standalone Flash Attention v2 kernel without libtorch dependency
A runtime for writing asynchronous applications with Modern C++, based on C++20 coroutine and liburing (io-uring)
Simple neural network implementation using CUDA technology. It is an educational implementation.
A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer
Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.
A distributed Google File System (GFS), partially implemented in C++. (http://bit.ly/gfs-impl)
SONG: Approximate Nearest Neighbor Search on GPU. SONG is a graph-based approximate nearest neighbor search toolbox.