-
ByteDance
- Shanghai, China
-
02:08
(UTC +08:00) - https://fangjiarui.github.io/
- https://www.zhihu.com/people/feifeibear
- in/fangjiarui
Lists (5)
Sort Name ascending (A-Z)
Stars
An Open Source Machine Learning Framework for Everyone
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering high-performance on-device LLMs and Edge AI.
FlashMLA: Efficient Multi-head Latent Attention Kernels
A distributed, fast open-source graph database featuring horizontal scalability and high availability
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
CUDA Templates and Python DSLs for High-Performance Linear Algebra
High-speed Large Language Model Serving for Local Deployment
lightweight, standalone C++ inference engine for Google's Gemma models.
Header-only C++/python library for fast approximate nearest neighbors
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Fast inference engine for Transformer models
LightSeq: A High Performance Library for Sequence Processing and Generation
Tendis is a high-performance distributed storage system fully compatible with the Redis protocol.
Go AI program which implements the AlphaGo Zero paper
Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.