-
cuda_hgemv Public
Forked from Bruce-Lee-LY/cuda_hgemvSeveral optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.
Cuda MIT License UpdatedOct 9, 2023 -
-
FlexGen Public
Forked from FMInference/FlexLLMGenRunning large language models on a single GPU for throughput-oriented scenarios.
Python Apache License 2.0 UpdatedSep 27, 2023 -
cuda_hgemm Public
Forked from Bruce-Lee-LY/cuda_hgemmSeveral optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
Cuda MIT License UpdatedSep 24, 2023 -
TranAD Public
Forked from imperial-qore/TranAD[VLDB'22] Anomaly Detection using Transformers, self-conditioning and adversarial training.
Python BSD 3-Clause "New" or "Revised" License UpdatedSep 13, 2023 -
KuiperInfer Public
Forked from zjhellofss/KuiperInfer带你从零实现一个高性能的深度学习推理库,Implement a high-performance deep learning inference library step by step
C++ MIT License UpdatedAug 31, 2023 -
-
DRS Public
Forked from JolyonJian/DRSA Deep Reinforcement Learning enhanced Kubernetes Scheduler for Microservice-based System
Go UpdatedAug 25, 2023 -
rlink-rs Public
Forked from rlink-rs/rlink-rsHigh-performance Stream Processing Framework. An alternative to Apache Flink.
Rust Apache License 2.0 UpdatedAug 12, 2023 -
-
fastllm Public
Forked from ztxz16/fastllm纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
C++ UpdatedJul 28, 2023 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python Apache License 2.0 UpdatedJul 25, 2023 -
Learn-Vim Public
Forked from iggredible/Learn-VimLearning Vim and Vimscript doesn't have to be hard. This is the guide that you're looking for 📖
Other UpdatedJul 6, 2023 -
kernl Public
Forked from ELS-RD/kernlKernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
Jupyter Notebook Apache License 2.0 UpdatedJun 30, 2023 -
tinygrad Public
Forked from tinygrad/tinygradYou like pytorch? You like micrograd? You love tinygrad! ❤️
Python MIT License UpdatedJun 23, 2023 -
awesome-self-supervised-learning-timeseries Public
Forked from qingsongedu/Awesome-SSL4TSA professionally curated list of awesome resources (paper, code, data, etc.) on Self-Supervised Learning for Time Series (SSL4TS).
UpdatedJun 22, 2023 -
concurrentqueue Public
Forked from cameron314/concurrentqueueA fast multi-producer, multi-consumer lock-free concurrent queue for C++11
C++ Other UpdatedJun 19, 2023 -
-
how-to-optim-algorithm-in-cuda Public
Forked from BBuf/how-to-optim-algorithm-in-cudahow to optimize some algorithm in cuda.
Cuda UpdatedJun 5, 2023 -
InferLLM Public
Forked from MegEngine/InferLLMa lightweight LLM model inference framework
-
iree Public
Forked from iree-org/ireeA retargetable MLIR-based machine learning compiler and runtime toolkit.
C++ Apache License 2.0 UpdatedMay 28, 2023 -
LaWGPT Public
Forked from pengxiao-song/LaWGPT🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. 基于中文法律知识的大语言模型
Python UpdatedMay 20, 2023 -
-
-
tinyengine Public
Forked from mit-han-lab/tinyengine[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 2…
C MIT License UpdatedFeb 9, 2023 -
ps-lite Public
Forked from dmlc/ps-liteA lightweight parameter server interface
C++ Apache License 2.0 UpdatedJan 11, 2023 -
CMU10-714 Public
Forked from PKUFlyingPig/CMU10-714Learning material for CMU10-714: Deep Learning System
Jupyter Notebook UpdatedJan 7, 2023 -
smart-pointers Public
Forked from HaykDanghyan/smart-pointersSmart Pointers implementation (std::unique_ptr, std::shared_ptr)
C++ UpdatedJan 1, 2023 -
-