SuperCB

🏠

Working from home

CuiBo SuperCB

🏠

Working from home

Learning Machine Learning System

48 followers · 176 following

rednote-hilab
Beijing

Achievements

cuda_hgemv Public
Forked from Bruce-Lee-LY/cuda_hgemv

Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.

Cuda MIT License Updated Oct 9, 2023
Bovim Public

Lua 1 Apache License 2.0 Updated Sep 30, 2023
FlexGen Public
Forked from FMInference/FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

Python Apache License 2.0 Updated Sep 27, 2023
cuda_hgemm Public
Forked from Bruce-Lee-LY/cuda_hgemm

Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

Cuda MIT License Updated Sep 24, 2023
TranAD Public
Forked from imperial-qore/TranAD

[VLDB'22] Anomaly Detection using Transformers, self-conditioning and adversarial training.

Python BSD 3-Clause "New" or "Revised" License Updated Sep 13, 2023
KuiperInfer Public
Forked from zjhellofss/KuiperInfer

带你从零实现一个高性能的深度学习推理库，Implement a high-performance deep learning inference library step by step

C++ MIT License Updated Aug 31, 2023
MyYadm Public

Shell Updated Aug 25, 2023
DRS Public
Forked from JolyonJian/DRS

A Deep Reinforcement Learning enhanced Kubernetes Scheduler for Microservice-based System

Go Updated Aug 25, 2023
rlink-rs Public
Forked from rlink-rs/rlink-rs

High-performance Stream Processing Framework. An alternative to Apache Flink.

Rust Apache License 2.0 Updated Aug 12, 2023
Myconfig Public

Shell Updated Aug 7, 2023
fastllm Public
Forked from ztxz16/fastllm

纯c++的全平台llm加速库，支持python调用，chatglm-6B级模型单卡可达10000+token / s，支持glm, llama, moss基座，手机端流畅运行

C++ Updated Jul 28, 2023
transformers Public
Forked from huggingface/transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python Apache License 2.0 Updated Jul 25, 2023
Learn-Vim Public
Forked from iggredible/Learn-Vim

Learning Vim and Vimscript doesn't have to be hard. This is the guide that you're looking for 📖

Other Updated Jul 6, 2023
kernl Public
Forked from ELS-RD/kernl

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Jupyter Notebook Apache License 2.0 Updated Jun 30, 2023
tinygrad Public
Forked from tinygrad/tinygrad

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python MIT License Updated Jun 23, 2023
awesome-self-supervised-learning-timeseries Public
Forked from qingsongedu/Awesome-SSL4TS

A professionally curated list of awesome resources (paper, code, data, etc.) on Self-Supervised Learning for Time Series (SSL4TS).

Updated Jun 22, 2023
concurrentqueue Public
Forked from cameron314/concurrentqueue

A fast multi-producer, multi-consumer lock-free concurrent queue for C++11

C++ Other Updated Jun 19, 2023
learn-llm Public

Python Updated Jun 6, 2023
how-to-optim-algorithm-in-cuda Public
Forked from BBuf/how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda Updated Jun 5, 2023
InferLLM Public
Forked from MegEngine/InferLLM

a lightweight LLM model inference framework

C++ 2 Apache License 2.0 Updated Jun 2, 2023
iree Public
Forked from iree-org/iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ Apache License 2.0 Updated May 28, 2023
LaWGPT Public
Forked from pengxiao-song/LaWGPT

🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. 基于中文法律知识的大语言模型

Python Updated May 20, 2023
Tinycompiler Public

A compiler made by CB

C++ 4 1 Updated Apr 26, 2023
Cost-Model-papers Public
Forked from LiuXiaoxuanPKU/Cost-Model-papers

Updated Feb 22, 2023
tinyengine Public
Forked from mit-han-lab/tinyengine

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 2…

C MIT License Updated Feb 9, 2023
ps-lite Public
Forked from dmlc/ps-lite

A lightweight parameter server interface

C++ Apache License 2.0 Updated Jan 11, 2023
CMU10-714 Public
Forked from PKUFlyingPig/CMU10-714

Learning material for CMU10-714: Deep Learning System

Jupyter Notebook Updated Jan 7, 2023
smart-pointers Public
Forked from HaykDanghyan/smart-pointers

Smart Pointers implementation (std::unique_ptr, std::shared_ptr)

C++ Updated Jan 1, 2023
MyJos Public

Solution to MIT 6.828

C 1 Updated Aug 14, 2022
Learn-Cuda Public

Cuda Updated Jul 23, 2022

CuiBo SuperCB

Achievements

Achievements

cuda_hgemv Public

Uh oh!

Bovim Public

Uh oh!

FlexGen Public

Uh oh!

cuda_hgemm Public

Uh oh!

TranAD Public

Uh oh!

KuiperInfer Public

Uh oh!

MyYadm Public

Uh oh!

DRS Public

Uh oh!

rlink-rs Public

Uh oh!

Myconfig Public

Uh oh!

fastllm Public

Uh oh!

transformers Public

Uh oh!

Learn-Vim Public

Uh oh!

kernl Public

Uh oh!

tinygrad Public

Uh oh!

awesome-self-supervised-learning-timeseries Public

Uh oh!

concurrentqueue Public

Uh oh!

learn-llm Public

Uh oh!

how-to-optim-algorithm-in-cuda Public

Uh oh!

InferLLM Public

Uh oh!

iree Public

Uh oh!

LaWGPT Public

Uh oh!

Tinycompiler Public

Uh oh!

Cost-Model-papers Public

Uh oh!

tinyengine Public

Uh oh!

ps-lite Public

Uh oh!

CMU10-714 Public

Uh oh!

smart-pointers Public

Uh oh!

MyJos Public

Uh oh!

Learn-Cuda Public

Uh oh!