-
Sun Yat-sen University
- Guang Zhou, China
- https://www.zhihu.com/people/liang-de-peng/posts
Stars
- All languages
- Assembly
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CoffeeScript
- Component Pascal
- Cuda
- Dockerfile
- Emacs Lisp
- Fortran
- Go
- HTML
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- Nim
- Objective-C
- OpenEdge ABL
- Perl
- PureBasic
- Python
- R
- Ruby
- Rust
- SCSS
- Scala
- Scheme
- Shell
- Swift
- TeX
- Thrift
- TypeScript
- Vim Script
- Vue
A tiny demo of interfacing CUDA via nanobind with a pytorch tensor
MooreThreads / vllm_musa
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
gen-cli / gen-cli
Forked from google-gemini/gemini-cliAgents of C.L.I.
ROCm / vllm
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
ROCm / flash-attention
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
kailums / flash-attention-rocm
Forked from ROCm/flash-attentionFast and memory-efficient exact attention ported to rocm
ROCm / pytorch
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
ROCm / triton
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
doreamon-design / clash
Forked from fossabot/clashA rule-based tunnel in Go.
trholding / llama2.c
Forked from karpathy/llama2.cLlama 2 Everywhere (L2E)
TianyiPeng / alpaca-lora
Forked from tloen/alpaca-loraInstruct-tune LLaMA on consumer hardware
tmc / go-llama2
Forked from karpathy/llama2.cLlama 2 inference in one file of pure Go
gaxler / llama2.rs
Forked from karpathy/llama2.cInference Llama 2 in one file of pure Rust 🦀
Manuel030 / llama2.c-android
Forked from karpathy/llama2.cInference Llama 2 in one file of pure C
leloykun / llama2.cpp
Forked from karpathy/llama2.cInference Llama 2 in one file of pure C++
deepspeedai / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
sbt / sbt-assembly
Forked from softprops/assembly-sbtDeploy über-JARs. Restart processes. (port of codahale/assembly-sbt)
OpenPPL / CuAssembler
Forked from cloudcores/CuAssemblerAn unofficial cuda assembler, for all generations of SASS, hopefully :)
strint / cpp_related_tips
Forked from huihut/interview📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, in…
yk / apes-public
Forked from NVlabs/stylegan2-ada-pytorchPublic repo for the GANFT video
ilopezfr / gpt-2
Forked from openai/gpt-2Code + Playground Colab for the paper "Language Models are Unsupervised Multitask Learners"
AlexeyAB / darknet
Forked from pjreddie/darknetYOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
shscy / Graphviz4S
Forked from Ldpe2G/Graphviz4SSimple Scala interface for Graphviz
pbaylies / stylegan-encoder
Forked from Puzer/stylegan-encoderStyleGAN Encoder - converts real images to latent space
llSourcell / AI_for_Dating
Forked from jeffmli/TinderAutomationNVIDIA / cocoapi
Forked from cocodataset/cocoapiCOCO API - Dataset @ http://cocodataset.org/
tqchen / mshadow
Forked from dmlc/mshadowMatrix Shadow:Lightweight CPU/GPU Matrix and Tensor Template Library in C++/CUDA for (Deep) Machine Learning
zhreshold / mxnet
Forked from apache/mxnetLightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
ethanhe42 / softer-NMS
Forked from facebookresearch/DetectronBounding Box Regression with Uncertainty for Accurate Object Detection (CVPR'19)