Skip to content
View borongyuan's full-sized avatar

Organizations

@Factor-Robotics

Block or report borongyuan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

26 stars written in Cuda
Clear filter

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Cuda 4,459 929 Updated Aug 30, 2024

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,817 463 Updated Oct 9, 2023

MatConvNet: CNNs for MATLAB

Cuda 1,434 748 Updated Dec 21, 2021

A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)

Cuda 933 298 Updated Oct 1, 2025

CUDA Kernel Benchmarking Library

Cuda 808 102 Updated Feb 3, 2026

Distributed multigrid linear solver library on GPU

Cuda 645 174 Updated Feb 4, 2026

Distribution-Aware Coordinate Representation for Human Pose Estimation

Cuda 565 83 Updated May 17, 2024

Official pytorch Code for CVPR2019 paper "Fast Human Pose Estimation" https://arxiv.org/abs/1811.05419

Cuda 398 67 Updated Sep 16, 2022

This is a monocular dense mapping system corresponding to IROS 2018 "Quadtree-accelerated Real-time Monocular Dense Mapping"

Cuda 367 87 Updated Oct 16, 2018

GPU-accelerated Levenberg-Marquardt curve fitting in CUDA

Cuda 336 103 Updated Feb 2, 2026

[SIGGRAPH 2021] ROSEFusion is proposed to tackle the difficulties in fast-motion camera tracking using random optimization with depth information only.

Cuda 324 47 Updated Jan 7, 2024

This repo is copied from https://github.com/leoxiaobin/deep-high-resolution-net.pytorch

Cuda 311 82 Updated Oct 12, 2021

An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).

Cuda 276 23 Updated Jul 16, 2025

[DeepFashion2 Challenge] Fashion Landmark Estimation with HRNet

Cuda 147 25 Updated Aug 30, 2024

Introduction to CUDA programming

Cuda 129 31 Updated May 19, 2017

A tool for examining GPU scheduling behavior.

Cuda 91 22 Updated Aug 17, 2024

openpose, yolov3 with tiny-tensorrt

Cuda 86 25 Updated Apr 23, 2021

Code supporting the WAFR paper "A Performance Analysis of Differential Dynamic Programming on a GPU," and the ICRA workshop follow on work deploying the algorithm onto robot hardware.

Cuda 49 8 Updated Sep 12, 2023

Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA

Cuda 35 11 Updated Jul 28, 2020

Based on CUDA Cuts Code

Cuda 26 6 Updated Jun 5, 2018

Collection of CUDA benchmarks, with a focus on unified vs. explicit memory management.

Cuda 20 2 Updated Oct 15, 2019

A CUDA renderer for the Buddhabrot fractal

Cuda 13 Updated Sep 14, 2023

A CUDA-accelerated SIFT implementation.

Cuda 13 6 Updated Feb 1, 2015

Fitting with Levenberg-Marquardt algorithm in CUDA

Cuda 10 6 Updated Apr 18, 2013

A GPU-only implementation of DenseCut for a RealSense camera

Cuda 9 Updated Nov 26, 2020

This is an implementation of PQP algorithm for MPC in CUDA C language

Cuda 1 1 Updated Apr 22, 2020