andrewssobral

🔴

I may be very slow to respond.

Andrews Cordolino Sobral andrewssobral

🔴

I may be very slow to respond.

Head of AI at ActiveEon, Paris - France. Ph.D. on Computer Vision and Machine Learning. Computer Vision Specialist. Experienced C++ and Python Developer.

915 followers · 541 following

Achievements

x3 x3

Achievements

x3 x3

Organizations

Starred repositories

26 stars written in Cuda

Clear filter

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 28,097 3,267 Updated Jun 26, 2025

baidu-research / warp-ctc

Fast parallel CTC.

Cuda 4,073 1,036 Updated Mar 4, 2024

hujie-frank / SENet

Squeeze-and-Excitation Networks

Cuda 3,580 851 Updated Feb 25, 2019

computerhistory / AlexNet-Source-Code

This package contains the original 2012 AlexNet code.

Cuda 2,765 358 Updated Mar 12, 2025

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 2,603 237 Updated Nov 7, 2025

brucefan1983 / CUDA-Programming

Sample codes for my CUDA programming book

Cuda 1,924 375 Updated Feb 15, 2025

NVIDIA / cub

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,801 463 Updated Oct 9, 2023

vlfeat / matconvnet

MatConvNet: CNNs for MATLAB

Cuda 1,431 748 Updated Dec 21, 2021

rapidsai / raft

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing …

Cuda 951 217 Updated Nov 7, 2025

NVIDIA / multi-gpu-programming-models

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 825 143 Updated Sep 26, 2025

akrizhevsky / cuda-convnet2

Automatically exported from code.google.com/p/cuda-convnet2

Cuda 812 294 Updated Dec 3, 2015

clu0 / unet.cu

UNet diffusion model in pure CUDA

Cuda 653 31 Updated Jun 28, 2024

Dao-AILab / causal-conv1d

Causal depthwise conv1d in CUDA, with a PyTorch interface

Cuda 638 133 Updated Oct 20, 2025

TorontoDeepLearning / convnet

A GPU implementation of Convolutional Neural Nets in C++

Cuda 505 229 Updated Oct 1, 2020

CisMine / Parallel-Computing-Cuda-C

CUDA Learning guide

Cuda 468 51 Updated Jun 20, 2024

mansimov / unsupervised-videos

Unsupervised Learning of Video Representations using LSTMs

Cuda 362 112 Updated Mar 6, 2018

likejazz / llama3.cuda

llama3.cuda is a pure C/CUDA implementation for Llama 3 model.

Cuda 344 25 Updated Apr 27, 2025

ulrichstern / cuda-convnet

Alex Krizhevsky's original code from Google Code

Cuda 198 32 Updated Mar 10, 2016

cuMF / cumf_als

CUDA Matrix Factorization Library with Alternating Least Square (ALS)

Cuda 180 46 Updated Aug 14, 2018

drkennetz / cuda_examples

Some CUDA example code with READMEs.

Cuda 176 26 Updated Mar 2, 2025

wangyi-fudan / wyGPT

Wang Yi's GPT solution

Cuda 142 7 Updated Dec 17, 2023

gevtushenko / llm.c

Forked from karpathy/llm.c

LLM training in simple, raw C/CUDA

Cuda 107 8 Updated May 1, 2024

pawelswoboda / RAMA

Cuda 87 17 Updated Jun 9, 2025

ohosseini / DOCS-pytorch

Deep Object Co-Segmentation

Cuda 56 14 Updated Aug 13, 2020

bwohlberg / sporco-cuda

CUDA extension for the SPORCO project

Cuda 18 6 Updated Jul 5, 2021

yianan261 / Multi_GPU_TRAINING_OPTIMIZATION

This project optimizes multi-GPU parallelism for machine learning training by accelerating multi-GPU using fused gradient buffers, NCCL AllReduce, and CUDA C kernel-level optimizations including me…

Cuda 8 Updated May 13, 2025

Starred topics

Arduino

Terminal

Tensorflow

MATLAB

C++

scikit-learn

Raspberry Pi

Qt

Python

Machine learning

See all starred topics