Skip to content
View marcromeyn's full-sized avatar

Block or report marcromeyn

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

24 stars written in C++
Clear filter

A library for efficient similarity search and clustering of dense vectors.

C++ 37,821 4,098 Updated Nov 7, 2025

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ 27,583 8,817 Updated Nov 7, 2025

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …

C++ 17,817 3,954 Updated Nov 6, 2025

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

C++ 14,012 1,211 Updated Oct 29, 2025

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,738 1,519 Updated Nov 7, 2025

Fast inference engine for Transformer models

C++ 4,121 418 Updated Nov 4, 2025

Learning embeddings for classification, retrieval and ranking.

C++ 3,958 526 Updated Dec 4, 2022

A composable and fully extensible C++ execution engine library for data management systems.

C++ 3,939 1,392 Updated Nov 7, 2025

Portfolio Optimization and Quantitative Strategic Asset Allocation in Python

C++ 3,636 602 Updated Aug 5, 2025

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

C++ 3,545 463 Updated Oct 22, 2025

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI …

C++ 3,094 517 Updated Aug 28, 2023

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,938 148 Updated Nov 5, 2025

SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.

C++ 1,775 73 Updated Jun 16, 2025

Collective communications library with various primitives for multi-machine training.

C++ 1,364 337 Updated Oct 21, 2025

ThunderGBM: Fast GBDTs and Random Forests on GPUs

C++ 708 87 Updated Mar 19, 2025

NVIDIA Inference Xfer Library (NIXL)

C++ 703 179 Updated Nov 7, 2025

RAPIDS Memory Manager

C++ 655 231 Updated Nov 7, 2025

A multi-model machine learning feature embedding database

C++ 638 31 Updated Dec 30, 2019

SMORe: Modularize Graph Embedding for Recommendation

C++ 380 84 Updated Oct 7, 2022

NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process communication and coordination overheads by allowing programmer…

C++ 373 37 Updated Oct 16, 2025

Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also …

C++ 352 75 Updated Nov 6, 2025

Minimalist code necessary for using porting C++ functions/classes using OpenCV's "Mat" type in functions argument lists directly (w/o explicit conversions) to python.

C++ 265 87 Updated Jan 31, 2022