marcromeyn

Marc Romeyn marcromeyn

ML-engineer @NVIDIA. Working on https://github.com/NVIDIA/NeMo, open-source library to train LLMs with trillion of parameters. Ex. @spotify

48 followers · 6 following

NVIDIA
Netherlands
@marcromeyn

Achievements

x3 x3

Achievements

x3 x3

Lists (32)

Sort

Starred repositories

24 stars written in C++

Clear filter

facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.

C++ 37,821 4,098 Updated Nov 7, 2025

dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ 27,583 8,817 Updated Nov 7, 2025

microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …

C++ 17,817 3,954 Updated Nov 6, 2025

spotify / annoy

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

C++ 14,012 1,211 Updated Oct 29, 2025

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,738 1,519 Updated Nov 7, 2025

OpenNMT / CTranslate2

Fast inference engine for Transformer models

C++ 4,121 418 Updated Nov 4, 2025

facebookresearch / StarSpace

Learning embeddings for classification, retrieval and ranking.

C++ 3,958 526 Updated Dec 4, 2022

facebookincubator / velox

A composable and fully extensible C++ execution engine library for data management systems.

C++ 3,939 1,392 Updated Nov 7, 2025

dcajasn / Riskfolio-Lib

Portfolio Optimization and Quantitative Strategic Asset Allocation in Python

C++ 3,636 602 Updated Aug 5, 2025

nmslib / nmslib

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

C++ 3,545 463 Updated Oct 22, 2025

aksnzhy / xlearn

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI …

C++ 3,094 517 Updated Aug 28, 2023

mirage-project / mirage

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,938 148 Updated Nov 5, 2025

kevmo314 / scuda

SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.

C++ 1,775 73 Updated Jun 16, 2025

pytorch / gloo

Collective communications library with various primitives for multi-machine training.

C++ 1,364 337 Updated Oct 21, 2025

ycjuan / kaggle-2014-criteo

C++ 1,261 619 Updated Aug 16, 2024

Xtra-Computing / thundergbm

ThunderGBM: Fast GBDTs and Random Forests on GPUs

C++ 708 87 Updated Mar 19, 2025

ai-dynamo / nixl

NVIDIA Inference Xfer Library (NIXL)

C++ 703 179 Updated Nov 7, 2025

rapidsai / rmm

RAPIDS Memory Manager

C++ 655 231 Updated Nov 7, 2025

perone / euclidesdb

A multi-model machine learning feature embedding database

C++ 638 31 Updated Dec 30, 2019

cnclabs / smore

SMORe: Modularize Graph Embedding for Recommendation

C++ 380 84 Updated Oct 7, 2022

NVIDIA / nvshmem

NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process communication and coordination overheads by allowing programmer…

C++ 373 37 Updated Oct 16, 2025

facebookincubator / dynolog

Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also …

C++ 352 75 Updated Nov 6, 2025

Algomorph / pyboostcvconverter

Minimalist code necessary for using porting C++ functions/classes using OpenCV's "Mat" type in functions argument lists directly (w/o explicit conversions) to python.

Marc Romeyn marcromeyn

Lists (32)

Build

Computer Vision

Cookbooks

custom-trainer

Dagster

Data-infra

Docs

Finance

Information Retrieval

Jax

Large Language Models

Large scale ML

LLM Eval

LLM Rapids

LLM + Tabular

MCP

meshx

ML

ML Executor

ML-Infra

NeMo

NeMo Agent

PKM

Python

Pytorch

Recsys

RL

Rust

Scripts

Shell

Tensorflow

Vscode

Starred repositories

Machine learning