- Munich, Germany
Stars
Minimal, fast + educational reimplementation of the TabICLv2 architecture
Lists of company wise questions. Every csv file in the companies directory corresponds to a list of questions on leetcode for a specific company based on the leetcode company tags. Updated as of 20…
A Python library for processing and filtering TabLib
Official Implementation of "TabDLM: Free-Form Tabular Data Generation via Joint Numerical–Language Diffusion"
Research on Tabular Deep Learning: Papers & Packages
[ICLR 2024 spotlight] Making Pre-trained Language Models Great on Tabular Prediction
TabICLv2: A state-of-the-art tabular foundation model
A comprehensive toolkit and benchmark for tabular data learning, featuring 35+ deep methods, more than 10 classical methods, and 300 diverse tabular datasets.
Replication code for Continuous Diffusion for Mixed-Type Tabular Data [ICLR 2025]
PluRel: Synthetic Data unlocks Scaling Laws for Relational Foundation Models
A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
ReDeLEx is a Python framework for developing and evaluating RDL models on relational databases via RelBench and CTU datasets.
Official implementation of the ARROW-Diff graph generation method.
This is the official implementation of the paper “Griffin: Towards a Graph-Centric Relational Database Foundation Model.”
ICLR 2026: Implementation of our unlearning method "Partial Model Collapse" introduced in: "Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs".
Code accompanying the paper "Generalized Interpolating Discrete Diffusion"
EDGE: Efficient and Degree-Guided Graph Generation via Discrete Diffusion Modeling
Official Jax Implementation of MD4 Masked Diffusion Models
Reference implementation of the paper "Efficient and Scalable Graph Generation through Iterative Local Expansion"
Metrics to evaluate quality and efficacy of synthetic datasets.
Pruna is a model optimization framework built for developers, enabling you to deliver faster, more efficient models with minimal overhead.
Machine Learning and Computer Vision Engineer - Technical Interview Questions
⚡ TabPFN: Foundation Model for Tabular Data ⚡
Official Implementations of "Mixed-Type Tabular Data Synthesis with Score-based Diffusion in Latent Space""
A package for benchmarking synthetic relational data generation methods
[TMLR] GraphMaker: Can Diffusion Models Generate Large Attributed Graphs?