Stars
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Get your documents ready for gen AI
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)
[NeurIPS 2022] ASPiRe: Adaptive Skill Priors for Reinforcement Learning
Automatic extraction of relevant features from time series:
A playbook for systematically maximizing the performance of deep learning models.
SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples
A curated list of research papers in Sentence Reprsentation Learning and a sts leaderboard of sentence embeddings.
Betty: an automatic differentiation library for generalized meta-learning and multilevel optimization
Scalable and user friendly neural 🧠 forecasting algorithms.
Easy to use Python library of customized functions for cleaning and analyzing data.
🏕️ Reproducible development environment for humans and agents
An active learning library for Pytorch based on Lightning-Fabric.
Python sample codes and textbook for robotics algorithms.
CAIRI Supervised, Semi- and Self-Supervised Visual Representation Learning Toolbox and Benchmark
Uncertainty Toolbox: a Python toolbox for predictive uncertainty quantification, calibration, metrics, and visualization
AI on the way. An RDBMS approach to deep learning. Declarative, explainable, scalable, optimizable, easy to deploy, all that good stuff.
A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.
Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Masked Siamese Networks for Label-Efficient Learning (https://arxiv.org/abs/2204.07141)
[ECCV 2022] Official Python implementation of BiB: Active Learning Strategies for Weakly-Supervised Object Detection.
pyrelational is a python active learning library for rapidly implementing active learning pipelines from data management, model development (and Bayesian approximation), to creating novel active le…
A Unified Semi-Supervised Learning Codebase (NeurIPS'22)
Code for the NAACL 2022 long paper "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings"
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821