-
Hopsworks
- Stockholm
- @jim_dowling
- in/jim-dowling-206a98
Stars
A high-throughput and memory-efficient inference and serving engine for LLMs
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
DSPy: The framework for programming—not prompting—language models
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
An orchestration platform for the development, production, and observation of data assets.
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
A framework for few-shot evaluation of language models.
Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Modin: Scale your Pandas workflows by changing a single line of code
🧙 Build, run, and manage data pipelines for integrating and transforming data.
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
📚 Parameterize, execute, and analyze notebooks
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Voilà turns Jupyter notebooks into standalone web applications
⚡ TabPFN: Foundation Model for Tabular Data ⚡
ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.
😎 A curated list of awesome MLOps tools
A collection of scripts to flash Tuya IoT devices to alternative firmwares
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
Sequence modeling benchmarks and temporal convolutional networks
A fast inference library for running LLMs locally on modern consumer-class GPUs
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)