-
DataChain.ai
- San Francisco
- https://dvc.org
- @shcheklein
Highlights
- Pro
Stars
Curate, Annotate, and Manage Your Data in LightlyStudio.
Setup a PostgreSQL for Linux, macOS and Windows runner machines.
Configuronic: A simple yet powerful Configuration as Code library
A robust (🐢) and fast (🐇) MLOps tool for managing data and pipelines in Rust (🦀)
Build complete API integrations with YAML and SQL. Rapid development without vendor lock-in and per-row costs.
A concise grammar of interactive graphics, built on Vega.
An open source SDK for logging, storing, querying, and visualizing multimodal and multi-rate data
An extremely fast Python type checker and language server, written in Rust.
An implementation of Pregel framework and graph algorithms on top of it with Ibis project DataFrames.
A CLI/TUI that simplifies launching VSCode projects, with a focus on dev containers
🔬PGWATCH: PostgreSQL metrics monitor/dashboard
Simplified version control, environment management, and single-button reproducible pipelines for research projects.
Light weight toolkit for bounding boxes providing conversion between bounding box types and simple computations.
An example how to use DataChain and DVC to version data, make project reproducible, track experiments and models
Pixeltable — Data Infrastructure providing a declarative, incremental approach for multimodal AI workloads.
👻 Ghostty is a fast, feature-rich, and cross-platform terminal emulator that uses platform-native UI and GPU acceleration.
Python tool for converting files and office documents to Markdown.
HDF5 for Python -- The h5py package is a Pythonic interface to the HDF5 binary data format.
ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.
LLM, CV, multimodal at scale
Model Context Protocol Servers
Analytics, Versioning and ETL for multimodal data: video, audio, PDFs, images
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
A package to work with SEC data. Incorporates datamule endpoints.