Lists (1)
Sort Name ascending (A-Z)
Stars
Configuration-based installation of OpenShift and Cloud Pak for Data/Integration/Watson AIOps/Business Automation on various private and public cloud infrastructure providers. Deployment attempts t…
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, Slurm, 20+ clouds, on-prem).
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable,…
ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.
PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website …
Low-code framework for building custom LLMs, neural networks, and other AI models
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
An orchestration platform for the development, production, and observation of data assets.
Tools for detecting wildlife in aerial images using active learning
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
🐢 bayesAB: Fast Bayesian Methods for A/B Testing
9 tools for Goodreads.com, for finding people based on the books they’ve read, finding books popular among the people you follow, following new book reviews, etc
State-of-the-Art Embeddings, Retrieval, and Reranking
An Elasticsearch ingest processor to do named entity extraction using Apache OpenNLP
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
Benchmarks of approximate nearest neighbor libraries in Python
A machine learning software for extracting information from scholarly documents
Open-source, low-code AutoML platform for Python. PyCaret 4.0: sklearn-native engine + React control plane.