Stars
[NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference
Named Tensors for Legible Deep Learning in JAX
Development repository for PlantCaduceus (PlantCAD) evaluation and experimentation pipelines
pathlib api extended to use fsspec backends
PlantCAD: cross-species modeling of plant genomes
Open-source framework for the research and development of foundation models.
Lightweight fast function pipeline (DAG) creation in pure Python for scientific (HPC) workflows 🕸️🧪
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
A generative world for general-purpose robotics & embodied AI learning.
Website for hosting the Open Foundation Models Cheat Sheet.
code to run sei and obtain sei and sequence class predictions
MyGene.info: A BioThings API for gene annotations
Using language models and ontology topology to perform semantic mapping of traits between biomedical datasets
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
👻 Experimental library for scraping websites using OpenAI's GPT API.
NetworkX-based Python library for representing ontologies
NXOntology data: making ontologies accessible as simple JSON files
Work with your web service, database, and streaming schemas in a single format.
GFlowNet library specialized for graph & molecular data
An Association Test for Aggregated Sets of SNP-Level Summary Statistics
Grounding of biomedical named entities with contextual disambiguation
A fast, parallelized, memory efficient, and cache-optimized Python implementation of node2vec