A unified Python toolkit for managing, processing, and analyzing vision datasets, from raw data to task specific analytics.
Streamlines the entire data pipeline for computer vision and medical imaging:
- Load and save 2D/3D data across common formats with a consistent API.
- Organize datasets using flexible file managers and YAML-based configs.
- Handle task-aware data (images, semantic masks, multilabel masks) with clear shape and axis conventions.
- Analyze dataset statistics and inspect data interactively with CLI tools and Napari integration.
pip install vidata
# or latest dev version
git clone https://github.com/MIC-DKFZ/vidata.git
cd vidata
pip install -e ./
# (optional - for visual inspections)
pip install napari-data-inspection
- Unified I/O: Load and save 2D images, 3D volumes, n-dim arrays, and configs with consistent
load_xxx
/save_xxx
API. - Supported Formats: PNG, TIFF, NIfTI, NRRD, NumPy, Blosc2, JSON, YAML, Pickle, Text
- Loaders & Writers: Task-aware data handling for images, semantic segmentation, and multilabel masks (single-file or stacked).
- File Management: Collect files with patterns, filters, and split definitions.
- Task-Aware Handling: Built-in support for semantic and multilabel segmentation masks with shape and axis conventions.
- Flexible Dataset Configs: Define datasets in YAML with layers, modalities, labels, and optional splits/folds.
- Analysis & Inspection Tools: CLI for dataset statistics and a Napari plugin for visual inspection.
Module | Purpose |
---|---|
io |
Provides low-level reading and writing support for multiple file types. |
loader |
Wraps io with task-specific logic to load data in correct format. |
writers |
Wraps io with task-specific logic to write data in correct format. |
task_manager |
Encapsulates logic for handling semantic and multi-label segmentation data. |
file_manager |
Handles file selection and organization using paths, patterns and splits. |
config_manager |
Parses dataset configuration files and instantiates loaders, writers, file managers, and task managers. |
data_analyzer |
Computes and visualizes dataset statistics. |
data_inspection |
Inspect your data (Requires napari_data_inspection to be installed) |
TL;DR;
- Use
load_xxx
/save_xxx
for all supported formats. - Images/arrays: PNG/JPG/TIFF (
imageio
,tifffile
), NIfTI/NRRD/MHA (sitk
,nibabel
), Blosc2, NumPy (.npy
,.npz
). - Configs/metadata: JSON, YAML, Pickle, TXT.
- Extendable by custom load and save functions.
- Functions always follow the same pattern:
data, meta = load_xxx("file.ext")
save_xxx(data, "out.ext", meta)
Expand for Full Details
Module | Extension(s) | Backend(s) | Notes |
---|---|---|---|
image_io |
.png , .jpg , .jpeg , .bmp |
imageio |
Standard 2D image formats |
tif_io |
.tif , .tiff |
tifffile |
Multipage TIFF, high bit-depths supported |
sitk_io |
.nii.gz , .nii , .mha , .nrrd |
sitk |
Medical image formats (3D volumes) |
nib_io |
.nii.gz , .nii , |
nibabel |
Alternative medical imaging backend |
blosc2_io |
.b2nd |
blosc2 , |
Compressed N-dimensional arrays. |
blosc2_io |
.b2nd |
blosc2pkl |
Compressed N-dimensional arrays with metadata in a separate pkl file. |
numpy_io |
.npy |
numpy |
Single NumPy array |
numpy_io |
.npz |
numpy |
Dictionary of arrays |
from vidata.io import (
load_image, save_image,
load_tif, save_tif,
load_sitk, save_sitk,
load_nib, save_nib,
load_blosc2, save_blosc2,
load_blosc2pkl, save_blosc2pkl,
load_npy, save_npy,
load_npz, save_npz,
)
# Standard image formats (PNG, JPG, BMP, …)
img, meta = load_image("example.png")
save_image(img, "out.png", meta)
# TIFF (supports multipage / high bit depth)
img, meta = load_tif("example.tif")
save_tif(img, "out.tif", meta)
# Medical imaging (NIfTI, MHA, NRRD) with SimpleITK
vol, meta = load_sitk("example.nii.gz")
save_sitk(vol, "out_sitk.nii.gz", meta)
# Medical imaging with Nibabel
vol, meta = load_nib("example.nii.gz")
save_nib(vol, "out_nib.nii.gz", meta)
# Blosc2 compressed array
arr, meta = load_blosc2("example.b2nd")
save_blosc2(arr, "out.b2nd", meta)
# Blosc2 compressed array but with metadata in a separate pickle file (with the same name)
arr, meta = load_blosc2pkl("example.b2nd")
save_blosc2pkl(arr, "out.b2nd", meta)
# NumPy arrays
arr, _ = load_npy("example.npy")
save_npy(arr, "out.npy")
arrs, _ = load_npz("example.npz")
save_npz(arrs, "out.npz")
Module | Extension(s) | Notes |
---|---|---|
json_io |
.json |
JSON metadata/configs |
yaml_io |
.yaml , .yml |
YAML metadata/configs |
pickle_io |
.pkl |
Python pickles |
txt_io |
.txt |
Plain text files |
from vidata.io import (
load_json, save_json,
load_yaml, save_yaml,
load_pickle, save_pickle,
load_txt, save_txt,
)
# JSON
obj = load_json("config.json")
save_json(obj, "out.json")
# YAML
obj = load_yaml("config.yaml")
save_yaml(obj, "out.yaml")
# Pickle (Python objects)
obj = load_pickle("config.pkl")
save_pickle(obj, "out.pkl")
# Plain text
obj = load_txt("config.txt")
save_txt(obj, "out.txt")
- Register a reader / writer with a decorator.
- Reader must return (numpy_array, metadata_dict).
- Writer must return list[str] of the file(s) it wrote.
- Registration happens at import time—make sure this module is imported (e.g., from your package’s init.py).
- See here for an example.
# custom_io_template.py — fill in the TODOs and import this module somewhere at startup.
import numpy as np
from typing import Tuple, Dict, List
# TODO: import your backend library (e.g., imageio, tifffile, nibabel, SimpleITK, ...)
# import imageio.v3 as iio
from vidata.registry import register_loader, register_writer
# --------------------------- READER ------------------------------------------
# Replace file extension and backend name to your custom function
@register_loader("image", ".png", ".jpg", ".jpeg", ".bmp", backend="imageio") # To Register Image Loading
@register_loader("mask", ".png", ".bmp", backend="imageio") # To Register Label Loading
def load_custom(file: str) -> tuple[np.ndarray, dict]:
"""
Load a file and return (data, metadata).
metadata can be empty or include keys like: spacing, origin, direction, shear, dtype, etc.
"""
# data = iio.imread(file) # example for imageio
data = ... # TODO: replace
meta = {} # TODO: replace
return data, meta
# return meta # for metadata files like json or yaml
# --------------------------- WRITER ------------------------------------------
# Replace file extension and backend name to your custom function
@register_writer("image", ".png", ".jpg", ".jpeg", ".bmp", backend="imageio") # To Register Image Writer
@register_writer("mask", ".png", ".bmp", backend="imageio") # To Register Label Writer
def save_custom(data: np.ndarray, file: str) -> list[str]:
"""
Save array to `file`. Return all created paths (include sidecars if any).
"""
# TODO: write using your backend
# iio.imwrite(file, data)
return [file]
TL;DR;
- Use
ImageLoader/Writer
,SemSegLoader/Writer
,MultilabelLoader/Writer
for single-file data. - Use
ImageStackLoader/Writer
,MultilabelStackedLoader/Writer
when channels/classes are split across files (*_0000
,*_0001
, …). - Some formats support multiple backends (e.g.,
.nii.gz
→sitk
,nibabel
).
from vidata.loaders import ImageLoader
from vidata.writers import ImageWriter
# Minimal example: load an image and save it again
loader = ImageLoader(ftype=".png", channels=3)
writer = ImageWriter(ftype=".png")
image, meta = loader.load("example.png")
writer.save(image, "out.png", meta)
Expand for Full Details
Use these loaders when each image or label is stored in a single file.
from vidata.loaders import ImageLoader, SemSegLoader, MultilabelLoader
from vidata.writers import ImageWriter, SemSegWriter, MultilabelWriter
# Image Loader/Writer
loader = ImageLoader(ftype=".png")
writer = ImageWriter(ftype=".png")
image, meta = loader.load("path/to/file.png")
writer.save(image,"path/to/output.png",meta)
# Semantic Segmentation Loader/Writer
loader = SemSegLoader(ftype=".png")
writer = SemSegWriter(ftype=".png")
mask, meta = loader.load("path/to/file.png")
writer.save(mask,"path/to/output.png",meta)
# Multilabel Segmentation Loader/Writer
loader = MultilabelLoader(ftype=".png")
writer = MultilabelWriter(ftype=".png")
mask, meta = loader.load("path/to/file.png")
writer.save(mask,"path/to/output.png",meta)
Use these loaders when each channel (image) or class (labels) is stored in a separate file. Files must follow a numeric suffix convention: file_0000.png, file_0001.png, etc. Only exists for Image and Multilabel Loaders
from vidata.loaders import ImageStackLoader, MultilabelStackedLoader
from vidata.writers import ImageStackWriter, MultilabelStackedWriter
# Image Loader
# expects multiple files with numeric suffixes: file_0000.png, file_0001.png, file_0002.png
loader = ImageStackLoader(ftype=".png", channels=3) # channels=3 --> 3 files are expected
writer = ImageStackWriter(ftype=".png", channels=3) # channels=3 --> 3 files are expected
image, meta = loader.load("path/to/file") # Pass only the base path without suffix and file_type
writer.save(image,"path/to/output.png",meta) # each channel will be saved in a separate file
# Multilabel Segmentation Loader
# expects file names: file_0000.png,... , file_0006.png
loader = MultilabelStackedLoader(ftype=".png", classes=7) # classes=7 --> 7 files are expected
writer = MultilabelStackedWriter(ftype=".png", classes=7) # classes=7 --> 7 files are expected
mask, meta = loader.load("path/to/file") # Pass only the base path without suffix and file_type
writer.save(mask,"path/to/output.png",meta) # each class will be saved in a separate file
Some file types support multiple backends. You can explicitly select one (Analogue for all Loaders and Writers):
from vidata.loaders import ImageLoader
# Explicit backend (SimpleITK)
loader=ImageLoader(ftype=".nii.gz",channels=1, backend="sitk")
image, meta = loader.load("path/to/file.nii.gz")
# Explicit backend (Nibabel)
loader=ImageLoader(ftype=".nii.gz",channels=1, backend="nibabel")
image, meta = loader.load("path/to/file.nii.gz")
# If not specified the first one from the table below is chosen (SimpleITK in this case)
loader=ImageLoader(ftype=".nii.gz",channels=1)
image, meta = loader.load("path/to/file.nii.gz")
Data Type | Extension(s) | Available Backends |
---|---|---|
Image/Mask | .png , .jpg , .jpeg , .bmp |
imageio |
Image/Mask | .nii.gz , .mha , .nrrd |
sitk , nibabel |
Image/Mask | .tif , .tiff |
tifffile |
Image/Mask | .b2nd |
blosc2 , blosc2pkl |
Image/Mask | .npy |
numpy |
Image/Mask | .npz |
numpy |
TL;DR;
- Images → (H, W, C) (2D) or (D, H, W, C) (3D).
- Semantic masks → (H, W) (2D) or (D, H, W) (3D).
- Multilabel masks → (N, H, W) (2D) or (N, D, H, W) (3D).
- Axes: Z=Depth, Y=Height, X=Width, C=Channels.
Expand for Full Details
Legend:
D
,H
,W
: Spatial Dims
C
: Number of Channels
N
: Number of Classes
B
: Batch size
Data Type | Expected Shapes 2D | Expected Shapes 3D |
---|---|---|
Images | (H, W) or (H, W, C) |
(D, H, W) or (D, H, W, C) |
Semantic Segmentation Masks | (H, W) |
(D, H, W) |
Multilabel Segmentation Masks | (N, H, W) |
(N, D, H, W) |
Axis | Spatial Dimension | Anatomical Plane | Direction | NumPy Axis | SimpleITK Axis | PyTorch Axis |
---|---|---|---|---|---|---|
Z (if 3D) | Depth (D) |
Axial | Bottom ↔ Top | 0 | 2 | 1 (after channel) |
Y | Height (H) |
Coronal | Back ↔ Front | 1 (or 0 in 2D) | 1 | 2 |
X | Width (W) |
Sagittal | Left ↔ Right | 2 (or 1 in 2D) | 0 | 3 |
C (optional) | Channel (C) |
- | - | 3 (if present) | Not used | 0 |
Framework | 2D Shape | 3D Shape |
---|---|---|
NumPy | (B, H, W, C) |
(B, D, H, W, C) |
(B, Y, X, C) |
(B, Z, Y, X, C) |
|
PyTorch | (B, C, H, W) |
(B, C, D, H, W) |
(B, C, Y, X) |
(B, C, Z, Y, X) |
TL;DR;
- Use
SemanticSegmentationManager
for single-label masks (each voxel has one class ID). - Use
MultiLabelSegmentationManager
for multi-label masks (channel-first one-hot). - Provides utilities to generate dummy masks, list class IDs, count pixels, and locate classes.
from vidata.task_manager import SemanticSegmentationManager
tm = SemanticSegmentationManager()
mask = tm.random((128, 128), num_classes=3)
print(tm.class_ids(mask)) # -> e.g. [0, 1, 2]
Expand for Full Details
The task managers provide a unified interface for semantic segmentation and multilabel segmentation labels, with utilities to generate dummy data, inspect class distributions, and query spatial properties.
from vidata.task_manager import (
SemanticSegmentationManager,
MultiLabelSegmentationManager,
)
# Semantic segmentation (2D example)
ssm = SemanticSegmentationManager()
mask = ssm.random(size, num_classes) # shape: (H, W) or (D, H, W), dtype=int, range: [0:num_classes]
empty = ssm.empty(size, num_classes) # zeros by default
ids = ssm.class_ids(mask) # sorted unique class IDs (e.g., [0,1,2,3,4])
count_bg = ssm.class_count(mask, 1) # number of pixels/voxels with class 1
coords = ssm.class_location(mask, 3) # indices (tuple of arrays) where class==3
spatial = ssm.spatial_dims(mask.shape) # (H, W) or (D, H, W)
has_bg = ssm.has_background() # True (class 0 is background)
# Multilabel segmentation (3D example)
mlm = MultiLabelSegmentationManager()
ml_mask = mlm.random(size, num_classes) # shape: (N, H, W) or (N, D, H, W), channel-first one-hot
ml_empty = mlm.empty(size, num_classes) # all zeros
ml_ids = mlm.class_ids(ml_mask) # sorted unique class IDs (e.g., [0,1,2,3,4])
ml_count = mlm.class_count(ml_mask, 2) # number of pixels/voxels where channel 2 is active
ml_coords = mlm.class_location(ml_mask, 6) # indices (tuple of arrays) where channel 6 is active
ml_spatial = mlm.spatial_dims(ml_mask.shape) # (H, W) or (D, H, W)
mlm.has_background() # False (class 0 is not background)
TL;DR;
- Use
FileManager
to collect single-file samples (*.png
,*.nii.gz
, …). - Use
FileManagerStacked
when samples are split across multiple files (*_0000
,*_0001
, …). - Supports
pattern
,include_names
, andexclude_names
filters. - Returns a list-like object (
len()
, indexing).
from vidata.file_manager import FileManager
fm = FileManager(path="data/images", file_type=".png")
print(len(fm), "files found")
print(fm[0]) # -> Path('data/images/example.png')
Expand for Full Details
The file_manager
module collects files from a root directory with optional
pattern, include/exclude filters, and a variant for stacked files
(e.g., *_0000.*
, *_0001.*
).
from vidata.file_manager import FileManager
# Single-file data (e.g., images or masks stored one-per-file)
fm = FileManager(
path="data/images",
file_type=".png",
pattern="*_image", # optional; glob-like,
include_names=["case_","sample_"], # optional; keep only if any token is in file name
exclude_names=["corrupt"], # optional; drop if any token is in file name
)
print(len(fm), "files") # > 30 files | how many files are found
print(fm[0]) # > data/images/case_1_image.png
Use FileManagerStacked when each sample is stored across multiple files with a numeric suffix (e.g., channels or classes): file_0000.ext, file_0001.ext, …
from vidata.file_manager import FileManagerStacked
# Single-file data (e.g., images or masks stored one-per-file)
fm = FileManagerStacked(
path="data/stacked_images",
file_type=".png",
pattern="*_image", # optional; glob-like,
include_names=["case_","sample_"], # optional; keep only if any token is in file name
exclude_names=["corrupt"], # optional; drop if any token is in file name
)
print(len(fm), "files") # > 30 files | how many files are found (number of base path)
print(fm[0]) # > data/stacked_images/case_1_image | gives the base path without suffix and dtype
TL;DR;
- Config is a YAML template for datasets: always starts with a
name
. - Add one or more layers (e.g.,
image
,semseg
,multilabel
) with fields likepath
,file_type
,channels
/classes
. - Optional splits: define
train/val/test
overrides or point to asplits_file
. - See
examples/template.yaml
and/or runvidata_template
for a template
Expand for Full Details
A config is a YAML file describing the dataset name, layers, and optional split definitions.
name: DatasetName
# Define an arbitrary number of layers
layers:
- name: SomeImages # unique layer name
type: image # 'image' for image data and for label data one of: 'semseg' | 'multilabel'
path: some/path/to/data # directory
file_type: .png # required extension incl. dot (e.g., .png | .nii.gz | .b2nd)
pattern: # optional regex to filter files relative to 'path' without file_type
backend: # optional backend to load/write the data (e.g., sitk | nib) for .nii.gz, etc.
channels: 3 # number of channels, required for image
file_stack: False # True if each channel is a separate file: *_0000, *_0001, ...
- name: SomeLabels
type: semseg # semseg --> Semantic Segmentation | multilabel --> Multilabel Segmentation
path: some/path/to/data
file_type: .png|.nii.gz|.b2nd|...
pattern:
backend:
classes: 19 # number of classes, required for semseg/multilabel
file_stack: False # for multilabel: true if each class is a separate file: *_0000, *_0001, ...
ignore_bg: null # optional bool for labels; if true, class 0 is ignored in metrics and analysis
ignore_index: 255 # optional int for labels; label value to ignore in loss/eval
# Splitting (optional)
splits:
splits_file: some/path/splits_final.json # optional: use a file to define splits, content can be:
# 1) object: {"train": [...], "val": [...], "test": [...]}
# 2) list of folds: [{"train": [...], "val": [...]}, {...}, ...]
# optional: Per-layer overrides for paths/patterns by split.
train:
SomeImages: # empty -> use defaults
SomeLabels: # empty -> use defaults
val:
SomeImages:
pattern: some_pattern # optional: example override for pattern parameter for layer and split
path: some/path/to/data # optional: example override for path parameter for layer and split
SomeLabels:
pattern: some_pattern
path: some/path/to/data
test: # empty --> split is not defined
TL;DR;
- Central entry point: validates configs and builds
FileManager
,Loader
, andTaskManager
for each dataset layer. - Config is a YAML file with
layers
(images/labels) and optionalsplits
. - Access layers by name, get file managers for splits/folds, and construct loaders/task managers automatically.
from vidata.config_manager import ConfigManager
from vidata.io import load_yaml
cfg = load_yaml("dataset.yaml")
cm = ConfigManager(cfg)
print(cm.layer_names()) # -> ['SomeImages', 'SomeLabels']
image_layer = cm.layer("SomeImages")
fm = image_layer.file_manager() # file discovery for this split/fold
loader = image_layer.data_loader() # ready-to-use loader
arr, meta = loader.load(str(fm[0]))
Expand for Full Details
The config_manager
module ties everything together: it validates dataset
configs, builds the appropriate FileManager, Loader, and TaskManager
for each dataset layer, and applies split definitions (train/val/test or folds).
from vidata.config_manager import ConfigManager
from vidata.io import load_yaml
# Load a dataset config
cfg = load_yaml("dataset.yaml")
cm = ConfigManager(cfg)
# Access layers
print(cm.layer_names()) # ['SomeImages', 'SomeLabels']
image_layer = cm.layer("SomeImages")
# Get a file manager for train split
fm = image_layer.file_manager(split="train", fold=0)
print(len(fm), "files")
# Build a loader
loader = image_layer.data_loader()
arr, meta = loader.load(str(fm[0]))
# Get the task manager for labels
label_layer = cm.layer("SomeLabels")
tm = label_layer.task_manager()
print("Classes in labels:", tm.class_ids(arr))
The vidata_analyze
CLI computes dataset statistics and writes them to the
specified output directory. Results include:
- Image statistics: sizes, resolutions, intensity distributions
- Label statistics: class counts, frequencies, co-occurrence
- Split summaries: optional per-split analysis
vidata_analyze -c path/to/datasets/*.yaml -o <outputdir>
# Analyze a specific split/fold
vidata_analyze -c path/to/datasets/*.yaml -o <outputdir> -s <split> -f <fold>
The data_inspections CLI provides an interactive viewer (via Napari) to browse images, labels, and splits defined in your dataset config
pip install napari-data-inspection[all]
Run the following
data_inspection -c path/to/datasets/*.yaml
# Inspect a specific split/fold
data_inspection -c path/to/datasets/*.yaml -s <split> -f <fold>
This repository is developed and maintained by the Applied Computer Vision Lab (ACVL) of Helmholtz Imaging and the Division of Medical Image Computing at DKFZ.
This repository was generated with copier using the napari-plugin-template.