MAMMA: Markerless Accurate Multi-person Motion Acquisition

Hanz Cuevas Velasquez^1*, Anastasios Yiannakidis^1*, Soyong Shin², Giorgio Becherini¹, Markus Höschle¹, Joachim Tesch¹, Taylor Obersat¹, Tsvetelina Alexiadis¹, Eni Halilaj², Michael J. Black¹

¹Max Planck Institute for Intelligent Systems, Tübingen ²Carnegie Mellon University

^*Equal contribution

[CVPR 2026 Oral] | arXiv | Project Page | Datasets

News

[2026-06] 🎉 MAMMA being presented at CVPR 2026
[2026-06] Code released (inference + training)

Install

git clone https://github.com/cuevhv/mamma.git
cd mamma

Full env + CUDA + weights setup: docs/INSTALL.md.

micromamba activate mamma          # or: conda activate mamma
python -m inference doctor         # verify env vars + weight paths

The pipeline is zero-config when weights live under data/.

Quick demo

Bundled 4-cam example, ~56 MB:

bash data/download_example.sh                                       # fetches videos to data/mamma_example/

python -m inference run \
  --cfg      configs/examples/presets/quick.yaml \
  --footage  data/mamma_example \
  --seq_name pushing_and_lifting_from_ground \
  --calib    configs/examples/calib/iphones_outdoors.yaml \
  --out-tag  demo -v

Outputs land under output/ma_*/demo/mamma_example/….

Prefer a browser UI? Run bash gui/scripts/dev.sh, open http://localhost:3000, and click Run demo. It's the same pipeline but friendlier UX!

Pipeline

ma_cap → ma_masks → ma_2d → ma_3d → ma_vis

Step	What it does
`ma_cap`	Loads multi-view capture
`ma_masks`	Per-person segmentation (SAM + YOLO)
`ma_2d`	2D landmark detection (MammaNet)
`ma_3d`	Multi-view SMPL-X optimization
`ma_vis`	Per-camera overlays + interactive scene

Entry point: python -m inference run (source: inference/cli/run.py).

Argument	What it is
`--cfg` / `--preset`	Pipeline-configuration YAML — declares which steps run and their hyperparameters. Capture-independent. (what a preset is + how to modify one)
`--footage`	Dataset root containing sequence subdirs (use with `--seq_name` + `--calib`). (layout reference)
`--seq_name`	One sequence subdirectory name under `--footage` to process (one run = one sequence).
`--calib`	Calibration file (`.yaml` / `.xcp` / OpenCV `.json`); applies to every sequence under `--footage`. (format reference)
`--capture`	Advanced: capture JSON pointing at footage, calibration, sequences, and camera names — used to iterate over many sequences in one invocation. (schema reference)
`--out-tag`	Output sub-directory tag under `output/ma_*/<tag>/` (default: `local`).
`-v`	Verbose runner logs.

Run the pipeline

Three things are needed:

A calibration file (how to make one)
A folder with your sequence (how to set it up)
A preset — use a shipped one: configs/examples/presets/quick.yaml (~5 min smoke) or configs/examples/presets/full.yaml (full-frame). See docs/CONFIGS.md to modify or author your own.

Then:

python -m inference run \
  --cfg      <path/to/preset>.yaml \
  --footage  <path/to/footage> \
  --seq_name <seq_name> \
  --calib    <path/to/calib>.yaml \
  --out-tag  run01 -v

Alternative — iterate over many sequences in one invocation. A capture JSON enumerates sequences, cameras, and the calibration in one file; the runner walks them automatically:

python -m inference run \
  --cfg     <path/to/preset>.yaml \
  --capture <path/to/capture>.json \
  --out-tag run01 -v

GUI

Browser UI for submitting and inspecting runs. It uses the same mamma python env.

gui/scripts/dev.sh        # dev: Flask :8000 + Vite :3000 (auto-reload)
gui/scripts/prod.sh       # prod: single Flask process on :8000

Setup and deployment: gui/README.md.

MAMMA datasets

The paper's released captures, evaluation data, and synthetic training data live on the MAMMA project page and require a free account.

Register at https://mamma.is.tue.mpg.de/ and confirm your email.
Either use the GUI's Pipeline assets panel (sign in once, click to download), or run the per-dataset shell scripts under data/:
```
bash data/download_mamma_dance.sh --bachata --meta --pred --videos_crf24
```

Five dataset families ship: dance, multi-person, iPhone, eval, and synthetic. Per-dataset sizes, video encodings, and the full script flag surface live in docs/DATASETS.md.

Just running on your own footage? You don't need any of this — see Run the pipeline above.

Layout

.
├── inference/       runner, step builders, doctor CLI
├── capture/         ma_cap step
├── segmentation/    ma_masks step
├── landmarks/       ma_2d step
├── optimization/    ma_3d step
├── visualization/   ma_vis step
├── configs/         presets + capture manifests
├── data/            body models + weights + datasets (gitignored)
├── output/          run outputs (gitignored)
├── gui/             browser UI (Flask + React)
└── scripts/         smoke tests + utilities

TODO

Release the evaluation scripts (2D landmark + benchmark evaluation) and the processed evaluation datasets.

Citation

@inproceedings{cuevas2026mamma,
  title     = {{MAMMA}: {Markerless Accurate Multi-person Motion Acquisition}},
  author    = {Cuevas Velasquez, Hanz and Yiannakidis, Anastasios and Shin, Soyong and Becherini, Giorgio and H{\"o}schle, Markus and Tesch, Joachim and Obersat, Taylor and Alexiadis, Tsvetelina and Halilaj, Eni and Black, Michael J.},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2026}
}

Acknowledgments

MAMMA builds on a number of open-source models, datasets, and tools. We thank their authors for releasing their work openly.

SAM 2 and SAM 3: Meta FAIR; segmentation backbones.
YOLOv12: Ultralytics; person detection.
SMPL-X: MPI-IS; expressive body model.
ViTPose: pretrained human-pose backbone.
HRNet: alternative pose-estimation backbone.
CameraHMR: MPI-IS; landmark architecture baseline.
BEDLAM: MPI-IS; synthetic dataset of humans in motion.
Rerun: interactive 3D scene viewer.
Detectron2: Meta FAIR; vision research library.
PyTorch Lightning, Hydra, and WebDataset: training infrastructure.

License

For non-commercial scientific research purposes LICENSE.

Contact

Questions, bug reports, or other inquiries: mamma@tue.mpg.de.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MAMMA: Markerless Accurate Multi-person Motion Acquisition

News

Install

Quick demo

Pipeline

Run the pipeline

GUI

MAMMA datasets

Layout

TODO

Citation

Acknowledgments

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
capture		capture
configs		configs
data		data
docs		docs
gui		gui
inference		inference
landmarks		landmarks
optimization		optimization
requirements		requirements
scripts		scripts
segmentation		segmentation
visualization		visualization
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

MAMMA: Markerless Accurate Multi-person Motion Acquisition

News

Install

Quick demo

Pipeline

Run the pipeline

GUI

MAMMA datasets

Layout

TODO

Citation

Acknowledgments

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages