HydraMPP (Hydra Massive Parallel Processing) is a high-performance Python library for distributed parallel processing. It scales the same code seamlessly from:
- ๐ป A single laptop
- ๐ฅ๏ธ A multi-core workstation
- ๐ A multi-node HPC cluster
Built for simplicity, portability, and near-zero dependencies, HydraMPP gives you a familiar @remote / .remote() programming model without dragging in a heavyweight runtime โ psutil and the Python standard library are all it needs.
|
|
| Feature | HydraMPP |
|---|---|
๐ชถ Featherweight (psutil + stdlib only) |
โ |
๐งฌ Ray-style API (@remote / .remote) |
โ |
| ๐ป Laptop โ ๐ cluster, same code | โ |
| ๐งฎ Native SLURM auto-wiring | โ |
| ๐ก Built-in live status monitor | โ |
| ๐ฆ PyPI and conda-forge | โ |
| ๐ Small, readable, auditable codebase | โ |
| ๐ Pure Python, no native build step | โ |
flowchart LR
A["Your Python Script"] -->|"remote tasks"| B["HydraMPP Driver"]
B --> C["Shared Job Queue"]
C --> D["Scheduler (main_loop)"]
D -->|"local"| E["Worker Processes"]
D -->|"TCP + pickle"| F["Client Node 1"]
D -->|"TCP + pickle"| G["Client Node N"]
H["hydra-status.py"] -. "UDP" .-> D
| Component | Technology |
|---|---|
| Language | Python โฅ 3.6 |
| Parallelism | multiprocessing |
| Resource detection | psutil |
| Networking | TCP sockets (struct framing) |
| Serialization | pickle |
| Cluster integration | SLURM (scontrol / srun) |
| Status monitor | UDP datagram |
| CLI helper | argparse |
pip install hydraMPPNo admin rights? Install into your home folder:
pip install --user hydraMPPconda install -c conda-forge hydraMPPgit clone https://github.com/raw-lab/HydraMPP
cd HydraMPP
pip install .import time
import hydraMPP
# 1๏ธโฃ Tag any function you want to run in parallel
@hydraMPP.remote
def slow_square(x, seconds=1):
time.sleep(seconds)
return x * x
def main():
# 2๏ธโฃ Initialize (local mode, auto-detect CPUs)
hydraMPP.init()
# 3๏ธโฃ Fire off jobs โ submission is non-blocking, returns a job id
jobs = [slow_square.remote(i) for i in range(20)]
# 4๏ธโฃ Request more resources for a heavier task
jobs.append(slow_square.options(num_cpus=4).remote(99))
# 5๏ธโฃ Drain results as they finish
ready, pending = hydraMPP.wait(jobs)
while pending:
for job in ready:
result = hydraMPP.get(job)
print(f"{result[1]} -> {result[2]}") # func name -> return value
ready, pending = hydraMPP.wait(pending)
hydraMPP.shutdown()
if __name__ == "__main__":
main()HydraMPP runs in three modes, selected by the address argument to init():
| Mode | How to start | Role |
|---|---|---|
| ๐ป local | hydraMPP.init() |
Single machine, all CPUs local |
| ๐ฅ๏ธ host | hydraMPP.init(address="host", port=24515) |
Coordinator that accepts clients |
| ๐ฐ๏ธ client | hydraMPP.init(address="10.0.0.5", port=24515) |
Worker node that joins a host |
# Coordinator (host) โ waits for clients to connect
hydraMPP.init(address="host", port=24515, timeout=10)
# Worker (client) โ connects to the host and offers its CPUs
hydraMPP.init(address="10.0.0.5", port=24515, num_cpus=36)| Function | Description |
|---|---|
hydraMPP.init(address, num_cpus, ...) |
Start HydraMPP in local / host / client mode |
@hydraMPP.remote |
Tag a function for parallel execution |
func.remote(*args, **kwargs) |
Queue a tagged function; returns a job id |
func.options(num_cpus=N).remote(...) |
Set per-call resource options |
hydraMPP.wait(queue, timeout, max) |
Split a job list into (ready, pending) |
hydraMPP.get(id) |
Retrieve a finished job's result record |
hydraMPP.put(name, obj) |
Place an object into the queue as a finished result |
hydraMPP.nodes() |
List the connected nodes |
hydraMPP.shutdown() |
Tear down workers and the manager |
hydraMPP.get(id) returns a record describing the job. Fields are accessible by index:
| Index | Field | Meaning |
|---|---|---|
| 0 | finished |
bool โ whether the job has completed |
| 1 | func_name |
name of the executed function |
| 2 | ret |
the function's return value |
| 3 | num_cpus |
number of CPUs the job used |
| 4 | runtime |
wall-clock run time, in seconds |
| 5 | hostname |
node the job ran on |
A helper script is included to inspect a running HydraMPP cluster over UDP.
usage: hydra-status.py [-h] [address] [port]
positional arguments:
address Address of the HydraMPP server [127.0.0.1]
port Port to connect to [24515]
options:
-h, --help show this help message and exitIt prints connected clients, available CPUs, and queued jobs, then exits. For continuous monitoring, wrap it with watch:
watch -n1 hydra-status.py localhostHydraMPP can auto-configure host and client nodes inside a SLURM allocation. Just add --hydraMPP-slurm $SLURM_JOB_NODELIST to your program's invocation and HydraMPP wires up the cluster for you.
๐ก Call
hydraMPP.init()after all functions have been tagged with@hydraMPP.remote. Use--hydraMPP-cpusto set CPUs per node;0or omitted lets HydraMPP guess.
#!/bin/bash
#SBATCH --job-name=My_Slurm_Job
#SBATCH --nodes=3
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=36
#SBATCH --mem=100G
#SBATCH --time=1-0
#SBATCH -o slurm-%x-%j.out
echo "====================================================="
echo "Start Time : $(date)"
echo "Submit Dir : $SLURM_SUBMIT_DIR"
echo "Job ID/Name : $SLURM_JOBID / $SLURM_JOB_NAME"
echo "Node List : $SLURM_JOB_NODELIST"
echo "Num Tasks : $SLURM_NTASKS total [$SLURM_NNODES nodes @ $SLURM_CPUS_ON_NODE CPUs/node]"
echo "====================================================="
path/to/program.py --custom-args \
--hydraMPP-slurm-auto --hydraMPP-cpus $SLURM_CPUS_ON_NODECreative Commons Attribution-NonCommercial (CC BY-NC 4.0)
See the LICENSE file for details.
If you use HydraMPP in published work, please cite:
Figueroa III JL, White III RA. 2026
HydraMPP: A lightweight library for distributed massive parallel processing in Python - threading at scale. BioRxiv
We welcome:
- ๐งต Scheduling and dispatch improvements
- ๐ Networking and fault-tolerance enhancements
- ๐ Authentication / transport security
- ๐ก Status-monitor features
- ๐งช Tests and benchmarks
Pull requests and issues are encouraged.
-
๐ Issues: HydraMPP Issues
-
๐ง Contact:
If you have any questions or feedback, please feel free to get in touch by email.