Extended cluster usage fills up the root partition

Reporting on behalf of Alex.

It seems that extended usage of the cluster fills up the root partition:

> Hail version: 0.2.18-08ec699f0fd4
> Error summary: DiskErrorException: No space available in any of the local directories.

You could build a new base image on a larger base instance, but once deployed, the increased root partition size will shrink the size of the `tmpfs` used by spark to perform its computation

There are some temporary fixes that could help us buying time:
1. Investigate why the Ansible `cleanup-base` role in ansible does/did not clean up the download directory `/opt/sanger.ac.uk/hgi/download`
2. Replace anaconda with miniconda in the Ansible `anaconda-base` role. Anaconda has 5GB `pkgs` directory with more than 400 packages, most of which are never going to be used.

Long term solutions would be:
1. Ship the logs away (Ansible `common` role), however this might not address the spark `work` directory
2. Encourage the user to the "time-limited" usage model where they tear down the cluster every time the do not use it. This would clean up anything.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extended cluster usage fills up the root partition #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extended cluster usage fills up the root partition #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions