This is a collection of helper functions and utilities to make working on the
DNAnexus-based Our Future Health TRE
more convenient. It provides wrapper functions around the dx
utility and should therefore not be
exclusively useful for working with the specific TRE it was designed for, but
general work on DNAnexus from within an interactive R session.
This project is in no way affiliated with DNAnexus. In fact, the author does not particularly enjoy working on their platform, hence this package.
Main use-cases are:
- Interact with the
dxtoolkit from within your R session, both locally and in a OFH TRE Jupyter session using convenience functions wrapping common calls todx - Submit R commands to a worker job without the need to interact with with the web interface of DNAnexus
- Where relevant, the functions work exclusively with dependencies available on the OFH TRE JupyterLab workstation
Function documentation is also available at
https://comp-med.github.io/r-ofhelper/.
- Launch a JupyterLab Session from your R command line
dx_launch_workstation()
- Submit R jobs with custom scripts
dx_submit_r_job()
- Submit Swiss Army Knife jobs
dx_run_swiss_army_knife()
- DNAnexus Operations:
- Functions starting with:
dx_* dx_run_cmd()can be used to execute arbitrarydxcommands slightly cleaner than by usingsystem2()or similardx_uploadcan be used to upload result files from a worker to the project space with the option to overwrite files with the same name
- Functions starting with:
- Decoding Workflows for raw OFH data
decode_single_select()decode_multi_select()decode_raw_ofh_file()
- Logging:
simple_logger()due to none being available on OFH - Instance-Type Selection:
find_tre_instance_type()so you don't have to check the rate card manually
The package relies on the external python executable dx, so using this
package on Windows will probably only work from within WSL. The package should
work without issues on Unix-based operating system.
The external dependency this package provides wrappers for must be installed
separately. Package managers like uv or conda can be used for this purpose.
The location of the required binary can then be queried and passed to the
initialization function.
# Using conda
conda create -n dxpy python=3.10
conda activate dxpy
pip install dxpy
# Using uv
uv venv
uv pip install dxpyAfter activating the environment, check the path of the executable (requires
the whereis utility).
whereis dxLocally, you can install the packacke using:
install.packages("remotes")
remotes::install_github("comp-med/r-ofhelper")On the OFH TRE, you can upload the package using the Airlock system, since no external packages can be installed and all development must take place in a vanilla Jupyter Notebook environment.
For this, a convenience function is provided (after installing the package
locally) to create a single input string that can be then be written to a file
to be submitted to the Airlock system. This is more convenient for the
auditing process.
Locally, run the following command to create a string containing all functions. Write that to a file and upload it to the TRE.
# `ofhelper_string` contains all functions of this package. Submit those to the airlock.
ofhelper_string <- create_ofhelper_string()
writeLines(ofhelper_string, "upload_this_via_the_airlock.txt")Please check the Getting Started Vignette.
Dependencies of functions that can be used are restricted to the packages available on the OFH TRE. Some functions are designed to be run for your local environment and might include additional dependencies.
data.tablefsrlanggluewithr
Due to the straight forward nature of parts of the package, some minor utility functions and some documentation was generated using a locally deployed LLM.
- Look for TODO tags in the functions!
- Tests are mostly mock-tests right now and very much lacking
- Integration tests with
dxare not present yet