The primary goal of the LibRA project is to directly expose algorithms used in Radio Astronomy (RA) for image reconstruction from interferometric telescopes. The primary target users are research groups (R&D groups at observatories, at university departments) and individual researchers (faculty, students, post-docs) who may benefit from a software system with production-quality implementation of the core algorithms which are also easy to use, deploy locally and modify as necessary. Therefore, a strong requirement driving this work is to keep the interface simple, the software stack shallow and the software dependency graph small.
This repository contains only the algorithmically-necessary code, and a build system to compile it into a library of algorithms. Such a library can be directly used as a third-party library by others in the RA community. Interfaces are provided to access the algorithms from C++ and Python, or as an end-user via standalone applications to conveniently configure and execute the algorithms from a Linux shell. The low-level algorithms exposed by these interfaces are factorized to be used as components in a higher-level generalized Algorithm Architecture (S. Bhatnagar, U. Rau, M. Hsieh, J. Kern, and R. Xue, AJ 170 246, 2025).
This page is meant for a condensed description of the architecture. This page is still under construction.
Interferometric radio telescopes are indirect imaging devices which collect data in the Fourier domain. Transforming the raw data from such devices to images require application of sophisticated algorithms to reconstruct the image. The fundamental scientific principles behind such telescopes share commonalities with other domains that rely on indirect imaging such as Magnetic Resonance Imaging (MRI) and Ultrasound imaging. To make RA algorithms available for application in such fields and enable cross-discipline R&D, the API to the library is based on C++ STL for portability and wider use that does not require an RA-specific software stack and dependencies.
Deep Imaging at a National Scale : Using the set of LibRA apps as path-finder for ngVLA-scale processing, we deployed the computationally intensive components of the Algorithm Architecture (Bhatnagar et al., AJ 170:246, 2025) on a scale about 10x larger (O(100) GPUs) than what has been attempted so far at NRAO (O(10) GPUs) to investigate the operational and computational challenges of distributed computing at this scale. For this, in collaboration with the Center for High Throughput Computing (CHTC, Univ. of Wisconsin-Madison, WI), we used a nation-wide network of computers in the Open Science Pool (OSPool), U.S. National Science Foundation's Pathways for Advancing Throughput computing (PATh), San Diego Supercomputer Center (SDSC) at the Univ. of California San Diego and the National Research Platform (NRP). This produced the deepest image ever made with the VLA, and in the RA community world-wide of the Hubble Ultra-Deep Field (HUDF) achieving a noise floor of 1 microJy/beam in the radio band.
- scientific code of algorithms for data calibration and image reconstruction
- a suite of standalone applications (apps) to configure and trigger the algorithms from commandline, and
- a build system to build the library of algorithms, the apps, and all the dependencies other than the System Requirements.
A containerized means of building the LibRA project is available here. This is mirrored here. Please create a ticket on github to interact with us.
The main branch of this project is also mirrored here.
The src directory contains the implementation of the basic
calibration and imaging algorithms. The code has been derived from
the CASA project but contains only the algorithmically-significant
part of the much larger CASA code base. The code here can be
compiled into a standalone reusable software library. This
significantly simplifies the software stack and the resulting software
dependency graph (compared to the CASA software stack and
dependencies). A suite of
standalone applications are
also available which can be built as relocatable Linux executable
(this may also be possible for MacOS, but we haven't tested it).
The resulting software stack is shown below. Figure on the left/top shows our current software stack where the RA Algorithms layer is built on the RA-specific data-access and CASACore layers. Work is in progress to decouple the RA Algorithms layer from RA-specific layers with the algorithms API based only on the C++ Standard Template Library (STL). With a translation layer RA-specific libraries (CASACore, RA Data Access/Iterators) may be replaced for use of RA Algorithms in other domains.
libparafeed in the figures is a standalone library for embedded user interface used for command-line configuration of LibRA apps.
libhpg is a standalone library that deploys the compute-intensive calculations for imaging on a GPU or a CPU core (like the re-sampling of irregular data to and from a regular grid -- a.k.a. "gridding" and "degridding" in RA jargon). This library is built on the Kokkos framework for performance portable implementation.
Standalone applications (apps) give access to algorithms via commandline options in the NAME=Val0[,Val1,...] format, or via an embedded interactive user interface. More detailed documentation for the user interfaces can be found via the following links:
- Full User Manual
- Commands in interactive mode
- Non-interactive mode (the
helpkeyword setting) - Customization
-
roadrunner: An application to transform the data in a Measurement Set (MS) to an image. This can be deployed on a single CPU core, or on a GPU. This is a.k.a. as themajor cyclein RA. -
dale: An application to apply normalization to theweight,psf,residualandmodelimages created withroadrunnerandhummbee, and compute the primary beam. -
chip: An application to accumulate multiple images onto an output image (a.k.a. the "gather" operation in CS-speak). -
hummbee: An application to derive a model of the signal in the raw image (e.g., made usingroadrunner). This is a.k.a. theminor cyclein RA. -
coyote: An application to build the CF Cache used as input to theroadrunnerapplication. -
acme: An application to print/verify image statistics. -
tableinfo: An application to print summary of the data (MS) and images (information from image headers). -
mssplit: An application to split a data (in the MS format) along various axis of the data domain. The resulting data can be written as a deep-copy, or as a reference to the input data base. -
subms: Functionally the same asmssplitbut additionally re-normalizes the sub-tables in the resulting data base.
-
libra_htclean.sh: A script that implements the Algorithm Architecture and uses the apps as algorithmic components for imaging. This implements the iterative image reconstruction technique widely used in RA for derivative and model update calculations (themajor cycleandminor cyclerespectively). The execution graph can be deployed as a DAG on a CPU, a GPU, or on a cluster of CPUs/GPUs using the framework intemplate_PATh. This has been used to deploy the parallel imaging execution graph on a local cluster, and on the PATh and OSG clusters. A variant that uses LibRA apps as components has also been used for a prototype deployment on AWS.
The following come default with RHEL8 or similar systems:
- GCC 8.x or later
- cmake 3.x or later
- git 2.0 or later, gcc-gfortran gtest-devel ccache
The ccache dependency can be dropped by passing -DUSE_CCACHE=OFF at cmake configure time. The cmake-based build is the supported route.
The following list of packages need to be installed. Following is a typical command to install:
-
dnf -y install {readline,ncurses,blas,lapack,cfitsio,fftw,wcslib,gsl}-devel
If LIBRA_USE_LIBSAKURA=ON also install Eigen3 library:
-
dnf -y install eigen3-devel -
An installation of the appropriate version of CUDA is also required for GPU support in the
roadrunnerapp. This dependence is limited to theKokkosandHPGlibraries below. We used the following commands to install CUDA libraries for cross compilation. Your mileage may vary. Note that for only building the software, an actual GPU on the build-host is not necessary.export distro=rhel8 export arch=x86_64 sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/$distro/$arch/cuda-$distro.repo sudo dnf clean expire-cache sudo dnf module install nvidia-driver:latest-dkms
A clone of this repository will get the src directory with the scientific code (implementation of the RA algorithms), apps/src directory with the source code for the standalone application and the top level CMakeLists.txt file to compile the code including build time dependencies.
See CUDA GPUs -- Compute Capability page for details of NVIDIA GPUs to determine the value for the Kokkos_CUDA_ARCH_NAME.
git clone https://github.com/ARDG-NRAO/LibRA.git
cd LibRA
mkdir build
cd build
# A list of Kokkos CUDA ARCH_NAME can be found at Kokkos web page https://kokkos.github.io/kokkos-core-wiki/keywords.html#keywords-arch
# See also CUDA GPUs -- Compute Capability at https://developer.nvidia.com/cuda-gpus.
# Default behaviour is to determine CUDA ARCH automatically.
#
# Default behaviour is BUILD_TESTING=OFF. The legacy flag Apps_BUILD_TESTS
# is still honored as an alias.
cmake -DKokkos_CUDA_ARCH_NAME=<ARCH_NAME> -DBUILD_TESTING=OFF .. # set BUILD_TESTING=ON to build the test suite
# It is set to run "make -j NCORES" internally, so it is important to just run "make" below to prevent parallelizing make twice.
make
The binary standalone
applications will be installed
in libra/install/bin by default. Override with
-DCMAKE_INSTALL_PREFIX=<prefix> at cmake configure time.
Configure with testing enabled, then build as above:
cmake -DKokkos_CUDA_ARCH_NAME=<ARCH_NAME> -DBUILD_TESTING=ON ..
make
Run the C++ test suite via CTest from the top of the build tree:
ctest --test-dir build/Libra -R '^test_' --output-on-failure
The -R '^test_' filter restricts execution to the gtest-based suite and skips legacy tests.
- Container recipes (Docker and Singularity) for LibRA live in
scripts/container_recipes/.
- An app for (self-) calibration
-
Make a top-levelcmakefile. -
A simple framework to runAcoyoteon multiple cores/nodes formode=fillcfsetting.slurmbased framework is in place. GNU Parallel based one may also be useful. - Implement a
modeincoyoteapp to list the specific CFs from the CFC which would be required for the given MS and settings.
Such a list can be used by other components of the algorithm architecture to make a potentially smaller sub-CFC, specially when the given MS is a partition of a larger database being imaged.