The following repository contains links and some sample code written in R to help you parse NIH project data.
The raw data in .CSV format is rather large, so it is stored here: https://drive.google.com/drive/folders/1iE3hYTTO7IXaBadpOJT9wL1VmBLJ3Wpc?usp=sharing
The original data can be found on the NIH website at the following URL: https://reporter.nih.gov/exporter/projects
The data dictionary, defining each column in the CSV, is available here: https://report.nih.gov/exporter-data-dictionary
For your convenience and so you can see it locally, I've replicated the data dictionary in this repo in the text file named NIH_RePORTER_Project_Data_Dictionary.
If you want to try out the R code, you'll need to download the individual .CSV files of RePORTER data either at the link above or from the NIH website. Each .CSV file represents one year of data. I recommend making a folder structure like this:
Top level folder: NIH_Data
Folder within that top level folder: data
In that data folder, place your downloaded .CSV files. The code in the script will combine these into a single dataframe which you will work with by filtering the data and visualizing it by generating charts.
The sample code contained here will help you do some basic data cleanup, like combining the .CSV file of each year of the RePORTER data into a single dataframe, and separating date columns into year, month, date columns to make them easier to work with.
I recommend running this code line by line in RStudio, which is what I did as I wrote it and explored this dataset.
I have included a few example plots in this repository, showing how the charts generated by this code look for a specific state.