Welcome to the RAP (Reproducible Analytical Pipeline) demonstration repository! This repository is designed for beginners to practice RAP principles, experiment with code, and learn best practices for reproducible, automated, and transparent analytical pipelines in Python.
This repository is still in development
-
Fork the repository:
- Go to the GitHub page for this repository.
- Click the "Fork" button in the top right to create your own copy.
- Clone your forked repository:
git clone https://github.com/<your-username>/python_rap_demo.git cd python_rap_demo
-
Set up your environment:
- Create and activate a virtual environment:
python -m venv .venv .venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Create and activate a virtual environment:
src/— Main pipeline code and modulesdata/— Example health data for analysisconfig/— Configuration files (YAML)tests/— Unit tests for pipeline modulesexercises/— Practice exercises (see below)docs/— Documentation
To run the main RAP pipeline, open a terminal in your project root and enter:
src/main.pyThis will:
- Load configuration from user_config.yaml
- Read input data from health_data.csv
- Clean and process the data
- Write outputs and generate a markdown report in outputs
- You should see a message confirming the report was generated.
Explore the existing code and add your own to the src/ folder.
All exercises for RAP learning are in the exercises/ folder. These are not part of the main pipeline, but are for practice and experimentation.
- Each exercise has its own subfolder and README with instructions.
- Work through exercises to learn how to:
- Add new modules
- Use config files
- Write unit tests
- Set up and customize pre-commit hooks
- Apply RAP principles in real code
Do not edit files in src/ unless instructed by an exercise.
Information about different files and folders can be found throughout the pipeline:
- Files: Contain information on what they are and what they are used for in a RAP in the file itself, except .secrets.baseline. .secrets.baseline information can be found in the
docsfolder - Folders: Contain a README to explain what the folder is for and typical files it contains
- Scripts: Fully documented with docstrings and comments.
Test your functions by adding tests to the tests/ folder.
Run tests with:
pytest testsThis repo is for learning and experimentation. If you want to contribute improvements, please read CONTRIBUTING.md.
AI has been used in the production of this content.
Happy RAP coding!