This Python build tool enables a given user to calculate a variety of different data privacy metrics on tabular data from a user interface.
- K-anonymity 1
- ℓ-diversity 2
- Sample Unique Detection Algorithm (SUDA) 3
- Privacy Information Factor (PIF) 4
- Noise addition
- Field generalisation
- Rounded Approximation
Input can be in either CSV or TSV format. For meta information, an option to load a JSON file is available.
- Python 3.7+ (tested with Python 3.13)
- Conda environment (recommended)
- Install R: Download and install R from CRAN
- Install Rtools: Download and install Rtools from CRAN Rtools
- Rtools provides the necessary build tools (make, gcc) required for compiling R packages
- Make sure to add Rtools to your system PATH during installation
- Install required R packages: Open R or RStudio and run:
install.packages("sdcMicro") - Set R_HOME environment variable (if needed):
set R_HOME=C:\Program Files\R\R-4.5.1
Why this is needed: The metaprivBIDS package depends on rpy2 which requires R and build tools to compile properly on Windows. Additionally, the sdcMicro R package is required for privacy analysis functionality. Without Rtools, you'll get "make: command not found" errors.
First, activate your conda environment:
conda activate your-env-name # or source your-venv/bin/activateNavigate to the MetaprivBIDS directory and run the interactive installer:
cd metaprivBIDS
python install.pyThe installer will:
- ✅ Fix pkg_resources deprecation warning (automatically pins setuptools<81 if needed)
- ✅ Install Qt dependencies (attempts to resolve GUI compatibility issues)
- ❓ Ask about pygraphviz (optional package for advanced graph visualization)
On some systems, the Qt GUI may hang due to missing X11/XCB libraries or platform compatibility issues. The installer provides several launch options:
Use the intelligent launcher that tests Qt compatibility and provides fallbacks:
python run_metaprivBIDS_safe.pyThis launcher will:
- Test CLI functionality first
- Ask if you want to try the GUI
- Test Qt compatibility with timeouts
- Provide fallback to CLI mode if GUI fails
If you only need the core functionality without GUI:
python test_cli.pyFor direct GUI access (use with caution):
python run_metaprivBIDS.pyIf this hangs, use Ctrl+C to interrupt and try the safe launcher.
If the GUI hangs, try these Qt platform alternatives:
# Minimal platform (no visual output but functional)
QT_QPA_PLATFORM=minimal python run_metaprivBIDS.py
# Offscreen platform (for headless servers)
QT_QPA_PLATFORM=offscreen python run_metaprivBIDS.pyAfter following the installation guide, the metrics within the MetaprivBIDS tool can be called through an import statement without making use of the GUI.
from metaprivBIDS.corelogic.metapriv_corelogic import metaprivBIDS_core_logic
metapriv = metaprivBIDS_core_logic()
# Load the data
data_info = metapriv.load_data('Use_Case_Data/adult_mini.csv')
# Inspect {column, unique value count, column type}
data = data_info["data"]
print("Column Types:", '\n')
print(data_info["column_types"], '\n')
# Select Quasi-Identifiers
selected_columns = ["age", "education", "marital-status", "occupation", "relationship", "sex", "salary-class"]
results_k_global = metapriv.find_lowest_unique_columns(data, selected_columns)
print('Find Influential Columns:', '\n')
print(results_k_global)
# Compute Personal Information Factor
pif_value, cig_df = metapriv.compute_cig(data, selected_columns)
print("PIF Value:", pif_value)
print("CIG DataFrame:")
print(cig_df)
# Run SUDA2 computation
results_suda = metapriv.compute_suda2(data, selected_columns, sample_fraction=0.3, missing_value=-999)
# Access results
data_with_scores = results_suda["data_with_scores"]
attribute_contributions = results_suda["attribute_contributions"]
attribute_level_contributions = results_suda["attribute_level_contributions"]To run tests, navigate to the tests folder and activate your environment:
cd tests
conda activate your-env-name # or source your-venv/bin/activate
python test_metaprivBIDS_core_logic.pyNote: Install pytest if needed: pip install pytest
UserWarning: pkg_resources is deprecated as an API
Solution: The installer automatically fixes this by pinning setuptools<81. If you see this warning, run:
pip install "setuptools<81"qt.qpa.plugin: Could not load the Qt platform plugin "xcb"
This plugin does not support propagateSizeHints()
Solutions:
- Use the safe launcher:
python run_metaprivBIDS_safe.py - Use CLI-only mode:
python test_cli.py - Try platform fallbacks:
QT_QPA_PLATFORM=minimal python run_metaprivBIDS.pyQT_QPA_PLATFORM=offscreen python run_metaprivBIDS.py
If the GUI hangs, press Ctrl+C to interrupt and use:
- Safe launcher:
python run_metaprivBIDS_safe.py - CLI mode:
python test_cli.py
If you get module import errors, ensure:
- Your conda environment is activated
- The package is installed:
python install.py - You're in the correct directory
For running tests:
pip install pytest
cd tests
python test_metaprivBIDS_core_logic.pyIf you cannot install system packages (sudo access), the CLI mode will work without additional system dependencies.
Error: make: command not found or R was not built as a library
Cause: Missing R installation or Rtools build tools on Windows.
Solutions:
- Install R: Download from CRAN
- Install Rtools: Download from CRAN Rtools
- Add Rtools to PATH: During Rtools installation, check "Add to PATH"
- Set R_HOME environment variable:
set R_HOME=C:\Program Files\R\R-4.5.1
- Alternative: Use pre-compiled rpy2:
pip uninstall rpy2 pip install --only-binary=all rpy2
- Alternative: Use conda for rpy2:
conda install -c conda-forge rpy2
Note: If rpy2 is causing issues and isn't critical for your use case, you may be able to skip R-related functionality.
Error: The R package "sdcMicro" is not installed
Cause: Required R package for privacy analysis functionality is missing.
Solutions:
- Install via R console:
install.packages("sdcMicro") - Install via RStudio: Open RStudio and run the same command
- Install from R command line:
R -e "install.packages('sdcMicro')" - If installation fails, try installing dependencies first:
install.packages(c("VIM", "robustbase", "cluster")) install.packages("sdcMicro")
Note: The sdcMicro package is essential for k-anonymity, l-diversity, and other privacy metrics. The application may not function properly without it.
For additional support:
- Use CLI mode for core functionality:
python test_cli.py - Check the debug script:
python debug_test.py - The core logic is fully functional without GUI dependencies
Footnotes
-
Sweeney, L. (2002). k-Anonymity: A Model for Protecting Privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(05), 557-570. ↩
-
Machanavajjhala, A., Kifer, D., Gehrke, J., & Venkitasubramaniam, M. (2007). ℓ-Diversity: Privacy Beyond k-Anonymity. ACM Transactions on Knowledge Discovery from Data (TKDD), 1(1), 3-es. ↩
-
Elliott, M. J., & Skinner, C. J. (2000). Identifying population uniques using limited information. Proceedings of the Annual Meeting of the American Statistical Association. ↩
-
Information Governance ANZ. (2019). Privacy Impact Assessment eReport. Link ↩