An AI-powered blood test analysis tool that extracts biomarkers from lab reports (PDF/images), matches them against a knowledge base, and provides normalized results with reference range assessments.
- Multi-format Support - Analyze PDFs and images (PNG, JPG, etc.)
- AI-Powered Extraction - Uses vision models to extract biomarker data from scanned reports
- Smart Matching - Exact alias matching + AI disambiguation for fuzzy matches
- Auto-Research - Automatically researches unknown biomarkers via web search
- Unit Conversion - Converts units to canonical formats using safe expression evaluation
- Demographic Ranges - Adjusts reference ranges based on age and sex
- Granular Status - Distinguishes
low/high,optimal,moderate, andelevated - Growing Knowledge Base - Learns new biomarkers and saves them for future analyses
flowchart TD
A[π Input: PDF/Image] --> B[πΌοΈ Convert to Images]
B --> C[π€ AI Vision Extraction]
C --> D{Exact Alias Match?}
D -->|Yes| G[β
Match Found]
D -->|No| E[π Get Fuzzy Candidates]
E --> F{AI Disambiguation}
F -->|Match| G
F -->|Research| H[π Web Research Agent]
F -->|Unknown| I[β οΈ Skip/Mark Unknown]
H -->|Success| J[πΎ Save to DB]
J --> G
H -->|Fail| I
G --> K[π Unit Conversion]
K --> L[π Range Assessment]
L --> M[π Final Report]
I --> M
- Python 3.13+
- uv (recommended) or pip
- Gemini API key
- Poppler (for PDF processing)
# Install poppler for PDF support
brew install poppler
# Clone the repository
git clone https://github.com/yourusername/open-blood-analysis.git
cd open-blood-analysis
# Install with uv (recommended)
uv sync
# Or with pip
pip install -e .Create a .env file in the project root:
GEMINI_API_KEY=your-gemini-api-key
AI_MODEL=gemini-3.1-flash-lite-preview
# Optional task-specific overrides:
# AI_OCR_MODEL=gemini-3.1-flash-lite-preview
# AI_RESEARCH_MODEL=gemini-3.1-flash-lite-preview
# AI_THINKING_MODEL=gemini-3.1-flash-lite-preview
# BIOMARKERS_PATH=biomarkers.jsonOr export variables directly:
export GEMINI_API_KEY="your-gemini-api-key"# Analyze a PDF report
uv run blood-analysis report.pdf
# Analyze an image
uv run blood-analysis scan.png
# With debug output
uv run blood-analysis report.pdf --debuguv run blood-analysis report.pdf --sex female --age 35# Save as JSON
uv run blood-analysis report.pdf --output results.json
# Save as CSV
uv run blood-analysis report.pdf --output results.csv
# Use an alternate biomarkers database file for this run
uv run blood-analysis report.pdf --biomarkers-path data/biomarkers.v2.json# Only match against existing database
uv run blood-analysis report.pdf --no-research# Refresh an existing entry (or add if not found)
uv run blood-analysis --reresearch-biomarker apolipoprotein_b
# Provide explicit unit context for the research prompt
uv run blood-analysis --reresearch-biomarker thyroid_stimulating_hormone --reresearch-unit "uIU/mL"
# Preview researched JSON without writing the configured biomarkers DB file
uv run blood-analysis --reresearch-biomarker rdw --dry-run-reresearch Analysis Results
βββββββββββββββββββββββββββ³βββββββββ³βββββββββ³βββββββββ³βββββββββββββββββββββββ
β Biomarker β Value β Unit β Reference β Optimal β Peak β Status β ID β
β‘βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ©
β COLESTEROL TOTAL β 145.0 β mg/dL β 0 - 199 β 130-160 β - β elevated β total_cholesterol β
β COLESTEROL HDL β 40.0 β mg/dL β 40 - 80 β 55-70 β - β moderate β hdl_cholesterol β
β LDL β 101.0 β mg/dL β 0 - 100 β 55-85 β - β high β ldl_cholesterol β
ββββββββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββββββ΄ββββββββββ΄βββββββ΄βββββββββββββββββββββββββββ΄ββββββββββββββββββββ
The tool maintains a biomarkers.json file that grows as you analyze more reports. Each entry includes:
{
"id": "total_cholesterol",
"aliases": ["COLESTEROL TOTAL", "cholesterol, total", "TC"],
"canonical_unit": "g/L",
"description": "Total cholesterol in blood",
"value_type": "quantitative",
"enum_values": null,
"min_normal": null,
"max_normal": 5.18,
"min_optimal": 3.6,
"max_optimal": 4.8,
"peak_value": null,
"molar_mass_g_per_mol": 386.65,
"conversions": {},
"reference_rules": [
{"condition": "age > 60", "max_normal": 6.2, "priority": 1}
],
"source": "research-agent-gemini"
}The conversion engine handles generic concentration scaling automatically:
- Mass scaling:
mg/dL <-> g/L,ng/mL <-> ug/L, etc. - Molar scaling:
mmol/L <-> umol/L, etc. - Massβmolar conversion when
molar_mass_g_per_molis provided.
Biomarker conversions are only for special transforms not covered generically.
Special formulas use safe expression evaluation with simpleeval:
xrepresents the input value- Example:
"mg/dL": "x / 38.67"converts mg/dL to mmol/L
Reference ranges can be customized by demographics:
- Conditions:
sex == male,sex == female,age > 50,age < 18 - Higher priority rules override lower ones
openblood/
βββ main.py # CLI entry point
βββ config.py # Configuration management
βββ loader.py # PDF/image ingestion
βββ llm.py # Vision model extraction
βββ database.py # Biomarker DB operations
βββ agent.py # AI disambiguation & research
βββ logic.py # Unit conversion & analysis
βββ types.py # Pydantic models
Contributions are welcome! Please feel free to submit a Pull Request.
MIT License - see LICENSE for details.