PlotQA Chart Data Extraction Pipeline

A complete end-to-end pipeline for extracting structured data from chart and graph images, based on the PlotQA paper ("PlotQA: Reasoning over Scientific Plots" by Nitesh Methani et al., 2020).

🚀 Quick Start

# Install dependencies
pip install -r requirements.txt
pip install --extra-index-url https://miropsota.github.io/torch_packages_builder detectron2==detectron2-0.6+18f6958pt2.8.0cu128

# Download model weights (see Installation section)
# Then run:
python calculate_visual_values.py --image your_chart.png --model models/ved/model_final.pkl --confidence 0.05 --use-caffe2 --debug

Overview

This pipeline takes chart images (bar charts, line graphs, scatter plots) as input and outputs structured JSON data containing:

Chart type and title
Axis labels and tick values
Data series with extracted values
Legend information
Visual element values (heights, lengths, positions)

Key Features:

Caffe2 Compatibility: Full compatibility with original PlotQA Caffe2 models through specialized detectors
Dual Architecture Support: Both Caffe2-compatible and exact Caffe2 architecture replication
Enhanced Output: Provides both CSV (original format) and JSON (structured) outputs
Visual Value Calculation: Calculates numerical values for visual elements (bars, lines, dots)
Debug Mode: Comprehensive debugging and visualization tools

The pipeline consists of four main stages:

VED (Visual Element Detection): Uses specialized Caffe2-compatible detectors for original PlotQA models
OCR (Optical Character Recognition): Extracts text from detected regions using Tesseract
SIE (Structural Information Extraction): Builds structured data from detections and OCR results
Visual Value Calculation: Computes numerical values for visual elements using coordinate-based scaling

Installation

Refer to Installation guide here

Usage

Basic Usage

# Basic chart processing
python process_chart.py --image your_chart.png --model models/ved/model_final.pkl --use-caffe2

# Visual values calculation with debug
python calculate_visual_values.py --image your_chart.png --model models/ved/model_final.pkl --confidence 0.05 --use-caffe2 --debug

# With KMP fix for Windows
$env:KMP_DUPLICATE_LIB_OK="TRUE"; python calculate_visual_values.py --image your_chart.png --model models/ved/model_final.pkl --confidence 0.05 --use-caffe2 --debug

Test with Sample Data

# Test with PlotQA dataset (if downloaded)
python process_chart.py --image data/plotqa/VAL/png/18458.png --model models/ved/model_final.pkl --use-caffe2

# Test visual values calculation
python calculate_visual_values.py --image data/plotqa/VAL/png/18458.png --model models/ved/model_final.pkl --confidence 0.05 --use-caffe2 --debug

Advanced Usage

Command Line Arguments

Common Arguments:

--image: Path to input chart image (required)
--model: Path to trained VED model weights (required)
--output: Output directory (default: "results")
--confidence: Detection confidence threshold (default: 0.05)
--debug: Enable debug mode for detailed analysis
--verbose: Enable verbose logging

Detector Selection:

The pipeline supports three detector types with the following selection logic:

If use_caffe2=True AND use_exact_caffe2=True → ExactCaffe2Detector { THIS IS USED BY DEFAULT }
If use_caffe2=True AND use_exact_caffe2=False → Caffe2CompatibleDetector
If use_caffe2=False → PlotQADetector (Detectron2)

Default Behavior (when no flags are specified):

calculate_visual_values.py: Uses ExactCaffe2Detector by default (--use-caffe2 and --use-exact-caffe2 are both True by default)
process_chart.py: Uses ExactCaffe2Detector by default (--use-caffe2 and --use-exact-caffe2 are both True by default)

Detector Options:

--use-caffe2: Use Caffe2-compatible detector (recommended for original models)
--use-exact-caffe2: Use exact Caffe2 architecture replication (most compatible)
--use-detectron2: Use Detectron2 detector (modern PyTorch-based, overrides Caffe2 options)

Output Files

Standard Output:

{image_name}.csv - Original PlotQA format
{image_name}.json - Structured JSON format
{image_name}_metadata.json - Processing metadata

Visual Values Output (calculate_visual_values.py):

{image_name}_visual_values.json - Visual element values with coordinates and scaling

Debug Mode Output:

temp/detection_summary.json - Detailed detection information
temp/detections_visualized.png - Image with bounding boxes
temp/ocr_debug.json - OCR results with extended bounding boxes
temp/ocr_with_extended_bboxes.png - OCR visualization
temp/debug_crops/ - Individual cropped regions

Caffe2 Detector Architecture Options

The pipeline supports multiple detector architectures for maximum compatibility:

Caffe2-Compatible Detector (`caffe2_compatible_detector.py`)

Purpose: Provides compatibility with original PlotQA Caffe2 models
Features:
- Loads original .pkl model files
- Maintains original detection format
- Optimized for PlotQA dataset models
Usage: --use-caffe2 flag

Exact Caffe2 Architecture Replication (`exact_caffe2_detector.py`)

Purpose: Exact replication of original Caffe2 architecture
Features:
- Pixel-perfect compatibility with original models
- Handles edge cases in original implementation
- Best for production use with original PlotQA models
Usage: --use-exact-caffe2 flag (recommended)

Detectron2 Detector (`generate_detections.py`)

Purpose: Modern PyTorch-based detection
Features:
- Faster inference
- Better GPU utilization
- Modern PyTorch ecosystem integration
Usage: --use-detectron2 flag or omit Caffe2 flags

Caffe2 Detector Architecture Details

Why Caffe2 Compatibility?

The original PlotQA models were trained using Caffe2, and while Detectron2 provides modern PyTorch-based detection, there can be subtle differences in:

Model loading and initialization
Preprocessing pipelines
Post-processing steps
Numerical precision

Our Caffe2-compatible detectors ensure pixel-perfect compatibility with original PlotQA models.

Architecture Comparison

Feature	Caffe2-Compatible	Exact Caffe2	Detectron2
Model Loading	Original `.pkl` files	Original `.pkl` files	Converted `.pth` files
Preprocessing	Original pipeline	Exact original pipeline	Modern pipeline
Post-processing	Original NMS	Exact original NMS	Modern NMS
Compatibility	High	Perfect	Good
Performance	Good	Good	Best
Maintenance	Medium	High	Low

Technical Implementation

Caffe2-Compatible Detector (`caffe2_compatible_detector.py`)

class Caffe2CompatibleDetector:
    def __init__(self, model_path, confidence_threshold=0.1):
        # Load original Caffe2 model
        self.model = self._load_caffe2_model(model_path)
        # Configure preprocessing to match original
        self.preprocessor = self._setup_preprocessing()
        # Configure post-processing
        self.postprocessor = self._setup_postprocessing()
    
    def detect_single_image(self, image_path):
        # Resize to 650x650 (original PlotQA size)
        resized_image = self._resize_image(image_path, (650, 650))
        # Run inference
        detections = self._run_inference(resized_image)
        # Apply confidence filtering
        filtered_detections = self._filter_detections(detections)
        return filtered_detections, resized_image, original_dimensions

Exact Caffe2 Architecture Replication (`exact_caffe2_detector.py`)

class ExactCaffe2Detector:
    def __init__(self, model_path, confidence_threshold=0.1):
        # Load with exact Caffe2 architecture
        self.model = self._load_exact_caffe2_model(model_path)
        # Replicate exact preprocessing steps
        self.preprocessor = self._replicate_caffe2_preprocessing()
        # Replicate exact post-processing
        self.postprocessor = self._replicate_caffe2_postprocessing()
    
    def detect_single_image(self, image_path):
        # Exact same preprocessing as original Caffe2
        processed_image = self._exact_caffe2_preprocess(image_path)
        # Run with exact same inference pipeline
        detections = self._exact_caffe2_inference(processed_image)
        # Apply exact same post-processing
        final_detections = self._exact_caffe2_postprocess(detections)
        return final_detections, processed_image, original_dimensions

Model File Compatibility

The pipeline supports both original Caffe2 model formats:

.pkl files: Original Caffe2 pickle files (recommended)
.pth files: PyTorch state dict files (for Detectron2)

For best compatibility with original PlotQA models, use .pkl files with Caffe2-compatible detectors.

Advanced Usage

Command Line Arguments

Common Arguments:

--image: Path to input chart image (required)
--model: Path to trained VED model weights (required)
--output: Output directory (default: "results")
--confidence: Detection confidence threshold (default: 0.05)
--debug: Enable debug mode for detailed analysis
--verbose: Enable verbose logging

Detector Selection:

--use-caffe2: Use Caffe2-compatible detector (recommended for original models)
--use-exact-caffe2: Use exact Caffe2 architecture replication (most compatible)
--use-detectron2: Use Detectron2 detector (modern PyTorch-based)

Debug Mode Features

When using --debug flag, the pipeline generates additional files in a temp/ directory:

detection_summary.json - Detailed detection information with coordinates
detections_visualized.png - Image with bounding boxes overlaid
ocr_debug.json - OCR results with extended bounding boxes
ocr_with_extended_bboxes.png - Visualization of OCR processing
debug_crops/ - Individual cropped regions for each detected element

Step-by-Step Processing

For advanced users who need more control, individual components can be used separately:

# 1. Generate detections only
python generate_detections.py \
    --model_path ./models/ved/model_final.pkl \
    --image_path chart.png \
    --output detections.txt

# 2. OCR and structural extraction
python ocr_and_sie.py \
    --image_dir ./images/ \
    --detections_dir ./detections/ \
    --output_dir ./extracted/

Output Formats

Standard JSON Output (`process_chart.py`)

The pipeline outputs structured JSON with the following format:

{
  "chart_type": "bar",
  "title": "Sales by Quarter",
  "x_axis": {
    "label": "Quarter",
    "type": "categorical"
  },
  "y_axis": {
    "label": "Sales ($M)",
    "type": "numeric"
  },
  "data_series": [
    {
      "name": "Sales",
      "type": "bar",
      "data": [
        {"x": "Q1", "y": 15.2},
        {"x": "Q2", "y": 23.1},
        {"x": "Q3", "y": 18.7},
        {"x": "Q4", "y": 28.3}
      ]
    }
  ],
  "metadata": {
    "source_image": "chart.png",
    "extraction_method": "PlotQA Pipeline",
    "total_series": 1
  }
}

Visual Values Output (`calculate_visual_values.py`)

When using calculate_visual_values.py, additional visual value data is included:

{
  "chart_orientation": {
    "isHbar": false,
    "isSinglePlot": true
  },
  "visual_elements_with_values": [
    {
      "pred_class": "bar",
      "bbox": [100, 200, 150, 400],
      "confidence": 0.95,
      "x_value": "Q1",
      "y_value": 15.2,
      "ocr_text": null
    },
    {
      "pred_class": "xticklabel",
      "bbox": [120, 450, 140, 470],
      "confidence": 0.88,
      "ocr_text": "Q1"
    }
  ],
  "calculation_method": "find_visual_values_algorithm"
}

CSV Output (Original PlotQA Format)

The pipeline also generates CSV files in the original PlotQA format for compatibility:

title,xlabel,ylabel,Sales
Sales by Quarter,Quarter,Sales ($M),15.2
Sales by Quarter,Quarter,Sales ($M),23.1
Sales by Quarter,Quarter,Sales ($M),18.7
Sales by Quarter,Quarter,Sales ($M),28.3

Supported Chart Types

Bar Charts: Vertical and horizontal bars
Line Graphs: Single and multi-series lines
Scatter Plots: Point-based data visualization
Mixed Charts: Combinations of the above

Detection Categories

The VED model detects the following chart elements (matching original PlotQA categories):

Category	Description
`bar`	Bar chart bars
`dot_line`	Dotted line elements
`legend_label`	Legend item labels
`line`	Line chart lines
`preview`	Legend preview boxes
`title`	Chart title
`xlabel`	X-axis label
`xticklabel`	X-axis tick labels
`ylabel`	Y-axis label
`yticklabel`	Y-axis tick labels

Configuration

Model Parameters

Confidence Threshold: Minimum detection confidence (default: 0.5)
NMS Threshold: Non-maximum suppression threshold
Input Size: Image resizing for detection (default: 800x1333)

OCR Parameters

Tesseract Config: Character whitelist and recognition mode
Text Padding: Padding around text bounding boxes (default: 5px)
Language: OCR language model (default: English)

Performance Tips

GPU Usage: Use CUDA-enabled GPU for faster training and inference
Batch Size: Adjust based on GPU memory (4-8 for most GPUs)
Image Quality: Higher resolution images generally produce better results
Preprocessing: Ensure charts are clearly visible with good contrast

Troubleshooting

Common Issues

"No module named 'detectron2'"

Install Detectron2 with CUDA support: pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu118/torch2.0/index.html

"Tesseract not found"

Install Tesseract OCR and add to PATH
Ubuntu: sudo apt-get install tesseract-ocr
Windows: Download from https://github.com/UB-Mannheim/tesseract/wiki

"CUDA out of memory"

Reduce batch size in training
Use smaller input image sizes
Enable gradient checkpointing

Poor extraction accuracy

Check detection confidence threshold
Verify image quality and resolution
Retrain model with more similar data

Debug Mode

Enable verbose logging and detailed analysis:

# Basic debug mode
python process_chart.py --image chart.png --model models/ved/model_final.pkl --use-caffe2 --debug

# Visual values with debug
python calculate_visual_values.py --image chart.png --model models/ved/model_final.pkl --use-caffe2 --debug --verbose

File Structure

charto/                                # Main project directory
├── calculate_visual_values.py         # Main Visual values calculation script
├── process_chart.py                   # Chart processing script
├── caffe2_compatible_detector.py      # Caffe2-compatible detector
├── exact_caffe2_detector.py           # Exact Caffe2 architecture replication
├── generate_detections.py             # Detectron2-based detection
├── ocr_and_sie.py                     # OCR and structural extraction
├── utils.py                           # Utility functions
├── bbox_conversion.py                 # Bounding box conversion utilities
├── upscale_boxes.py                   # Box upscaling functionality
├── requirements.txt                   # Complete dependencies list (latest versions)
├── README.md                          # This file
├── calculate_visual_values_readme.md  # Detailed description of the working of the Main Script
├── models/                            # Trained model weights
│   └── ved/
│       ├── model_final.pkl            # Caffe2 model file
│       ├── model_iter19999.pkl        # Training checkpoint
│       ├── net.pbtxt                  # Network architecture
│       └── param_init_net.pbtxt       # Parameter initialization
├── data/                              # PlotQA dataset
│   └── plotqa/
│       ├── TRAIN/                     # Training images
│       ├── VAL/                       # Validation images
│       ├── TEST/                      # Test images
│       └── annotations/               # COCO-style annotations
└─── samle_results/                    # Example results

API Reference

PlotQAProcessor

Main pipeline class for chart processing:

from process_chart import PlotQAProcessor

# Initialize with Caffe2-compatible detector
processor = PlotQAProcessor(
    model_path="./models/ved/model_final.pkl",
    confidence_threshold=0.05,
    use_caffe2=True,
    use_exact_caffe2=True,
    debug=False
)

# Process a single image
results = processor.process_image("chart.png", "output_dir")

VisualValuesCalculator

Extended processor with visual value calculations:

from calculate_visual_values import VisualValuesCalculator

# Initialize calculator
calculator = VisualValuesCalculator(
    model_path="./models/ved/model_final.pkl",
    confidence_threshold=0.05,
    use_caffe2=True,
    use_exact_caffe2=True,
    debug=True  # Enable for full visual value calculation
)

# Process with visual values
results = calculator.process_with_values("chart.png", "output_dir")

Caffe2-Compatible Detector

Direct access to Caffe2-compatible detection:

from caffe2_compatible_detector import Caffe2CompatibleDetector

detector = Caffe2CompatibleDetector(
    model_path="./models/ved/model_final.pkl",
    confidence_threshold=0.05
)

detections, resized_image, original_dimensions = detector.detect_single_image("chart.png")

Exact Caffe2 Detector

Exact Caffe2 architecture replication:

from exact_caffe2_detector import ExactCaffe2Detector

detector = ExactCaffe2Detector(
    model_path="./models/ved/model_final.pkl",
    confidence_threshold=0.05
)

detections, resized_image, original_dimensions = detector.detect_single_image("chart.png")

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Citation

If you use this pipeline in your research, please cite the original PlotQA paper:

@inproceedings{methani2020plotqa,
  title={PlotQA: Reasoning over Scientific Plots},
  author={Methani, Nitesh and Ganguly, Pritha and Khapra, Mitesh M and Kumar, Pratyush},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={1527--1536},
  year={2020}
}

Original PlotQA Pipeline Compatibility

This implementation maintains compatibility with the original PlotQA pipeline while modernizing the underlying technology:

What's Preserved from Original:

Dataset Structure: Uses exact same directory structure and annotation files (train_50k_annotations.json, etc.)
Detection Format: Outputs detections in original format: CLASS_LABEL CLASS_CONFIDENCE XMIN YMIN XMAX YMAX
Category Labels: Uses same visual element categories (axis, tick, tick_label, etc.)
CSV Output: Maintains original semi-structured CSV format for compatibility
Pipeline Stages: Follows same VED → OCR → SIE workflow

What's Modernized:

Detectron2: Uses PyTorch-based Detectron2 instead of Caffe2-based Detectron
Python 3: Updated from Python 2 to Python 3.8+
Enhanced OCR: Improved OCR processing with better error handling
JSON Output: Added structured JSON output alongside original CSV
Better Documentation: Comprehensive usage examples and API documentation

Migration from Original:

If you have an existing PlotQA setup, you can:

Use the same dataset directory structure
Use the same .pkl model files with Caffe2-compatible detectors
Replace detection generation with process_chart.py or calculate_visual_values.py
Get both original CSV and new JSON outputs
Access visual element values with coordinate-based calculations

Troubleshooting

Common Issues

"No module named 'detectron2'"
- Install Detectron2: pip install --extra-index-url https://miropsota.github.io/torch_packages_builder detectron2==detectron2-0.6+18f6958pt2.8.0cu128
"Model file not found" or "Could not load model"
- Ensure model_final.pkl is in models/ved/ directory
- Check file size: ls -lh models/ved/model_final.pkl (should be ~100MB+)
- Re-download from: https://drive.google.com/drive/folders/1P00jD-WFg_RBissIPmuWEWct3xoM3mgU?usp=sharing
"Tesseract not found"
- Install Tesseract OCR and add to PATH
- Windows: Download from https://github.com/UB-Mannheim/tesseract/wiki
"CUDA out of memory"
- Use CPU version: Install PyTorch without CUDA
- Reduce batch size in processing

KMP_DUPLICATE_LIB_OK error (Windows)

$env:KMP_DUPLICATE_LIB_OK="TRUE"
python process_chart.py --image chart.png --model models/ved/model_final.pkl --use-caffe2

"No visual elements detected"
- Check image quality and resolution
- Try different confidence thresholds: --confidence 0.1 or --confidence 0.05
- Ensure image is a chart/graph (not a photo or other image type)

Testing

Test the pipeline with sample images:

# Test basic processing
python process_chart.py --image test_chart.png --model models/ved/model_final.pkl --use-caffe2

# Test visual values calculation
python calculate_visual_values.py --image test_chart.png --model models/ved/model_final.pkl --use-caffe2 --debug

# Test different detector architectures
python process_chart.py --image test_chart.png --model models/ved/model_final.pkl --use-exact-caffe2
python process_chart.py --image test_chart.png --model models/ved/model_final.pkl --use-detectron2

#if error

$env:KMP_DUPLICATE_LIB_OK="TRUE"; python calculate_visual_values.py --image test_chart.png --model models/ved/model_final.pkl --confidence 0.05 --use-caffe2 --debug --verbose 

$env:KMP_DUPLICATE_LIB_OK="TRUE"; python process_chart.py --image test_chart.png --model models/ved/model_final.pkl --confidence 0.05 --use-caffe2 --debug --verbose

Quick Reference

Essential Commands

# Install dependencies
pip install -r requirements.txt
pip install --extra-index-url https://miropsota.github.io/torch_packages_builder detectron2==detectron2-0.6+18f6958pt2.8.0cu128

# Basic processing
python process_chart.py --image chart.png --model models/ved/model_final.pkl --use-caffe2

# Visual values with debug
python calculate_visual_values.py --image chart.png --model models/ved/model_final.pkl --use-caffe2 --debug

# Windows KMP fix
$env:KMP_DUPLICATE_LIB_OK="TRUE"; python process_chart.py --image chart.png --model models/ved/model_final.pkl --use-caffe2

Download Links

Model Weights: https://drive.google.com/drive/folders/1P00jD-WFg_RBissIPmuWEWct3xoM3mgU?usp=sharing
Dataset: https://drive.google.com/drive/folders/15bWhzXxAN4WsXn4p37t_GYABb1F52nQw?usp=sharing
Tesseract (Windows): https://github.com/UB-Mannheim/tesseract/wiki

Acknowledgments

Facebook AI Research for Detectron2
The PlotQA dataset authors (Methani et al., 2020)
Original PlotQA pipeline contributors
Tesseract OCR team
PyTorch community

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
models/ved		models/ved
sample_results		sample_results
.gitignore		.gitignore
INSTALL.md		INSTALL.md
README.md		README.md
bbox_conversion.py		bbox_conversion.py
caffe2_compatible_detector.py		caffe2_compatible_detector.py
calculate_visual_values.py		calculate_visual_values.py
calculate_visual_values_readme.md		calculate_visual_values_readme.md
exact_caffe2_detector.py		exact_caffe2_detector.py
generate_detections.py		generate_detections.py
ocr_and_sie.py		ocr_and_sie.py
process_chart.py		process_chart.py
requirements.txt		requirements.txt
upscale_boxes.py		upscale_boxes.py
utils.py		utils.py

Abhay-Sastha-S/charto

Folders and files

Latest commit

History

Repository files navigation

PlotQA Chart Data Extraction Pipeline

🚀 Quick Start

Overview

Installation

Usage

Basic Usage

Test with Sample Data

Advanced Usage

Command Line Arguments

Output Files

Caffe2 Detector Architecture Options

Caffe2-Compatible Detector (caffe2_compatible_detector.py)

Exact Caffe2 Architecture Replication (exact_caffe2_detector.py)

Detectron2 Detector (generate_detections.py)

Caffe2 Detector Architecture Details

Why Caffe2 Compatibility?

Architecture Comparison

Technical Implementation

Caffe2-Compatible Detector (caffe2_compatible_detector.py)

Exact Caffe2 Architecture Replication (exact_caffe2_detector.py)

Model File Compatibility

Advanced Usage

Command Line Arguments

Debug Mode Features

Step-by-Step Processing

Output Formats

Standard JSON Output (process_chart.py)

Visual Values Output (calculate_visual_values.py)

CSV Output (Original PlotQA Format)

Supported Chart Types

Detection Categories

Configuration

Model Parameters

OCR Parameters

Performance Tips

Troubleshooting

Common Issues

Debug Mode

File Structure

API Reference

PlotQAProcessor

VisualValuesCalculator

Caffe2-Compatible Detector

Exact Caffe2 Detector

Contributing

License

Citation

Original PlotQA Pipeline Compatibility

What's Preserved from Original:

What's Modernized:

Migration from Original:

Troubleshooting

Common Issues

Testing

Quick Reference

Essential Commands

Download Links

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Caffe2-Compatible Detector (`caffe2_compatible_detector.py`)

Exact Caffe2 Architecture Replication (`exact_caffe2_detector.py`)

Detectron2 Detector (`generate_detections.py`)

Caffe2-Compatible Detector (`caffe2_compatible_detector.py`)

Exact Caffe2 Architecture Replication (`exact_caffe2_detector.py`)

Standard JSON Output (`process_chart.py`)

Visual Values Output (`calculate_visual_values.py`)

Packages