Skip to content

S4Plus/PtxDec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PtxDec: LLM-based PTX to CUDA Decompilation

Official implementation for ASE2025 Submission #1925:
Enhancing LLM to Decompile Optimized PTX to Readable CUDA for Tensor Programs

📌 Overview

This repository contains code for our LLM-based PTX-to-CUDA decompilation framework featuring:

  • Compiler-based data augmentation for generating aligned PTX-CUDA pairs
  • Rolled-PTX representation to handle optimized loop structures
  • LLM fine-tuning pipeline for decompilation
  • Evaluation scripts for decompilation accuracy and quality

🗂 Repository Structure

.
├── data_generator/             # Sec 3.2: Data augmentation pipelines
│   ├── Scheduling-Diverse/     # Scheduling diversity pipeline
│   └── Subgraph-Diverse/       # Subgraph diversity pipeline
│
├── dataset_workspace/          # Dataset processing & evaluation
│   ├── simplify_cuda.py        # Sec 3.3: CUDA Kernel Refactoring
│   ├── loop_reroll_ptx.py      # Sec 3.4: PTX loop rerolling
│   └── ...
│
└── model_train_infer/          # LLM training & inference
    ├── train.ipynb             # Model fine-tuning
    ├── infer.py                # Decompilation inference
    └── ...

🔍 Key Components

Data Generation (Sec 3.2)

  • Scheduling-Diverse Pipeline:
    Entry: data_generator/Scheduling-Diverse/tenset/scripts/measure_programs_cuda.py
  • Subgraph-Diverse Pipeline:
    Entry: data_generator/Subgraph-Diverse/welder/nnfusion/artifacts/my_welder_cudaptx.py

Data Processing

  • Quality improvement (Sec 3.3): dataset_workspace/simplify_cuda.py
  • Rolled-PTX generation (Sec 3.4): dataset_workspace/loop_reroll_ptx.py

Model Training

  • LLM fine-tuning: model_train_infer/train.ipynb
  • Inference: model_train_infer/infer.py

Evaluation

  • Decompilation accuracy and quality evaluation:
    Entry: dataset_workspace/my_eval_decompile_multi_*.py

📦 Data & Model Weights

The full dataset (400K PTX-CUDA pairs) and pretrained model weights are being prepared for public release. Due to their size:

  • Full dataset (~69 GB) will be available on Hugging Face Datasets
  • Model weights (~14.5 GB) will be available on Hugging Face Hub

A sample dataset subset is included in dataset_workspace/dataset_example/ for initial exploration.

About

code

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •