A virtual machine implementation for executing NVIDIA PTX (Parallel Thread Execution) intermediate code. This project provides a complete VM architecture with advanced execution features, memory optimizations, and comprehensive tooling.
Authors: Han-Zhenzhong, TongyiLingma, GitHub Copilot
The project is hosted on gitee at: https://gitee.com/hanzhenzhong/ptx-vm
- Full SIMT (Single Instruction Multiple Threads) execution model
- Warp scheduling with dynamic thread mask management
- Predicated execution support for conditional operations
- Comprehensive divergence handling with multiple reconvergence algorithms
- Performance counters for detailed execution statistics
- Hierarchical memory model with separate spaces
- Data cache simulation with configurable parameters
- Shared memory bank conflict detection
- Memory access pattern analysis and optimization
- TLB and page fault handling for virtual memory
- Dynamic register allocation framework
- Instruction scheduling optimizations
- Memory coalescing optimizations
- Cache configuration flexibility
- Host API design for easy integration
- CLI interface for manual execution and debugging
- CUDA binary loading infrastructure
- Enhanced debugging interface with watchpoints
- Comprehensive unit test suite
- Integration tests for system-level behavior
- Performance benchmarks
- Example programs for demonstration
- 📚 完整文档索引 - 查看所有文档分类
- 🚀 用户文档 - 使用指南、API 文档、示例代码
- 🔧 开发文档 - 开发者指南、实现总结、性能优化
- 📖 规范文档 - PTX 基础知识、SIMT 执行模型、技术规范
主要文档:
- CMake 3.14+
- C++20 compatible compiler:
- GCC 10+
- Clang 12+
- MSVC VS2019 16.10+
- Google Test (for unit testing)
# Clone the repository
git clone https://gitee.com/hanzhenzhong/ptx-vm.git
cd ptx_vm
# Create build directory
mkdir build && cd build
# Configure with CMake
cmake ..
# Build the project
makeBUILD_TESTS=ON/OFF- Enable/disable unit testsBUILD_EXAMPLES=ON/OFF- Enable/disable example programsBUILD_DOCUMENTATION=ON/OFF- Enable/disable documentation building
The PTX VM provides three ways to use the virtual machine:
Run PTX programs directly from the command line:
# Basic execution (with default INFO log level)
./ptx_vm examples/simple_math_example.ptx
# With debug logging to see detailed execution
./ptx_vm --log-level debug examples/control_flow_example.ptx
# Run example programs
cd build
./execution_result_demo
./parameter_passing_exampleCommand-line options:
-h, --help- Display help message-l, --log-level LEVEL- Set log level:debug,info(default),warning,error
For debugging, learning, and experimentation:
# Start interactive mode
./ptx_vm
# Interactive commands
> load examples/control_flow_example.ptx # Load PTX program
> alloc 1024 # Allocate memory
> launch myKernel 0x10000 # Launch kernel with parameters
> memory 0x10000 256 # View memory contents
> dump # Show execution statistics
> loglevel debug # Change log level
> exit # Exit the VMIntegrate PTX VM into your application:
#include "host_api.hpp"
int main() {
// Initialize VM
HostAPI hostAPI;
hostAPI.initialize();
// Allocate device memory
CUdeviceptr devicePtr;
hostAPI.cuMemAlloc(&devicePtr, 1024 * sizeof(int));
// Prepare and copy data
std::vector<int> data(1024, 42);
hostAPI.cuMemcpyHtoD(devicePtr, data.data(), 1024 * sizeof(int));
// Load and execute PTX program
hostAPI.loadProgram("my_kernel.ptx");
// Launch kernel with parameters
void* params[] = { &devicePtr, &size };
hostAPI.cuLaunchKernel(kernel, 1,1,1, 32,1,1, 0, 0, params, nullptr);
// Copy results back
std::vector<int> results(1024);
hostAPI.cuMemcpyDtoH(results.data(), devicePtr, 1024 * sizeof(int));
// Cleanup
hostAPI.cuMemFree(devicePtr);
return 0;
}Control the verbosity of VM output:
# Command-line mode
./ptx_vm --log-level debug program.ptx # Detailed debug info
./ptx_vm --log-level info program.ptx # General info (default)
./ptx_vm --log-level warning program.ptx # Warnings and errors
./ptx_vm --log-level error program.ptx # Errors only
# Interactive mode
> loglevel debug # Enable all logs
> loglevel info # Default level
> loglevel warning # Warnings and errors only
> loglevel error # Errors only
> loglevel # Display current levelLog levels:
- debug - Shows detailed execution info, register values, memory operations
- info - Shows program loading, kernel launches, general info (default)
- warning - Shows warnings and errors only
- error - Shows errors only
For more details, see:
- 📖 Complete User Guide - Detailed usage instructions
- 📖 中文用户指南 - Chinese user guide
- 📖 Quick Reference - Command quick reference
- 📖 API Documentation - API reference
- 📖 Logging System - Logging system details
Load a PTX or CUDA binary file into the VM.
> load <filename>Execute the loaded program.
> runExecute one instruction at a time.
> step [number_of_instructions]Set a breakpoint at a specific address.
> break <address>Set a watchpoint at a specific memory address.
> watch <address>Display register information.
> register [all|predicate|pc]Display memory contents.
> memory <address> [size]Allocate memory in the VM.
> alloc <size>Copy memory within the VM.
> memcpy <dest> <src> <size>Write a single byte value to a specific memory address.
> write <address> <value>Fill memory with multiple byte values starting at a specific address.
> fill <address> <count> <value1> [value2] ...Load data from a file into VM memory at a specific address.
> loadfile <address> <file> <size>Launch a kernel with parameters.
> launch <kernel_name> [param1] [param2] ...Start profiling session.
> profile <output_file.csv>Output execution statistics.
> dumpList loaded program disassembly.
> listDisplay visualization of execution state.
> visualize <type>Where <type> can be:
warp- Warp execution visualizationmemory- Memory access visualizationperformance- Performance counter display
Exit the virtual machine.
> quitComprehensive documentation is available via DOCS_INDEX.md and organized into three directories:
- docs_user/ - End-user and API usage documentation
- docs_dev/ - Contributor/developer documentation and technical reports
- docs_spec/ - PTX/SIMT fundamentals and specification notes
Recommended starting points:
The documentation covers architecture, code structure, contribution guidelines, and technical details of implementation.
See RELEASE_NOTES.md for information about this release, including:
- Key features
- Installation instructions
- Usage examples
- Known issues
- Future enhancements
See CONTRIBUTORS.md for a list of contributors and information about how to contribute to the project.
This project is licensed under the MIT License. See the LICENSE file for details.
- Zhenzhong Han - Lead Developer and Architect
- Email: zhenzhong.han@qq.com
- Role: Chief architect and main developer of the PTX Virtual Machine