Skip to content

rkdals0131/LION_ros2

Repository files navigation

LION ROS2 Integration Package

Overview

This repository contains a ROS2 integration package for the LION (LiDAR-Only Instance segmentation with Objects in Neural voxel space) 3D object detection model. The package provides real-time 3D object detection capabilities for autonomous vehicles using Ouster LiDAR sensors.

Architecture

System Design

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│  Ouster LiDAR   │────▶│   ROS2 Bridge    │────▶│   LION Model    │
│   (OS1-32)      │     │  (Python Node)   │     │  (PyTorch/CUDA) │
└─────────────────┘     └──────────────────┘     └─────────────────┘
        │                        │                         │
        │                        │                         │
        ▼                        ▼                         ▼
  /ouster/points          Detection Node            3D Detections
  (PointCloud2)           - Data conversion         - Bounding boxes
                         - Preprocessing           - Classifications
                         - Model inference         - Confidence scores

Package Structure

pcdet_ws/
├── src/lion_ros2/          # ROS2 package
│   ├── lion_ros2/          # Python modules
│   │   ├── detection_node.py    # Main ROS2 node
│   │   ├── model_loader.py      # LION model loading/inference
│   │   ├── data_feeder.py       # Point cloud preprocessing
│   │   └── utils.py             # Utility functions
│   ├── config/             # Configuration files
│   │   ├── ros_params.yaml      # ROS2 parameters
│   │   └── lion_retnet.yaml     # Model configuration
│   └── launch/             # Launch files
│       └── lion_detection.launch.py
├── lib/LION/               # LION model implementation
└── Data/                   # Model checkpoints

Key Components

  1. Detection Node (detection_node.py)

    • Subscribes to /ouster/points topic
    • Manages the detection pipeline
    • Publishes detection results and visualization markers
  2. Model Loader (model_loader.py)

    • Loads LION model from checkpoint
    • Handles GPU memory allocation
    • Performs inference with CUDA acceleration
  3. Data Feeder (data_feeder.py)

    • Converts point clouds to model input format
    • Performs voxelization
    • Manages batch preparation
  4. Utils (utils.py)

    • Optimized PointCloud2 to NumPy conversion
    • Message formatting utilities

Performance Optimization Journey

Initial Performance Issues

When first deployed, the system was running at only 3.2-3.8 Hz, far below the target of 10+ Hz. Through systematic profiling, we identified the bottlenecks:

Initial timing breakdown (ms):
- PointCloud2 → NumPy: 118-256ms (48-68% of total time!)
- Model inference: 113-119ms
- Preprocessing: 2.1-2.6ms
- Publishing: 2.2-4.1ms

The struct.unpack Bottleneck

The original pointcloud2_to_numpy implementation used Python loops with struct.unpack:

# Old implementation (simplified)
for i in range(0, len(msg.data), msg.point_step):
    for field_name in field_names:
        value = struct.unpack(fmt, msg.data[i+offset:i+offset+size])[0]
        point_data.append(value)

For a typical Ouster scan with 20,000 points × 4 fields = 80,000 struct.unpack calls per frame!

NumPy Vectorization Solution

Instead of using Numba JIT or writing C++ extensions, we leveraged NumPy's powerful vectorization capabilities:

# New implementation
# 1. Convert entire buffer to numpy array once
cloud_data = np.frombuffer(msg.data, dtype=np.uint8)

# 2. Use structured arrays for efficient field access
dt = np.dtype([
    ('x', '<f4'), ('y', '<f4'), ('z', '<f4'),
    ('pad1', 'V4'), ('intensity', '<f4'),
    ('pad2', f'V{msg.point_step - 20}')
])
structured_cloud = cloud_data.view(dt)

# 3. Extract all points at once using column operations
points = np.column_stack([
    structured_cloud['x'],
    structured_cloud['y'],
    structured_cloud['z'],
    structured_cloud['intensity']
])

Results

The optimization achieved a 100x speedup in point cloud conversion:

Metric Before After Improvement
Conversion Time 118-256ms 1.3-1.5ms ~100x faster
Total FPS 3.2-3.8 Hz 7.7-8.2 Hz 2.2x faster
Total Latency 250-380ms 117-131ms 2x faster

Current Performance Profile

Timing breakdown (ms):
- Convert: 1.3-1.5ms (solved! ✓)
- Validate: 0.0ms
- Preprocess: 1.7-2.1ms
  - Voxelization: 0.7-0.9ms
  - Other: ~1ms
- Inference: 112-124ms (now the main bottleneck)
- Postprocess: 0.1-0.4ms
- Publish: 1.4-4.6ms

Usage

Prerequisites

  • ROS2 Humble
  • CUDA 11.8+
  • Python 3.10
  • Mamba environment: lion_310_ros2

Building

cd /home/user1/ROS2_Workspace/pcdet_ws
mamba activate lion_310_ros2
source /opt/ros/humble/setup.bash
colcon build --packages-select lion_ros2 --symlink-install
source install/setup.bash

Running

# Launch LION detection node
ros2 launch lion_ros2 lion_detection.launch.py

# With custom parameters
ros2 launch lion_ros2 lion_detection.launch.py input_topic:=/your/topic

Monitoring

# Check detection rate
ros2 topic hz /lion/detections

# View detections
ros2 topic echo /lion/detections

# Visualize in RViz2
rviz2 -d config/lion_viz.rviz

Configuration

Key parameters in config/ros_params.yaml:

  • model_config: Path to model configuration
  • checkpoint_path: Path to trained model weights
  • input_topic: Input point cloud topic (default: /ouster/points)
  • score_threshold: Detection confidence threshold (default: 0.3)
  • max_points_per_frame: Safety limit for points (default: 150000)

Future Optimizations

With the PointCloud2 conversion bottleneck solved, the focus shifts to model inference optimization. See prd.md for the detailed optimization roadmap.

Contributing

This package is part of an autonomous vehicle research project. For questions or contributions, please contact the development team.

License

[License information to be added]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published