This repository contains a ROS2 integration package for the LION (LiDAR-Only Instance segmentation with Objects in Neural voxel space) 3D object detection model. The package provides real-time 3D object detection capabilities for autonomous vehicles using Ouster LiDAR sensors.
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Ouster LiDAR │────▶│ ROS2 Bridge │────▶│ LION Model │
│ (OS1-32) │ │ (Python Node) │ │ (PyTorch/CUDA) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
/ouster/points Detection Node 3D Detections
(PointCloud2) - Data conversion - Bounding boxes
- Preprocessing - Classifications
- Model inference - Confidence scores
pcdet_ws/
├── src/lion_ros2/ # ROS2 package
│ ├── lion_ros2/ # Python modules
│ │ ├── detection_node.py # Main ROS2 node
│ │ ├── model_loader.py # LION model loading/inference
│ │ ├── data_feeder.py # Point cloud preprocessing
│ │ └── utils.py # Utility functions
│ ├── config/ # Configuration files
│ │ ├── ros_params.yaml # ROS2 parameters
│ │ └── lion_retnet.yaml # Model configuration
│ └── launch/ # Launch files
│ └── lion_detection.launch.py
├── lib/LION/ # LION model implementation
└── Data/ # Model checkpoints
-
Detection Node (
detection_node.py)- Subscribes to
/ouster/pointstopic - Manages the detection pipeline
- Publishes detection results and visualization markers
- Subscribes to
-
Model Loader (
model_loader.py)- Loads LION model from checkpoint
- Handles GPU memory allocation
- Performs inference with CUDA acceleration
-
Data Feeder (
data_feeder.py)- Converts point clouds to model input format
- Performs voxelization
- Manages batch preparation
-
Utils (
utils.py)- Optimized PointCloud2 to NumPy conversion
- Message formatting utilities
When first deployed, the system was running at only 3.2-3.8 Hz, far below the target of 10+ Hz. Through systematic profiling, we identified the bottlenecks:
Initial timing breakdown (ms):
- PointCloud2 → NumPy: 118-256ms (48-68% of total time!)
- Model inference: 113-119ms
- Preprocessing: 2.1-2.6ms
- Publishing: 2.2-4.1ms
The original pointcloud2_to_numpy implementation used Python loops with struct.unpack:
# Old implementation (simplified)
for i in range(0, len(msg.data), msg.point_step):
for field_name in field_names:
value = struct.unpack(fmt, msg.data[i+offset:i+offset+size])[0]
point_data.append(value)For a typical Ouster scan with 20,000 points × 4 fields = 80,000 struct.unpack calls per frame!
Instead of using Numba JIT or writing C++ extensions, we leveraged NumPy's powerful vectorization capabilities:
# New implementation
# 1. Convert entire buffer to numpy array once
cloud_data = np.frombuffer(msg.data, dtype=np.uint8)
# 2. Use structured arrays for efficient field access
dt = np.dtype([
('x', '<f4'), ('y', '<f4'), ('z', '<f4'),
('pad1', 'V4'), ('intensity', '<f4'),
('pad2', f'V{msg.point_step - 20}')
])
structured_cloud = cloud_data.view(dt)
# 3. Extract all points at once using column operations
points = np.column_stack([
structured_cloud['x'],
structured_cloud['y'],
structured_cloud['z'],
structured_cloud['intensity']
])The optimization achieved a 100x speedup in point cloud conversion:
| Metric | Before | After | Improvement |
|---|---|---|---|
| Conversion Time | 118-256ms | 1.3-1.5ms | ~100x faster |
| Total FPS | 3.2-3.8 Hz | 7.7-8.2 Hz | 2.2x faster |
| Total Latency | 250-380ms | 117-131ms | 2x faster |
Timing breakdown (ms):
- Convert: 1.3-1.5ms (solved! ✓)
- Validate: 0.0ms
- Preprocess: 1.7-2.1ms
- Voxelization: 0.7-0.9ms
- Other: ~1ms
- Inference: 112-124ms (now the main bottleneck)
- Postprocess: 0.1-0.4ms
- Publish: 1.4-4.6ms
- ROS2 Humble
- CUDA 11.8+
- Python 3.10
- Mamba environment:
lion_310_ros2
cd /home/user1/ROS2_Workspace/pcdet_ws
mamba activate lion_310_ros2
source /opt/ros/humble/setup.bash
colcon build --packages-select lion_ros2 --symlink-install
source install/setup.bash# Launch LION detection node
ros2 launch lion_ros2 lion_detection.launch.py
# With custom parameters
ros2 launch lion_ros2 lion_detection.launch.py input_topic:=/your/topic# Check detection rate
ros2 topic hz /lion/detections
# View detections
ros2 topic echo /lion/detections
# Visualize in RViz2
rviz2 -d config/lion_viz.rvizKey parameters in config/ros_params.yaml:
model_config: Path to model configurationcheckpoint_path: Path to trained model weightsinput_topic: Input point cloud topic (default:/ouster/points)score_threshold: Detection confidence threshold (default: 0.3)max_points_per_frame: Safety limit for points (default: 150000)
With the PointCloud2 conversion bottleneck solved, the focus shifts to model inference optimization. See prd.md for the detailed optimization roadmap.
This package is part of an autonomous vehicle research project. For questions or contributions, please contact the development team.
[License information to be added]