-
-
Notifications
You must be signed in to change notification settings - Fork 16.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NVIDIA Jetson Nvidia Jetson Nano, Xavier NX, Orin Deployment tutorial #9627
Comments
@AyushExel awesome! Should this be renamed to something like NVIDIA Jetson Nano deployment tutorial? |
@glenn-jocher yeah. "Nvidia Jetson Nano deployment tutorial sounds good". |
@AyushExel awesome, added to wiki. I think I'll add to README also. Are those times in the last table right BTW? |
@glenn-jocher yes. It's also mentioned here - https://wiki.seeedstudio.com/YOLOv5-Object-Detection-Jetson/ |
thanks for the great documentation @AyushExel |
Hi @AyushExel For step 4. I'm using a Seeed reComputer J1010 (Jetson Nano) with Jetpack 4.6.2 and I've tried a couple of times with a fresh flash of the Jetson each time. I noticed that YoloV5 requires Python 3.7, whereas Jetpack 4.6.2 includes Python 3.6.9, so I used YoloV5 v6.0 (and v6.2 initially) EDIT: also tried JP4.6.1 (same result) thanks in advance Andrew |
@lakshanthad do you know what's causing this? |
thanks @AyushExel I've found the crash report (which I can send to you or @lakshanthad) I also noticed the SeeedStudio article here: similar/the same? The first 10 lines are:
|
Hello @barney2074, Can I know exactly after which command you encounter this crash? And please attach the report here if possible.
Yes. Most content on this GitHub is based on that wiki. That wiki mainly explains the entire process from labeling to deploying |
Hi @lakshanthad This occurs after the command I have also tried the Seeed wiki- I'll put outcome in a separate post to avoid confusing the issue thanks Andrew |
Hi @lakshanthad At step 19 of the Seeed wiki (serialising the model) I get the following error:
|
@barney2074 But after that, a problem occurs when building deepstream.
|
Thanks @dinobei One step forward.... Andrew |
@lakshanthad do you know what's happening in this error? I seems like its originating from deepstream-yolo module. Is there a way to run this without that? |
Hello, It is recommended to choose it inside NVIDIA SDK Manager when installing JetPack. Because this ensures that there will be no compatibility or missing dependency issues. |
Well. If you just want to deploy, you can use the pre-trained PyTorch model to perform the inference. In this case, follow until and including the Install PyTorch and Torchvision section in the above guide. After that, execute python detect.py --source <video_source>. But the goal of this document is to use TensorRT to increase performance on the Jetson platform. And to use TensorRT with a video stream, DeepStream SDK is used. So there are 2 ways of deployment on Jetson.
The first method is the fastest deployment. However, the second method ensures the model performance is better on the Jetson hardware compared with the first method. |
I think this document can be divided into two.
Any suggestions? I can work to reorganize it as above and update this guide. |
@lakshanthad |
I made a huge mistake. I didn't install DeepStream SDK. I thought DeepStream-Yolo and DeepStream SDK are the same. |
Hi @lakshanthad I'm aiming to get my custom YoloV5 model running on the Jetson, although I tried yolov5s.pt as a test to try to eliminate the problem i.e it is not just my custom model Just to clarify my understanding: the TensorRT .engine needs to be generated on the same processor architecture as used for inferencing. i.e can't generate it on an x86/RTX machine and run inferencing on an ARM (Jetson) one ?? Andrew |
Yes. Please try again and share your results. |
Yes, you are right. The However, the guide that you found out on Seeed wiki that you mentioned earlier, when only TensorRT is used without DeepStream SDK, you need to manually do this serialize and deserialize work. Coming back to the issues you are still facing, is any of the issues you mentioned before solved, or do they still exist? Could we debug like this? First try without TensorRT.
Please let me know whether this works at first. |
There is no big difference. The way to use only TensorRT is this. However, there is no example present to view detection on real-time video. The repo only supports image inferencing at the moment. DeepStream SDK comes with real-time video detection support. However, if you are comfortable with maybe OpenCV, it could be possible to grab the video frames as images using OpenCV and do the inferencing while only using the TensorRT Github mentioned before. |
Hi, @Alberto1404, i had this issue also. For some reason when you recompile and use OPENCV=1 in cmd, it doesn't actually build with opencv. Quick fix is to just go into the makefile and change the opencv query to opencv=1. Then just recompile without the opencv flag. |
@arminbiglari Thank you so much for your reply. -----UPDATE----- |
just run the inference step 9:“deepstream-app -c deepstream_app_config.txt”. The calib.table will be created when calibrating. |
Try: |
How should we deploy a instance segmentation model with deepstream ? |
Hello, @AyushExel thanks for sharing good information. Is there any way to use the above without using deepstream? Thank you. |
Thanks for the tutorial. Used it to configure and run YoloV5s on an Orin AGX Dev Kit. Model Name Precision Inference Size FPS |
@junxi-haoyi hello! Thank you for reaching out. The "Illegal instruction" error typically means that the executable code being run is not compatible with the CPU architecture of the device. To resolve this issue, you can try setting the Here's an example of how you can do this: sudo OPENBLAS_CORETYPE=ARMV8 python3 setup.py install This command sets the I hope this helps! If you have any further questions or encounter any more issues, feel free to ask. |
There is no file named gen_wts_v5.py in the utils folder of the deepstream-yolo repo? what should I do |
@vanshdhar999 you can create the
If you have specific requirements or need assistance with the content of the |
What is the required code to be written? @glenn-jocher |
@vanshdhar999 the Here's a basic example of what the content of the import torch
from models.yolo import Model # Import your YOLOv5 model class here
# Load the PyTorch model
model = Model(pretrained=False) # Replace with the actual YOLOv5 model class and load the weights
# Convert and save the model to TorchScript format
# Replace 'input_shape' with the actual input shape of the model
ts = torch.jit.script(model.to(torch.device('cpu')).eval())
ts.save('yolov5s.ts') # Save the TorchScript model
# Convert and save the model to TensorRT engine
import torch2trt
from torch2trt import TRTModule
import pycuda.driver as cuda
# Directories for TensorRT logging
TRT_LOGGER = torch2trt.Logger(torch2trt.LogSeverity.INFO)
model_trt = torch2trt.ts2trt(ts, [torch.randn(1, 3, input_shape, input_shape).to(device)], max_batch_size=1, log_level=torch2trt.LogSeverity.VERBOSE, strict_type_constraints=True)
model_trt.save("yolov5s.trt") # Save the TensorRT engine Please replace the placeholders with your actual YOLOv5 model class, input shape, and other specific details. Remember to adjust the script according to your exact requirements and your trained YOLOv5 model. If you need further assistance or have more specific requirements, feel free to let me know. |
@glenn-jocher Can we implement Deepstream based trackers like NvDcf or NvSORT by adding the tracker plugin in the config file? I tried but it doesn't seem to work. I want to use Yolov5 along with NvSORT tracker. |
Hello! Integrating NvSORT or other DeepStream trackers with YOLOv5 in a DeepStream framework does involve careful configuration. Ensure that the [tracker]
enable=1
tracker-width=640
tracker-height=384
ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvdcf.so
ll-config-file=config_tracker_NvDCF_perf.yml
gpu-id=0
enable-batch-process=1 Make sure paths and parameters match your setup. Additionally, ensure that the model outputs and tracker configs are compatible (location, detection confidence formats, etc.). If the issue persists, check for any error messages in the logs which might offer more specific insight into what might be going wrong. 😊 |
@glenn-jocher When I try to run |
Hello! It looks like the error is happening because the attribute A quick fix could be checking whether the attribute exists before trying to delete it. You could modify your if hasattr(model.model[-1], 'anchor_grid'):
delattr(model.model[-1], 'anchor_grid') This change ensures that the script only tries to delete the |
Hi @glenn-jocher I was able to convert the model to onnx using |
Hello! The discrepancy in detection performance between the original Ultralytics YOLOv5 inference and the ONNX/DeepStream version might be due to several factors:
To further diagnose, consider comparing the intermediate outputs of both models or checking if the issue persists with different models/samples. A code review might also help ensure settings like input normalization are consistent. 🧐 |
Thanks for reply. I had implemented https://github.com/marcoslucianops/DeepStream-Yolo. All the plugins for post processing are as per the model's requirements. Is there a way to compensate for the loss from the conversion from pt to onnx? Because to run it on deepstream it supports onnx files in this case for TRT inference. |
@vmukund36 hello! Thanks for the update. To address the performance loss after converting your model from
If discrepancies persist, consider further fine-tuning or retraining the model directly in the ONNX format or adjusting the DeepStream pipeline to better accommodate the model's characteristics. 😊 |
Hi @glenn-jocher I tried with onnx-simplifier as well but the results are the same if not worse. I had tried to convert to onnx format using ultralytics api as well and upon inferencing it gave a similar result. I am not really sure how to make sure to not compromise on model's performance while also inferencing on deepstream(which supports onnx only). Do you recommend trying any other alternative approach in order to not face this issue? |
Hello! It sounds like you're facing a challenging issue with the ONNX format in DeepStream. If
If you've explored all usual routes, these alternative steps might help in maintaining performance consistency. Keep in mind that slight discrepancies between training frameworks and inference engines are common, and achieving an exact match might require a bit of trial and error. 😊 |
Hello! Thank you for reaching out and providing the screenshot. To assist you better, could you please share a minimum reproducible code example? This will help us understand the context and reproduce the issue on our end. You can find guidance on creating a minimum reproducible example here: Minimum Reproducible Example. Additionally, please ensure that you are using the latest versions of Once we have the necessary details, we can investigate further and provide a more accurate solution. Thank you for your cooperation! 😊 |
Deploy on NVIDIA Jetson using TensorRT and DeepStream SDK
This guide explains how to deploy a trained model into NVIDIA Jetson Platform and perform inference using TensorRT and DeepStream SDK. Here we use TensorRT to maximize the inference performance on the Jetson platform. UPDATED 18 November 2022.
Hardware Verification
We have tested and verified this guide on the following Jetson devices
Before You Start
Make sure you have properly installed JetPack SDK with all the SDK Components and DeepStream SDK on the Jetson device as this includes CUDA, TensorRT and DeepStream SDK which are needed for this guide.
JetPack SDK provides a full development environment for hardware-accelerated AI-at-the-edge development. All Jetson modules and developer kits are supported by JetPack SDK.
There are two major installation methods including,
You can find a very detailed installation guide from NVIDIA official website. Also you can find guides corresponding to the above-mentioned reComputer J1010 and reComputer J2021.
Install Necessary Packages
cd yolov5 vi requirements.txt
Note: torch and torchvision are excluded for now because they will be installed later.
Install PyTorch and Torchvision
We cannot install PyTorch and Torchvision from pip because they are not compatible to run on Jetson platform which is based on ARM aarch64 architecture. Therefore we need to manually install pre-built PyTorch pip wheel and compile/ install Torchvision from source.
Visit this page to access all the PyTorch and Torchvision links.
Here are some of the versions supported by JetPack 4.6 and above.
PyTorch v1.10.0
Supported by JetPack 4.4 (L4T R32.4.3) / JetPack 4.4.1 (L4T R32.4.4) / JetPack 4.5 (L4T R32.5.0) / JetPack 4.5.1 (L4T R32.5.1) / JetPack 4.6 (L4T R32.6.1) with Python 3.6
file_name: torch-1.10.0-cp36-cp36m-linux_aarch64.whl
URL: https://nvidia.box.com/shared/static/fjtbno0vpo676a25cgvuqc1wty0fkkg6.whl
PyTorch v1.12.0
Supported by JetPack 5.0 (L4T R34.1.0) / JetPack 5.0.1 (L4T R34.1.1) / JetPack 5.0.2 (L4T R35.1.0) with Python 3.8
file_name: torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl
URL: https://developer.download.nvidia.com/compute/redist/jp/v50/pytorch/torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl
For example, here we are running JP4.6.1 and therefore we choose PyTorch v1.10.0
sudo apt install -y libjpeg-dev zlib1g-dev git clone --branch v0.11.1 https://github.com/pytorch/vision torchvision cd torchvision sudo python3 setup.py install
Here a list of the corresponding torchvision version that you need to install according to the PyTorch version:
DeepStream Configuration for YOLOv5
cd yolov5 wget https://github.com/ultralytics/yolov5/releases/download/v6.1/yolov5s.pt
Note: To change the inference size (defaut: 640)
-s SIZE --size SIZE -s HEIGHT WIDTH --size HEIGHT WIDTH Example for 1280: -s 1280 or -s 1280 1280
Run the Inference
The above result is running on Jetson Xavier NX with FP32 and YOLOv5s 640x640. We can see that the FPS is around 30.
INT8 Calibration
If you want to use INT8 precision for inference, you need to follow the steps below
Step 3. For COCO dataset, download the val2017, extract, and move to DeepStream-Yolo folder
Step 4. Make a new directory for calibration images
Note: NVIDIA recommends at least 500 images to get a good accuracy. On this example, 1000 images are chosen to get better accuracy (more images = more accuracy). Higher INT8_CALIB_BATCH_SIZE values will result in more accuracy and faster calibration speed. Set it according to you GPU memory. You can set it from head -1000. For example, for 2000 images, head -2000. This process can take a long time.
From
... model-engine-file=model_b1_gpu0_fp32.engine #int8-calib-file=calib.table ... network-mode=0 ...
To
The above result is running on Jetson Xavier NX with INT8 and YOLOv5s 640x640. We can see that the FPS is around 60.
Benchmark results
The following table summarizes how different models perform on Jetson Xavier NX.
Additional
This tutorial is written by our friends at seeed @lakshanthad and Elaine
The text was updated successfully, but these errors were encountered: