Run Multiple Models on the Same GPU with Amazon SageMaker Multi-Model Endpoints Powered by NVIDIA Triton Inference Server. A Java client is also provided.
-
Updated
Nov 14, 2022 - Java
Run Multiple Models on the Same GPU with Amazon SageMaker Multi-Model Endpoints Powered by NVIDIA Triton Inference Server. A Java client is also provided.
Add a description, image, and links to the sagemaker-deployment topic page so that developers can more easily learn about it.
To associate your repository with the sagemaker-deployment topic, visit your repo's landing page and select "manage topics."