0% found this document useful (0 votes)
12 views3 pages

l40s Datasheet

Uploaded by

mario.garcia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views3 pages

l40s Datasheet

Uploaded by

mario.garcia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Datasheet

NVIDIA L40S
Unparalleled AI and graphics performance
for the data center.

Generative AI is fueling transformative change, unlocking a new frontier of


opportunities for enterprises across every industry. To transform with AI, enterprises Accelerate Next-
need more compute resources, greater scale, and a broad set of capabilities to meet Generation Workloads
the demands of an ever-increasing set of diverse and complex workloads.
> Generative AI
The NVIDIA L40S GPU is the most powerful universal GPU for the data center,
> LLM inference
delivering end-to-end acceleration for the next generation of AI-enabled
applications—from gen AI, LLM inference, small-model training and fine-tuning to > LLM fine-tuning and small-model
3D graphics, rendering, and video applications. training

> NVIDIA Omniverse™ Enterprise


Generative AI Large Language Model
> Rendering and 3D graphics
Image Generation (LLM) Inference
> Streaming and video content
Stable Diffusion (images per minute)

100 800
1st Token Latency (ms)

80
82 600 669

60
400
40

200
20
17 143
11 77
0 0
SD SD SDXL Llama Llama Llama
(512x512) (1024x1024) (1024x1024) 2 -7B 2 -13B 2 -70B
Measured performance; NVIDIA L40S Measured performance; NVIDIA L40S
Stable Diffusion v2.1, TRT 8.6.1, BS:1, FP16 | Llama 2-7B/13B/70B, ISL=2048, OSL=128,
Stable Diffusion XL 1.0, TRT 8.6.1, BS:1, FP16 BS=1: FP8.

Powered by the NVIDIA Ada Lovelace Architecture


Fourth-Generation Tensor Cores
Hardware support for structural sparsity and optimized TF32 format provides
out-of-the-box performance gains for faster AI and data science model training.
Accelerate AI-enhanced graphics capabilities with DLSS to upscale resolution
with better performance in select applications.

NVIDIA L40S | Datasheet | 1


Third-Generation RT Cores
Enhanced throughput and concurrent ray-tracing and shading capabilities improve
ray-tracing performance, accelerating renders for product design and architecture,
engineering, and construction workflows. See lifelike designs in action with
hardware-accelerated motion blur and stunning real-time animations.

Transformer Engine
Transformer Engine dramatically accelerates AI performance and improves memory
utilization for both training and inference. Harnessing the power of the Ada
Lovelace fourth-generation Tensor Cores, Transformer Engine intelligently scans
the layers of transformer architecture neural networks and automatically
recasts between FP8 and FP16 precisions to deliver faster AI performance and
accelerate training and inference.

Data Center Ready


The L40S GPU is optimized for 24/7 enterprise data center operations and designed,
built, tested, and supported by NVIDIA to ensure maximum performance, durability,
and uptime. The L40S GPU meets the latest data center standards, is Network
Equipment-Building System (NEBS) Level 3 ready, and features secure boot with root
of trust technology, providing an additional layer of security for data centers.

Technical Specifications

GPU Architecture NVIDIA Ada Lovelace Architecture

GPU Memory 48GB GDDR6 with ECC

Memory Bandwidth 864GB/s

Interconnect Interface PCIe Gen4 x16: 64GB/s bidirectional

NVIDIA Ada Lovelace Architecture- 18,176


Based CUDA® Cores

NVIDIA Third-Generation RT Cores 142

NVIDIA Fourth-Generation Tensor 568


Cores

RT Core Performance TFLOPS 209

FP32 TFLOPS 91.6

TF32 Tensor Core TFLOPS 183 I 366*

BFLOAT16 Tensor Core TFLOPS 362.05 I 733*

FP16 Tensor Core 362.05 I 733*

FP8 Tensor Core 733 I 1,466*

Peak INT8 Tensor TOPS 733 I 1,466*

Peak INT4 Tensor TOPS 733 I 1,466*

Form Factor 4.4" (H) x 10.5" (L), dual slot

Display Ports 4x DisplayPort 1.4a

Max Power Consumption 350W

Power Connector 16-pin

NVIDIA L40S | Datasheet | 2


Thermal Passive

Virtual GPU (vGPU) Software Support Yes

vGPU Profiles Supported See the virtual GPU licensing guide

NVENC I NVDEC 3x l 3x (includes AV1 encode and decode)

Secure Boot With Root of Trust Yes

NEBS Ready Level 3

MIG Support No

NVIDIA® NVLink® Support No

* With sparsity

Ready to Get Started?


To learn more about the NVIDIA L40S, visit
www.nvidia.com/l40s
© 2024 NVIDIA Corporation and affiliates. All rights reserved. NVIDIA, the NVIDIA logo, CUDA, HGX, NVLink, and
Omniverse are trademarks and/or registered trademarks of NVIDIA Corporation and affiliates in the U.S. and other
countries. Other company and product names may be trademarks of the respective owners with which they are
associated. 3110647. FEB24

You might also like