vLLM and PyTorch worked together to fix a long-standing aarch64 install headache — as of PyTorch 2.11.0, pip install torch on GB200 / GB300 / GH200 just works. What changed: PyTorch 2.11.0 now publishes CUDA-enabled aarch64 wheels to the default PyPI index. No more custom --index-url flags. No more transitive dependencies silently swapping your GPU build for the CPU wheel. New users on Grace Hopper and Grace Blackwell systems can follow the standard install instructions and have vLLM work the first time. In our latest blog, Kaichao You (co-founder Inferact, Lead Maintainer vLLM) shares the full story: 🐛 A 2024 hackathon bug bringing up vLLM on GH200 🔧 vLLM's in-tree workarounds (use_existing_torch.py and [tool.uv] build-isolation passthrough) 🤝 From GitHub issue to PyTorch Foundation TAC discussion 🚀 The fix landing in PyTorch 2.11.0, driven by NVIDIA and PyTorch core. A great example of cross-project collaboration under the PyTorch Foundation umbrella — and a reminder that boring infrastructure wins compound. Read the full story: https://lnkd.in/gGc8mRm8 ✍ Alban Desmaison (Meta), Nikita Shulga (Meta), Andrey Talman (Meta), Piotr Bialecki (NVIDIA)
PyTorch
Research Services
San Francisco, California 319,831 followers
An open source machine learning framework that accelerates the path from research prototyping to production deployment.
About us
An open source machine learning framework that accelerates the path from research prototyping to production deployment. PyTorch is an open source project at the Linux Foundation.
- Website
-
http://www.pytorch.org
External link for PyTorch
- Industry
- Research Services
- Company size
- 501-1,000 employees
- Headquarters
- San Francisco, California
- Type
- Public Company
- Specialties
- Artificial Intelligence, Deep Learning, Machine Learning, and AI
Locations
-
Primary
Get directions
548 Market St
San Francisco, California, US
Employees at PyTorch
Updates
-
Don't miss DeepSpeed.ai virtual office hours on May 26 at 12:00 PM America/New_York to get recent key updates, including AutoSP (sequence parallel), AutoEP (expert parallel), and AutoTP (tensor parallel) & ask questions with Masahiro Tanaka, member of DeepSpeed TSC.
Have questions about DeepSpeed or ideas for what we should prioritize next? Join our next DeepSpeed Office Hours on Tuesday, May 26 at 12:00 PM America/New_York. We'll cover general questions, Q2 roadmap progress, and requests for Q3. Everyone is welcome! Zoom: https://lnkd.in/gHkQA2KA
-
ExecuTorch now has an MLX delegate that runs PyTorch models on Apple Silicon GPUs. It supports LLMs, speech-to-text, and MoE models with quantization via TorchAO. Export with torch.export, run on Metal. Read our latest blog: https://lnkd.in/gNGubvXa
-
PyTorch reposted this
I’ll be at #MLSys this week, May 18–22 🚀 The PyTorch Foundation will have a booth with experts from across foundation projects, including PyTorch, vLLM, and Ray. Come by to meet the teams, ask questions, and learn more about the open AI infrastructure ecosystem. 🔥 I’ll also be speaking Monday morning on agentic self-improvement with OpenRoll, an agentic trace and rollout spec. 🤖 Hope to see you there — come by the booth and say hello! 👋 #PyTorch #vLLM #Ray #OpenRoll #OpenSourceAI #LinuxFoundation PyTorch The Linux Foundation Agentic AI Foundation
-
-
PyTorch releases include thousands of changes, and our Release Live Q&As give you direct access to the maintainers and contributors behind them. In this clip from our PyTorch 2.11 webinar, Nikita Shulga explains why quantization support is moving from PyTorch core to TorchAO and what that means for developers. Join Andrey Talman, Alban Desmaison, and Joseph Spisak on May 20 at 10:00 AM PT for a technical overview of PyTorch 2.12 and a live Q&A moderated by Chris Gottbrath. PyTorch 2.12 includes major updates across compilation, distributed systems, export, graph capture, and accelerator support. 🔗 Register now: https://lnkd.in/gRJ7dSTE
-
Big congrats to the ExecuTorch team. Their paper just won the Best Industry Paper Award at MLSys 2026. 🏆 ExecuTorch is PyTorch's framework for running models on-device, in production today across phones, wearables, desktops, and embedded systems. Backends include Apple, Arm, Qualcomm, MediaTek, Samsung, NXP, Cadence, Intel, NVIDIA, and others. Catch the team Tuesday morning in the Best Paper Session (8:45 AM PT) and at the poster session Thursday evening. Paper: https://lnkd.in/g5UmCu4D Discord: https://lnkd.in/g2e4xEcw X: https://x.com/Executorch Project: https://lnkd.in/grE63qJ8 Schedule: https://lnkd.in/gAkzQfPY ✍ Mergen Nachin, Digant Desai, Stephen Jia, Chen Lai, Mengwei Liu, Jacob Szwejbka, Raziel Alvarez, RJ Ascani, Dave Bort, Manuel Candales, Andrew Caples, Yanan Cao, Zhengxu Chen, Soumith Chintala, Gregory Comer, Tanvir Islam, Songhao Jia, Tarun Karuturi, Jack Khuu, Abhinay Kukkadapu, Tugsbayasgalan Manlaibaatar, Andrew Or, Kimish Patel, Siddartha Pothapragada, Lucy Qiu, Supriya Rao, Orion Reblitz-Richardson, Max Ren, Scott Roy, Anthony Shoumikhin, Scott Wolchok, Guang Y., Angela Y., Mengtao Yuan, Hansong Zhang, Jack Zhang, Zhenrui(Jerry) Zhang, Shunting Zhang, Cagatay Bilgin #MLSys2026
-
How do you want to share your expertise at PyTorch Conference North America? 🗣️ We have multiple ways for you to contribute to the program in San Jose (Oct 20-21): 🔸 Session Presentations (25 min): Focused technical talks or case studies. 🔸 Lightning Talks (10 min): Fast-paced, high-impact insights. 🔸 Birds of a Feather (25 min): Informal, participant-driven discussions. 🗓️ Get your proposal in by June 7: https://bit.ly/4bIgqbs #PyTorchCon #PyTorch #PyTorchFoundation #FutureOfAI #AI #GenAI #MachineLearning #ML #DeepLearning #OpenSource #OpenSourceSoftware #OpenSourceDevelopment #OpenSourceCommunity #OSS #LinuxFoundation #events #linux #CallForProposals #CallForPapers #CFP #CallForSpeakers
-
In his keynote at PyTorch Conference Europe 2026, Edward Yang (Meta) describes a common frustration with tensor parallelism: getting distributed gradients right often requires inserting non-obvious operations, and missing one can lead to silently incorrect results. PyTorch is addressing this with SPMD types, a new type-checking system that verifies distributed programs and identifies where required collectives are missing. Watch Edward's full keynote for updates on pre-compilation support for torch.compile, SPMD types, and other recent work in distributed training: https://lnkd.in/gm8eA-JW #PyTorchCon
-
PyTorch reposted this
vLLM tops the Artificial Analysis leaderboard 🎉 vLLM tops Artificial Analysis on DeepSeek V3.2 and ranks among the top deployments of MiniMax-M2.5 and Qwen 3.5 397B. The leading deployments of these models are now open source. How each result was built: 🔹 DeepSeek V3.2 — Aggressive op fusion across the attention path collapsed ~33 per-layer kernels down toward ~10. 🔹 MiniMax-M2.5 — Custom EAGLE3 draft trained against the target's own token distribution via TorchSpec, plus a custom QK-norm fusion for MiniMax's TP-aware attention. 🔹 Qwen 3.5 397B — Targeted fusions plus a QK-norm fix for Qwen's linear-attention path. Every optimization is in vLLM main or on its way upstream. Huge thank you to Inferact, DigitalOcean, NVIDIA, Red Hat, and the vLLM community 🙏 Full breakdown 👇 https://lnkd.in/gtRgxSFS
-
Driving optimized nuclear reactor design with AI Physics and Digital Twins The development of socially acceptable nuclear reactors requires that they are safe, clean, efficient, economical, and sustainable. However, validating new designs presents significant challenges including expense, time constraints, and inherent complexities of physical experiments and simulations. NVIDIA CUDA-X libraries, the PhysicsNeMo AI Physics framework, and the Omniverse libraries help developers in the nuclear industry address these challenges by delivering GPU-accelerated, AI-augmented simulation solutions for real-time digital twins. Read the full post: https://lnkd.in/gT8W4Wse