A major milestone for the HDF5 ecosystem: 2.1.0 is live on conda-forge! We are excited to announce that the first HDF5 2.x packages are now available on conda-forge. This marks a significant shift in how we build and deliver the library, moving to full Semantic Versioning to make dependency management smoother for everyone. What’s new in the 2.x series? • Cloud-Optimized Performance: We’ve introduced new defaults for cloud-native HDF5 and a new ROS3 (ReadOnly S3) driver backed by the Amazon C S3 library for more robust remote data access. • Modern Hardware Support: Native support for bfloat16 and complex numbers, keeping pace with modern AI and scientific computing workloads. • Massive Speedups: Significant performance improvements across the board, including major optimizations for Virtual Datasets (VDS). Whether you are managing massive HPC datasets or building cloud-based analytics, HDF5 2.1.0 is built to be faster, more predictable, and ready for modern hardware. Get started today: conda install -c conda-forge hdf5=2.1.0 Check out the files here: https://lnkd.in/gyJgYeKC
About us
The HDF Group provides a unique suite of technologies and supporting services that make possible the management of large and complex data collections. Its mission is to advance and support HDF® (Hierarchical Data Format) technologies and ensure long-term access to HDF data.
- Website
-
http://www.hdfgroup.org/
External link for The HDF Group
- Industry
- Non-profit Organizations
- Company size
- 11-50 employees
- Headquarters
- Champaign, Illinois
- Type
- Nonprofit
- Founded
- 2006
Locations
-
Primary
Get directions
410 E University Ave
Suite 210
Champaign, Illinois 61820, US
Employees at The HDF Group
Updates
-
New NISAR sample data products are live, showcasing how HDF5 enables seamless, high-performance access to massive Earth science datasets--and they're cloud-optimized from the start. Need help getting your data cloud optimized? Reach out to us The HDF Group. https://lnkd.in/eQU_zwfQ
-
Safety, security, and privacy aren't interchangeable—and in the HDF5 ecosystem, those distinctions are critical. Our latest blog post explores the "Shared Vocabulary" we’re building to identify and mitigate data risks. This work is a core part of the HDF5 SHINES project, proudly supported by the National Science Foundation (NSF) Safe-OSE program. Read more: https://lnkd.in/gsHFqv3g
-
We are excited to announce The HDF Group has received a U.S. National Science Foundation (NSF) SAFE-OSE award to dramatically enhance the safety, security, and privacy of the #HDF5 ecosystem. This grant is a critical investment in protecting the integrity of scientific, HPC, and national security data worldwide, allowing us to strengthen our development infrastructure and introduce new security features for HDF5 users everywhere. Read the full details on this vital project and its impact on the future of open-source data: https://lnkd.in/gCi3784Z #NSF #DataSecurity #HPC #OpenSource
-
The HDF Group's John Readey just returned from presenting at the Second Visualizing Offline and Live Data with AI (VOLDA) Workshop - 2025! John's talk, "Data Streaming using HDF5 and HSDS," demonstrated how to leverage HSDS to handle real-time data collection from multiple sensor streams, a critical topic for scalable data infrastructure. We're proud to have his expertise driving innovation here! You can check out the abstract and slide deck for John's contribution here: https://lnkd.in/gTUQ6HXr Take a look at the impressive full agenda for the VOLDA Workshop here: https://lnkd.in/gYKtf-jV #VOLDA2025 #DataScience #AI #HDF5 #HSDS #DataStreaming
-
-
IT'S HERE! The official release of HDF5 2.0.0 is live! 🚀 This is a massive leap forward, modernizing the HDF5 library for today's high-performance, large-scale data workflows. We are incredibly proud to deliver a release focused on reliability, ease of use, and speed. The Biggest Wins in HDF5 2.0.0: - Performance Power-Up: We’ve seen up to 2500% faster read/write operations for Virtual Datasets (VDS). If you use VDS, this is a game-changer. - Modernized & Streamlined: We’ve adopted semantic versioning, moved to a reliable CMake-only build system, and ensured C11 standard compliance. - New Capabilities: Enjoy native support for bfloat16 and complex numbers, plus full UTF-8 filename support on Windows. HDF5 2.0.0 is built to power the next generation of data science and engineering. Dive into the release newsletter to see how this update will transform your pipelines! Check out the full announcement and download HDF5 2.0.0 now: https://lnkd.in/giXE3V9c #HDF5 #DataScience #BigData #ScientificComputing #DataEngineering #SoftwareRelease
-
Coming in the next HDF5 release: Native support for complex number datatypes HDF5 2.0.0 introduces a standardized method for handling complex numbers, improving efficiency and data reliability for users in fields such as medical imaging, signal processing, quantum physics, and fluid dynamics. Here's how this impacts development: • Simplified Data Handling: Users can now directly read and write complex data, which simplifies code and reduces development effort. • Better Data Interoperability: Users no longer need to rely on custom HDF5 datatypes to store complex numbers. • Out-of-the-Box Tool Support: Command-line tools like 'h5dump' and 'h5ls' will natively recognize and interpret complex data types. Look for the HDF5 2.0.0 release soon!
-
The HDF Group reposted this
💡 HDF5 isn’t just storage — it’s a data container on steroids. Raw IEX tick data in PCAP form is massive: ~6TB for the TOPS dataset alone ~3× the storage footprint compared to HDF5 and that’s before applying TICKPACK, the multi-threaded, pipelined custom compressor for tickdata I just started working on. Wondering how IEX-DOWNLOAD and IEX2H5 fit into a real workflow? I put together a full guide (requirements, rigs, workflow, benchmarks, code examples): https://lnkd.in/gYdS3bYq #HDF5 #TickData #QuantFinance #HighPerformanceComputing #OpenSource #DataInfrastructure
-
-
Coming in HDF5 2.0.0—an update in how Windows handles non-UTF-8 filenames! For this upcoming release, we added a fallback path that allows users to work with both UTF-8 and non-UTF-8 filenames. You’ll also find a new environment variable called HDF5_PREFER_WINDOWS_CODE_PAGE that allows users to always treat a filename as non-UTF-8 regardless of what format the filename is meant to be. With this change, users won’t need to work around international characters in filenames and will find a much cleaner fix than our previous efforts in 1.14.4. Let us know if you have any questions about this change. Look for the new HDF5 2.0.0 release coming out soon!
-
Cloud optimized HDF5 is transforming data access! NASA’s ICESat-2 ATL03 data (the largest files in their archive) now enables efficient access directly from their S3 storage via cloud optimized HDF5 files, yielding faster processing & reduced costs. You can expect more improvements in cloud-optimized HDF5 in our upcoming HDF5 2.0.0 release. Read more about the NASA development here: https://lnkd.in/gRuZRadv