Releases · GoogleCloudPlatform/gcsfuse

Updated the golang version from 1.24.5 to 1.24.6 to fix the CVEs

This release is built on top of 3.4.0 & upgrades go sdk dependency (from v1.56.2 to v1.56.3).

Dependency Upgrades / CVE fixes
Go SDK patch release upgrade 1.56.2 -> 1.56.3

Introducing profile based optimizations for AI/ML workloads: We are adding flag --profile to apply best practices recommended in our performance tuning best practices for AI/ML. The supported values (as of now) are aiml-training, aiml-checkpointing and aiml-serving. Setting the profile flag simplifies performance optimization for AI/ML workloads by consolidating over 10 flags into one. Details of profiles can be found here.

Please note:

You can set --profile only during mount. To update it, you need to remount.
Memory Considerations: As part of profile setting, we will set metadata cache capacity and TimeToLive (TTL) to unlimited (meaning never expire and never evict). If your VM doesn’t have enough memory, this might result in Out of Memory (OOM) errors, so available and needed memory should be considered. While these should be considered on any machine, machines with limited memory (<1TiB) are more prone to OOM errors.
Precedence order:
- The profile-based optimization supersedes optimizations applied for high-end machine types introduced in v3.0.0.
- The profile-based optimization is superseded by user-specified value in the mount command or in configuration file, if set.
Profile flag is not yet supported in GCSFuse CSI volumes in GKE pods.

Rapid storage

Now appends to objects originally written via GCSFuse leverage Rapid real time appends support and content is visible in real time. Note that this applies only to the file handles opened with mode O_APPEND.
Read Improvements
- Faster Random Reads (~10%)

Dependency Upgrades / CVE fixes

Go SDK patch release upgrade 1.56.1 -> 1.56.2
gRPC patch release upgrade 1.75.0 -> 1.75.1

Full Changelog: v3.3.0...v3.4.0

Buffered Read: Accelerate Large Sequential Reads

In this release, we are introducing the Buffered Read feature, designed to accelerate performance of applications that perform large, sequential reads on files stored in Google Cloud Storage. This is helpful for reading model weights during inference or large scale media processing applications that read data sequentially.

This feature improves throughput (2-5x) by asynchronously prefetching parts of a GCS object parallelly into an in-memory buffer, serving subsequent reads from this buffer instead of making network calls. This asynchronous and parallel buffering approach improves throughput by saturating network bandwidth without requiring additional application-side parallelism.

Feature Enablement: This feature is disabled by default and can be enabled with --enable-buffered-read flag or read:enable-buffered-read config. Buffered reads are ignored if file cache is enabled. Over time, we will work towards getting this enabled by default.
Use Cases: Single-threaded applications reading large (> 100MB) files sequentially. E.g. For reading models during inference (prediction) of an AI/ML model.
Memory usage:
- Buffered readers will use CPU memory for storing the prefetched blocks.
- Memory usage is capped at 320 MB per file-handle and 640 MB (40 x 16MB) globally. The global memory limit is configurable via the --read-global-max-blocks (default: 40).
- This memory is automatically released when a file handle is closed or a random read access pattern is detected.
CPU Usage: The CPU overhead is typically proportional to the performance gain achieved.
Known Limitations:
- Workload combining sequential and random reads (which also includes some model serving techniques such as Pipeline parallelism that do random reads at the start before switching to large sequential reads) may not benefit and could automatically fallback to default reads. We plan to improve buffered read for such scenarios in the future releases. Please reach out to us for improving the performance in such scenarios.
- Please consider available system memory when enabling buffered reads to avoid out of memory(OOM) issues as we can go up to 640MB of memory usage by default. Please reduce --read-global-max-blocks(default:40) to avoid Out of Memory(OOM) issues.

Bug Fixes

Resolved sporadic mount failures and enhanced stability by improving the retry mechanism for stalled API calls to the backend. (#3561, #3684)
Fix for GCSFuse not returning errors in unmount of bucket (e.g. resource busy error) when mounted with gcsfuse (#3768) introduced in 3.2.0.

Improvements

Streaming Writes now support retrying stalled write operations.
Improved stability for writes(#3710)
Logging/monitoring improvements
1. Efficient log collection: Metrics are more efficient now with CPU usage reduced from 15% to <2% for small single threaded reads and memory allocation is reduced from 34% to 0%.
2. Mount logs display block size used in streaming writes in MiB instead of bytes for improved readability.
3. Added error log for unsupported values of flag: log-format(#3751)
4. Made logs efficient by downgrading the logging level of some unimportant logs to trace instead of higher logging level. (#3746. #3749)

Dependency Upgrades / CVE fixes

Dependency upgrades (#3740)

Full Changelog: v3.2.0...v3.3.0

This release is built on top of 2.11.3 & contains fix for the symlink rename issue (bug which caused I/O errors while renaming symlinks has been fixed - #3648)

Rapid Storage

Support for parallel random reads on same file handle for Rapid Storage which improves random read performance.

Atomic Move Object :

Renames now use the improved Atomic rename object operation on non-Heirarchical Namespace Buckets. ( Flat namespace Buckets ).
Previously, this was only supported in Hierarchical Namespace Buckets.

Bug Fixes

Streaming Writes : A Bug which led to incorrect utilization of the configured write:global-max-blocks configuration due to a semaphore lock release is fixed.
Rapid Storage : Fixed a GRPC error by upgrading GRPC dependency version to 1.74 (#3567)
SymLink Fix - A bug which cause I/O errors while renaming symlinks has been fixed.(#3648)

Improvements

Random reader refactoring - The new, modular design of the read path improves maintainability, increases test coverage to 95% and accelerates future development.
Dependency upgrades (#3567)

Full Changelog: v3.1.0...v3.2.0

Vulnerability fix on top of v2.11.2(go golang.org/x/net v0.37.0 to v0.41.0)

Renaming Objects with MoveObject API:

Starting with this release, we're transitioning from the "copy-and-delete" method to the more robust MoveObject API. This change brings atomic rename compatibility, meaning that all renaming operations will now be completed as a single, indivisible process.

Enhanced Read Reliability and Reduced Read Tail Latency

Starting in v3.1.0, GCSFuse improves read reliability and helps reduce read tail latencies by automatically detecting and retrying stalled GET requests. This feature uses a dynamic timeout based on historical successful or canceled request latency to identify and retry the slowest 1% of reads, which are the primary contributors to tail latency. If a request times out, it is retried using an exponential backoff algorithm.

Automatic Inactive Read Streams Closure:

This improves GCS server-side resource (tcp connections, memory) utilization by automatically closing inactive GCSFuse read streams after a reasonable timeout (between 10s to 20s). Especially important for large scale AI/ML training workloads. Inactive read streams often result from workloads that keep file handles open and idle for some time after reading only a portion of data. If the workload performs the read after the stream closure, it is still expected to read, however with an additional delay of establishing the stream again.

Bug fixes & Improvements:

An improvement where GCSFuse would unnecessarily consume CPU cycles due to improper Jacobsa/Fuse logger initialization. Loggers now initialize as configured (e.g., OFF, INFO). - PR

Dependency Upgrades / CVE fixes

Fix vulnerability go-viper/mapstructure - PR

Full Changelog: v3.0.0...v3.0.1

Automatic Defaults for High Specification Machines

Cloud Storage FUSE automatically optimizes default configuration settings when running on specific high-performance Google Cloud machine types to maximize performance for demanding, high-throughput workloads. Values that are manually set at the time of mount will override these defaults.

Configurations are automated for the following high-performance machine types:
a2-megagpu-16g, a2-ultragpu-8g, a3-edgegpu-8g, a3-highgpu-8g, a3-megagpu-8g, a3-ultragpu-8g, a4-highgpu-8g-lowmem, ct5l-hightpu-8t, ct5lp-hightpu-8t, ct5p-hightpu-4t, ct5p-hightpu-4t-tpu, ct6e-standard-4t, ct6e-standard-4t-tpu, ct6e-standard-8t, ct6e-standard-8t-tpu.

When a supported machine type is detected, Cloud Storage FUSE automatically applies the following configuration values:

        implicit-dirs: true
        metadata-cache.negative-ttl-secs: 0
 	metadata-cache.ttl-secs: -1
        metadata-cache.stat-cache-max-size-mb: 1024
        metadata-cache.type-cache-max-size-mb: 128
        file-system.rename-dir-limit: 200000

Streaming Writes is the Default Write Algorithm

Streaming Writes was introduced in v2.9.1. After the upgrade to v3.0.0, the Streaming Writes algorithm becomes the default path for New file Writes superseding the previous algorithm that temporarily stages the entire write in a local directory. The previous algorithm continues to be the fallback in case Streaming Writes cannot be used, such as for Edit file scenarios. Additional improvements to streaming writes in this release include:

In addition to Streaming writes becoming the default write path, the following improvements are also included:

Concurrent Reads During Writes: Streaming Writes now supports reading files even while writes are in progress. Upon a read request, the object will be finalized, and reads will be served directly from GCS. Subsequent write operations to the same file will revert to legacy staged writes.

Write Stalls and Chunk Uploads: Streaming writes do not currently implement chunk-level timeouts or retries. Write operations may stall, and chunk uploads that encounter errors will eventually fail after the default 32-second deadline.

Memory Control for Streaming Writes: A new flag, --write-global-max-blocks (or the write:global-max-blocks config), has been added to control the memory usage of streaming writes. By default, each file actively being written via streaming writes is allocated one 32 MiB block. This flag allows you to control the total number of blocks utilized by streaming writes across the entire mount to limit the memory if needed. Once the memory limit is reached, any new file write will fallback to the previous algorithm.

Fix for Inconsistent File Deletions: Previously, an edge case involving out-of-order writes during streaming and subsequent file deletion from the mount could prevent files from being removed from GCS. This has now been fixed.

Truncation Support: Truncating files to a size smaller than their current size is now supported during streaming writes.

If needed, users can switch back the previous method of fully staged writes using by passing:
Command-line flag: --enable-streaming-writes=false
Configuration file: write:enable-streaming-writes:false

Bug fixes:

Rapid Storage Class: An edge case of High Read latencies for Random Read scenarios on Rapid Storage has been fixed.[PR/3327]
GCSFuse now supports environment variable of GCE_METADATA_HOST in case of custom metadata hosts [PR/3253]
Fix broken support for requester-pays buckets [PR/3256]

Full Changelog: v2.12.0...v3.0.0

Releases: GoogleCloudPlatform/gcsfuse

Gcsfuse v3.4.3

Uh oh!

Gcsfuse v3.4.1

Uh oh!

Gcsfuse v3.4.0

Uh oh!

Gcsfuse v3.3.0

Buffered Read: Accelerate Large Sequential Reads

Bug Fixes

Improvements

Dependency Upgrades / CVE fixes

Uh oh!

Gcsfuse v2.11.4

Uh oh!

Gcsfuse v3.2.0

Uh oh!

Gcsfuse v2.11.3

Uh oh!

Gcsfuse v3.1.0

Uh oh!

Gcsfuse v3.0.1

Uh oh!

Gcsfuse v3.0.0

Automatic Defaults for High Specification Machines

Streaming Writes is the Default Write Algorithm

Bug fixes:

Uh oh!