Skip to content

Align playback of xHE‑AAC tracks with Auxio’s ReplayGain logic #1340

@flosnvjx

Description

@flosnvjx

Describe your proposal

For audio tracks encoded in xHE‑AAC, the media3 MediaCodec decoder applies loudness normalization based on tracks' embedded MPEG‑D DRC metadata. The normalization targets a specific loudness level (‑16 LKFS) and is performed automatically.

Scenario: When such a track also tagged w/ ReplayGain metadata (e.g., written by CLI utilities like rsgain), the resulting playback volume would be double‑adjusted -- first by the decoder’s loudness normalization, and then by Auxio’s ReplayGain processing -- leading to unexpectedly low volume. Furthermore, the presence of decode‑time loudness leveling makes the user experience of (toggling on/off) ReplayGain inconsistent for these tracks -- either tagged w/ ReplayGain metadata or not.

I propose the following changes, for harmonizing ReplayGain UX logic w/ xHE‑AAC tracks:

  1. Disable decode‑time loudness normalization by default: set KEY_AAC_DRC_TARGET_REFERENCE_LEVEL to ‑1, which disables loudness normalization.

  2. Conditionally enable decoder loudness normalization:

    • If ReplayGain is enabled and the track tagged with ReplayGain gain information → utilize ReplayGain tag i.e. applying gain to the track (i.e., keep decoder loudness normalization disabled). Note: here I'd prefer the ReplayGain tag, because the freely available exhale command line encoder does not support writing album gain as MPEG-D loudness metadata in encoded xHE-AAC tracks.
    • If ReplayGain is enabled but the track lacks ReplayGain gain tags → instruct the decoder to normalize to a loudness level compatible with ReplayGain on that track.

This ensures that, when ReplayGain is enabled, xHE‑AAC tracks are either normalized once (by the decoder) when a xHE-AAC track comes without ReplayGain tag, or gained once (by ReplayGain) when ReplayGain tag is present, avoiding double application of volumn adjustment like in current Auxio implementation.

What problem does this proposal solve?

  • Prevents the double‑adjustment (first by decoder loudness normalization, then by ReplayGain) that currently makes xHE‑AAC tracks that tagged with ReplayGain gain data playback at an unexpectedly low volume.
  • Makes the ReplayGain toggle UX consistent for xHE‑AAC and other tracks—when ReplayGain is enabled, the user hears the ReplayGain‑adjusted level; when disabled, the user hears the original loudness level, instead of a persistant -16 LKFS loudness level across all xHE-AAC tracks.

What alternatives have you considered?

  • Do nothing change: Accept the double‑adjustment behavior, accepting the -16 LKFS loudness normalization, those have a music collections contains xHE‑AAC files shall at least remove the ReplayGain tags to get an acceptable volume during playback
  • Merely disable loudness normalization of AAC at decoder end: i.e. treat loudness metadata of AACs like it never exist, if loudness leveling is desired, user must tag the track with ReplayGain data. This should require a minimal codebase change.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions