doi: 10.1109/WASPAA66052.2025.11230996

Cite This Publication

Stephen D. Voran ORCID logo and Jaden Pieper

Abstract:

The restoration of degraded audio signals is often performed on complex-valued frequency-domain (FD) representations. This requires manipulation of either magnitudes and phases or real and imaginary parts. In general, these manipulations do not produce consistent representations. The consequence is that the magnitudes and phases (or real and imaginary parts) of the restored time-domain signal (which are always consistent) do not match the generally inconsistent values imposed during FD restoration. In colloquial terms, “What we get is not what we asked for.” The enforcement of consistency is always heard in the resulting audio and is known in principle, but it can be better understood. We present two-dimensional FD SNR frameworks (e.g., magnitude/phase or real/imaginary) that visually reveal how consistency enforcement changes the applied FD restorations to arrive at the achieved FD restorations. We also show how extended Griffin-Lim algorithms can reduce and direct, but not eliminate, the changes produced by consistency enforcement. We apply objective estimators to connect this work to estimated speech quality and intelligibility. This work can inform machine learning training and architecture choices that must balance restoration efforts across two dimensions (e.g., magnitude and phase) to arrive at the best possible speech quality.

Keywords: frequency domain analysis; machine learning; signal processing algorithms; acoustics; signal to noise ratio; time-domain analysis; machine learning algorithms; training; conferences

Related Links:

For technical information concerning this report, contact:

Jaden Pieper
Institute for Telecommunication Sciences
(202) 236-7516
jpieper@ntia.gov

For funding information concerning this report, click this link.

Performing Agency

U.S. Department of Commerce

National Telecommunications and Information Administration

Institute for Telecommunication Sciences

325 Broadway

Boulder, CO 80305

https://ror.org/00mj5bc69

Funding Agency

U.S. Department of Commerce

National Telecommunications and Information Administration

Herbert C. Hoover Building

14th and Constitution Ave., N.W.

Washington, D.C. 20230

https://ror.org/032241511

Disclaimer:

Certain commercial equipment, components, and software may be identified in this report to specify adequately the technical aspects of the reported results. In no case does such identification imply recommendation or endorsement by the National Telecommunications and Information Administration, nor does it imply that the equipment or software identified is necessarily the best available for the particular application or uses.

For questions or information on this or any other NTIA scientific publication, contact the ITS Publications Office at ITSinfo@ntia.gov or 303-497-3572.

Back to Search Results