SE

Accuracy of SRC in Audio

The objective audio measurements below illustrate how accurately the various sample rate conversion algorithms (SRC) preserve the waveform, magnitudes and phases of the initial signal. The resamplers were tested with both technical signals and real music. The signal degradation is measured with the Difference level parameter, DF (in decibels). This shows how significantly the initial waveform is altered as a result of sample rate conversion. A low DF level (blue/violet area) implies less degradation of the signal.

Measurements for each resampler are presented as slides, with each showing DF levels with seven technical signals (rectangular diffrograms) and ten minutes of music (histogram of DF values). Waveform, magnitude and phase DF levels (DFwf, DFmg, DFph) are presented on separate DF-slides, which can be switched by clicking the right-top corner of a slide. Check How to read DF-slides for details.

For comparison, DF-slides of two different signal processing were included: the high anchor - bit reduction from 32-bit to 24-bit without dithering (@44.1k) and the low anchor - bit reduction from 32-bit to 16-bit with triangular dithering [-1,+1] (@44.1k).

The most sensitive indicator of the degradation is the phase spectra of a real music signal. The Ph spectra degrades first and foremost because it keeps greater part of information about any complex signal like music. So the histogram of DFph values (with the music mix Diversity) is the most indicative.

While Median of DF distribution characterizes the overall level of degradation, it does not account the shape/form of the distribution. The 95th percentile is a more robust indicator with the clear meaning - most artifacts of a music signal (95%) are below (better) this level. By default the DF-slides of resamplers are sorted by 95th percentile of DFph distribution/histogram and more accurate resamplers are on top. Additionally you can sort them by median or 95th-percentile of waveform or magnitude histograms:

 

How to read DF-slides

DF measurements are presented in three DF-slides: waveform slide (Wf), magnitude slide (Mg) and phase slide (Ph). Each of them shows corresponding measurements of degradation with the same set of technical and music signals. Below is the Wf slide showing DF measurements of waveform degradation caused by the bit reduction which is usually applied as a final step during CD mastering.

Df-slide with DF measurements of bit reduction 32 -> 16 with triangle dithering

A histogram on DF-slide is the result of DF measurements with the real-life audio mix Diversity consisting of 58 short excerpts (10sec with fade in/out) from compositions of various genres. The DF value is measured for every 50ms of the output signal. Thus 9.5min of music result in 11363 DF values which are presented in the form of a histogram. Each pixel of the histogram corresponds to a 50ms time frame of the signal. In every histogram bin DF values are sorted by the energy of corresponding signal frames - lower energy frames are at the bottom and have darker colors.

The histogram is the most informative indicator of a signal degradation and the phase histogram is the most sensitive one, it is affected by degradation foremost. The median of DF values indicates the average level of music signal degradation. The 95th percentile is a more robust indicator because it does not account shape of DF distribution. It has simple and clear meaning - most of signal artifacts (95%) are below/better this DF level.

The rectangles on DF-slide are diffrograms showing degradation of technical signals with time (horizontal dimension) and frequency (vertical dimension). All technical signals are 24s long and result in 60 DF values each [60=24s/400ms]. Median of these values - a robust estimator of signal degradation - is indicated under corresponding diffrogram. All diffrograms are pixel-accurate - each pixel represents a perfectly identifiable portion of the signal and color of the pixel encodes precisely the level of degradation and the energy of that signal portion. Zoom in (the nearest-neighbor interpolation is recommended)! Every image on DF-slide visualizes thousands of measurements.

The diffrograms show the level of degradation with the following technical signals:

Sine 1 kHz. For Sine input signal DFwf = (THD+Noise) - 3dB. This is a bridge between traditional audiometric and the new DF-metric.

Sine 12.5 kHz. The Sine signal of higher frequency. The DF of a properly designed audio circuit with this signal is close to its DF with Sine 1 kHz.

DFD. A mix of two Sine waves - 12460 Hz and 12540 Hz. This is the standard signal for measuring inter-modulation distortion (IMD). When the latter is low, DF with this signal is close to DF with Sine 12.5 kHz.

Square 1 kHz. A square wave for measuring slew rate and phase inconsistency within 20Hz-20kHz.

PSN IEC 60268-1. Pink noise, filtered and dynamically compressed. A standard signal that simulates a real audio program. Its DF is usually close to the median of the histogram with music signal. This is the most meaningful technical signal in DF-metric from a listener perspective.

White Noise. This is the toughest test for any audio circuit/processing. It reveals all possible types of degradation but in unclear proportion [additional research needed].

PSN IEC 60268-1, 1bit, -101.1 dBFS. The Program Simulation Noise additionally down-scaled to 1 bit. It simulates a real audio program (16 bit) at the lowest possible level. It is an equivalent of the SNR parameter of traditional audiometric.

The Creative Commons license [BY-ND] allows to copy and redistribute unmodified DF-slides for any purpose, even commercially.

 

Creative Commons License 2001-2025 SoundExpert