« Back

Audio quality of SBC XQ Bluetooth audio codec

SBC XQ logo

SBC XQ is not just a new Bluetooth (BT) audio codec, it is a lifehack.

Standard BT audio codec SBC is incorporated into all BT stereo audio devices as mandatory [1][2]. It can work at arbitrary high bitrates but BT documents, however, recommend 328 kbit/s (44.1/16) for high quality mode. This mode provides just acceptable audio quality according to SE ratings. In order to increase quality of audio transmission over BT:

  • CSR plc, a multinational semiconductor company acquired APT Licensing Ltd. and introduced aptX audio codec, then aptX HD, then Qualcomm bought all these and additionally introduced aptX Low Lattency and aptX Adaptive codecs
  • Sony introduced its proprietary Hi-Rez LDAC
  • Samsung introduced three new BT audio codecs: HD/Scalable/UHQ-BT
  • Huawei introduced HWA LHDC audio codec
  • Apple uses AAC in its products
  • a guy under the nick ValdikSS wrote the patch for Android and another guy - Pali Rohár - for Linux; these patches unlock higher bitrates of SBC encoder.

It turned out that almost all modern BT headphones, speakers, receivers ... support SBC bitrates up to 730 kbit/s just out of the box. And that patch (SBC XQ) helps to encode BT audio on Andriod smartphones at the following bitrates:

  • BT EDR 2 - 452.0 kbit/s for 44.1/16, 492.0 kbit/s for 48/16
  • BT EDR 3 - 551.2 kbit/s for 44.1/16, 600.0 kbit/s for 48/16

The choice of bitrates looks like a reasonable compromise between audio quality, compatibility with current BT devices and stability of BT connection [3][4].

The SBC XQ codec (at bitrates for 44.1/16) was added to our live listening tests:

SBC XQ CBR@452.0 (Bluetooth) - Subband codec for Bluetooth A2DP profile, CBR, 452.0 kbit/s FBR
CODER: SBC Encoder LIB Version 1.5 (Philips)
- usage: sbc_encoder -v -p -r453000 -oout.sbc ref.wav
- SBC XQ default setting for BT EDR2
- 44100 Hz, Dual Channel
- bitpool: 38, bands: 8, blk_len: 16, allocation method: Loudness
DECODER: SBC Decoder LIB Version 1.5 (Philips)

SBC XQ CBR@551.2 (Bluetooth) - Subband codec for Bluetooth A2DP profile, CBR, 551.2 kbit/s FBR
CODER: SBC Encoder LIB Version 1.5 (Philips)
- usage: sbc_encoder -v -p -r552000 -oout.sbc ref.wav
- SBC XQ default setting for BT EDR3
- 44100 Hz, Dual Channel
- bitpool: 47, bands: 8, blk_len: 16, allocation method: Loudness
DECODER: SBC Decoder LIB Version 1.5 (Philips)

Early ratings (inaccurate so far) are available on the page Encoders 320+ kbit/s. For more accurate results, please, participate in SE listening tests - there is a short instruction on that page below the ratings.

 

Df-measurements of SBC XQ with music signal

If the text below seems a bit confusing, please, refer to the previous article - Audio quality of Bluetooth aptX and aptX HD

Once again, assessment of perceived sound quality by means of any objective measurements should be made with great care. In df-metric we have the criterion, which defines whether df-measurements can be used as such or not. It is similarity of sound signatures or artifact signatures of devices under test. If their artifact signatures are close enough, then their df-measurements correlate well to the scores of perceived sound quality.

For broader perspective we will compare artifact signatures of SBC XQ@452.0 and SBC XQ@551.2 with SBC@201 (Low Quality), SBC229 (Middle Quality), SBC@328 (High Quality), aptX@352, aptX HD@529, ADPCM IMA@354, AAC@256. As in previous articles we will use Pink Floyd album “The Dark Side of The Moon” as the test signal.

Figure 1. The dendrogram shows how different BT audio codecs relate to each other according to their artifact signatures. The shorter the link between two codecs, the more similar their artifact signatures. The Spearman distance 0.1 is critical for relation of Df measurements to subjective scores. Similarity dendrogram of tested BT codecs

 

We can see that SBC and aptX codecs have similar artifact signatures (distance<0.1), so their df-measurements are indicative of their perceived sound quality.

Figure 2. Histograms of df-sequences for the codecs under test. Medians and 25/75 percentiles are indicated. Median Df is an estimator of average waveform degradation. Shape of histogram relates to character/type of waveform degradation - artifact signature.

 

Looking at Df medians we can safely conclude that audio quality of SBC XQ is comparable to aptX HD. And for BT EDR3 devices SBC XQ slightly surpasses aptX HD. It will be impossible to tell them apart in a blind listening test. SBC codec uses primitive psychoacoustic model for encoding and aptX does not use it at all, so their perceived audio quality is determined mostly by bitrate. Different settings of SBC, including SBC XQ, can be compared to aptX and aptX HD aurally with the help of Bluetooth A2DP SBC/aptX online encoder [5].

All current BT stereo devices could use this higher quality encoding. It just suffices to modify BT stack of sending device. Receiving BT devices that support only mandatory SBC codec will benefit most from this trick.

At the moment the required patch is included into LineageOS, Resurrection Remix and crDroid forks of Android. The patch for Linux PulseAudio from Pali Rohár besides SBC XQ also adds support for aptX, aptX HD and FastStream codecs [6]. This extra quality is for free. It's hard to imagine any objection to including this option into all BT stacks and the main Android branch. You can request this feature in your operating system:

Future versions of SBC XQ will include more bitrate choices. So, manufacturers of BT headphones, speakers and receivers can further improve audio quality of their BT products simply by removing all limits to SBC decoding parameters, thus providing full support for SBC XQ.

References

[1] Bluetooth SIG, Specification of the Bluetooth System, Profiles, Advanced Audio Distribution Profile, v1.3.2, 2019-01-21, https://www.bluetooth.org/docman/handlers/downloaddoc.ashx?doc_id=457083

[2] Christian Hoene, Mansoor Hyder, “Considering Bluetooth’s Subband Codec (SBC) for Wideband Speech and Audio on the Internet”, Technical Report WSI-2009-3, 2009-10, https://pdfs.semanticscholar.org/1f19/561d03bc88b67728375566c95bbf77e730d5.pdf

[3] ValdikSS, “Bluetooth stack modifications to improve audio quality on headphones without AAC, aptX, or LDAC codecs”, 2019-06-18, https://habr.com/en/post/456476/

[4] ValdikSS, “Improve Bluetooth audio quality on headphones without aptX or LDAC”, since 2018-08-22, https://forum.xda-developers.com/android/software-hacking/improve-bluetooth-audio-quality-t3832615

[5] Bluetooth A2DP SBC/aptX online encoder, https://btcodecs.valdikss.org.ru/sbc-encoder/

[6] Pali Rohár, New API for Bluetooth A2DP codecs, since 2019-01-12, https://patchwork.freedesktop.org/series/55117/

 

Update


September 2019

Another patch for PulseAudio BT stack is available now from JP Guillemin - https://github.com/JPGuillemin/pulseaudio/tree/SBC-XQ. This patch activates SBC XQ mode in Linux systems in the most simple end effective way - it allows to transmit audio using standard SBC codec in the highest possible quality supported by receiving BT device.

 

Comments
Ken Laberteaux
Gabriel Bouvigne wrote: “If they can sustain about twice stereo/JS bitrate in dual channel, in theory they should also be able to sustain this higher bitrate in stereo/JS, which would be a vast quality improvement....”

Yes it would, and I think I have another (possibly better) way to do it. Once you see it, it seems obvious. Change the number of subbands from 8 to 4.

From [Section 12.9 of https://www.bluetooth.org/docman/handlers/downloaddoc.ashx?doc_id=457083] (what Serge posted before)

bit_rate = 8 * frame_length* fs / nrof_subbands / nrof_blocks,

where frame_length (in Bytes) for:

MONO and DUAL_CHANNEL:
frame_length = 4 + (4 * nrof_subbands * nrof_channels ) / 8 +
round_up[ nrof_blocks * nrof_channels * bitpool / 8 ].

And STEREO and JOINT_STEREO:
frame_length = 4 + (4 * nrof_subbands * nrof_channels ) / 8
+ round_up[(join * nrof_subbands + nrof_blocks * bitpool ) / 8].

round_up:=round x to the next integer in the direction of plus infinity.

Let’s consider the composition of these SBC frames. Just to give some perspective, we can plug in these common values: nrof_blocks=16, bitpool=37, nrof_subbands=8, nrof_channels=2 into the above frame_length equations. We can observe that the initial 4 in the equation is the number of bytes in the header of the SBC frame. The next term (4 * nrof_subbands * nrof_channels ) / 8 (which equals 8 using the values above for 2 channels, 4 for 1 channel) are the size of the scale factors in the SBC frame. The final term, the one rounded up, which represents the quantized, bandpass audio data, is 74 (for mono), or 148 (dual_channel), or 74 (for stereo), or 75 (for joint_stereo). Clearly this last term has the largest impact on the size of the frame_length.

These frames are generated once for every nrof_subbands * nrof_blocks audio samples for mono signals, or nrof_subbands * nrof_blocks for stereo pairwise samples for stereo signals. If the original stream is sampled at 44.1 kHz, then each frame represents just under 3 msec of audio data.

These choice of codec parameter values would produce a joint-stereo bitrate of 8*(4+8+75)*44100/(8*16), which is approximately 240 kbits/sec. This is just slightly above the 229 kbits/sec of the middle-quality joint-stereo as proposed in the SBC specification (where a bitpool of 35 is chosen instead of 37).

So, we know from listening tests, many feel that medium quality SBC is “not very transparent”, i.e. sucks. So we want to improve upon this by increasing the bit rate, while sticking with the SBC codec.

Yes, you could double the bitrate of SBC by forcing a change from joint stereo to dual channel, which is what ValdikSS has proposed. This has the effect of doubling the bitrate by sending SBC frames that are almost twice as large at the same timing as before. But, as I have shared with him by private email, that requires the extra bits to be used in separate channels. However, for most music, L and R channels are highly correlated, which is why Joint Stereo, as compared to Stereo, is often able to maintain a similar amount of transparency with a lower bit rate. So unless your channels are very non-correlated, enforcing the extra bits to be used in separate channels is not optimum (as observed by Gabriel).

You could also simply double the bitpool, and this (I am guessing) was ValdikSS’ original idea (just remove the arbitrary cap on bitpool at SRC and increase bitrate). However, he found that multiple SNKs were either buggy or had other issues at higher bit rates, even if they advertised higher bit rates. This is what VadikSS was discussing above.

So, here is my proposal: Use 4 subbands instead of 8. Subband=4 is a mandatory requirement for all SBC SNKs. (Whether it is advertised, but not working well due to poor testing, is to be seen (just like ValdikSS found with higher bitpools)). This has the effect of sending an SBC frame of roughly the same frame_length as before, but twice as frequently, i.e. the SBC frame would characterize 64 samples (either mono or stereo pairs) of audio, as opposed to 128 samples when subband=8.

As with ValdikSS’ proposal, my proposal will roughly double the bitrate, but extra bits are not required to be used in separate channels. SBC frames will capture a smaller slice of audio time, so this should intuitively help with time-resolution of transients. While each bandpass-filtered, subband time sequence (of nrof_blocks length) will contain frequencies from a bandwidth of twice the frequency as when subbands=8, that should not matter, as the concatenation of the analysis and synthesis filter should be very close to perfect restoration, only limited by the quantization errors in each subband. And note that the bitpool is now shared by half as many subbands as when subbands=8, thereby reducing those quantization errors.

A large number of subbands can be useful if you are trying to identify and remove psychoacoustically masked sounds, as occurs in many other subband codecs, e.g. mp3, but no such efforts are explicitly attempted in SBC.

So, I propose we try to configure SBC with joint_stereo and subbands=4, while keeping the blocksize=16 and bitpool=37. I choose 37, as it is the one of the minimum max bitpool discovered by ValdikSS’ codec compatibility chart at

https://btcodecs.valdikss.org.ru/codec-compatibility/

(note: bitpool=35 might be safer; would need testing)

FWIW, ValdikSS made some subband=4 SBC samples for me at bitpool=37. Certainly not blind testing, but it sounded…. really good. But we need a lot better tests before claiming anything.

Please give me feedback on this idea. Obviously a larger choice for bitpool (above 37 or 35) would likely be even more transparent, assuming that it does not swamp the bluetooth channel. If it seems sound, perhaps we could get Serge to add this config to the SE testing?

P.s. It seems that another idea worth exploring would be to keep subbands=8 but using nrof_blocks=8 (instead of 16). I would need to think it through, to better understand how the extra bits would be used.
Posted on 1/9/20 4:46 AM in reply to Gabriel Bouvigne.
Gabriel Bouvigne
That seems like a good idea.

(Reminder: we have a bitrate calculator: https://btcodecs.valdikss.org.ru/sbc-bitrate-calculator/ )

In theory, having only 4 subbands would really impair an encoder using psychoacoustic computations to handle the subbands. In reality, considering that the default/example SBC encoder is only considering the subband as allocation buckets/packets, without real psychoacoustic computations, I don't think it should hinder anything.

Having this increased bitrate in a joint stereo configuration is indeed very likely to be of better quality compared to the dual channel configuration. (rule of thumb: consider that transmitting two channels in joint stereo is requiring about 1.6x the bitrate of a mono channel, while dual channel requires 2x)

Regarding transcients handling, as SBC is a subband codec (ie it quantizes samples in the time domain) it is quite immune against smearing compared to a transform codec (which quantizes samples in the frequency domain): butcher samples in the frequency domain is spreading the error when converting back to time domain, while butchering samples in the time domain is not spreading the butchering. Transcients are surviving against large quantization errors when the quantization is done in the time domain.
Posted on 1/9/20 6:20 AM in reply to Ken Laberteaux.
Ken Laberteaux
Yes, I know of the sbc bitrate calculator. I used it to check my work. I included it in my post to make clear what you said, that the number of subbands would only impact over which time period (of the original audio) the bitpool would be allocated.

Your comments about transients handling in the time-domain all make sense.
Posted on 1/16/20 5:05 PM in reply to Gabriel Bouvigne.
Ken Laberteaux
Hi All, I have a much more detailed article exploring SBC, SBC XQ, and the idea I mentioned earlier. You can read and comment on the article here:

https://docs.google.com/document/d/1YPTwFhvo99suyrwkT4Z97so9_xwoOGkpSqgveLp­YvYc/edit?usp=sharing
Posted on 2/7/20 4:04 AM.
Showing 21 - 24 of 24 results.
of 2
Audio Transparency Initiative