« Back

Audio Transparency Initiative

This article needs proofreading and style improvement. If you like to make it better, please, contact me, I will mention you here for this contribution. Alternatively you can pay for Grammarly monthly subscription ($30) for me, so I can edit it myself in collaboration with AI. Thanks in advance.


The Audio Transparency Initiative focuses on shifting a listener's attention from gadgets and technology to music and its authors. For the purpose it proposes clear guidelines for designing/manufacturing of an audio path transparent for human hearing. The accompanying study of the audio quality in the market of portable players is aimed at (1) defining the required level of transparency and (2) showing how far different portable players are away from this level. Exceptional audio quality at low prices can be achieved through the united efforts of music lovers.


Music is the universal language of mankind.
Henry Longfellow

Music is everybody's possession.
It's only publishers who think that people own it.
John Lennon


Today it is not yet possible to say with certainty how and why our species of Homo sapience began to sing (hunting, rituals/dances, lullabies, proto-language ...). Our closest relatives in the animal world - chimpanzees - have no such musical behavior at all (poorly developed larynx), but for example in birds it is very common. At the same time in all human communities known to us musical culture in one form or another is necessarily present, and some researchers believe that the Neanderthals were even more musical than us. If we add to this that the oldest musical instrument we have found is about 37000 years old (bone with holes, flute), we can confidently say that our musical culture has its roots in the very early stages of emergence us - humans - as a species. And music itself is one of the oldest ways of our social interaction.

With the development of mankind and the emergence (and extinction) of new hearths of civilization, our musical practices also developed. Despite local successes (ancient Egypt, India, Greece, ...) our musical culture up to the Renaissance was quite simple/natural and most people could easily be both listeners and performers. The rise of Western civilization at the end of the Late Middle Ages gave a powerful impetus to the development of sciences, technology and culture as a whole. The most important/deepest component of any culture - the musical one has also naturally experienced a rapid blossoming. There appeared: advanced musical notation, professional composers and performers, new musical genres and instruments, music economy and entrepreneurs, sound recording and distribution of recorded sound to listeners ... . There appeared a deeper understanding of the structure of music, its regularities (in order to subsequently break them all without exception)) and the peculiarities of human perception of music and sound (psychoacoustics). Without pretending to the completeness of this list, I will dwell in more detail on its two points, two aspects of the evolution of musical life.


Musical instruments. Since about the 15th century, the variety and complexity of musical instruments have increased many times. Separation of musicians from listeners has begun and reached its peak at the end of the 20th century. However, with the advent of electronic musical instruments, sampling techniques and personal computers, process of the separation started to slow down and even reverse. It became possible to create music literally at home without having a serious musical education. The rapid development in recent years of AI technology has led to the emergence of so-called generative music [Mubert, Jukebox, ...], which in terms of its technical and, most importantly, artistic characteristics is already not so different from the human one (we already hear it without knowing its origin). The mass availability of such technologies will inevitably lead to the explosive growth of people making music and final blurring of the boundary between musicians and listeners. Each person will have the opportunity to compose new music by training a neural network with suitable music material, including her own music (playing instruments or singing). We will not discuss here the quality of such music, we just note that its quantity will be huge and there will be some outstanding examples among it for sure.


Music economy. Somewhere at the turn of the 18th century, music starts to move out of churches and mansions of noble people (patrons). It begins to live its own life. The first, initially discreet music intermediaries/managers appear, helping musicians, on the one hand, to organize performances, and listeners, on the other, to be informed about these performances and to attend them. The commercialization of music life has begun. This process also reaches its peak at the end of the 20th century, when most of the music business finds itself in the hands of the four record companies (EMI, Sony Music Entertainment, Universal Music Group, and Warner Music Group) that set the rules in the music market for both musicians and listeners. Over the course of two centuries, the humble helpers have turned into the owners of the world's music business, controlling almost any interaction between musicians and listeners, and making the bulk of all profits. To be fair it should be noted that despite such, to put it mildly, unhealthy commercial situation, the recording industry in the 20th century has produced a huge number of real musical masterpieces that will be in demand for several generations of listeners. And again, technological progress in the late 20th century changed the rules of the game. Internet, compact audio file formats (mp3, FLAC, ...) and decentralized file sharing protocol (torrent) have formed an open technological environment for direct communication between musicians and listeners. During the last two decades we are actively searching for new forms of such interaction: personal websites/blogs of musicians, services for promotion/sales of their own music (MySpace, Soundcloud, TuneCore, ...), free distribution of music by authors and payment by the formula "pay what you want", direct financial support of musicians by listeners and participation in their releases. Despite serious complications of this process, there is no doubt it will be successful, because musicians and listeners need each other, they have all required network technologies for self-organization and the differences between them, as mentioned above, are becoming increasingly blurry/conditional. Conversely, any closed/proprietary mass delivery channels for music on any medium have no future. Meanwhile, new types of intermediary services will be in demand as navigation in the new boundless ocean of music for listeners becomes more and more complicated, and musicians require increasingly sophisticated technical support. It worth to add that it will remain very hard (even more) to create great works and hits.

Thus, having made a dizzying turn together with the Western civilization, musicians and listeners unite again. The question - whether to listen or to write music - soon will be a question of mood. However, another barrier remains between musicians and listeners.


Playback obsession. With the advent of sound recording in the late 19th century (and as it improved/complicated), musicians started to require the assistance of sound engineers. Now the quality of the final product - the master record - depended a lot on their knowledge, experience and musical taste. Listeners also needed technically sophisticated devices for sound reproduction. And while recording technology/equipment is much more complex than playback one and the skill of the engineers has a far greater impact on the final recording, the quality of the listening experience was associated mostly with playback devices from the beginning. In Hi-Fi/Hi-End magazines, forums and websites the characteristics of recorded sound such as detail, depth, fidelity, transparency, etc. are invariably discussed only in the context of the quality of the playback audio path. It is difficult to say why such an unnatural discourse has emerged. Perhaps it can be explained by the fact that the first playback devices were indeed of poor quality, and their advancement was seen as the main potential for improving the listening experience. Perhaps the economic factor also played a role - the nascent market for playback equipment looked promising, and so the issue of sound quality was debated mostly within the marketing of these new consumer devices. But whatever the reasons, this crafty approach remains dominant to this day and, it should be noted, is fully in the interests of manufacturers of consumer audio equipment.


Flawed metric. This confusing tradition was favored a lot by the fact that the listener has the last word in determining sound quality of an audio device and there is no reliable method of measuring it with an instrument - the traditional set of objective parameters in audio industry (THD, Noise, Frequency response, etc.) does not correlate well enough with subjectively perceived sound quality. This was well known to audio engineers from the very beginning. With the advent of audio digital technology, this weakness of traditional audio metrics (TAM) became even more evident. Today, only an experienced technician can benefit from the results of traditional measurements; the average user cannot do this. While a manufacturer of audio equipment necessarily has such a specialist in the staff and, no doubt, knows very well the actual quality of its products, the listener has to rely only on her feelings/impressions from listening, which are strongly influenced by many extraneous factors - the appearance of the device, its price/brand, seller's advice, etc. In short, the manufacturer knows everything about the product, the consumer knows a little more than nothing. Such asymmetry of the information in the market creates the incentive for a manufacturer to pass off low-quality goods as higher-quality ones [Information Asymmetry, G. Akerlof, ...]. This is exactly what happened in the consumer audio market and, to a lesser extent, in the professional market. Taking advantage of the asymmetry manufacturers have gradually become more focused on marketing rather than research and development, raising prices and increasing profits. Thousand-dollar audio cables and fabulously expensive High-End products are the most egregious manifestations of this phenomenon.

Exaggerated importance of modern playback equipment is the last barrier between musicians and listeners. This barrier can be safely removed with the help of the new approach to designing an audio path and assessment of its quality. The integral part of this approach is the new audio metric - df-metric, which was developed to overcome the shortcomings of TAM [Audio metric that makes a difference].


Basic idea of df-metric. In the most simplified form the basic idea of the new audio metric can be explained as follows. Since sound is entirely determined by the shape of the sound wave, the task of preserving the quality of sound is the task of preserving the waveform of a signal along the audio path. The level of distortion/degradation of the waveform is measured with a special parameter - Difference Level, dB [Difference Level. An objective audio parameter]. The main beauty of this parameter is the ability to measure degradation of any signals, including real music/speech ones. When using the latter, this parameter correlates well with results of listening tests.


Why TAM works poorly. Experiments with signals of different waveforms that were used for testing real audio devices allowed to identify the main cause of low efficiency of TAM. The figure below shows the results of testing eleven players with signals of various waveforms: music, noise, rectangular, triangular and sinusoidal signals [source: Audio Quality of High-End Portable Players]. The devices in the picture are sorted by Df levels of the real music signal.


Test of portable players with signals of different waveforms

Test of portable players with signals of different waveforms: real music signal [Test set of music material "Variety"], noise signal simulating real music signal (BS EN 50332-1), rectangular, triangular and sinusoidal signals. Correlation of Df levels of each signal with Df levels of the music signal is indicated.


We can see that noise signal correlates the most with the musical one (r=0.99), and correlation of the sinusoidal signal is the lowest (r=0.69). It should be noted that when using a sinusoidal signal the value of Df parameter is equal to THD+Noise of TAM (they differ from each other by 3dB). It's obvious that the signals of different waveforms do not degrade in a consistent manner. Each time we measure the quality of a group of devices with a new signal, we get a new ranking of their quality. Thus, the TAM's initial assumption that smaller distortion/degradation of one signal (sinusoidal most often) results in smaller distortion/degradation of other signals turns out to be incorrect. A device that reproduces sinusoidal (any) signal more accurately than other devices may be less accurate in reproducing a signal of different waveform compared to the same devices.

If measurement results depend on the type of test signal used, the question is - which test signal should be preferred while measuring. I think the answer is obvious - the real music/speech signal (or its noise equivalent), as the sound equipment is designed to work with this particular signal.

Recent attempts to expand the list of test signals used within TAM can only increase its efficiency to a small extent due to the low correlation between different test signals. Today TAM reminds Ptolemy's model with the Earth in the center, where ever increasing number of new epicycles had to be introduced in order to predict positions of the planets more precisely. Finally it was replaced by a heliocentric model, which proved to be both simpler and more accurate. An audio metric will benefit in the same way if it stops using technical signals to predict sound quality of audio devices. There should be a real sound signal in the center.


Features and benefits of df-metric. In df-metric technical signals are used only for testing special aspects of device performance and are interesting mainly to its developers. Sound quality of a device is measured with a large array of varied sound material (music/speech) using statistical methods.

Benefits of the new audio metric:

  • a fairly simple and inexpensive way to measure Difference Level
  • df measurements give comprehensive picture of device audio performance
  • higher correlation of measurements with subjective sound quality due to using of real-world sound signals
  • easier interpretation of measurement results, understandable to the layperson
  • Df parameter can be used for both analog and digital signals (regardless of sample rates for the latter)
  • df-metric helps to determine the lower threshold of sound signal degradation, at which this degradation becomes imperceptible for any listener (or majority of listeners). In other words, it helps to determine the required transparency level of an audio path.

The transparency level can be determined both theoretically and based on measurements of real audio devices on the market. If necessary, it is possible to define several additional classes of semi-transparency.

Some disadvantage of the Df parameter is its indifference to the nature/type of signal degradation. As we know, not all distortions are equally noticeable to our hearing. However, all up-to-date audio engineering experience shows that weighing different types of degradation of real sound signals according to psychoacoustic regularities is hopelessly difficult. Their noticeability significantly depends both on the nature of the musical/speech signal (genre, saturation, tempo, ...) and on the peculiarities of perception of a particular listener - her listening experience, musical preferences, features of her hearing (age) and even on the cultural musical environment in which the listener grew up (perception of consonances and dissonances). With such a set of complicating factors the development of any universally agreed objective audio metric that takes them all into account is simply impossible. Fortunately, there is no such need anymore.


Engineering approach to sound quality (the new paradigm). The current level of technology allows to design and produce audio equipment with any low level of distortion/degradation of music signal. The available to date measurements of real audio devices (few so far) show that the audio transparency level for different types of devices is in the range from -70dB to -90dB and some devices on the market already provide such accuracy. In other words, today it is easier to provide the required transparency of an audio path by engineering methods than to sort out which distortions are acceptable due to their low audibility for hearing and which should be excluded. The paradigm of the inevitable distortions can be safely lost for good. It's just the breeding ground for marketing abuses in audio industry. This is especially true for manufacturers of High-End equipment, who are essentially the sellers of distortions, making believe their customers that their distortions are the most musical, fully revealing the richness of recorded sound. The new paradigm of transparent audio (lossless digital/analog audio path) is more honest and allows to shift the focus of listeners' attention from manufacturers of audio equipment to the creators of recordings [The Honest Audio]. It is exactly because they in studio determine those warmth, depth and details which are so loved to talk about in magazines and on sites, existing due to advertising budgets of audio equipment manufacturers. Whereas discussion of all those sound characteristics with the manufacturer has no more sense than discussion of your received letters with the postman. Any sound playback equipment is just a communication channel for an audio signal and the sound quality of this channel is fully determined by the only parameter - the accuracy of the signal reproduction, its transparency. And only if, for some reason, there is no possibility to ensure the required transparency of the channel, it makes sense to allow certain distortions, using our knowledge of psychoacoustics. Such exceptional cases are fewer and fewer nowadays.

Besides strengthening the role of authors/musicians/engineers, the proposed paradigm has a number of additional features and advantages:

  • easy customer control of the quality of audio products thanks to reliable and affordable measurement method
  • recovery of consumer audio market (and not only) by displacing unnecessarily expensive devices of dubious quality and replacing them with equipment of minimal price/quality ratio. Outstanding sound quality at minimum price is not a utopia, but a real goal that can be achieved through the efforts of consumers alone
  • commoditization of many hardware audio solutions and improvement of their sound quality thanks to clear guidelines for their design and manufacturing. Everything that sounds can sound perfectly. It's not only a question of aesthetics, it is a question of our hearing well-being too
  • for audiophiles who practice attentive listening, there will be an easy way to achieve the best sound. As the experience from listening to a recorded sound cannot be any better than what has been achieved in studio, preparing a listening environment according to studio standards becomes a simple and logical guide. The appropriate recommendations in professional audio are well elaborated. And by the way, the very division of audio equipment by quality (not by functionality) into professional and consumer looks more and more far-fetched. Moreover, as we have already found out, the line separating creators of recordings from their listeners gradually becomes less clear
  • for those who believe that a recording can sound better than in studio, a new market of software plug-ins for creative listening is opening up. The transparent audio path can easily be supplemented with sound processing tools: EQs, correctors of room acoustics and frequency response of speakers/headphones (Harman Target Curve everywhere?), tube/vinyl sound emulators, enhancers of depth, width, warmth, details ..., vocal removers and other audio-photoshop filters. The list is limited by imagination of developers only.

The changes will also affect the so-called listening rituals. Instead of heating the tube amps, cleaning the vinyl and placing the cartridge onto the groove there will come reading/viewing additional information about the author/artist (and after listening, perhaps leaving a comment on his personal page or even providing a small financial support), searching for technical details of the record creation or - why not - listening a pair of warm-up tracks of the same style/genre from another author/performer. Connection ritual is an important procedure for any attentive listener and its variants can be numerous, not only maintenance/start of the sound system, which soon may be just not visible to the listener (in-wall speakers, control from phone/tablet).


A step towards a better audio. Many manufacturers today can produce sound equipment of the required quality/transparency level. The new audio metric provides clear recommendations for developers as to what technical parameter should be achieved for the purpose. However, in order for manufacturers to start doing this, they must be sure that (1) the recommended value of Df parameter actually provides the required transparency level and (2) there is sufficient market demand for such devices/solutions. The listeners themselves must also be sure of these, so they could send to manufacturers a clear signal indicating their preferences by making informed purchasing decisions.

Both tasks can be solved by testing the available audio equipment on the market according to the new audio metric. First, such a study will make sure that the metric really works (or it does not) and the best examples of audio equipment have higher transparency levels. Secondly, it will help to determine more precisely the required level of transparency based on the current (real) market situation. For the study results to be meaningful and convincing, the number of devices examined should be large enough (100 - 200). Presentation of measurement results in the form of df-slides [How to read df-slides] and interactive 3D model [draft example: 200+ AD/DA converters] perfectly suits the case. For a number of reasons this study is easier to perform for such segment of audio market as Digital Portable Audio:

  • this is a popular way of listening music in good quality both at home and on the go; almost half of the listeners are using top/specialized smartphones for the purpose. Therefore, the results of the study in this segment will be interesting to many people
  • the size of this segment is rather small so almost all of it could be tested, clearly showing who is who in portable audio
  • portable players usually have a relatively low level of transparency (there are exceptions), which allows using of inexpensive equipment for measurements (TI PCM4222 Evaluation Module, $200). Therefore, many audio enthusiasts will be able to participate in the testing and it will take less time to complete it
  • the measurement method was developed/perfected while testing portable devices [hi-end, portable, beta], so there is already a good understanding of both the possible difficulties of the measurement process and the expected results.

The successful experience gained in this market segment can be extended to other segments later on.


Pump.a.DAP Campaign

Digital portable audio mass testing campaign "Pump.a.DAP" will be carried out:

  • with open discussion of its progress and results [it can be already followed on SE Twitter and on the special Facebook page; other platforms can be used as needed]
  • according to clearly defined/described measurement procedure [soon...] and using open source software [Diffrogram]
  • using open accounting, as the campaign will require some funding. The particular organization of testing and its time frame (expected 4+ months) will depend heavily on the raised sum. However, the results for each DAP will be published as they become available.


Who are invited to participate in the campaign:

  • Audio enthusiasts who already have TI PCM4222 Evaluation Module (or ready to purchase it) and who are able to perform simple measurement procedures. As usually, the authors of the measurements will be mentioned on df-slides of players they tested.
  • Portable audio stores, planning to offer their customers really high quality devices with the best price/quality ratio; the campaign will need access to a large number of models that would be very expensive to purchase otherwise.
  • Audio forums and discussion platforms, ready to provide special place for discussion of testing progress and working issues.
  • Scientific and research organizations; this study has a good potential of becoming the basis for the development of new standards in audio.
  • All music lovers and DAP users who are tired of marketing noise around portable audio, who would like to see a real picture of quality in this segment of the market and who are willing to support this initiative/campaign by sharing its news or with small donations [How to support]. You will most likely be the only source of funding for this campaign, which is carried out for the benefit of ordinary listeners and rather to the detriment of the audio manufacturers, who are pretty satisfied with the rules of the game in the market. Absolutely all donations will be mentioned on the page with results of this mass-testing. If you decide to donate to the campaign, don't forget to specify the model of the player whose measurements you would like to see. These models will be tested in the first place.


A step towards a better future. The 21st century has already become a turning point in the musical life of the planet. New ways of delivering music to listeners, decentralized music sharing, experiments with new music economy, remastering of old recordings, democratization of music creation, the increasing use of AI methods for creation, processing and recognition of music and speech, VR and binaural audio ... At the same time, manufacturers of consumer audio equipment for the most part continue to live in the 20th century and still prefer to earn their money through advanced hybrid marketing.

With the advent of the Internet, a profound transformation has begun in human society itself. The main driving force behind this change is broad social self-organization gaining momentum. Music culture was one of the first to engage in this process (P2P Music was ahead of P2P Money). It turned out that many complex technical and economic challenges associated with our music life can be solved by ourselves, using the potential of a global network that can easily connect both listeners to listeners and listeners to musicians. Progress in many other areas that follow will depend on the developments in music industry.

Could we find and consolidate new practices of direct interaction between musicians and listeners (including new economic models)? Or we will rely on third-party services with their internal corporate rules.

Will we be attentive to and curious about new authors and their music? Or we will continue to focus on gadgets and technical issues of sound reproduction, consuming music content from providers primarily in the background.

Will we, listeners and music lovers, be able to influence the audio market according to our own interests? Or we will rapturously look at the manufacturers who still manage/manipulate this market (the interests of audio technology users are mostly shaped by them).

The answers to these questions will partly determine the future that we choose/create:

The future that is based largely on cooperation and attention to each other, transparent decision-making and respect for nature and resources
The future with intensifying competition and invention of ever more civilized ways of deceiving each other, with infinitely stimulated consumption and decisions made within closed interest groups.

This Audio Transparency Initiative and corresponding testing campaign, designed to shift listeners' attention from technology to music and its authors, is my little step towards a better future.


A few final words in-person:

Serge Smirnoff
You can support the initiative on the page with first results of portable players measurements - http://soundexpert.org/vault/dpa.html#screen-of-fame
Posted on 3/6/21 8:15 PM.
Audio Transparency Initiative