7 Keystones of Accurate HiFi Blind Test [Article]

Convert audio files for mobile phone, tablet, dap, car, dac, smart tv, music server, computer

bestseller

→ ORDER SACD ISO to DSF

SACD ISO to DSD64 only, no audio processing.

Details...

bestseller

→ ORDER [Full Version]

Save budget for full abilities.

Details...

bestseller

→ ORDER [DSF/DFF to FLAC]

DSF/DFF (up to DSD128) to FLAC, WAV, mp3, etc (up to 24 bit/192 kHz).

Details...

bestseller

→ CONFIGURE & ORDER

Save budget for small functionality.

Details...

¹ If you buy first core module (2023/12.x version) of your "AuI ConverteR Modula-R" module kit from 24 August 2023 to 24 October 2023, you will get free update to version 2024 (13.x) after its release for all modules that are bought from 24 August 2023 to 24 October 2023.
 If you buy "AuI ConverteR PROduce-RD" (2023/12.x version) from 24 August 2023 to 24 October 2023, you will get free update to version 2024 (13.x) after its release.

"Audiophile myths debunked" is one of the popular topics in forums and articles.
As rule, these "myths" is based on listening impressions, unexplained technically or small measurable difference is inaudible theoretically/probably.
To check the "myths" accurately, it's recommended to use a method, that ensures repeatability with the same results in similar conditions.

HiFi audio blind test is a way of "objective" measurement of "subjective" perception.

Read about the most important things, turning a HiFi test into safe evidence.

February 6, 2024 updated

Author: Yuri Korzunov,
Audiophile Inventory's developer with 25+ year experience in digital signal processing,
author of the articles that make audio easy for beginners

7 Keystones of HiFi Blind Test [Article]

How to estimate sound quality
Double-blind audio test
ABX test for audio
What's proper blind Hi-Fi test
Methodology
Test protocol
Listening place
Testing equipment issues
Measurement precision issue
Big number of measurements
Careful condition control
Conclusions
Frequently Asked Questions
Popular HiFi test kinds
References
Read more

How to estimate sound quality

HiFi blind test is a comparison when a listener (participant) try recognize unknown audible sample.

Sample here is either a recording, or equipment, or software, or apparatus/software mode.

This trial is an attempt to measure non-measurable music feature - perceived sound quality.

Objective here is measurability and repeatability in equal conditions. It's a scientific approach.

Subjective here is human feelings.

Still, we cannot access foreign feelings directly. We cannot listen to music like another person. We cannot be sure, that we feel the same each time even.

We can ask him/her only. But they can't exactly describe their feeling. Especially, it refers to the subtlest details, provided by modern musical equipment, software, HiFi test records.

But we want to know: is it really audible or not?

The main aim of the blind trial is to eliminate the subjective part of perception: price, bias, habit, etc. I.e. probable imaginary perception should not impact the experiment.

Double-blind test is used for the "subjectivity" reduction.

Double-blind test is a trial where neither listener nor the conductor knows what is sample.

Blind test audio

ABX test for audio

ABX test may be used in ear comparison.

ABX test is a trial where samples A and B may be compared with sample X (either A or B). Listener should recognize what is X (A or B).

ABX test audio

ABX audio test software

Foobar2000 with ABX test software plugin may be used to ABX test of audio files. However, the author doesn't know what happens with audio stuff when ABX-test-foobar's plugin prepares files before comparison (progress when the plugin start).

For ABX test Mac OS software you can try to find ABXer utility. For iPhone, ABX Tester application exists. Also, cross platform Lacinato ABX is available. The author doesn't learned these pieces of software, so he has no opinion.

If somebody claims that sound is different, it can't measure its audibility. Because we listen in the brains via ears.

Theoretically, measurement tools are more sensitive than human ears. But, without trials, we don't know where is audibility edge of difference.

Audibility edge is:

maximal value of a feature or
difference of values between listening samples,

that cannot be distinguished by human.

Professional double blind test of music equipment is not home entertainment. It is hard long expensive work.

It doesn't necessarily audiophile blind test should be done at laboratory. At home nobody can prohibit us to do it.

But safe trial results are necessary as claim evidences.

Proper blind test should consists these elements:

Methodology
Protocol
Big number of measurements
Listening place
Testing equipment issues
Measurement precision issue
Careful condition control

Test should ensure repeatability in the same conditions.

Methodology

Trial begins with methodology design. Methodology define trial's:

aim,
precision,
implementation,
other things, noted below.

Methodology should be designed first

Test protocol

Test protocol is paper, where recorded detailed data about conducted trial:

equipment,
measurement tools,
participants (and its listening skill, occupation, education, other),
trial conditions,
other.

Hi-Fi test protocol

Hi-Fi test protocol

Testing and measurement equipment may be registered with unique identifier of item (serial number, as example).

The main protocol aim is ability to check experiment conditions in case of doubt or to better understand result reasons.

Listening place

To save time several participants may be accommodated in one room. In the room are placed several seat rows.

In the trial result interpretation, necessary take into account:

each seat impact to sound,
seat with and without listener impact differently,
speaker sound is different for each seat.

Speaker have individual radiation pattern.
Frequency response depend on listener place relatively speakers and listening room's walls. Because acoustical rays are interfere, bounce from a surfaces and interfere too.

Testing equipment issues

Testing equipment (apparatus that will tested in the trial) should be checked to workability by way, that may be described in the methodology of the trial.

It is desirable, if testing equipment have unique identifier (serial number). It help to repeat experiment or found difference result reasons in case of doubt.
Because different equipment instances may have various figures and sound differently.

Very important thing!

LOUDNESS OF COMPARED SAMPLES

Human perception edge is close to 1 ... 2 dB of loudness. So 0.1 ... 0.2 dB level normalization of samples is recommended.

Human have limited time of echoic memory (auditory event retaining) [2].
So time sample listening should not be too long. Immediate (real-time) easy switch between samples must be ensured.

Updated: When we compare headphones, tested sample can't be isolated from participant technically. For solving this issue, synthesis of headphone features and comparison via single headphone unit were suggested [3].

Measurement precision issue

In statistical calculations, measurement error values may be different. We don't know it exactly, but accept in first approach, that it have normal distribution [1]. So...

To provide measurement precision X, it is recommended measure value Y with precision X/3 or higher.

As example, we want to provide ear trial measurement precision 1%.

1/1% = 1/0.01 = 100 tryings.

So trying number 3 times higher 300 = 100 * 3 is recommended.

Defined measurement precision should be provided

Big number of measurements

The ear test have a many variables due human factor, above all.

Somewhat, it may be compensated by big number of tryings, participants, equipment items.

Big participant and trying number is recommended

In the "Measurement precision issue" part was considered example: providing 1% precision.

There are recommended 300 tryings.

However, it don't guarantee 100 % of true.

Participant skills can cause biasing of results, as example.

To compensate it with precision 1%, we should invite 300 = (1/1%)*3 = (1/0.01)*3 participants.

Thus total number of measurements is 90000 = 300 trying * 300 participants

We should account each feature, that can cause mistake probably. To a first approximation, we should invite participants with different skills and groups of participants with same skill should have same number of members.

In the trial conclusions we should group results by participant skills.

General rule:

number measurements is enough,
if adding of several participants and/or tryings and/or equipment items and/or other
impact to results into allowable error margin.

Careful condition control

Conditions define experiment. Changing conditions can bias results significantly.

Careful control of conditions provide the test exactness

Careful control of coditions provide the test exactness

Example:

Somebody want to compare DACs. He asks peoples who have the devices, check it on some public HiFi test CD.

These tests, as rule, performed at home without exact condition control. We have experiment at different speakers, listening places, the ambient noise, etc. Even experiment, described in details, may have subtle, at first glance, issues, that impact upon result.

As rule, before experiment we can't know exactly, that details are important in the trial. Only result figures can show, that details are unimportant in our experiment: impact to result into allowable error margin.
As example, we cannot expect listening skill affected trial results. But for some tests, it might be important.

Conclusions

Blind test, double blind test, ABX test are intended to "objective" measurement of "subjective" (immeasurable) perception.
Properly performed and recorded hi-fi blind test may be a serious objective evidence of subjective perception.
The trial may be performed at home for personal purposes, but can't be used as technical evidence.
The trial should be designed, performed in a manner that maximizes independence from number of participants and tryings and/or other. As reference value may be allowable changes into error margin.
The independence is achieved by big number of participants and tryings, equipment items, etc.
Even correctly designed and performed HiFi test don't guarantee absolute truth. Because new knowledge may open new details.

Share the article in the social networks, forums

Frequently Asked Questions

How do you test a blind audio?

To blind test its partisipants should not know what is sounds now. It may require additional equipment.

Read details...

What is blind listening?

Blind listening is means that listener can't see acoustic source.

Read details...

Popular HiFi test kinds

lossless vs mp3 test
16 bit vs 24 bit
PCM vs DSD
FLAC vs WAV
acoustic cables

References

Video: Some New Evidence that High School and College Students May Prefer Accurate Sound Reproduction

Audio Basis - articles about audio

7 Keystones of Accurate HiFi Blind Test [Article]

How to estimate sound quality

Double-blind audio test

ABX test for audio

What's proper blind Hi-Fi test

Methodology

Test protocol

Listening place

Testing equipment issues

Measurement precision issue

Big number of measurements

Careful condition control

Conclusions

Frequently Asked Questions

How do you test a blind audio?

What is blind listening?

Popular HiFi test kinds

References

Read more