Audio Basis - articles about audio
DSD (DSF, DFF files, SACD) competition with PCM (FLAC, WAV, AIFF, mp3, others) produces many debates around. But DSD and PCM formats have common features and principles. Read about sound quality comparison, myths and technical explanation by audio software developer with 25+ years of practical experience in signal processing Yuri Korzunov.
When we think about a new DAC, we ask: Is DSD worth it? What is better: DSD or PCM? What about dsd vs vinyl? Or DSD 2.8 versus 192 24 PCM?
Direct Stream Digital is implemented in:
Pulse Code Modulation is implemented in multiple formats: WAV, AIFF, FLAC, ALAC, mp3, CD-audio, etc. It's so-called PCM files.
Vinyl, as the analog mechanical source, causes more distortions, than digital DSD. But some audio distortions may be considered as subjective sound advantages. So this matter is complex enough.
Both PCM and DSD are digital formats. By summing up some widespread myths:
Below we will consider technical explanations and debunk these myths.
Read also: DSD vs FLAC >
First, we can define: what is "sound quality" to discuss below.
For digital audio format, quality is a degree of identity between waveforms:
There is no mystery. It is fine detected via spectral-time analysis in different forms.
Identity degree may be detected via the measurement of simple difference between original and restored waveforms (sample difference).
However, the spectral method is more informative and sensitive to distortions.
PCM (pulse code modulation) and DSD (sigma-delta modulation) difference is not as great, as it seems at first glance.
Both kinds of modulation contain carried (musical) signal in the lowest part of the spectrum.
The lack of bits is solved via difference in quantization-noise distribution and behavior.
For PCM, quantization noise is evenly distributed across range {0 … [sample rate]/2}.
For DSD, noise energy is pushed to the inaudible high part of the spectrum. The pushing is called noise shaping.
Significant energy of noise is out of audible range. And the reserve of the total band is needed. I.e., a sample rate higher, than for PCM, is required.
PCM quantization noise correlates with useful signal: no signal is no noise.
DSD noise doesn’t depend on signal. There is noise during silence. DSD DAC eliminates this noise.
The maximum signal level causes overload.
Minimum signal level may be lesser or equal to quantization noise level (a.k.a. noise floor). Signal "under" noise has bad recognition.
Simplified dynamic range
Simplified dynamic range is a difference between maximum level and noise floor.
Warning: It is not a technically correct definition. But, in the frame of the article, we will use it for easier understanding.
Dynamic range is a difference between the maximum and minimum allowable levels of a signal transmitted thru a unit.
We can discuss dynamic range as a common base for both PCM and DSD. Dynamic range is a single matter for different bit depths.
Maximum level in the digital audio is accepted as 0 dB. It does not depend on bit depth. But the noise floor depends on the bit resolution.
As example, we have a certain noise level for some PCM bit depth.
After the bit resolution reducing, the noise level is higher and dynamic range is reduced.
Bit depth truncating (from PCM to DSD)
However, the noise level may be kept into limited useful frequency band. Noise energy, that grows due to bit depth truncating, may be pushed out of the band via noise shaping.
Noise shaping, PCM to DSD transformation
It is very like to DSD. And, yes! It is a real multibit DSD!
Read about PCM to DSD conversion and DSD to PCM converters here > and here >
Now let me try to come from DSD to PCM. Here we also have given noise level.
Noise shaping may be characterized by steepness. Both total and useful audio signal bands define the steepness.
Noise shaping steepness
Steeper noise shaping may be used to expand the useful band. So DSD 64 vs 128 (5.6 MHz) is a matter of steepness.
Higher steepness can cause lesser stability of sigma-delta modulator (noise shaper). So lesser steepness may be good.
Adding a bit for sample resolution decreases noise level. So noise shaper's steepness may be reduced too. Because noise energy, that needs to push out of the useful band, is lesser.
Adding bit - lesser the steepness, DSD to PCM transform
At the last, the noise floor become flat. So DSD signal is transformed to PCM.
DSD has a significantly higher sample rate, than PCM. It's needed because band reserve is needed to receive excess noise energy. It makes useful frequency band clearer.
However, even simple band expansion causes lesser noise level. Because total quantization noise energy is constant. It's distributed in the full band.
Graphically noise energy is the square of the noise spectrum. It's like a rectangle. If its width (band) is expanded, height (noise level) diminishes.
Sample rate and quantization noise level
Two times band expanding decreases noise level at 6 dB.
So we can say, that DSD decreases noise level two way simultaneously:
The Nyquist theorem is the same for both DSD and PCM. In both cases, the upper half of spectrum like to mirrored lower one. I.e. useful spectrum consumes lower half of the spectrum. The upper half should be filtered to restore analog signal.
For DSD high part of the spectrum into the lower half, should be filtered too. Because it contains high-frequency noise of sigma-delta modulation.
DXD is high-resolution PCM, that is extracted from DSD with the keeping of "legacy" ultrasound noise. DXD compatible hardware must filter the noise. Otherwise, ultrasound noise can cause audible products due to non-linear distortions in playback audio system.
DXD is designed for DSD editing. However, some processing has non-linear distortions and the ultrasound noise can cause audible products.
Each audio application begins from an analog-to-digital converter (ADC).
There are many types of ADC.
ADC must provide suppression of all stuff over [sample rate/2] before analog signal digitizing.
Otherwise, the stuff will be shifted/mirrored into the range of the low half of the sample rate. Read more here >
Practically, it is recommended to suppress all above transmitted audio band 0 … 20 kHz (maybe slightly more). It's necessary to avoid transmitting energy excess, that consumes resources of dynamic range.
For PCM and DSD ADC suppression is provided via analog low-frequency filter only. Filter have slope suppression characteristic by frequency (suppression about 20 … 48 dB per octave).
Octave is 2-time difference between frequencies.
More recording sample rate - more suppression - more quality (lesser distortions) of captured sound.
DSD ADC has significantly higher sample rate, than PCM DAC. It provides better suppression in the forbidden frequency range.
All excessive stuff may be filtered in digital form.
Using of resistor matrix in DAC demands very high precision of components and voltage.
The simpler decision is using of fast-growing saw voltage to measure input analog value.
This principle is the principle of DSD. I.e. DSD is simpler/cheaper format for sound capturing, than PCM.
Applying of DSD digital-to-analog converter (DAC) allow maximally simplify scheme and DAC adjusting.
In the first approach, DSD DAC is a simple low-frequency filter (that pass low frequencies - music stuff - only).
Higher sample rate, than for PCM, simplify using of the analog filter. So steep transient to suppression area, like PCM, is no need.
So many precise components is no need too.
Almost all modern DAC use internal PCM to DSD conversion "on fly" for digital-to-analog conversion.
If use DSD as end-user format,
are needed only.
I.e. same result with fewer efforts than "native" PCM is there.
Read more about PCM and DSD DAC comparison >
Not once digitizing/restoring to the analog of the square wave for PCM and DSD was compared.
More steep front and lesser ringing in front/end sides of square impulse are considered as DSD advantage.
Let's consider "ideality" of digitizing/restoring of the square wave.
Square wave has the infinite spectrum. I.e. for ideal digitizing/restoring, infinite sample rate is needed.
The DSD sample rate is significantly higher, than sample rate of PCM. It is the reason for steeper front/end of the square impulse.
Lower ringing for DSD is the result of lesser steepness of the filter, than one used for PCM. Because PCM has lower sample rate.
Another side, using wider (more 20…24 kHz) bands for DSD give more noise energy, that fast growth upper 24 kHz.
I.e. price of a better form of square is higher noise level.
Lesser ringing due to lesser steepness of DSD DAC filter (less ringing) leads to worse filtration. Thus it causes a higher noise level.
With increasing of PCM sample rate, steeper front/end of the square impulse may be achieved too.
I.e. no difference between DSD and PCM for restoring square. Values of sample rates and filter steepness are there only.
Now let me ask: why us need to restore ideal square for audio applications?
While only one solid practically proven theory exists: humans listen up to 20 kHz.
Sometimes we can see references to the article.
However, the article considers brain analysis of audio environment via principle dissimilar by Furie.
It allows discriminating short time quants of audio content.
However, there no word about new in mechanical capabilities of human ears - listening up to 20 kHz.
Therefore, why we need ideally re-create form of square impulse?
In audio applications we listen via ears, don’t watch via eyes.
So it is necessary to provide maximal fidelity in 0 … 20 kHz range.
It leads to visible (by eyes!) lesser steepness of fronts.
Anyway, inside our head, we have the same lesser steepness of front level, that ideally (theoretically) played back on speakers.
It is a feature of our ears.
So, why restoring of square form better, than can receive our ears, is needed?
Upper range, after restoring, can produce audible products due to non-linear distortions.
And the range will be analyzed via "principle dissimilar by Furie" :)
Almost everybody knows that it is impossibly editing DSD "natively".
Here «native» means editing without intermediate converting to PCM. Such converting is "very-very bad"!
It's necessary to consider two things:
PCM is format with multi-bit samples. Adding bits to DSD change nothing.
Probably you listened "scary" phrase "destructive decimation".
Decimation is the simplest removing of excess samples to decrease sample rate.
Before decimation, the filter of all frequencies upper half output sample rate is applied, to avoid distortions in the audible range.
This filtering can cause ringing artifacts.
However, qualitative filtration has significantly lesser artifacts, than mixing and effect processing in music production.
PCM has no noise in the upper part of the spectrum. It allows to successfully apply non-linear processing (as for an example, overdrive/distortion effects) to musical stuff.
As an alternative, multi-bit DSD is suggested. However, "multibitness" allow solving only mixing and simplest multiply volume changing.
Elementary changing level to 1 dB cause trouble.
Multi-bit DSD has noise in the upper part of the spectrum too. Thus, to apply non-linear processing, converting DSD to "PCM" is desirable.
Also, it needs to remember, that computer can apply multibit math processing only.
End-user of the editing system should not worry about intermediate conversion(s). It is an engineer issue, how to find "hidden" possibilities and what "tricks" are need to apply.
It is necessary to consider editing system as a ready decision with given features at input and output.
Read more about DSD editing >
The main issue is DSD and PCM are technically impossible (including blind test) to compare as digital formats.
It is possible to compare only real systems, that use either PCM or DSD or both.
The systems must be considered as "black boxes" with input and output analog signal. These signals can be compared via spectral methods or hearing tests.
Final result depends on used recording, components, settings, technical decisions.
DSD over PCM (DoP) is special digital audio protocol to provide compatibility DAC interface and other device, in instance, computer.
Read more...
In common case, DSD and PCM have own issues. However, a PCM DAC contains sigma-delta modulator (DSD conversion) inside to solve some digital-to-analog conversion issues. So, DSD DAC is more simplified than PCM one, theoretically.
Actually, audio device desing and recording quality is matter.
Read more...
DSD has some theoretical advantages comparing PCM. Whether the DSD advantages are implemented sufficiently to surpass given PCM case, or not, it's unknown.
Read more...
SACD is optical disk. It contains sound recordings in DSD format.
Read more...
Lossless sound files are the highest quality audio files.
Read more...
FLAC is PCM audio format. DSD may be packed into PCM as DoP.
Read more...
DSD isn't better or worse than FLAC. Audio system implementation makes it better or worse.
Read more...
MQA is one of implementations of PCM format. We should compare DSD vs MQA as DSD vs PCM.
Read more...
Yes. You can convert PCM to DSD...
Yes. FLAC may be converted to DSD. Read guide...
Read guide how to convert mp3 to DSD here...
Read how to convert DSF to FLAC in this guide...
Roon software can convert DSD to PCM when play DSD music.
To playback DSD audio on PCM DAC, it's necessary to convert DSD to PCM. Read details...
Spotify doesn't have DSD streaming codec.
Look for DSD streaming...
Tidal does not stream DSD. Read more about DSD streaming...
In simple words
Let's imagine that DSD music is house built of square bricks.
And PCM music is the same house built of rectangle bricks.
Both houses are plastered on the outside.
You can't distinguish one house from another.
However, both houses are built different teams. And they are different slightly.
Details
Practical superiority of quality for each system is subject of engineer art, but not used format.
Use strong sides of each format in music ecosystem.
Author: Yuri Korzunov,
Audiophile Inventory's developer
December 21, 2022 updated | since September 9, 2015