audio file converter for music production and hi‑end audio
Audiophile Inventory on Facebook Audiophile Inventory on Twitter Audiophile Inventory on Instagram Audiophile Inventory on Pinterest Audiophile Inventory on Telegram Audiophile Inventory on YouTube

PCM Audio [Sound Quality, Myths, Definitive Guide 2019]

Audio Basis - educational articles

yuri korzunovPCM audio (Pulse-Code Modulation) is the coding of an analog signal to digital form (representation in numbers). Most of a digital audio formats (WAV, FLAC, mp3, AIFF, ALAC, AAC and other) are based on the modulation. Read how PCM works, about its sound quality, myth reasons, comparison with alternative audio formats and other in simple words. Sometimes size compressed PCM is called as bitstream. Read easy explanation by audio software developer Yuri Korzunov.

Watch and share: Hi-Res Audio [How it works. Sound quality. Myth debunking]

Watch and share: DSD vs FLAC [Format Comparison]

• DSF, DFF, ISO (1-bit audio) is supported in maximal PROduce-RD and configurable Modula-R
• For ISO tracks, DSF, DFF with length more 3 minutes FREE demo mute 2 second silence in the output middle
• DVD ISO is NOT supported

How to PCM works

PCM analog-digital converter, quantization

Analog-digital converter (ADC) is a device, that periodically measure analog signal voltage and send the measured values as numbers (in digital form) to PCM digital audio output.
PCM encoding is the conversion of an analog signal to digital form.

Analog and digital form of signal

Analog and digital form of signal


The period between measurements is the same.

Sample is digital value of measurement (amplitude).

Quantization is the measurement step of the voltage level of an analog signal.

Maximal amplitude in digital form have value 0 dBFS (decibel relative full scale, 2^Nbit).

Level quantization

Level quantization


Samples may be stored and transmitted without altering of information. It is the main advantage of digital signals, comparing analog ones.

However, binary data may be damaged during storing and/or transmitting.



Sample rate

Sample rate (sampling rate) is amount samples per second (measured in Hz, Hertz).

The sampling rate is constant for pulse-code modulation coded audio stream.

As rule, an analog signal is coded as real numbers (math definition). It is usual numbers that we use permanently.


Sample rate


Nyquist theorem

Nyquist theorem defines theoretical minimal sample rate.

Let's pay attention to "theoretical" word. Real implementations require to account other factors too. Read below about myths, where we'll discuss, why higher sample rates are used.

In simple words (it is not exact math definition) the Nyquist–Shannon sampling theorem may sound as:

Coded digital audio signal has a total band=[sample rate]/2

Below we will consider the theorem details, when 44.1 kHz / 16 bit sound quality matter is discussed.

More exact the theorem wording in sound terms:
Endless analog sine signal may be coded (to digital form) and restored with sampling rate 2 times more the
signal's frequency.

Keyword is "endless" here. But real musical signal components are finite.

More samples per finite signal duration keep more information about source signal to restore it from digital to analog form.
More samples per duration, it is closer to infinity.


Nyquist theorem (about minimal sample rate)

Nyquist theorem (about minimal sample rate)


Alternatively, the input samples may be processed via Hilbert transform. It converts real numbers to complex ones.

The complex number contains real and imaginary parts.

In this case, coded digital audio signal has a total band=[sampling rate].

However, it has no big sense in audio, because digital data size is the same.


PCM audio ADC (analog-digital converter) scheme

PCM audio ADC (analog-digital converter) scheme


Analog-digital converter capture full frequency band at the input. But all stuff above [sample rate]/2 is folded with band {0 ... [sample rate]/2}. It adds noise to the coded digital signal.

Analog-digital conversion without input filter: folded spectrum

Analog-digital conversion without input filter: folded spectrum


Thus analog signal band above [sample rate]/2 should be cut.

But the analog filter isn't steep enough. The issue is solved via a higher sampling rate: higher sample rate is lower filter gain at [sample rate]/2 point  (lesser signal at the filter output).

To increase steepness, a digital filter is used. But it adds additional distortions.

PCM analog to digital conversion: steep vs non-steep filter

PCM analog to digital conversion: steep vs non-steep filter


Also in DAC sampling rate may be increased (oversampling) to better work with the analog filter. Oversampling works with the digital filter in pair.

There is a myth that non-multiple resampling causes more distortions, than multiple one. But in case 48000 and 44100 Hz, resampling is applied the same way.

Read more about sample rate conversion (multiple vs non-multiple) >


Bit depth

Bit depth is number bits of number code (word), that store analog signal value.

Bit depth defines the number of digital levels, that can be stored.


Bit depth


Maximal value of the word is the maximal positive value of an analog signal at ADC input. Its code is:


where N - number bits of the word.

Minimal value of the word is maximal negative value of analog signal at ADC input. Its code is:


The total number of measured levels is:


Pulse code modulation: bit depth

Bit depth


Bit depth truncation is bit depth reducing via removing of one or more bits.

Rounding is bit depth reducing via removing of one or more bits with altering of reduced number according to removed bit(s).

Rounding may be applied when float point bit depth is converted to integer one.

Rounding is more exact mathematically, than the truncation.


Quantization noise

Codes of analog values, stored into the words have precision limitation. The limitation is defined by total number of measured levels L. So stored codes (samples) are not equal exactly to real analog voltage.

Quantization error is difference between sample (digital value) and real voltage of analog signal.

The error is various for each sample and lesser than 1.

The error is observed as quantization noise at digital signal spectrum.

The quantization noise is equitably distributed across frequency on the digital signal spectrum.

The energy of quantization noise is constant in total band. Thus, increasing of the total band of an analog signal after DAC (sampling rate increasing) decrease the noise level in the audible range [0 ... 20 kHz]. It happens because audible range has a fixed width.

Quantization noise
depend on the band of an analog signal

Quantization noise

Of course, real DAC has output electrical noise. As rule, its level is about -117 ... -120 dB.

Quantization noise altering for bit depth and sample rate

Parameter altering Quantization noise altering Domain
[sample rate] x 2 -6 dB in analog domain
[bit depth] x 2 -6 dB in digital and analog domain
[Fourier transform length] x 2 -6 dB in digital domain

In the digital domain quantization noise level is decreased about 6 dB for Fourier transform length 2 times more.


Quantization noise level formula for bit depth M:

NQ=1/(2M × V12) [11], where

V12 is the square root of 12.


In the digital domain, NQ is the same independently sample rate. But the Fourier transform divide digital band to parts (small sub-bands). Number of the parts is [the transform length]/2 (for real samples) and [the transform length] (for complex samples). I.e. more the length, more the parts, lesser energy into each of the parts.

Fourier transform is converting oscillogram (time domain) to spectrum (frequency domain).

In digital audio, we mean discrete Fourier transform in most cases. The discrete mean, that spectrum is divided to taps.

Fourier transform length is tap number.

FFT (fast Fourier transform) is case of Fourier transform. It's length is 2K, where K is integer number.

Noise level and Fourier transform taps dependency

Noise level and Fourier transform taps dependency

If there are tips 2 times more, noise energy is redistributed. And each tap have energy 2 times lesser.

Noise energy is square of noise part, concluded into tap.

If we make tap width as before the redistributing (tap width at the part A of the picture), noise level will 2 times lesser. Because square of noise is constant. It happens on computer display, when tap width have same pixel width on a screen.


Read below more about bit depth, quantization noise and dynamic range for 16 bit implementations.



PCM digital-analog converter

Digital to analog converter transform samples to analog values.

PCM DAC demodulation:

  • convert digital codes (samples) to voltage levels,
  • filtering aliases via output filter.

PCM DAC (digital-analog converter)

PCM DAC (digital-analog converter)


At first glance, PCM DAC produce "stairs" at output. But it is not so. Because "the stairs" are smoothed by analog filter at the digital-analog converter output.

It allow to says, that analog signal in band 0 ... [sampling rate]/2 is fully restored.

But that's not exactly true. Because the analog filter isn't ideally "brick wall".

Non-filtered "stair" spectrum contains:

  • useful musical signal at bottom of the frequency axis;
  • aliases, that is copies of the musical signal. They are repeated along frequency axis.

Half of the aliases are flipped horizontally. In ideal audio system without non-linear distortions these aliases will inaudible.


  1. in real musical systems non-linear distortions can cause intermodulations, that generate audible products by inaudible components.
  2. aliases consume part of dynamic range and reduce dynamic range abilities of the useful musical signal.

Both of these issues are solved via "stair" filtering.


Read more how to works DACs, about its advantages and disadvantages:



PCM file formats (pcm audio codecs)

Codec is:

  • encoder (encode PCM to an audio format);
  • decoder (decode an audio format to PCM).

In the table noted only file abilities, that author know. If you have additional information to correct description or other, contact us.

PCM file types

Type Lossless Lossy Hi-Res capabilities Multichannel Metadata Additional information
WAV yes yes yes yes text metadata are supported (LIST chunk),
non-standard artwork implementation (ID3)
SONY WAV64 and WAV RF64 formats have big file (more 2GB) ability
FLAC yes flac file as container maximal resolution 32 bit 384 kHz yes supported Capable to size more 2GB, may be used as other format container, including MQA
AIFF yes yes yes yes Text metadata are supported,
non-standard artwork implementation (ID3)
ALAC [4] yes   yes, maximal resolution 32 bit 384 kHz up to 7.1 supported  
CAF yes yes yes yes May be recorded into Free chunk. Compatibility issues are probable Big file (more 2GB) ability
mp3 [5]   yes no (32, 44.1, 48 kHz only) stereo only supported No size limitations, consists of frames
MQA [3]   yes yes   supported as FLAC
(when FLAC container is used)

May be:

  • provided in FLAC container,
  • played back without decoding.
AAC [6]   yes yes, up to 96 kHz up to 5 channels supported Designed as mp3 replacement to improve perceived sound quality[1]
DTS [8] yes yes up to 24 bit / 48 kHz (core) 5.1 depend on container Dolby Digital technology
Dolby TrueHD [12] yes   up to 24 bit / 192 kHz up to 8 channels (24 bit / 96 kHz)   Dolby Digital technology
ac3 [9]   yes no (up to 48 kHz) supported depend on container  
WMA [10] yes yes yes, up to 24 bit / 96 kHz supported in WMA9 PRO supported  

Comments to the table:

"bit" mean bit-depth value (example: 24 bit),

"kHz" mean sampling rate value (example: 96 kHz).

Sometimes files with same extension may contains different extensions. Examples: *.m4a files can contains either ALAC or AAC; FLAC can contains either FLAC or MQA or DoP.

A reading software (player, converter, editor, other) parse file. As rule, file consists of data blocks. These blocks have identifiers. And the reading software recognize the block types. Sometimes the software check data integrity. If there are non-correct data, the software may to reject file opening (depend on implementation).

Size compressed file types are used for saving hard disk space. Especially, it is actually for portable devices: digital audio players (DAP), mobile phones, etc.

"Big" home audio systems have no disk space issues in many cases.

Portable devices are able to playback multichannel files. But it is listened at stereo headphones, as rule. So multichannel records consume disk space to extra channels. The space extra size issue may be solving via downmixing audio files to stereo.

Downmixing quality depend on implementation.

Read below about PCM sound quality issues





Jitter is unstability periods of samples. It cause non-linear distortions/noise.

Jitter appear in ADC and DAC. It is impossibly to get rid of jitter in real music systems. Because there are electromagnetic interference, non-stability of clock generators, power line interference issues.

However, the jitter may be minimized via special technical decisions.

Jitter cause non-linear distortions

Jitter audio cause non-linear distortions


Read more about jitter >





Quantization error cause non-linear distortions. It correlate with musical signal. Correlated distortions are considered as especially unwanted to perceived sound quality.

To decorrelate the distortions and the signal, dither is applied.

Dither audio (spectrum)

Dither audio (spectrum)


Dither is extremely low level noise, that added to musical signal before ADC or before bit depth truncation prior to DAC.

Dither audio (scheme)

Dither audio (scheme)


To reduce noise in audible band, noise shaping may be applied. It looks like "pushing" of noise energy to upper part of frequency range. But the shaping demands of band reserve to the "pushing".

Read more about dither >



PCM size compression


Size compression of audio content is way to save space at hard disk or increase throughput in communication line. Compression is performed by encoder and decoder  software.

There are 2 types of the compression: lossless and lossy.

If audio data content isn't compressed, it lossless always.

PCM lossless and lossy

PCM lossless and lossy



Lossless compression is size compression when input and output binary audio data content are identical.

Lossless PCM formats: WAV, FLAC, AIFF, ALAC, APE, ...

Lossless formats have same sound quality. There is opinion, that different sound may be there. Some objective hypotheses exists too. But still no researches, that are famous to author.

Also DSD may be packed in PCM as DoP audio format.



Lossless compression is size compression when input and output binary audio data content aren't identical.

Lossy PCM formats: mp3, AAC, DTS, MQA[3], ogg, ...

Lossy formats have different sound losses.

We can compare lossless and lossy formats technically.

Different lossy formats look for minimal losses by psychoacoustic criteria. And these compression methods are based on various hypotheses.

As example, AAC format was developed to improve mp3 sound quality according newer knowledges about brain processing of sonic information [1].



PCM vs Bitstream


"Bitstream" is non-official name of compressed by size lossy/lossless coded streams (PCM, Dolby, DTS, etc.). I.e. it is streams that have stream volume (bit/sec) feature as compression estimation. As example, PCM vs Dolby Digital is one of cases of PCM vs Bitstream. From this point of view, mp3 and FLAC are "bitstream" too.

As rule, higher stream volume for single codec give better sound quality. But, other hand, higher bitrate may lead to lesser channel number in fixed band width of digital interface. As example, stereo instead multichannel.

AV users asks what is use PCM or bitstream to transmit data from player to audio-video receiver of home theater.

If your player and AV-receiver are capable to PCM (including multichannel [if it is need]), then use PCM. Otherwise, use bitstream codecs.

PCM vs Bitstream

PCM vs Bitstream

As rule, bitstream is recommended for SPDIF interface. HDMI (latest versions) can transmit multichannel PCM (LPCM).



PCM vs Dolby


Dolby is size compressed PCM. It used to transmit audio signal thru digital audio interfaces with lower speed. As example, multichannel audio thru SPDIF.

If compression is lossless, it is not matter Dolby or original PCM there. Lossy compressing cause some quality losses.
Generally, it is impossible to say, the losses will audible or not.





It is impossible to compare PCM and DSD from technical point of view. Because different hardware is used there.

Sound quality don't depend on format. But audio quality depend on format implementation.

DSD after edition demands re-modulation. PCM don't.

DSD DAC is simpler than PCM digital-analog converter.





WAV and FLAC is binary identical. They both provide the same sound quality.

Read details here >





LPCM (Linear Pulse Code Modulated Audio) is PCM with regular intervals between quantization levels of analog voltage. It is common PCM in audio. [2]

LPCM (Linear Pulse Code Modulated Audio)

LPCM (Linear Pulse Code Modulated Audio)


Sound quality


Sound quality mean distortion level. However, distortions may have different distribution by frequency and phase. And distortions must be estimated in the light of psychoacoustics.

PCM sound quality


What sample rate is enough

Aliases (distortion) appear during analog-to-digital and digital-to-analog conversion. Sample rate define the alias period on frequency axis. The period is half of sampling rate. All audio content above the period should be removed to avoid of distortions of useful musical signal.

The analog filter makes the removing. However, analog filter isn't steep. So, the higher the [sample rate]/2 the deeper suppression.

Higher sampling rate help to implement ADC and DAC with lesser distortions.


What bit depth is enough

Bit depth define minimal noise level into record. The bith depth should provide noise level below noise floor of electrical circuits of ADC and DAC.
If recorded musical stuff will digitally processed (gain increasing, equalization, level normalizing, other), noise floor of processed stuff should be below DAC noise level.

In audio software, processing may be implemented in 32- or 64-bit float point formats. These formats have high precision (low quantization noise) and better overload abilities, than integer ones.

As far as author know, DAC can't receive data in float point formats. These formats are rounded to integer into playback software to send to DAC.

DAC with sigma delta modulator are able to receive float point formats. But author know nothing about such real implementations.

So, need to consider necessity in noise floor reserve to take of bit depth value.



What about 16 bit 44100 Hz? Myth reasons

Author consider as myths two states:

  • 16 bit 44.1 kHz is enough because we can't hear ultrasound,
  • High resolution (above 16 bit 44100 Hz) give more sound details.

Below author try to explain why he think so.


16 bit  44100 Hz vs High resolution


Nyquist theorem is well known. We know and can easy practically check 20 000 Hz audible limit.

It give base to myth that 44100 Hz is maximally reasonable sample rate. Because in 2 times more 20 kHz plus range for transient band of ADC and DAC filters. And there is opinion, that higher sampling rates aimed for ultrasound playback, that we can't hear.

Nyquist theorem, indeed, says that analog sine may be coded to digital PCM and restored back to analog without loses.

But it is ideal concept, that require infinite time of recording and playback and ideal brickwall filter.

We have no infinite time.

We have no brickwal filter actually. We have filter with some transient band.

Narrow transient band is difficult for analog filter. Steeper digital filter, more intensive its ringing distortions. Also may be technical resource limitations to build steep enough filter.

Inside DAC upsampling with digital filter is used for proper filter work. But hardware may have calculation resource limitation to implement sophisticated filter.

So high sample rates are used not for ultrasound playback, but for ADC and DAC filters.


We know that human hear sonic in range 0 ... 120 dB maximally. There is opinion that dynamic range of 16 bit is 16 * 6 dB = 96 dB.

But 96 dB is noise floor (actually it about -100 ... -110 dB).

So minimal signal -96 dB have "zero sound quality". I.e. noise cover audio signal.

To keep sound quality signal must be higher noise. We can take noise level about -40 dB as allowable.

So actual dynamic range, when sound quality is kept, is 56 dB = 96-40.


If digital signal processing is applied, total gain may be increased and noise will boosted.


We can solve low level quality issue via higher bit depth.


Audio data integrity

Digital audio data may be corrupted in transmitting or at storage. It can be checked via checksum comparison.

Checksum is unique number calculated for binary audio data array.


Checksum A is calculated for correct music data array.

Before playback, checksum B of actual data array is calculated.

If checksum A and B are different, we can suggest that the actual data is damaged.

Checksum is used for compressed data, as rule. But it may be applied for any audio file content.



PCM software


The software process and/or playback PCM audio files, streams.


PCM audio players

  • Audirvana
  • foobar2000
  • JRiver
  • VLC

Audiophile players are capable to bit-perfect playback of audio files: audio file content is sent to DAC without altering.



PCM audio converters

Main demand to the converter is minimal distortions in audible band.

CD ripper is kind of audio converter that capable to copy CD audio data to file.


PCM editors and DAWs

  • Audacity
  • Sound Forge
  • Wavelab





  1. PCM is way to code analog signal in digital form.
  2. Its bit depth depend on desirable noise level. The bit depth defined by each application.
  3. Sample rate is defined according analog filter primarily. To better work with analog filter, digital filtration is used.
  4. PCM editing and processing is easier than DSD.
  5. PCM may be compressed lossless or lossy.
  6. 44.1 kHz 16 bit may be enough in some conditions. But it is implementation matter rather.


Here test to sample quality comparison


Author: , ,
Audiophile Inventory's developer


PCM audio




  1. The MP3 Is Officially Dead, According To Its Creators
  2. Linear Pulse Code Modulated Audio (LPCM)
  3. Is MQA DOA?
  4. ALAC specification
  5. mp3 specification
  6. AAC specification
  7. CAF specification
  8. DTS specification
  9. AC3 specification
  10. WMA specification
  11. About quantization noise
  12. Dolby TrueHD specification


Read other articles about audio issues

Copyright © Yuri Korzunov
["Audiophile Inventory" (since 2011), "AudiVentory", "Audiophile Inventory by".
Также "Audiophile Inventory by" (прежде "Audiophile Inventory") охраняемое коммерческое обозначение],
2010-2019. All Rights Reserved.
All prices at this site in the U.S. dollars without V.A.T. and other applicable taxes. The prices are recommended. All information at this site is not a public offer.

Site map      Terms and Conditions