Follow
Audiophile Inventory on Facebook Audiophile Inventory on Google+ Audiophile Inventory on Twitter Audiophile Inventory on Instagramm Audiophile Inventory on Pinterest Audiophile Inventory on Telegram Audiophile Inventory on Telegram Audiophile Inventory on YouTube

PCM Audio | Read Quick Explanation. Myth Reasons >

Share 
Audio Basis - educational articles

PCM (Pulse Code Modulation) audio is coding of analog signal in digital form (pure numbers). Most of audio files (WAV, FLAC, mp3, AIFF, ALAC, AAC and other) contains sound as PCM. Sometimes size compressed PCM is called as bitstream. Read below why.

Pulse code modulation was discussed many times. But author hope, that you learn something new here. Read below how to works PCM, its sound quality, comparison with alternative formats, myth reasons and other in simple words.

How to PCM works

 

PCM analog-digital converter

Analog-digital converter (ADC) is a device, that periodically measure analog signal voltage and store the measured values as numbers in digital form.

Analog and digital form of signal

Analog and digital form of signal

 

Period between measurements is the same.

Sample is digital value of measurement.

Quantization is measurement step of voltage level of analog signal.

Level quantization

Level quantization

 

Samples may be stored and transmitted without altering of information. It is main advantage of digital signals, comparing analog ones.

However, binary data may be damaged during storing and/or transmitting.

 

 

Sample rate

Sample rate is amount samples per second (measured in Hz, Hertz).

Sampling rate is constant for pulse-code modulation coded audio stream.

As rule, analog signal is coded as real numbers (math definition). It is usual numbers that we use permanently.

 

Sample rate

 

Nyquist theorem

Nyquist theorem define theoretical minimal sample rate.

Let's pay attention to "theoretical" word. Real implementations require to account other factors too. Read below about myths, where we'll discuss, why higher sample rates are used.

In simple words (it is not exact math definition) the Nyquist–Shannon sampling theorem may sounds as:

Coded digital audio signal have total band=[sample rate]/2


Below we will consider the theorem details, when 44.1 kHz / 16 bit sound quality matter is discussed.
 

More exact the theorem wording in sound terms:
Endless analog sine signal may be coded (to digital form) and restored with sampling rate 2 times more the
signal's frequency.

Keyword is "endless" here. But real musical signal components are finite.

More samples per finite signal duration keep more information about source signal to restore it from digital to analog form.
More samples per the duration, it is closer to inifinity.

 

Nyquist theorem (about minimal sample rate)

Nyquist theorem (about minimal sample rate)

 

Alternatively, the input samples may be processed via Hilbert transform. It convert real numbers to complex ones.

Complex number contains real and imaginary parts.

In this case coded digital audio signal have total band=[sampling rate].

However, it have no big sense in audio, because digital data size is the same.

 

PCM audio ADC (analog-digital converter) scheme

PCM audio ADC (analog-digital converter) scheme

 

Analog-digital converter capture full frequency band at input. But all stuff above [sample rate]/2 is folded with band {0 ... [sample rate]/2}. It add noise to coded digital signal.

Analog-digital conversion without input filter: folded spectrum

Analog-digital conversion without input filter: folded spectrum

 

Thus analog signal band above [sample rate]/2 should be cut.

But analog filter isn't steep enough. The issue is solved via higher sampling rate: higher sample rate is lower filter gain at [sample rate]/2 point  (lesser signal at the filter output).

To steepness increasing digital filter is used. But it add additional distortions.

PCM analog to digital conversion: steep vs non-steep filter

PCM analog to digital conversion: steep vs non-steep filter

 

Also in DAC sampling rate may be increased (oversampling) to better work with analog filter. Oversampling works with digital filter in pair.

There is myth that non-multiple resampling cause more distortions, than multiple one. But in case 48000 and 44100 Hz, resampling is applied same way.

Read more about sample rate conversion (multiple vs non-multiple) >

 

Bit depth

Bit depth is number bits of number code (word), that store analog signal value.

Bit depth define amount of digital levels, that can be stored.

 

Bit depth

 

Maximal value of the word is maximal positive value of analog signal at ADC input. Its code is:

Vmax=2(N-1)-1,

where N - number bits of the word.

Minimal value of the word is maximal negative value of analog signal at ADC input. Its code is:

Vmin=-2(N-1).

Total number of measured levels is:

L=2N.

Pulse code modulation: bit depth

Bit depth

 

Bit depth truncation is bit depth reducing via removing of one or more bits.

Rounding is bit depth reducing via removing of one or more bits with altering of reduced number according removed bit(s).

Rounding may be applied when float point bit depth is converted to integer one.

Rounding is more exact mathematically, than the truncation.

 

Quantization noise

Codes of analog values, stored into the words have precision limitation. The limitation is defined by total number of measured levels L. So stored codes (samples) are not equal exactly to real analog voltage.

Quantization error is difference between sample (digital value) and real voltage of analog signal.

The error is various for each sample and lesser 1.

The error is observed as quantization noise at digital signal spectrum.

The quantization noise is equitably distributed across frequency on the digital signal spectrum.

Energy of quantization noise is constant in total band. Thus, increasing of total band of analog signal after DAC (sampling rate increasing) decrease the noise level in audible range [0 ... 20 kHz]. It happens because audible range have fixed width.

Quantization noise
depend on band of analog signal

Quantization noise

Of course, real DAC have output electrical noise. As rule, its level about -117 ... -120 dB.

Quantization noise altering for bit depth and sample rate

Parameter altering Quantization noise altering Domain
[sample rate] x 2 -6 dB in analog domain
[bit depth] x 2 -6 dB in digital and analog domain
[Fourier transform length] x 2 -6 dB in digital domain

In the digital domain quantization noise level is decreased about 6 dB for Fourier transform length 2 times more.

 

Quantization noise level formula for bit depth M:

NQ=1/(2M × V12) [11], where

V12 is square root of 12.

 

In the digital domain NQ is the same independently sample rate. But Fourier transform divide digital band to parts in order to the transform length. I.e. more the length, more the parts, lesser energy into each of the parts.

Fourier transform is converting oscillogram (time domain) to spectrum (frequency domain).

In digital audio we mean discrete Fourier transform in most cases. The discrete mean, that spectrum is divided to taps.

Fourier transform length is tap number.

FFT (fast Fourier transform) is case of Fourier transform. It's length is 2K, where K is integer number.

Noise level and Fourier transform taps dependency

Noise level and Fourier transform taps dependency

If there are tips 2 times more, noise energy is redistributed. And each tap have energy 2 times lesser.

Noise energy is square of noise part, concluded into tap.

If we make tap width as before the redistributing (tap width at the part A of the picture), noise level will 2 times lesser. Because square of noise is constant. It happens on computer display, when tap width have same pixel width on a screen.

 

Read below more about bit depth, quantization noise and dynamic range for 16 bit implementations.

 

 

PCM digital-analog converter

Digital to analog converter transform samples to analog values.

PCM DAC demodulation:

  • convert digital codes (samples) to voltage levels,
  • filtering aliases via output filter.

PCM DAC (digital-analog converter)

PCM DAC (digital-analog converter)

 

At first glance, PCM DAC produce "stairs" at output. But it is not so. Because "the stairs" are smoothed by analog filter at the digital-analog converter output.

It allow to says, that analog signal in band 0 ... [sampling rate]/2 is fully restored.

But that's not exactly true. Because the analog filter isn't ideally "brick wall".

Non-filtered "stair" spectrum contains:

  • useful musical signal at bottom of the frequency axis;
  • aliases, that is copies of the musical signal. They are repeated along frequency axis.

Half of the aliases are flipped horizontally. In ideal audio system without non-linear distortions these aliases will inaudible.

But:

  1. in real musical systems non-linear distortions can cause intermodulations, that generate audible products by inaudible components.
  2. aliases consume part of dynamic range and reduce dynamic range abilities of the useful musical signal.

Both of these issues are solved via "stair" filtering.

 

Read more how to works DACs, about its advantages and disadvantages:

 

 

File formats

In the table noted only file abilities, that author know. If you have additional information to correct description or other, contact us.

PCM file types

Type Lossless Lossy Hi-Res capabilities Multichannel Metadata Additional information
WAV yes yes yes yes text metadata are supported (LIST chunk),
non-standard artwork implementation (ID3)
SONY WAV64 and WAV RF64 formats have big file (more 2GB) ability
FLAC yes flac file as container maximal resolution 32 bit 384 kHz yes supported Capable to size more 2GB, may be used as other format container, including MQA
AIFF yes yes yes yes Text metadata are supported,
non-standard artwork implementation (ID3)
 
ALAC [4] yes   yes, maximal resolution 32 bit 384 kHz up to 7.1 supported  
CAF yes yes yes yes May be recorded into Free chunk. Compatibility issues are probable Big file (more 2GB) ability
mp3 [5]   yes no (32, 44.1, 48 kHz only) stereo only supported No size limitations, consists of frames
MQA [3]   yes yes   supported as FLAC
(when FLAC container is used)

May be:

  • provided in FLAC container,
  • played back without decoding.
AAC [6]   yes yes, up to 96 kHz up to 5 channels supported Designed as mp3 replacement to improve perceived sound quality[1]
DTS [8] yes yes up to 24 bit / 48 kHz (core) 5.1 depend on container  
Dolby TrueHD [12] yes   up to 24 bit / 192 kHz up to 8 channels (24 bit / 96 kHz)    
ac3 [9]   yes no (up to 48 kHz) supported depend on container  
WMA [10] yes yes yes, up to 24 bit / 96 kHz supported in WMA9 PRO supported  

Comments to the table:

"bit" mean bit-depth value (example: 24 bit),

"kHz" mean sampling rate value (example: 96 kHz).

Sometimes files with same extension may contains different extensions. Examples: *.m4a files can contains either ALAC or AAC; FLAC can contains either FLAC or MQA or DoP.

A reading software (player, converter, editor, other) parse file. As rule, file consists of data blocks. These blocks have identifiers. And the reading software recognize the block types. Sometimes the software check data integrity. If there are non-correct data, the software may to reject file opening (depend on implementation).

Size compressed file types are used for saving hard disk space. Especially, it is actually for portable devices: digital audio players (DAP), mobile phones, etc.

"Big" home audio systems have no disk space issues in many cases.

Portable devices are able to playback multichannel files. But it is listened at stereo headphones, as rule. So multichannel records consume disk space to extra channels. The space extra size issue may be solving via downmixing audio files to stereo.

Downmixing quality depend on implementation.

Read below about PCM sound quality issues

 

 

Jitter

 

Jitter is unstability periods of samples. It cause non-linear distortions/noise.

Jitter appear in ADC and DAC. It is impossibly to get rid of jitter in real music systems. Because there are electromagnetic interference, non-stability of clock generators, power line interference issues.

However, the jitter may be minimized via special technical decisions.

Jitter cause non-linear distortions

Jitter audio cause non-linear distortions

 

Read more about jitter >

 

 

Dither

 

Quantization error cause non-linear distortions. It correlate with musical signal. Correlated distortions are considered as especially unwanted to perceived sound quality.

To decorrelate the distortions and the signal, dither is applied.

Dither audio (spectrum)

Dither audio (spectrum)

 

Dither is extremely low level noise, that added to musical signal before ADC or before bit depth truncation prior to DAC.

Dither audio (scheme)

Dither audio (scheme)

 

To reduce noise in audible band, noise shaping may be applied. It looks like "pushing" of noise energy to upper part of frequency range. But the shaping demands of band reserve to the "pushing".

Read more about dither >

 

 

PCM size compression

 

Size compression of audio content is way to save space at hard disk or increase throughput in communication line.

There are 2 types of the compression: lossless and lossy.

If audio data content isn't compressed, it lossless always.

PCM lossless and lossy

PCM lossless and lossy

 

Lossless

Lossless compression is size compression when input and output binary audio data content are identical.

Lossless PCM formats: WAV, FLAC, AIFF, ALAC, APE, ...

Lossless formats have same sound quality. There is opinion, that different sound may be there. Some objective hypotheses exists too. But still no researches, that are famous to author.

Also DSD may be packed in PCM as DoP audio format.

 

Lossy

Lossless compression is size compression when input and output binary audio data content aren't identical.

Lossy PCM formats: mp3, AAC, DTS, MQA[3], ogg, ...

Lossy formats have different sound losses.

We can compare lossless and lossy formats technically.

Different lossy formats look for minimal losses by psychoacoustic criteria. And these compression methods are based on various hypotheses.

As example, AAC format was developed to improve mp3 sound quality according newer knowledges about brain processing of sonic information [1].

 

 

PCM vs Bitstream

 

"Bitstream" is non-official name of compressed by size lossy/lossless coded streams (PCM, Dolby, DTS, etc.). I.e. it is streams that have stream volume (bit/sec) feature as compression estimation. As example, PCM vs Dolby Digital is one of cases of PCM vs Bitstream. From this point of view, mp3 and FLAC are "bitstream" too.

As rule, higher stream volume for single codec give better sound quality. But, other hand, higher bitrate may lead to lesser channel number in fixed band width of digital interface. As example, stereo instead multichannel.

AV users asks what is use PCM or bitstream to transmit data from player to audio-video receiver of home theater.

If your player and AV-receiver are capable to PCM (including multichannel [if it is need]), then use PCM. Otherwise, use bitstream codecs.

PCM vs Bitstream

PCM vs Bitstream

As rule, bitstream is recommended for SPDIF interface. HDMI (latest versions) can transmit multichannel PCM (LPCM).

 

 

PCM vs DSD

 

It is impossible to compare PCM and DSD from technical point of view. Because different hardware is used there.

Sound quality don't depend on format. But audio quality depend on format implementation.

DSD after edition demands re-modulation. PCM don't.

DSD DAC is simpler than PCM digital-analog converter.

 

 

WAV vs FLAC

 

WAV and FLAC is binary identical. They both provide the same sound quality.

Read details here >

 

 

LPCM

 

LPCM (Linear Pulse Code Modulated Audio) is PCM with regular intervals between quantization levels of analog voltage. It is common PCM in audio. [2]

LPCM (Linear Pulse Code Modulated Audio)

LPCM (Linear Pulse Code Modulated Audio)

 

Sound quality

 

Sound quality mean distortion level. However, distortions may have different distribution by frequency and phase. And distortions must be estimated in the light of psychoacoustics.

PCM sound quality

 

What sample rate is enough

Aliases (distortion) appear during analog-to-digital and digital-to-analog conversion. Sample rate define the alias period on frequency axis. The period is half of sampling rate. All audio content above the period should be removed to avoid of distortions of useful musical signal.

The analog filter makes the removing. However, analog filter isn't steep. So, the higher the [sample rate]/2 the deeper suppression.

Higher sampling rate help to implement ADC and DAC with lesser distortions.

 

What bit depth is enough

Bit depth define minimal noise level into record. The bith depth should provide noise level below noise floor of electrical circuits of ADC and DAC.
If recorded musical stuff will digitally processed (gain increasing, equalization, level normalizing, other), noise floor of processed stuff should be below DAC noise level.

In audio software, processing may be implemented in 32- or 64-bit float point formats. These formats have high precision (low quantization noise) and better overload abilities, than integer ones.

As far as author know, DAC can't receive data in float point formats. These formats are rounded to integer into playback software to send to DAC.

DAC with sigma delta modulator are able to receive float point formats. But author know nothing about such real implementations.

So, need to consider necessity in noise floor reserve to take of bit depth value.

 

 

What about 16 bit 44100 Hz? Myth reasons

Author consider as myths two states:

  • 16 bit 44.1 kHz is enough because we can't hear ultrasound,
  • High resolution (above 16 bit 44100 Hz) give more sound details.

Below author try to explain why he think so.

 

16 bit  44100 Hz vs High resolution

 

Nyquist theorem is well known. We know and can easy practically check 20 000 Hz audible limit.

It give base to myth that 44100 Hz is maximally reasonable sample rate. Because in 2 times more 20 kHz plus range for transient band of ADC and DAC filters. And there is opinion, that higher sampling rates aimed for ultrasound playback, that we can't hear.

Nyquist theorem, indeed, says that analog sine may be coded to digital PCM and restored back to analog without loses.

But it is ideal concept, that require infinite time of recording and playback and ideal brickwall filter.

We have no infinite time.

We have no brickwal filter actually. We have filter with some transient band.

Narrow transient band is difficult for analog filter. Steeper digital filter, more intensive its ringing distortions. Also may be technical resource limitations to build steep enough filter.

Inside DAC upsampling with digital filter is used for proper filter work. But hardware may have calculation resource limitation to implement sophisticated filter.

So high sample rates are used not for ultrasound playback, but for ADC and DAC filters.

 

We know that human hear sonic in range 0 ... 120 dB maximally. There is opinion that dynamic range of 16 bit is 16 * 6 dB = 96 dB.

But 96 dB is noise floor (actually it about -100 ... -110 dB).

So minimal signal -96 dB have "zero sound quality". I.e. noise cover audio signal.

To keep sound quality signal must be higher noise. We can take noise level about -40 dB as allowable.

So actual dynamic range, when sound quality is kept, is 56 dB = 96-40.

 

If digital signal processing is applied, total gain may be increased and noise will boosted.

 

We can solve low level quality issue via higher bit depth.

 

Audio data integrity

Digital audio data may be corrupted in transmitting or at storage. It can be checked via checksum comparison.

Checksum is unique number calculated for binary audio data array.

 

Checksum A is calculated for correct music data array.

Before playback, checksum B of actual data array is calculated.

If checksum A and B are different, we can suggest that the actual data is damaged.

Checksum is used for compressed data, as rule. But it may be applied for any audio file content.

 

 

PCM software

 

The software process and/or playback PCM audio files, streams.

 

PCM players

  • Audirvana
  • foobar2000
  • JRiver
  • VLC

Audiophile players are capable to bit-perfect playback of audio files: audio file content is sent to DAC without altering.

 

 

PCM converters

Main demand to the converter is minimal distortions in audible band.

CD ripper is kind of audio converter that capable to copy CD audio data to file.

 

PCM editors and DAWs

  • Audacity
  • Sound Forge
  • Wavelab

 

 

Conclusions

 

  1. PCM is way to code analog signal in digital form.
  2. Its bit depth depend on desirable noise level. The bit depth defined by each application.
  3. Sample rate is defined according analog filter primarily. To better work with analog filter, digital filtration is used.
  4. PCM editing and processing is easier than DSD.
  5. PCM may be compressed lossless or lossy.
  6. 44.1 kHz 16 bit may be enough in some conditions. But it is implementation matter rather.

 

Author: , ,
Audiophile Inventory's developer

 

PCM audio

 

References

 

  1. The MP3 Is Officially Dead, According To Its Creators
  2. Linear Pulse Code Modulated Audio (LPCM)
  3. Is MQA DOA?
  4. ALAC specification
  5. mp3 specification
  6. AAC specification
  7. CAF specification
  8. DTS specification
  9. AC3 specification
  10. WMA specification
  11. About quantization noise
  12. Dolby TrueHD specification

 


Read other articles about audio issues
Share 

Copyright © Yuri Korzunov
["Audiophile Inventory" (since 2011), "AudiVentory", "Audiophile Inventory by AudiVentory.com".
Также "Audiophile Inventory by AudiVentory.com" (прежде "Audiophile Inventory") охраняемое коммерческое обозначение],
2010-2018. All Rights Reserved.

Site map      Terms and Conditions