Dynamic range compression in audio—audio compression—reduces the dynamic range of an audio signal. The dynamic range is the difference between the quietest and loudest sounds. Audio compression is used in music production, vocal recordings, and broadcasting and helps to deliver a more consistent and enjoyable listening experience. When overdone, however, audio compression can result in an unnatural or distorted sound. In this article, we explore audio compression—what it is, how it’s applied, and its benefits and limitations.
Table of Contents
What is dynamic range compression?
Dynamic range compression is a process that reduces the dynamic range of a signal.
The dynamic range is the difference between the smallest and largest values (e.g., amplitude levels) in a signal. It represents the maximum amount of variation in the signal’s values. Dynamic range compression, therefore, reduces the signal’s maximum variation.
Dynamic range applies to signals in a variety of settings, including:
- Audio (see below)
- Electronics, e.g., the maximum variation in power, current, or voltage
- Photography, e.g., the maximum variation in luminance
- Video, e.g., the maximum variation in brightness
Dynamic range compression, therefore, refers to reducing the maximum signal variation in these settings.
What is dynamic range compression in audio?
The dynamic range in audio is the difference between the quietest and loudest sounds in an audio signal. Dynamic range compression, or audio compression, reduces the dynamic range of an audio signal.
A higher dynamic range allows more depth and clarity of sound, whereas a lower (i.e., more compressed) dynamic range reduces the range of volume dynamics. This results in a ‘flatter’ sound.
In a musical setting, dynamic range compression reduces the contrast, nuance, and detail of sound, with fewer discernible quiet and loud sections. It improves the consistency of volume, however, producing a ‘punchier’, more uniform sound.
Why use audio compression?
Audio compression is useful in many situations. It can prevent distortion from loud signals, for instance, make quiet sounds more audible, or create a more balanced and consistent listening experience.
Following are some common areas where audio compression is used:
In music production, compression is used to control the levels of individual tracks and the overall mix. It helps ensure that all elements of a song—from the quietest hi-hat to the loudest snare drum—are clearly heard and contribute to the overall sound.
A vocalist’s performance, for instance, can vary greatly in volume. Without compression, some words might be too loud and other words too soft, making the lyrics hard to understand. Compression smooths out these volume differences, making the vocals more consistent and balanced in the mix.
Compression is also applied to drums to add punch and sustain, creating a more powerful and energetic sound. By adjusting the attack and release times (see below), a producer can shape the sound of drums, making them ‘tighter’ or ‘looser’ depending on the desired effect.
In voice recordings, such as podcasts, audiobooks, or voiceovers, compression helps to maintain a consistent volume level.
Speakers naturally vary their volume as they speak, and without compression, a listener may need to frequently adjust their volume control.
In a podcast, one speaker may have a louder voice than another, or a single speaker may become more or less animated in their volume levels. Compression balances these volume differences, creating a smoother, more enjoyable listening experience.
Compression also helps to reduce unwanted noise. By setting the threshold (see below) above the level of the background noise (or the noise floor), the compressor prevents noise from being amplified when voice or other desired sounds are amplified.
In broadcasting, whether radio or television, the goal is to deliver a consistent audio experience—compression plays a vital role in this.
A news broadcast usually contains multiple elements—the anchor’s voice, interviews, background music, sound effects, etc. These can have vastly different volume levels, so compression ensures that they are balanced and easy to comprehend.
Compression is also used to prevent audio from exceeding the maximum allowed level in broadcasting, which could cause distortion or damage to broadcasting equipment.
Compression has many other uses, including:
- In film and television production it’s used to balance dialogue, music, and sound effects, keeping dialogue at audible levels and not overwhelmed by sound effects or music
- In live sound reinforcement, e.g., concerts or public speaking events, it’s used to prevent sudden loud sounds from causing audience discomfort and to prevent feedback by reducing the level of microphones not in current use
- In hearing aids, it’s used to amplify quiet sounds without making loud sounds uncomfortably loud
Dynamic range compression parameters
In an audio production workflow, dynamic range compression is achieved by using a device called a ‘compressor’. Following are the parameters that you would need to adjust in a typical compressor.
|Level at above (or below) which compression starts
|How much a signal is compressed once it’s beyond the threshold
Very high 10–Inf:1
|How long it takes for a signal to be compressed
|Very fast 0.1–5 ms
Fast 1–15 ms
Medium 3–30 ms
Slow 10–100 ms
|How long it takes for a compressed signal to revert to the original signal
|Very fast 0.1–5 ms
Fast 5–50 ms
Medium 30–100 ms
Slow 50–100+ ms
|How a compressor transitions between the compressed and uncompressed states
|Compensation for the volume reduction caused by compression
|Match the gain reduction caused by compression
The threshold is the level at which the compressor starts to work, i.e., it determines what part of your audio will be affected by the compressor.
It’s usually measured in decibels (dB) and any signal that exceeds the threshold will be compressed. If you set the threshold at -10 dB (expressed as dBFS), for instance, then any part of the signal above -10 dB will be reduced in volume.
If you set a low threshold, more of the audio will be compressed, making it sound more consistent in volume but also more ‘processed’. If you set a high threshold, only the loudest parts will be compressed, preserving more of the audio’s natural dynamics.
The ratio controls the degree of compression, or gain reduction, i.e., it determines how much a signal is reduced once it exceeds the threshold. As its name suggests, this parameter is expressed as a ratio, e.g., 2:1, 4:1, or 8:1.
A ratio of 2:1 means that for every 2 dB over the threshold, the output will only increase by 1 dB. A ratio of 4:1 means that for every 4 dB over the threshold, the output will only increase by 1 dB, and so on.
A low ratio results in gentle compression, which is useful for subtly controlling volume dynamics. A high ratio results in heavy compression, taming loud peaks and creating a more processed sound.
The attack refers to how long it takes for a signal to be compressed once the threshold has been exceeded.
It’s usually measured in milliseconds (ms), or sometimes as dB per second, with a short attack time meaning the compressor will react quickly and a long attack time meaning the compressor will react more slowly.
A short attack (e.g., less than 1 ms) can make the audio sound tighter and more controlled but it can also reduce the impact of percussive sounds. A long attack (e.g., 10–100 ms) can preserve the natural dynamics and transients of audio, but it might not control the volume as effectively.
The release works in the opposite way to attack and refers to how long it takes for a compressed signal to go back to its original state after the signal falls below the threshold.
Like the attack, release is usually measured in milliseconds (sometimes as dB per second) and a short release time means the compressor will stop reducing the volume quickly while a long release time means the compressor will stop reducing the volume slowly.
A short release (e.g., 2–5 ms) can result in a smoother and more sustained sound but it may make the audio ‘pump’ or ‘breathe’ as the compressor rapidly turns on and off. A long release (e.g., 40–60 ms) may result in the audio sounding overly compressed and lacking in dynamics.
The knee controls how the compressor transitions between the uncompressed and compressed states.
A ‘hard’ knee results in immediate compression once the threshold is exceeded, while a ‘soft’ knee results in a more gradual transition to compression.
A hard knee can make the compression more noticeable, which might be desirable for creative effects or for controlling loud peaks. A soft knee can make the compression more transparent, which might be preferred for gentle dynamic control or for preserving the natural sound of audio.
The makeup gain, or output gain, compensates for the volume reduction caused by compression. After being compressed, a signal’s overall level is quieter than it was originally, so makeup gain is used to bring the level back up.
Generally, the amount of makeup gain you apply should match the amount of gain reduction caused by the compressor. If the compressor reduces the peak level of your audio by 5 dB, for instance, you can apply 5 dB of makeup gain to restore the peak level. However, the exact amount of makeup gain will depend on the specific sound you’re aiming for.
Note that, while makeup gain increases the volume of the compressed signal, it doesn’t restore the original dynamics. The loud and quiet parts of the audio will still be closer together in volume than they were before compression.
Types of dynamic range compression in audio
Audio compression can be applied in various ways, as follows:
Downward vs upward compression
Downward and upward compression work in opposite directions.
Downward compression (more common) reduces the volume of sounds that exceed a certain threshold. If you’re compressing a vocal track, for example, downward compression reduces the volume of the loudest vocal parts to make them more consistent with the quieter ones.
Upward compression, on the other hand, increases the volume of sounds that fall below a certain threshold. Again, if compressing a vocal track, upward compression increases the volume of the quietest vocal parts to make them more consistent with the louder ones.
Multiband compression acts on an audio signal in multiple frequency bands and applies different compression settings to each band. This lets you compress different parts of the frequency range separately.
You may want to apply heavy compression to bass frequencies, for instance, to tighten up the low end, while applying lighter compression to high frequencies to preserve the brightness and clarity of your mix. This can give you a more balanced and controlled sound, depending on your preferences.
Limiting is compression applied with a high ratio, i.e., it prevents an audio signal from exceeding a certain level and uses compression ratios often exceeding 10:1.
You can use a limiter on the master track of a mix to ensure that the overall volume doesn’t exceed 0 dB, which could cause digital clipping and distortion.
With a limiter, you can also ‘brick wall’ your audio, i.e., use a very high ratio and a very fast attack time, to make sure that the signal doesn’t go above the threshold.
Parallel compression, sometimes referred to as New York compression or Motown compression, involves blending a compressed version of a signal with the original, uncompressed signal. This adds density and sustain to your audio while preserving the natural dynamics of the original signal.
You can use parallel compression on a drum track, for instance, to add punch and energy—the compressed signal brings out the sustain of the drums while the original signal preserves its transients.
Sidechain compression uses an external audio signal to ‘modulate’ compression in the original signal, e.g., compression applied to an audio track (the main track) is controlled by the level of another audio track (the sidechain track).
A common use of sidechain compression is in electronic dance music (EDM). Here, the bass is often compressed in response to the kick drum. This causes the bass to ‘pump’ in time with the kick, creating a rhythmic effect and ensuring that the kick cuts through the mix.
Serial compression uses two or more compressors in series.
Each compressor applies a small amount of gain reduction, resulting in a more natural and transparent sound than if a single compressor were used.
You could use one compressor with a fast attack and release to tame transient peaks, for example, followed by another compressor with a slow attack and release. This would smooth out the overall dynamics and let you shape the sound in more nuanced ways.
De-essing reduces the volume of sibilant sounds, such as ‘s’ and ‘sh’ sounds in vocal recordings. Sibilant sounds can be harsh and distracting if they’re too loud.
A de-esser works by compressing only higher frequency ranges, e.g., above 5 kHz, where sibilance occurs. By removing the sibilant sounds, de-essing helps make vocals sound smoother and more polished.
How over-compression can affect sound quality
While audio compression has many uses, too much compression can be a problem.
The loudness wars
A case in point is the ‘loudness wars’, a trend that occurred in pop music from the 1980s to the 2000s.
The loudness wars was a phenomenon in which music recordings became increasingly loud due to excessive use of dynamic range compression. A well-known example is the 2008 album “Death Magnetic” by Metallica, characterized by its extremely loud and distorted sound.
According to Ian Shepherd, a veteran mastering engineer, Metallica’s Death Magic is just too loud—it’s been compressed by about as much as it’s possible to compress audio. This made it unpleasant and a “pointless massacre of our music“, says Shepherd.
Louder is better—but too loud is worse.Ian Shepherd
The rationale for the loudness wars
If Shepherd’s views are anything to go by, the loudness wars negatively impacted pop music by taking dynamic range compression too far.
So, what motivated the loudness wars?
The main rationale was the belief that louder music stands out and is more appealing to listeners. This especially applies when music is played alongside other tracks, e.g., on the radio, in a playlist, or in environments with a lot of background noise.
There’s also a psychoacoustic perception at play, where listeners tend to perceive louder music as sounding better, at least in short-term listening tests. This may also have been a motive for louder recordings.
The problem with over-compression
While louder music has its place, the loudness wars illustrated a number of issues with over-compression, including:
- An overly reduced dynamic range, resulting in music that sounds less natural and (possibly) more ‘tiring’ to listen to (i.e., causing ‘musical fatigue’), as there’s less contrast between loud and quiet sounds
- More distortion and clipping, especially at volumes near the maximum levels that digital audio can handle
- Not acknowledging that listeners can easily adjust the volume of their own music—if a track with a wide dynamic range appears quiet, it’s easy to turn up the overall volume
Lessons learned from the loudness wars
One positive outcome of the loudness wars is a renewed appreciation for dynamics in music. Many audio engineers, music producers, and listeners now recognize the value of a wide dynamic range and the negative effects of over-compression.
A movement known as Dynamic Range Day was started in 2010 to raise awareness about over-compression. Streaming platforms like Spotify and Apple Music now use loudness normalization, which adjusts the volume of all tracks to a similar level. This removes the need for excessive compression and encourages a return to more dynamic mastering practices.
The loudness wars is a reminder that, while dynamic range compression is a powerful tool in audio production, it shouldn’t be over-used. The goal should be to enhance the music and make it sound the best it can, rather than simply making it as loud as possible.
Dynamic range compression is a critical tool in audio production, used to balance the loud and quiet parts of an audio signal.
It has a wide range of applications, from music production and voice recordings to broadcasting and live sound reinforcement. As illustrated by the ‘loudness wars’ of the 1980s to 2000s, however, overuse of compression can lead to a reduction in audio quality and listener enjoyment.
The key parameters of compression—threshold, ratio, attack, release, knee, and makeup gain—allow for nuanced control over sound. Different types of compression, including downward and upward, multiband, limiting, parallel, sidechain, serial, and de-essing, offer various ways to shape audio with compression.
The lessons learned from the loudness wars have led to a greater appreciation for dynamic range and a more mindful approach to using compression. Today, the focus is shifting back towards preserving the natural dynamics of music and creating a more balanced and enjoyable listening experience.
What is a compressor?
A compressor is a device used in audio production to control the dynamic range of sound. It reduces the difference between the loudest and softest parts of an audio signal by attenuating the signal level when it exceeds a certain threshold.
What is threshold control in a compressor?
Threshold control determines the level at which the compressor starts to reduce a signal’s level. When the signal exceeds the threshold, the compressor begins to apply compression.
What is parallel compression?
Parallel compression is a process where a compressed version of a signal is blended with the original uncompressed signal. This allows better control over the dynamic range of the sound, as the compressed signal adds sustain and body to the uncompressed signal.
What is multiband compression?
Multiband compression applies compression on different frequency bands of an audio signal independently. It allows precise control over the dynamic range of individual frequency ranges, which can be useful when dealing with complex audio material.
How does compression affect the audio dynamic range?
Compression reduces the dynamic range of an audio signal by attenuating the louder parts of the signal (or amplifying the quieter parts). This results in a more consistent and controlled sound. and can make audio sound punchier and more balanced.
What is the input signal in compression?
The input signal is the audio signal that’s being processed by the compressor. It is the signal that will be affected by the compression settings, and its level and dynamics will motivate how the compression is applied.
How can compression be used in audio signal processing?
Compression can be used to control the dynamic range of an audio signal. It can be used to smooth out variations in level, add sustain to a sound, or even create special effects. Compression plugins are commonly used in digital audio workstations for this purpose.
What is a high compression ratio?
A high compression ratio refers to the amount of gain reduction applied by a compressor. A ratio of 4:1 means that for every 4 dB that the input signal exceeds the threshold, the output level is reduced by 1 dB. Higher compression ratios result in more aggressive compression.
How does compression affect the audio output level?
Compression reduces the dynamic range of an audio signal, bringing up the quieter parts of the signal and lowering the overall level. Depending on the compression settings, the output level can be adjusted to match the desired level of the audio material.
Can an audio compressor be used for data compression?
No, a compressor in the context of audio is used to control the dynamic range of an audio signal, not for data compression. Data compression is a different concept used to reduce the size of digital files, while audio compression is used to alter (i.e., reduce) the dynamic range of an audio signal.