In video compression, entropy coding is the final lossless stage that squeezes out the last bits of redundancy from the data before it hits the bitstream. H.264/AVC, one of the most successful video codecs ever designed, offers two distinct entropy coding methods: Context-Adaptive Variable-Length Coding (CAVLC) and Context-Adaptive Binary Arithmetic Coding (CABAC). These two approaches represent a classic trade-off between compression efficiency and computational complexity.
Understanding the difference between CAVLC and CABAC is essential for anyone working with H.264 encoders or decoders, choosing profiles, or optimizing bitrate. This post breaks down how each method works, their strengths and weaknesses, and where they fit in real-world applications.
What Is Entropy Coding and Why Does It Matter?
After prediction, transform, and quantization, the encoder is left with a stream of syntax elements: macroblock types, motion vectors, quantized coefficients, etc. These elements are highly redundant and predictable based on context (what came before them in the stream).
Entropy coding exploits this predictability to assign shorter codewords to frequent symbols and longer ones to rare symbols, reducing the overall bitrate without any loss of information. Both CAVLC and CABAC are context-adaptive, meaning they dynamically adjust their probability models based on previously coded data, making them far more efficient than static Huffman or Golomb coding.
The choice between the two directly affects compression performance: CABAC typically saves 10–15% bitrate compared to CAVLC at the same quality, but at the cost of significantly higher decoding complexity.
CAVLC: Context-Adaptive Variable-Length Coding
CAVLC is the simpler of the two methods and is mandatory in the Baseline and Extended profiles of H.264 (commonly used for videoconferencing and mobile applications).
How CAVLC Works
CAVLC primarily encodes the quantized transform coefficients (the residual data) using a set of pre-defined variable-length code tables. The key innovation is that it switches tables based on context—specifically, the number of non-zero coefficients in neighboring blocks.
The encoding process for a block of coefficients follows these steps:
Coeff_token: A single codeword that encodes both the total number of non-zero coefficients (TotalCoeff) and the number of trailing ±1 coefficients (TrailingOnes). Different VLC tables are chosen depending on the number of non-zero coefficients in previously encoded nearby blocks.
Sign of TrailingOnes: One bit per trailing ±1.
Levels: The magnitudes of the remaining non-zero coefficients, encoded with adaptive VLC tables that grow more expensive as levels get larger.
Total_zeros: The total number of zero coefficients before the last non-zero one, again using context-adaptive tables.
Run_before: For each non-zero coefficient (starting from the highest frequency), the number of zeros preceding it.
Other syntax elements (motion vectors, macroblock types, etc.) use either exponential Golomb codes or fixed-length codes.
Because CAVLC relies on lookup tables and simple adaptations, it is fast to implement in both hardware and software and requires relatively little memory.
CABAC: Context-Adaptive Binary Arithmetic Coding
CABAC is the more advanced method and is used in the Main and High profiles (common in broadcast, Blu-ray, and streaming).
How CABAC Works
CABAC takes a fundamentally different approach: arithmetic coding, which can represent probabilities with fractional bits, getting closer to the theoretical entropy limit.
The process has three conceptual stages:
Binariation: Each syntax element is first converted into a sequence of binary symbols (bins). H.264 defines several binarization schemes (unary, truncated unary, fixed-length, etc.) and even concatenated schemes for elements like motion vector differences.
Context Modeling: Each bin is associated with a probability model (context) chosen based on previously coded information. H.264 defines hundreds of context models (around 400+ in total) for different syntax element types and positions. These models track the probability of the bin being 0 or 1.
Binary Arithmetic Coding: The actual compression engine. It maintains an interval and recursively subdivides it according to the current bin’s probability. The final output is a fractional bit representing the entire sequence.
A bypass mode exists for bins assumed to be equiprobable (50/50), which skips context modeling and uses a simpler coding path for speed.
CABAC’s continuous probability updates and arithmetic precision allow it to adapt more finely to local statistics, yielding superior compression.
CAVLC vs. CABAC: Head-to-Head Comparison
| Aspect | CAVLC | CABAC |
|---|---|---|
| Compression Efficiency | Good (baseline for lower profiles) | Excellent (typically 10–15% better than CAVLC) |
| Encoding Complexity | Moderate | High (serial nature limits parallelism) |
| Decoding Complexity | Low to moderate | High (2–4× more cycles than CAVLC on average) |
| Memory Requirements | Low (few tables) | Moderate (many context models) |
| Error Resilience | Better (self-synchronizing to some degree) | Lower (a single bit error can desync the arithmetic coder) |
| Parallelism | Good | Limited (sequential dependency) |
| Profiles | Baseline, Extended | Main, High, and higher-tier profiles |
In practice, CABAC’s bitrate savings are most noticeable at lower bitrates and higher resolutions, where statistical redundancies are more pronounced.
When to Choose Which
Use CAVLC when you need low-complexity decoding (mobile devices, real-time videoconferencing), better error resilience (unreliable networks), or compatibility with Baseline profile devices.
Use CABAC for maximum compression efficiency in controlled environments: Blu-ray authoring, broadcast television, high-quality streaming (Netflix, YouTube, etc.), and archival storage.
Most modern hardware decoders support CABAC efficiently, so the complexity penalty is less relevant today than it was in the early 2000s.
Conclusion
CAVLC and CABAC illustrate the engineering trade-offs at the heart of H.264’s success. CAVLC offers simplicity and robustness, while CABAC pushes closer to the Shannon limit with sophisticated arithmetic coding. Together, they helped H.264 achieve widespread adoption across vastly different use cases.
Newer codecs like HEVC and AV1 build on CABAC’s ideas (both use variants of arithmetic coding), but understanding these two H.264 methods remains valuable for anyone working with legacy content or hybrid systems.
If you’re implementing an encoder or tuning presets in x264/FFmpeg, experimenting with the --cabac flag is a great way to see the difference in action. Let me know in the comments if you’d like deeper details on any part of the process!