The demand for fast-paced, high-quality video content delivery has led to the development of different video compression standards – video codecs, in the video distribution industry.
Let’s understand what H.264 is, and how its encoding and decoding processes work in video streaming.
What is H.264?
The H.264 codec, the most widely accepted codec in the world, is a video compression standard which compresses, records, and distributes online video content through different video streaming channels. It was jointly developed by the Video Coding Experts Group of ITU and Moving Pictures Experts Group of IOS, but given different names – Advanced Video Coding (AVC) or MPEG-4 Part 10, and H.264 respectively.
The high demand of H.264 codec is because of its great video streaming quality with relatively lower video bitrates than other codec standards, such as H.263, MPEG-2 or MPEG-4 Part 2, with easy implementation mechanism and simple utilization method.
All-in-all, H.264 has become a revolutionary tool for video streamers and viewers, with easy access to high-quality, fast streaming video content on a variety of streaming platforms.
How does an H.264 codec work?
The H.264 video codec follows two methods for processing raw video files into suitable formats – encoding and decoding. It starts with encoding the video files into block-format, which is then decoded into playback container formats.
Encoder Processes
The encoding process is accomplished with prediction, transformation and video encoding tasks, which produces a compressed H.264 bitstream. The H.264 uses block-oriented encoding standard with motion competition for processing the frames of the video content.
● Prediction
In this step, a macroblock (16×16 or 4×4 pixels) is created based on the previously-coded data, either from already coded and transmitted frames (inter-prediction), or from the current frames (intra-prediction). Then, a residual sample is generated by subtracting the prediction block from the macroblock.
● Transformation and quantization
Here, a 4×4 or 8×8 integer transform outputs a set of coefficients, creating an image block by combining each basis pattern based on their respective coefficient values.
Then the block of transform coefficients are divided by the integer values that are set by the video processor, called quantization. This process reduces the precision factor of the transform coefficients, with a quantization parameter (QP) value. Higher the QP value, more zero-coefficients, resulting in poor decode quality with high compression (small block size), whereas, lower QP value leads to more non-zero coefficients, which improves the decoded image quality but increases the block size.
● Bitstream encoding
The encoding process creates syntax elements, which include transform coefficients, decoding information, video sequence information, and compression structure information. Once these syntax elements are converted into an efficient, compact binary code, using arithmetic or variable length coding, which forms the encoded bitstream for storing and transmitting.
Decoder Processes
● Bitstream decoding
The decoder part of H.264 codec, reverses the coding process of the compressed bitstream, and recreates a video images sequence by extracting the information from those syntax elements.
● Rescaling
Then the transform coefficients are multiplied by an integer value to restore their original scales. Further, the standard basis patterns are combined by an inverse transform, weighted by the rescaled coefficients, and recreates each block of residual sample. Finally, all these blocks combine together to form a residual macroblock.
● Reconstruction
In this stage, the prediction blocks identical to those created by the encoder are created and added to the residual block, and a decoded macroblock is reconstructed, which becomes the part of the video frame for displaying to the viewers.
Conclusion
The H.264 codec is considered to be a highly flexible and affordable video processing standard for the video streaming and video delivery industry. It follows the complex, but easy to understand encoding and decoding processes, which offer excellent video compression without suffering on the quality of playback.