One of the fundamental laws in the world of hardware engineering - although probably not for much longer - is Moore’s Law which postulates that the integration density of transistor circuits doubles roughly every two years.Īs a rule of thumb, this would mean that, when AVC came out, integration density was ~2⁷ times lower than it is today. For more in-depth information about this, please see Overview of the HEVC Standard which is publicly available and goes into great detail on the various processing steps involved. H.264 already employed a similar prediction mechanism, but was limited to only 9 prediction directions, whereas H.265 now allows for 35, improving prediction accuracy significantly. The idea here is to express such blocks not in terms of their pixel values, but in terms of the pattern that the pixels form and to then store the mathematical approximation of that pattern, further reducing the size of a frame. The same approach is also taken for blocks within a single frame where H.265 introduces the concept of a Coding Unit that can be a subset of a CTU and might get subdivided even further into so called Prediction Blocks.
This way, particularly in less busy scenes and videos with a static background, you can significantly cut the number of pixels that need to be stored. Wikipedia actually has a short summary of how they differ from traditional macroblocks and potential efficiency gains here Going beyond pixelsĪnother common way to increase compression efficiency on moving pictures is to take a snapshot, use it as a reference frame (i-frame), and express subsequent frames in reference to this snapshot, generally referred to as a p-frame. Looking at its successor, we see that H.265 has abandoned these so-called macroblocks for so-called coding tree units and these CTUs can not only have variable dimensions depending on the structure of the image, but they can also contain up to 64圆4 pixels. The main reason for that is the maximum block size of 16x16 pixels, which gives away a lot of optimization headroom once you exceed a certain pixel count. This wasn’t exactly a development the ITU anticipated in the early 2000s and so H264 doesn’t compress high-resolution video particularly well.
We now have widespread distribution of 1080p videos and any higher range smartphone can already record 4K videos without a problem. Of the many things that have changed about the video data itself since 2003, resolution has to stand out the most. I will outline a few of them and link to more in-depth information in case you want to get a deeper understanding of how the two standards work under the hood.
While this is technically not required reading for what we are about to do, it is worth briefly talking about the various improvements H.265 provides over H.264. Of course it is true, that new isn’t always better, so I quickly want to make the case as to why you should want H.265 over H.264. Luckily, its more efficient successor HEVC/H.265 has been around for a few years now and, ever since its initial draft in 2013, has seen continuously improved support by device manufacturers, to the point where even low-end smartphones these days have a hardware decoder for the codec.
While that codec was a huge leap in compression ratio and encoding efficiency compared to MPEG2, the initial draft of the standard was released all the way back in 2003.Īnd while there have been plenty of revisions on it since, the underlying format is arguably dated. Most videos these days, whether they are recorded with phones and DSLRs or downloaded from the internet are still encoded with the AVC/H.264 codec.