|
他们的weblog最近发表了关于为何选择intra-frame:)
A primer on inter- vs. intra-frame video compression
February 21st, 2008 by Sam
The Elemental blog seems to attract an inordinate amount of spam, but it also gets the occasional reasonable questions from folks. One of the more confusing aspects of video compression seems to be inter- vs. intraframe compression. Here’s a brief lesson on both.
Video in it’s simplest sense can be thought of as a series of still images (hence the early days of film were called “moving pictures” until some marketing genius compressed the words and came up with “movies”). When these still images are flipped past the human eye fast enough, the eye interprets the frames as motion instead of unique still images. From a compression perspective, however, they are still just a series of images displayed in a specified order.
Video compression then focuses on how to take these contiguous frames of video and minimize the amount of information needed to code the picture. The natural first step is to compress each individual image. This is known as intraframe compression, and uses only information contained in the current frame to minimize the image size. As an example, JPEG (the standard file format used for images on the Internet) uses the discrete cosine transform to rid images of high-frequency components, which are generally not perceptible by the human psychovisual system; by throwing this information away, a still image can be coded with much less data. This idea has been refined over generations, with early still-image standards like GIF and JPEG laying the foundation for more complex video standards like Motion JPEG and DV, the standard widely used in MiniDV videocameras.
As demand for better picture quality at lower bit rates increased, however, the compression achievable by intra-only encoding became insufficient. Temporal compression or interframe encoding was introduced in the MPEG-1 standard, and has since been refined in the MPEG-2, VC-1 and H.264 codecs. These codecs include intraframe (I-frame) coded images as described in the preceding paragraph, but they also contain predictive-coded frames (P-frames) and bidirectionally-predictive-coded frames (B-frames). P-frames rely on images that were transmitted earlier in the sequence, and use data in these frames — with minor changes — create the current frame. B-frames are similar, but can use data from images earlier and later in video sequence. There can be many P- and B-frames between each I-frame, and since most video sequences have similar images for long periods of time dramatically higher compression can be achieved. The consecutive number of interframe images is referred to as Group of Picture (GOP) length.
The benefits of intraframe-only compression are that it is generally less computationally expensive to process, since it doesn’t require multiple frames to be stored in memory and accessed concurrently. There is also less latency in the encoding process, so compressed images are created much quicker. Hence, historically digital videocameras have captured intraframe-only formats (DV, DV50, DVCPRO HD, AVC-Intra). However, new generations of consumer camcorders with limited storage capacity are relying on interframe-encoded formats like HDV (a long-GOP version of MPEG-2) and AVCHD (a long-GOP version of H.264). These formats allow high-definition video to be stored on the same MiniDV tapes that previously could only capture standard-definition video. Editing these long-GOP formats is incredibly computationally intensive, as for each image displayed, many temporally adjacent frames need to be decoded first. Hence the need for GPU-accelerating decoding and Elemental’s RapiHD™ software! |
|