What File Types Can You Upload to Giphy
How Video Formats Work
January 31, 2018 by
Introduction
In my presentation on GIFs, I focused on how GIFs work and how GIFs compare to mod image formats. Permit's look a bit at modern video formats and see how they compress data.
Hither at GIPHY, we obviously piece of work a lot with the GIF format, just yous might be surprised to learn that we likewise work a lot with video. Much of our incoming content is video, and we often display and deliver video rather than GIFs because modern video formats both wait and compress amend than GIFs.
But modern video formats are complex beasts: more complex than GIFs. To understand them, we need to think about colour formats, blush subsampling, resolutions, container formats and codecs. I'k assuming you lot are already familiar with nuts similar resolution and frame rates, simply let'due south break down the rest.
Colour Format and Chroma Subsampling
Many developers are used to thinking near colour in terms of ruby-red, green and bluish, or RGB. These are the colors we utilize for display, when defining web colors, and often when working with raw pixel data. You might also be familiar with HSB, CMYK, or other "colour spaces" used in programs like Photoshop, but y'all are probably less familiar with the colorspace used natively in most video formats: YUV (Footnote: For the purposes of this post, we are making some generalizations about color spaces. You lot might hear about other color spaces in the same family as YUV that accept other names like Y'UV, YCbCr and so on. These are different, but since they belong to the same family, we'll treat them as one).
Historically, YUV's reward was the ability to add colour to blackness and white television broadcasts without interfering with existing signals or adding unnecessary information. While this kind of compatibility is not needed in the digital age, YUV still has a major advantage over other color spaces: this color space separates luminance (which can be thought of as brightness) from chrominance (which can be thought of as colour), so nosotros can apply different levels of pinch to brightness and color. This separation is user-friendly because color is less significant to our perception of image quality than brightness.. Past reducing the resolution of the chroma components relative to the luminance components, we tin can significantly reduce the bandwidth requirements of the video with virtually no visual impact. This is called "chroma subsampling", and is indicated using iv:X:Y notation. For example, 4:2:0 video has half the blush resolution in both the horizontal and vertical direction, and is extremely common in consumer formats, such equally Blu-ray. four:4:4 video, on the other hand, uses no chroma subsampling, and is usually considered overkill, even for professional video.
Codecs
Once the color is converted to YUV and subsampled every bit required, the video can be farther compressed using any number of available codecs, such every bit H.264, VP8, Sorensen and Cinepak. Despite the wide multifariousness, the codecs have much in common and it is possible to make some wide generalizations.
Like epitome compression formats, modern video formats can compress spatially, meaning pixels can use information from their neighbors to reduce storage requirements. However, they tin also compress temporally, which means that frames can use information from nearby frames to reduce storage requirements. To enable temporal compression, private frames can be divided into three categories:
- – Intra-Frames, or I-Frames, contain a consummate image. Because of this, they are non dependent on side by side frames for display, and do not enable temporal compression. I-Frames are sometimes called "primal frames."
- – Predicted-Frames, or P-Frames, are dependent only on previous frames for display.
- – Bidirectional-Frames, or B-Frames, are are dependent on both prior and subsequent frames for brandish.
Complicating the matter is the fact that some modernistic codecs let frames to be cleaved up into "macroblocks" or "slices", which tin can be individually treated as intra, predicted, or bidirectional. Whatsoever given frame can be divided into rectangular regions, and those regions tin can be treated as split up types of frames.
Obviously B-Frames let for the nearly compression, but they can be complex to encode and decode. On the other hand, I-Frames let for the least compression, but are easy to encode and decode. Because I-Frames tin can be rendered without reference to other frames, they correspond points in playback that don't require any buffering. Equally a effect, I-Frames play an of import part in "scrubbing," skipping, video editing, streaming, and ensuring consistent motion picture quality.
A set of images bounded by I-Frames is sometimes called a "Group of Pictures" or GOP. A GOP can be idea of as an atomic unit of measurement: it can be operated on and transmitted independently, without reference to other video content. GOPs therefore play an important role in streaming, where videos need to be cleaved into pieces for efficient delivery, and encoding, where videos need to exist cleaved into pieces for parallel processing.
Container Formats
Once compressed, video must be stored in a "container", such as OGG, AVI, FLV, MP4, MPEG, QuickTime, WebM , etc. Containers can be thought of as the box that the encoded data goes in. A physical box might have a shipping label with information about what'due south within, and might contain one or more than items within. Similarly, video containers store metadata, ranging from information most the codec itself, to copyright and subtitle information. They also allow multiple video and audio streams to exist packaged in one file.
Most container formats are organized into "blocks" or "chunks." These blocks allow readers to skip over sections they are unable to read or non interested in. Of form, readers that are unable to read certain chunks may render the video incorrectly, but in principle this design allows for formats to be extensible, and for readers of dissimilar types to gather the minimum amount of information they need every bit easily as possible.
Of course, non every container format works like this. Some formats, especially former audio formats, are simply divided into metadata and information. But every bit requirements for things like interleaving information and extensibility have increased, newer formats are unremarkably more circuitous. Some formats, similar the MPEG Send stream, include error correction and synchronization, which might exist useful over unreliable transports, but normally simply add overhead when sent over TCP/IP.
Putting information technology All Dorsum Together
So far, we've discussed the particulars of video formats in the social club you might need to retrieve about them for encoding. To decode a file, you need to think well-nigh things in the opposite gild: starting time you need to read the container format, and then read the codec, and finally convert the (usually) YUV information to RGB for display, taking chroma subsampling into account. Because some of the steps involved are lossy, y'all might not become back exactly what y'all started with, just for most modernistic codecs, the goal is for the effect to be as close to the original as possible.
-Bjorn Roche, Sr. Media Pipeline Engineer
Source: https://engineering.giphy.com/how-video-formats-work/
0 Response to "What File Types Can You Upload to Giphy"
Post a Comment