什么是编解码器?

这是我们正在进行的“是什么...?”文章, 旨在提供定义, 历史, 和 context around significant terms 和 issues in the online video industry.

执行概要

编解码器是互联网的氧气流媒体 market; no 编解码器s, no 流媒体. 从拍摄视频到编辑再到编码我们的流媒体文件以供交付, 编解码器涉及到整个过程的每一步. Many video producers also touch the DVD-ROM 和 Blu-ray markets, 以及广播, 编解码器也在其中发挥了作用.

Though you probably know what a 编解码器 is, do you really know 编解码器s? Certainly not as well as you will after reading this article. First we’ll cover the basics regarding how 编解码器s work, then we’ll examine the different roles performed by various 编解码器s. 接下来我们将研究H.264成为当今使用最广泛的视频编解码器, 最后以音频编解码器的快速讨论结束.

编解码器的基础

Codecs are compression technologies 和 have two components, 压缩文件的编码器, 还有一个解码器来解压缩. 有数据的编解码器(PKZIP), 静止图像(JPEG), GIF, PNG), (MP3音频, AAC)和视频(Cinepak, mpeg - 2, H.264, VP8).

的re are two kinds of 编解码器s; lossless, 和 lossy. 无损编解码器，如PKZIP或PNG，在解压缩时复制与原始文件完全相同的文件. 有一些无损视频编解码器，包括苹果的动画编解码器和 Lagarith编解码器, but these can’t compress video to data rates low enough for streaming.

与无损编解码器相比, lossy 编解码器s produce a facsimile of the original file upon decompression, 但不是原始文件. Lossy 编解码器s have one immutable trade-off–the lower the data rate, the less the decompressed file looks (or sounds) like the original. In other words, the more you compress, the more quality you lose.

有损压缩技术使用两种类型的压缩，帧内压缩和帧间压缩. 帧内压缩本质上是应用于视频的静态图像压缩, with each frame compressed without reference to any other. 例如, Motion-JPEG只使用帧内压缩, 将每一帧编码为单独的JPEG图像. 的 DV编解码器也只使用帧内压缩 DVCPRO-HD, which essentially divides each HD frame into four SD DV blocks, 所有编码完全通过帧内压缩.

相反，帧间压缩利用帧之间的冗余来压缩视频. 例如，在一个会说话的头部场景中，大部分背景都是静态的. Inter-frame techniques store the static background 信息 once, then store only the changed 信息 in subsequent frames. Inter-frame compression is much more efficient than inter-frame compression, 因此，大多数编解码器都被优化为搜索和利用帧之间的冗余信息.

早期基于CD-ROM的编解码器(如Cinepak和Indeo)使用两种类型的帧进行此操作:关键帧和增量帧. 关键帧存储完整的帧，只使用帧内压缩进行压缩. 在编码, the pixels in delta frames were compared to pixels in previous frames, 多余的信息被删除了. 每个增量帧中的剩余数据也根据需要使用帧内技术进行压缩，以满足文件的目标数据速率.

什么是三角架?

图1. Key frames 和 delta frames as deployed by CD-ROM based 编解码器s.

这显示在图1, which is a talking head video of the painter shown on the upper left. 在视频中，画面中唯一变化的区域是嘴、雪茄和眼睛. 四个增量帧仅存储已更改的像素块，并在解压缩期间引用关键帧以获取冗余信息.

在这个场景中使用一个动画文件, 帧间压缩是无损的, 因为你可以用存储在关键帧和增量帧中的信息逐位重建原始动画. 然而，对于真实世界的视频，操作并不是无损的, 效率很高, 这就解释了为什么会说话的视频编码质量比足球比赛或纳斯卡赛车高得多.

长GOP格式

自CD-ROM时代以来，帧间技术取得了进步，大多数编解码器，包括mpeg - 2、H.264和VC-1, 现在使用三种帧类型进行压缩:i帧, B-frames, 和P-frames, 如图所示图2. i帧与关键帧相同, 并且仅使用帧内技术进行压缩, 使它们成为最大的, 效率最低的框架类型.

图2. I-, B- 和P-frames as used in most advanced compression technologies.

b坐标系和p坐标系都是delta坐标系. p帧是最简单的，并且可以利用任何先前的I或p帧中的冗余信息. b框架更为复杂, 和 can utilize redundant 信息 in any previous or subsequent I-, B或p坐标系. This makes B-frames the most efficient of the three frame types.

的se multiple frame types are stored in a group of pictures, 或共和党, 它从每个i帧开始，包括直到但不包括随后的i帧的所有帧. Codecs that use all three frame types are often called “long GOP formats,” primarily when the 编解码器s are used in non-linear editing systems. 这突出了有损压缩技术的第二个基本权衡:解码复杂性的质量. 这是, 编解码器提供的质量越高, 就越难解码, particularly in interactive applications like video editing.

的 first long-GOP format used in non-linear editing systems was 丁肝病毒, an mpeg - 2 based format, 和 imagine the complexity this introduced. 例如, 使用DV和Motion-JPEG, 每一帧都是完全自我参照的, so you could drag the editing playhead to any frame in the video, 它可以实时解压.

然而, 使用基于mpeg - 2的丁肝病毒, 如果你把游戏头拖到b帧, 非线性编辑器必须解压缩b帧所引用的所有帧, 这些帧可能位于时间轴上的b帧之前或之后. 在当时动力不足的计算机系统上, 大多数使用32位操作系统，只能处理2GB内存, long-GOP formats caused significant latency which made editing unresponsive.

随着摄像机越来越多地依赖于长GOP格式，如mpeg - 2和H.264存储他们的数据, 一种新型编解码器, 通常称为中间编解码器, 到达现场. 这些公司包括Cineform公司.苹果ProRes和Avid DNxHD. 这些编解码器仅使用帧内压缩技术以获得最大的编辑响应性, 非常高的数据留存率.

函数专用编解码器

这些中间编解码器强调的事实是，虽然有一些交叉, 通过它们的功能来识别编解码器是很有用的, 其中包括以下类别:

用于摄像机的采集编解码器

其中包括用于DV和DVCPROHD的Motion-JPEG，用于索尼XDCAM HD和丁肝病毒的mpeg - 2，以及H.AVCHD和许多数码单反相机中使用的264. 这里编解码器的作用是在满足板载存储机制的数据速率要求的同时，以尽可能高的质量捕获数据.

Intermediate 编解码器s, as identified above, used primarily during editing

如前所述，在此角色中，这些编解码器旨在优化编辑响应性和质量.

交付编解码器

的se include mpeg - 2 for DVD, broadcast 和 satellite, mpeg - 2, VC-1 和 H.264是蓝光格式，H.264, VP6, WMV, WebM 和 multiple other formats for streaming delivery. 在这个角色中, the 编解码器s must match the data rate m和ated by the delivery platform, 在流媒体的情况下呢, 是否远远低于用于收购的比率.

编解码器和容器格式

It’s important to distinguish 编解码器s from 容器格式, 虽然有时它们的名字相同. 短暂的, 容器格式, 或包装, are file formats that can contain specific types of data, 包括音频, video, 隐藏式字幕文本, 以及相关的元数据. Though there are some general-purpose 容器格式, 像QuickTime, 大多数容器格式只针对生产和分发管道的一个方面, 比如MXF，用于在摄像机上进行基于文件的捕获, FLV和WebM用于流媒体Flash和WebM内容.

在某些情况下, 容器格式有一个或主要的编解码器, 比如Windows Media Video和WMV容器格式. 然而, most 容器格式 can input multiple 编解码器s. QuickTime可能有最广泛的用途，一些摄像机捕捉mpeg - 2/H.264 video in the QuickTime container format, 和 lots of videos distributed on iTunes with an MOV extension.

一个可能引起混淆的领域与MPEG-4有关, which is both a container format (MPEG-4 part 1) 和 a 编解码器 (MPEG-4 part 2). 从技术上讲，至少从ISO的角度来看，H.264也是MPEG-4编解码器(MPEG-4 part 10)，它在很大程度上取代了MPEG-4编解码器的使用. 作为一种容器格式，MP4文件可以包含使用mpeg - 2、MPEG-4、VC-1、H.263和H.264编解码器.

用VC-1编码你的MP4文件, 然而, QuickTime播放器或任何iOS设备都无法播放该文件. 在这方面, 当生成要分发的文件时, 选择与目标查看器的播放功能兼容的编解码器和容器格式非常重要.

多数道路通向H.264

从历史上看, 视频编解码器在多个不同的路径上发展, 和 it’s interesting that most of those paths led to H.264, which is why the 编解码器 has so much momentum today. One path was through the International St和ards Organization, 谁的标准影响了摄影, 计算机和消费电子产品市场. ISO于1993年发布了第一个视频标准, 什么是MPEG-1, 随后在1994年推出了mpeg - 2, 1999年的MPEG-4, 和AVC / H.264 in 2002.

的 next path was via the International Telecommunications Union, 哪个是联合国负责信息和通信技术问题的主要机构, 并为电话标准做出了贡献, 广播电视市场. 的 ITU debuted their first video-conferencing related st和ard, H.1984年，与H.1990年，H.1994年，H.1995年，H.264, which was jointly developed with the ISO, in 2002.

正如我们所见, 数码摄像机最初使用DV编解码器, 然后过渡到mpeg - 2, 哪个继续占有重要的份额. AVCHD是H.基于264年的格式, 和 prosumer camcorders using this format are growing in popularity, 使用H键的摄像机也是如此.基于264的avc intra格式. H.264 completely owns the market for Digital SLR cameras like the Canon 7D, 几乎所有的摄像机都使用H.264.

在传输方面，虽然大多数有线电视广播仍然是基于mpeg -2的，H.264 is gaining momentum in CATV 和 is widely used in satellite broadcasting. 的 streaming markets were first dominated by proprietary 编解码器s, 最初是RealNetwork的RealVideo, 然后是微软的Windows Media Video, 然后On2的VP6, 与索伦森视频3主要的QuickTime编解码器. In 2002, Apple’s QuickTime 6 debuted MPEG-4 support, with H.264添加到QuickTime 7在2005年. In the same year, the first video-capable iPod shipped, also with H.264(和MPEG-4)支持. 2007年，Adobe加入了H.264 support to Flash, 和 Microsoft Silverlight support followed in 2008.

唯一一个H.264并没有主导中间编解码器市场，这并不适合长GOP格式. 除此之外，几乎所有其他市场，从ipod到卫星电视，都主要是由H驱动的.264编解码器.

音频编解码器

最后，由于大多数视频也是通过音频捕获的，因此音频组件也必须被处理. 的 most widely used audio format for acquisition 和 editing is PCM, 它代表脉冲编码调制, which is usually stored in either WAV or AVI format on Windows, 或Mac上的AIFF或MOV. PCM被认为是未压缩的, so it may be more properly characterized as a file format, 而不是编解码器. 为了保证质量, 大多数中间编解码器只是通过摄像机传送的未压缩音频.

Most delivery formats have an associated lossy audio 编解码器, like MPEG audio 和 AC-3 Dolby Digital compression on DVDs. 大多数早期的流媒体技术, 比如RealVideo和Windows Media, 拥有专有的音频组件, 所以RealAudio伴随着RealVideo文件, 就像Windows Media Audio和Windows Media Video一样.

当Adobe将VP6编解码器与用于Flash发行的MP3音频编解码器配对时，这种动态变化最为显著. 基于标准的音频编解码器.264 video is the Advanced Audio Coding (AAC) 编解码器, while WebM pairs the VP8 编解码器 with the open-source Vorbis 编解码器.