-->
为五月的纽约流媒体保留座位吧. 现在注册!

回顾:NETINT Quadra T1U视频处理单元

文章特色图片

本综述将重点介绍 NETINT Quadra T1U and explore its capabilities as a video processing unit (VPU) for high-volume encoding and 代码转换 of single files, 编码梯子, 还有直播.

The Quadra T1U VPU uses the Codensity G5 ASIC (application-specific integrated circuit) chip (Figure 1),装在一个钱包大小的U.2外形尺寸. NETINT将该产品称为VPU,因为除了转码功能之外, 它执行缩放和覆盖板上,并具有人工智能渲染功能, 我在这篇评论中没有测试过吗. The cost is around $1,500, and 10–20 units can fit in a single server with the necessary U.2 slots. The U.2插槽使用与显卡相同的超高速PCIe连接器. Each Quadra T1U draws only 17 watts of power and delivers more throughput than a computer that draws 400-plus watts.

NETINT codense G5 ASIC芯片

Figure 1. Codensity G5 ASIC芯片

Quadra T1U提供以下功能:

  • AV1/H.264 / HEVC / YUV编码
  • VP9/H.264 / HEVC / YUV解码
  • 在扩展
  • 上覆盖
  • 两个AI深度神经网络引擎

如前所述,Quadra T1U的关键部件是ASIC芯片. Transcoders with ASIC chips hold significant advantages over CPU-based and GPU-based encoders since they can be designed for a specific purpose—in this case, 代码转换. ASIC芯片的其他一些关键优点是它们允许更小的设备, 执行专门任务, 并通过降低能耗来提高效率.

Quadra T1U设置

Quadra T1U硬件设置很简单. 除了美国之外.2外形尺寸, Quadra T1U采用PCIe形式, 类似于用户可以安装的网卡或GPU. Individuals with previous experience working on computers should be able to set up this device.

For software install, NETINT works with FFmpeg and GStreamer and has an SDK with an API. The Quadra T1U ships with scripts that automate the software installation process for users. 你可以阅读一篇关于安装该产品的硬件和软件的文章 这篇LinkedIn帖子.

用于本综述中的测试, NETINT在远程服务器上安装了Quadra T1U,并为我进行了配置. 公司还为我的测试提供了脚本. 因此,我所要做的就是通过Bitvise SSH Client连接以运行我的测试.

使用Quadra T1U

正如前面提到的, 运行Quadra T1U时, 您可以使用FFmpeg或GStreamer脚本或直接通过API运行它. 我通过连接到远程计算机并在终端中运行脚本进行了测试. You’ll also need a tool for generating reports for video quality metrics if you want to generate VMAF, SSIM, 和PSNR分数.

我使用的评测指南是为Windows电脑编写的, and the open source programs for connecting and measuring quality were all Windows-based. 出于这个原因,我用一台Windows电脑进行了测试.

The Bitvise SSH Client and FFMetrics tools recommended by NETINT for my testing both run on Windows. The Bitvise SSH Client is used to connect to the server running the Quadra T1U and perform various tasks. What’s helpful about using this tool is that it easily allows users to connect to the server and open multiple terminal windows to run commands. You need multiple terminal windows for functions like reviewing encoding status and CPU usage. Bitvise SSH Client可从go2sm下载 .com/bitvise.

FFMetrics用于生成VMAF、SSIM和PSNR分数. 为了审查和测试Quadra T1U, I used Windows Server 2019 on an Amazon Web Services EC2 instance to connect to the Quadra T1U and run testing scripts. 我在Windows服务器上安装了Bitvise SSH客户端和FFMetrics. 可以找到FFMetrics测试版 here.

值得注意的是,Quadra T1U没有GUI. You’ll need some BASH scripting experience using the terminal to run scripts once connected with the Bitvise SSH Client.

使用Quadra T1U

Once the Quadra T1U is installed and set up, you can begin using the VPU to run encodes. 在开始编码之前, though, you’ll need to connect your Bitvise SSH Client to the server and run some basic commands using terminal windows to get the Quadra T1U ready for use.

首先,启动Bitvise或SSH Client. Enter your IP 信息 in the Host section along with your port number, as shown in 图2(下面). 接下来,添加您的用户名和密码. 然后单击“Log in”按钮.

bitwise SSH客户端

Figure 2. 登录Bitvise SSH客户端

接下来,是时候运行一些命令来开始使用Quadra T1U. 一旦使用Bitvise登录, you can open terminal windows to run commands or navigate to folders where you may want to run tests.

要打开终端窗口,请单击左侧的New终端控制台按钮 图3(下面). 您最多可以打开10个终端窗口, 这是我在测试中从未接近过的极限.

打开终端窗口

Figure 3. 打开终端窗口和SFTP窗口

To navigate to directories on the Quadra T1U, click New SFTP window, and choose the directory. First, open a terminal window, and run the following command to initialize the Quadra T1U:

init_rsrc

Second, open another terminal window to open throughput testing, then run this command:

ni_rsrc_mon它们

该命令运行监控实用程序,每5秒刷新一次. 它监视解码器/编码器/缩放器的利用率.

接下来,打开另一个终端窗口. 在这个窗口, 您将跟踪同时运行多少个版本的FFmpeg, 您还将监视整个系统负载. 运行命令:

top

The terminal window will show tracking versions of FFmpeg running and monitoring overall system load when encoding is not taking place. Overall system utilization is low during Quadra T1U operation and quite high when encoding using CPU-only codecs like x265 and x264.

最后,打开第四个窗口来运行脚本. To run a single script, navigate to the folder where your script is, and run chmod +x on the script. 在终端窗口中显示如下:

Chmod +x scriptname.sh

测试Quadra T1U

本次评测测试中使用的Ubuntu服务器规格如下:

  • AMD Ryzen 5 5600X 6核CPU
  • AMD Ryzen 5 5600X 6核CPU运行在2200mhz
  • 每个核两个线程
  • 12个CPU线程
  • 16GB RAM

该服务器有6个CPU和12个内核,因此总可用系统CPU为1200%.

回顾一下, I was interested in learning whether the Quadra T1U could benefit colleges and universities like mine, 俄亥俄州立大学. 根据我们的视频需求, 大学的, 每周有成千上万的视频编码可供点播. 这些编码中有许多使用编码阶梯. 没有那么多的每周直播.

以下是我希望在这篇综述中回答的问题:

  • Could encoding with the Quadra T1U provide a significant reduction in CPU usage for single-file encodes and encoding with 编码梯子?
  • Could significantly more encodes be performed using the Quadra T1U compared to CPU-based encoding?
  • Would the quality of encoding using the Quadra T1U be the same as FFmpeg encodes or better?

为了我的测试, NETINT provided guidance and instructions for best approaches to testing the Quadra T1U in ways that video engineers could benefit from and for how its customers use the product.

以下是我的测试结果:

吞吐量:单文件编码

  • H.264、HEVC、AV1(使用Quadra T1U)
  • x264/x265(使用FFmpeg)

吞吐量:阶梯编码

  • H.264、HEVC、AV1(使用Quadra T1U)
  • x264/x265(使用FFmpeg)

Quality: H.264/HEVC

  • 吞吐量优化(Quadra T1U和FFmpeg比较)
  • 质量优化(Quadra T1U和FFmpeg比较)

吞吐量测试:单文件

首先,我将讨论单文件吞吐量测试. 在Quadra T1U上, I ran a master script to perform 32 simultaneous FFmpeg encodes with one of the selected Quadra hardware codecs. 每个编码从RAM驱动器输入一个1080p文件来模拟实时操作. 有关“足球”来源的详细信息显示在 图4(下面). 在整个综述的测试中使用了该来源.

mediainfo

Figure 4. 本评论的详细来源见Mediainfo

To run each test, I navigated the server and selected and ran a test command similar to this:

./ test_32_H264.sh

The 32 in the name shows the number of simultaneous encodes achieved by calling and running 32 separate encoding scripts. 使用所有三种编解码器将单个文件转换为单个输出时, Quadra produced 32 30 fps simultaneous transcodes that I verified in the encoding logs. 与此形成鲜明对比的是, 仅使用FFmpeg和CPU进行编码, 服务器只生成了5个FFmpeg x264编码和3个FFmpeg x265编码.

图5(下面) shows what the script looks like for single-file throughput testing using the Quadra T1U.

quadra的命令字符串

Figure 5. 带有解释的Quadra命令字符串

图6(下面) 显示用于x264 FFmpeg编码的命令字符串.

x264的命令字符串

Figure 6. 带有解释的x264命令字符串

最后, 图7(下面) 显示x265编码的命令字符串.

x265的命令字符串

Figure 7. x265的命令字符串和解释

一旦将脚本提交到Quadra进行编码, 你会看到解码器, encoders, scalers, 和FFmpeg实例的数量. ModelLoad 100意味着你的编码已经达到了极限. 这就是为什么每个编解码器编码的峰值是32,这是Quadra T1U的最大容量.

The Ubuntu top utility shows CPU utilization and the number of FFmpeg encodes running in this terminal window during a Quadra T1U encoding. 在Quadra T1U上执行编码时,CPU使用率极低, 但在运行FFmpeg CPU编码时明显更高. 这种情况在我的测试中一直发生.

在我的测试中,我只能同时维护5个FFmpeg x264编码. 因为CPU max是1,200%,我的测试显示CPU使用率为963%, 另一个成功的编码并非不可能. 但是系统尝试了6种编码,无法保持30帧/秒的帧率.

图8(下面) 显示了我的单文件吞吐量测试结果的摘要. 期望值列显示了每个主脚本中包含的编码数, 实际值列显示了测试期间发生的情况.

吞吐量测试的结果

Figure 8. 吞吐量测试的结果-单个文件

用FFmpeg控制Quadra T1U, I had 32 simultaneous encodes for each Quadra hardware codec using one Quadra T1U module. For CPU-based encodes with FFmpeg, the max encodes were five for x264 and three for x265. This really illustrates the advantages that ASIC-based encoding has over CPU-based encoding and the potential cost savings per stream.

相关文章

VOD编码器买家指南

什么是您的视频点播编码需求的最佳解决方案? 这要视情况而定,但本指南将帮助您弄清楚该问哪些问题.

NAB 2019: NETINT谈高密度H.265编码

流媒体's Jan Ozer and NETINT's Ray Adensamer discuss NETINT's Codensity T400, 这是针对那些需要大规模进行实时视频编码工作的公司.