

TorchAudio’s official binary distributions are compiled to work with FFmpeg 4 libraries, and they contain the logic required for hardware-based decoding/encoding. NVIDIA GPU with hardware video decoder/encoder.įFmpeg libraries compiled with NVDEC/NVENC support. To use NVENC/NVDEC with TorchAudio, the following items are required. In the following, we look into how to enable GPU video decoding with NVIDIA’s Video codec SDK. Using them in TorchAduio requires additional FFmpeg configuration. For the detail on the performance of GPU decoder and encoder please see Hardware-Accelerated Video Decoding and Encoding Overview ¶ This page goes through how to build FFmpeg with hardware acceleration. However, please note that not all the video formats are supported by hardware acceleration. This improves the video throughput significantly.

Using NVIDIA’s GPU decoder and encoder, it is also possible to pass around CUDA Tensor directly, that is decode video into CUDA tensor or encode video from CUDA tensor, without moving data from/to CPU. TorchAudio can make use of hardware-based video decoding and encoding supported by underlying FFmpeg libraries that are linked at runtime. HuBERT Pre-training and Fine-tuning (ASR).Music Source Separation with Hybrid Demucs.

Speech Enhancement with MVDR Beamforming.
