Audio compression with EnCodec and OpenVINO

Compression is an important part of the Internet today because it enables people to easily share high-quality photos, listen to audio messages, stream their favorite shows, and so much more. Even when using today’s state-of-the-art techniques, enjoying these rich multimedia experiences requires a high speed Internet connection and plenty of storage space. AI helps to overcome these limitations: "Imagine listening to a friend’s audio message in an area with low connectivity and not having it stall or glitch."

In this tutorial, we consider how to use OpenVINO and EnCodec algorithm for hyper compression of audio. EnCodec is a real-time, high-fidelity audio codec that uses AI to compress audio files without losing quality. It was introduced in High Fidelity Neural Audio Compression paper by Meta AI. More details about this approach can be found in Meta AI blog and original repo.

Notebook Contents

This notebook demonstrates how to convert and run EnCodec model using OpenVINO.

Notebook contains the following steps:

Instantiate and run an EnCodec audio compression pipeline.
Convert models to OpenVINO IR format, using model conversion API.
Integrate OpenVINO to the EnCodec pipeline.

As the result, we get a pipeline that accepts input audio file and converts it to compressed representation, ready for being saved on disk or sent to a recipient. After that, it can be successfully decompressed back to audio.

Installation Instructions

This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start. For details, please refer to Installation Guide.