Spaces:
Runtime error
Runtime error
# Accelerate Inference of NLP models with Post-Training Quantization API of NNCF | |
[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/language-quantize-bert/language-quantize-bert.ipynb) | |
This tutorial demonstrates how to apply INT8 quantization to the Natural Language Processing model BERT, | |
using the [Post-Training Quantization API](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/quantizing-models-post-training/basic-quantization-flow.html). | |
The [HuggingFace BERT](https://huggingface.co/docs/transformers/model_doc/bert) [PyTorch](https://pytorch.org/) model, | |
fine-tuned for [Microsoft Research Paraphrase Corpus (MRPC)](https://www.microsoft.com/en-us/download/details.aspx?id=52398) task | |
is used. The code of this tutorial is designed to be extendable to custom models and datasets. | |
## Notebook Contents | |
The tutorial consists of the following steps: | |
* Downloading and preparing the MRPC model and a dataset. | |
* Defining data loading functionality. | |
* Running optimization pipeline. | |
* Comparing F1 score of the original and quantized models. | |
* Comparing performance of the original and quantized models. | |
## Installation Instructions | |
This is a self-contained example that relies solely on its own code.</br> | |
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start. | |
For details, please refer to [Installation Guide](../../README.md). | |