Optimized and Quantized DistilBERT with a custom pipeline with handler.py
NOTE: Blog post coming soon
This is a template repository for Text Classification using Optimum and onnxruntime to support generic inference with Hugging Face Hub generic Inference API. There are two required steps:
- Specify the requirements by defining a
requirements.txt
file. - Implement the
handler.py
__init__
and__call__
methods. These methods are called by the Inference API. The__init__
method should load the model and preload the optimum model and tokenizers as well as thetext-classification
pipeline needed for inference. This is only called once. The__call__
method performs the actual inference. Make sure to follow the same input/output specifications defined in the template for the pipeline to work.
add
library_name: generic
to the readme.
note: the generic
community image currently only support inputs
as parameter and no parameter.
- Downloads last month
- 34
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the HF Inference API does not support generic models with pipeline type text-classification