|
--- |
|
pipeline_tag: token-classification |
|
datasets: |
|
- conll2003 |
|
metrics: |
|
- precision |
|
- recall |
|
- f1 |
|
- accuracy |
|
tags: |
|
- distilbert |
|
--- |
|
|
|
**task**: `token-classification` |
|
**Backend:** `sagemaker-training` |
|
**Backend args:** `{'instance_type': 'ml.g4dn.2xlarge', 'supported_instructions': 'avx512_vnni'}` |
|
**Number of evaluation samples:** `1000` |
|
|
|
Fixed parameters: |
|
* **model_name_or_path**: `elastic/distilbert-base-uncased-finetuned-conll03-english` |
|
* **dataset**: |
|
* **path**: `conll2003` |
|
* **eval_split**: `validation` |
|
* **data_keys**: `{'primary': 'tokens'}` |
|
* **ref_keys**: `['ner_tags']` |
|
* **calibration_split**: `train` |
|
* **node_exclusion**: `[]` |
|
* **per_channel**: `False` |
|
* **calibration**: |
|
* **method**: `minmax` |
|
* **num_calibration_samples**: `100` |
|
* **framework**: `onnxruntime` |
|
* **framework_args**: |
|
* **opset**: `11` |
|
* **optimization_level**: `1` |
|
* **aware_training**: `False` |
|
|
|
Benchmarked parameters: |
|
* **quantization_approach**: `dynamic`, `static` |
|
* **operators_to_quantize**: `['Add', 'MatMul']`, `['Add']` |
|
|
|
# Evaluation |
|
## Non-time metrics |
|
| quantization_approach | operators_to_quantize | | precision (original) | precision (optimized) | | recall (original) | recall (optimized) | | f1 (original) | f1 (optimized) | | accuracy (original) | accuracy (optimized) | |
|
| :-------------------: | :-------------------: | :-: | :------------------: | :-------------------: | :-: | :---------------: | :----------------: | :-: | :-----------: | :------------: | :-: | :-----------------: | :------------------: | |
|
| `dynamic` | `['Add', 'MatMul']` | \| | 0.937 | 0.937 | \| | 0.953 | 0.953 | \| | 0.945 | 0.945 | \| | 0.988 | 0.988 | |
|
| `dynamic` | `['Add']` | \| | 0.937 | 0.937 | \| | 0.953 | 0.953 | \| | 0.945 | 0.945 | \| | 0.988 | 0.988 | |
|
| `static` | `['Add', 'MatMul']` | \| | 0.937 | 0.074 | \| | 0.953 | 0.253 | \| | 0.945 | 0.114 | \| | 0.988 | 0.363 | |
|
| `static` | `['Add']` | \| | 0.937 | 0.065 | \| | 0.953 | 0.186 | \| | 0.945 | 0.096 | \| | 0.988 | 0.340 | |
|
|
|
## Time metrics |
|
Time benchmarks were run for 3 seconds per config. |
|
|
|
|
|
Below, time metrics for batch size = 1, input length = 64. |
|
|
|
| quantization_approach | operators_to_quantize | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) | |
|
| :-------------------: | :-------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: | |
|
| `dynamic` | `['Add', 'MatMul']` | \| | 57.64 | 12.30 | \| | 17.67 | 81.33 | |
|
| `dynamic` | `['Add']` | \| | 43.51 | 29.42 | \| | 23.00 | 34.00 | |
|
| `static` | `['Add', 'MatMul']` | \| | 43.05 | 21.11 | \| | 23.33 | 47.67 | |
|
| `static` | `['Add']` | \| | 43.50 | 37.93 | \| | 23.00 | 26.67 | |
|
|
|
|
|
Below, time metrics for batch size = 4, input length = 64. |
|
|
|
| quantization_approach | operators_to_quantize | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) | |
|
| :-------------------: | :-------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: | |
|
| `dynamic` | `['Add', 'MatMul']` | \| | 119.50 | 39.92 | \| | 8.67 | 25.33 | |
|
| `dynamic` | `['Add']` | \| | 119.62 | 107.42 | \| | 8.67 | 9.33 | |
|
| `static` | `['Add', 'MatMul']` | \| | 120.23 | 56.94 | \| | 8.33 | 17.67 | |
|
| `static` | `['Add']` | \| | 119.10 | 130.78 | \| | 8.67 | 7.67 | |
|
|
|
|
|
Below, time metrics for batch size = 8, input length = 64. |
|
|
|
| quantization_approach | operators_to_quantize | | latency_mean (original, ms) | latency_mean (optimized, ms) | | throughput (original, /s) | throughput (optimized, /s) | |
|
| :-------------------: | :-------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: | |
|
| `dynamic` | `['Add', 'MatMul']` | \| | 165.84 | 75.45 | \| | 6.33 | 13.33 | |
|
| `dynamic` | `['Add']` | \| | 214.65 | 211.41 | \| | 4.67 | 5.00 | |
|
| `static` | `['Add', 'MatMul']` | \| | 166.53 | 129.00 | \| | 6.33 | 8.00 | |
|
| `static` | `['Add']` | \| | 214.81 | 256.95 | \| | 4.67 | 4.00 | |
|
|
|
|