add experience

Browse files

Files changed (7) hide show

README.md +83 -0
runs.json +580 -0
tensorboard/1657610437.7287223/events.out.tfevents.1657610437.ip-10-2-224-27.ec2.internal.1.1 +3 -0
tensorboard/1657610437.7304575/events.out.tfevents.1657610437.ip-10-2-224-27.ec2.internal.1.2 +3 -0
tensorboard/1657610437.7316337/events.out.tfevents.1657610437.ip-10-2-224-27.ec2.internal.1.3 +3 -0
tensorboard/1657610437.7327793/events.out.tfevents.1657610437.ip-10-2-224-27.ec2.internal.1.4 +3 -0
tensorboard/events.out.tfevents.1657610437.ip-10-2-224-27.ec2.internal.1.0 +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,83 @@

+---
+pipeline_tag: token-classification
+datasets:
+- conll2003
+metrics:
+- precision
+- recall
+- f1
+- accuracy
+tags:
+- distilbert
+---
+**task**: `token-classification`
+**Backend:** `sagemaker-training`
+**Backend args:** `{'instance_type': 'ml.g4dn.2xlarge', 'supported_instructions': 'avx512_vnni'}`
+**Number of evaluation samples:** `1000`
+Fixed parameters:
+* **model_name_or_path**: `elastic/distilbert-base-uncased-finetuned-conll03-english`
+* **dataset**:
+    * **path**: `conll2003`
+    * **eval_split**: `validation`
+    * **data_keys**: `{'primary': 'tokens'}`
+    * **ref_keys**: `['ner_tags']`
+    * **calibration_split**: `train`
+* **node_exclusion**: `[]`
+* **per_channel**: `False`
+* **calibration**:
+    * **method**: `minmax`
+    * **num_calibration_samples**: `100`
+* **framework**: `onnxruntime`
+* **framework_args**:
+    * **opset**: `11`
+    * **optimization_level**: `1`
+* **aware_training**: `False`
+Benchmarked parameters:
+* **quantization_approach**: `dynamic`,  `static`
+* **operators_to_quantize**: `['Add', 'MatMul']`,  `['Add']`
+# Evaluation
+## Non-time metrics
+| quantization_approach | operators_to_quantize |     | precision (original) | precision (optimized) |     | recall (original) | recall (optimized) |     | f1 (original) | f1 (optimized) |     | accuracy (original) | accuracy (optimized) |
+| :-------------------: | :-------------------: | :-: | :------------------: | :-------------------: | :-: | :---------------: | :----------------: | :-: | :-----------: | :------------: | :-: | :-----------------: | :------------------: |
+|       `dynamic`       |  `['Add', 'MatMul']`  |  \|  |        0.937         |         0.937         |  \|  |       0.953       |       0.953        |  \|  |     0.945     |     0.945      |  \|  |        0.988        |        0.988         |
+|       `dynamic`       |       `['Add']`       |  \|  |        0.937         |         0.937         |  \|  |       0.953       |       0.953        |  \|  |     0.945     |     0.945      |  \|  |        0.988        |        0.988         |
+|       `static`        |  `['Add', 'MatMul']`  |  \|  |        0.937         |         0.074         |  \|  |       0.953       |       0.253        |  \|  |     0.945     |     0.114      |  \|  |        0.988        |        0.363         |
+|       `static`        |       `['Add']`       |  \|  |        0.937         |         0.065         |  \|  |       0.953       |       0.186        |  \|  |     0.945     |     0.096      |  \|  |        0.988        |        0.340         |
+## Time metrics
+Time benchmarks were run for 3 seconds per config.
+Below, time metrics for batch size = 1, input length = 64.
+| quantization_approach | operators_to_quantize |     | latency_mean (original, ms) | latency_mean (optimized, ms) |     | throughput (original, /s) | throughput (optimized, /s) |
+| :-------------------: | :-------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
+|       `dynamic`       |  `['Add', 'MatMul']`  |  \|  |            57.64            |            12.30             |  \|  |           17.67           |           81.33            |
+|       `dynamic`       |       `['Add']`       |  \|  |            43.51            |            29.42             |  \|  |           23.00           |           34.00            |
+|       `static`        |  `['Add', 'MatMul']`  |  \|  |            43.05            |            21.11             |  \|  |           23.33           |           47.67            |
+|       `static`        |       `['Add']`       |  \|  |            43.50            |            37.93             |  \|  |           23.00           |           26.67            |
+Below, time metrics for batch size = 4, input length = 64.
+| quantization_approach | operators_to_quantize |     | latency_mean (original, ms) | latency_mean (optimized, ms) |     | throughput (original, /s) | throughput (optimized, /s) |
+| :-------------------: | :-------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
+|       `dynamic`       |  `['Add', 'MatMul']`  |  \|  |           119.50            |            39.92             |  \|  |           8.67            |           25.33            |
+|       `dynamic`       |       `['Add']`       |  \|  |           119.62            |            107.42            |  \|  |           8.67            |            9.33            |
+|       `static`        |  `['Add', 'MatMul']`  |  \|  |           120.23            |            56.94             |  \|  |           8.33            |           17.67            |
+|       `static`        |       `['Add']`       |  \|  |           119.10            |            130.78            |  \|  |           8.67            |            7.67            |
+Below, time metrics for batch size = 8, input length = 64.
+| quantization_approach | operators_to_quantize |     | latency_mean (original, ms) | latency_mean (optimized, ms) |     | throughput (original, /s) | throughput (optimized, /s) |
+| :-------------------: | :-------------------: | :-: | :-------------------------: | :--------------------------: | :-: | :-----------------------: | :------------------------: |
+|       `dynamic`       |  `['Add', 'MatMul']`  |  \|  |           165.84            |            75.45             |  \|  |           6.33            |           13.33            |
+|       `dynamic`       |       `['Add']`       |  \|  |           214.65            |            211.41            |  \|  |           4.67            |            5.00            |
+|       `static`        |  `['Add', 'MatMul']`  |  \|  |           166.53            |            129.00            |  \|  |           6.33            |            8.00            |
+|       `static`        |       `['Add']`       |  \|  |           214.81            |            256.95            |  \|  |           4.67            |            4.00            |

runs.json ADDED Viewed

	@@ -0,0 +1,580 @@

+[
+    {
+        "model_name_or_path": "elastic/distilbert-base-uncased-finetuned-conll03-english",
+        "task": "token-classification",
+        "dataset": {
+            "path": "conll2003",
+            "eval_split": "validation",
+            "data_keys": {
+                "primary": "tokens",
+                "secondary": null
+            },
+            "ref_keys": [
+                "ner_tags"
+            ],
+            "name": null,
+            "calibration_split": "train"
+        },
+        "quantization_approach": "static",
+        "operators_to_quantize": [
+            "Add",
+            "MatMul"
+        ],
+        "node_exclusion": [],
+        "aware_training": false,
+        "per_channel": false,
+        "calibration": {
+            "method": "minmax",
+            "num_calibration_samples": 100,
+            "calibration_histogram_percentile": null,
+            "calibration_moving_average": null,
+            "calibration_moving_average_constant": null
+        },
+        "framework": "onnxruntime",
+        "framework_args": {
+            "opset": 11,
+            "optimization_level": 1
+        },
+        "hardware": "Architecture:                    x86_64\nCPU op-mode(s):                  32-bit, 64-bit\nByte Order:                      Little Endian\nAddress sizes:                   46 bits physical, 48 bits virtual\nCPU(s):                          8\nOn-line CPU(s) list:             0-7\nThread(s) per core:              2\nCore(s) per socket:              4\nSocket(s):                       1\nNUMA node(s):                    1\nVendor ID:                       GenuineIntel\nCPU family:                      6\nModel:                           85\nModel name:                      Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz\nStepping:                        7\nCPU MHz:                         3100.244\nBogoMIPS:                        4999.99\nHypervisor vendor:               KVM\nVirtualization type:             full\nL1d cache:                       128 KiB\nL1i cache:                       128 KiB\nL2 cache:                        4 MiB\nL3 cache:                        35.8 MiB\nNUMA node0 CPU(s):               0-7\nVulnerability Itlb multihit:     KVM: Vulnerable\nVulnerability L1tf:              Mitigation; PTE Inversion\nVulnerability Mds:               Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown\nVulnerability Meltdown:          Mitigation; PTI\nVulnerability Spec store bypass: Vulnerable\nVulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization\nVulnerability Spectre v2:        Mitigation; Retpolines, STIBP disabled, RSB filling\nVulnerability Srbds:             Not affected\nVulnerability Tsx async abort:   Not affected\nFlags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke avx512_vnni\n",
+        "versions": {
+            "transformers": "4.20.1",
+            "optimum": "1.2.3.dev0",
+            "optimum_hash": "5ac9c0d9fd7e7cca55b2f9935b961ed5b6c50112"
+        },
+        "evaluation": {
+            "time": [
+                {
+                    "batch_size": 4,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 25,
+                        "throughput": 8.33,
+                        "latency_mean": 120.23236996,
+                        "latency_std": 0.989423927986037,
+                        "latency_50": 120.445322,
+                        "latency_90": 121.0641136,
+                        "latency_95": 121.63786520000001,
+                        "latency_99": 122.31954252,
+                        "latency_999": 122.47620775200001
+                    },
+                    "optimized": {
+                        "nb_forwards": 53,
+                        "throughput": 17.67,
+                        "latency_mean": 56.94031537735849,
+                        "latency_std": 2.2044830948358625,
+                        "latency_50": 56.199388,
+                        "latency_90": 60.3284648,
+                        "latency_95": 60.6057082,
+                        "latency_99": 61.70255691999999,
+                        "latency_999": 62.529690292000005
+                    }
+                },
+                {
+                    "batch_size": 8,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 19,
+                        "throughput": 6.33,
+                        "latency_mean": 166.53055257894738,
+                        "latency_std": 1.575841987426849,
+                        "latency_50": 166.638572,
+                        "latency_90": 168.272883,
+                        "latency_95": 168.7129504,
+                        "latency_99": 169.52801488,
+                        "latency_999": 169.711404388
+                    },
+                    "optimized": {
+                        "nb_forwards": 24,
+                        "throughput": 8.0,
+                        "latency_mean": 129.002869375,
+                        "latency_std": 0.6157854643813875,
+                        "latency_50": 129.063924,
+                        "latency_90": 129.7084936,
+                        "latency_95": 129.9355643,
+                        "latency_99": 130.24102448,
+                        "latency_999": 130.313872748
+                    }
+                },
+                {
+                    "batch_size": 1,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 70,
+                        "throughput": 23.33,
+                        "latency_mean": 43.048573857142856,
+                        "latency_std": 1.1204473128323003,
+                        "latency_50": 42.845755,
+                        "latency_90": 43.8944438,
+                        "latency_95": 44.3052485,
+                        "latency_99": 46.73122168000001,
+                        "latency_999": 49.909082367999986
+                    },
+                    "optimized": {
+                        "nb_forwards": 143,
+                        "throughput": 47.67,
+                        "latency_mean": 21.113699776223775,
+                        "latency_std": 0.1930452254945551,
+                        "latency_50": 21.085728,
+                        "latency_90": 21.3874956,
+                        "latency_95": 21.4500651,
+                        "latency_99": 21.640094780000002,
+                        "latency_999": 21.648399938
+                    }
+                }
+            ],
+            "others": {
+                "baseline": {
+                    "precision": 0.936836221352711,
+                    "recall": 0.9533560864618885,
+                    "f1": 0.9450239639131661,
+                    "accuracy": 0.9880421708059153
+                },
+                "optimized": {
+                    "precision": 0.07350512058143377,
+                    "recall": 0.25312855517633676,
+                    "f1": 0.1139272913466462,
+                    "accuracy": 0.3629802589683719
+                }
+            }
+        },
+        "max_eval_samples": 1000,
+        "time_benchmark_args": {
+            "duration": 3,
+            "warmup_runs": 1
+        },
+        "model_type": "distilbert"
+    },
+    {
+        "model_name_or_path": "elastic/distilbert-base-uncased-finetuned-conll03-english",
+        "task": "token-classification",
+        "dataset": {
+            "path": "conll2003",
+            "eval_split": "validation",
+            "data_keys": {
+                "primary": "tokens",
+                "secondary": null
+            },
+            "ref_keys": [
+                "ner_tags"
+            ],
+            "name": null,
+            "calibration_split": "train"
+        },
+        "quantization_approach": "static",
+        "operators_to_quantize": [
+            "Add"
+        ],
+        "node_exclusion": [],
+        "aware_training": false,
+        "per_channel": false,
+        "calibration": {
+            "method": "minmax",
+            "num_calibration_samples": 100,
+            "calibration_histogram_percentile": null,
+            "calibration_moving_average": null,
+            "calibration_moving_average_constant": null
+        },
+        "framework": "onnxruntime",
+        "framework_args": {
+            "opset": 11,
+            "optimization_level": 1
+        },
+        "hardware": "Architecture:                    x86_64\nCPU op-mode(s):                  32-bit, 64-bit\nByte Order:                      Little Endian\nAddress sizes:                   46 bits physical, 48 bits virtual\nCPU(s):                          8\nOn-line CPU(s) list:             0-7\nThread(s) per core:              2\nCore(s) per socket:              4\nSocket(s):                       1\nNUMA node(s):                    1\nVendor ID:                       GenuineIntel\nCPU family:                      6\nModel:                           85\nModel name:                      Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz\nStepping:                        7\nCPU MHz:                         3100.091\nBogoMIPS:                        4999.99\nHypervisor vendor:               KVM\nVirtualization type:             full\nL1d cache:                       128 KiB\nL1i cache:                       128 KiB\nL2 cache:                        4 MiB\nL3 cache:                        35.8 MiB\nNUMA node0 CPU(s):               0-7\nVulnerability Itlb multihit:     KVM: Vulnerable\nVulnerability L1tf:              Mitigation; PTE Inversion\nVulnerability Mds:               Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown\nVulnerability Meltdown:          Mitigation; PTI\nVulnerability Spec store bypass: Vulnerable\nVulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization\nVulnerability Spectre v2:        Mitigation; Retpolines, STIBP disabled, RSB filling\nVulnerability Srbds:             Not affected\nVulnerability Tsx async abort:   Not affected\nFlags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke avx512_vnni\n",
+        "versions": {
+            "transformers": "4.20.1",
+            "optimum": "1.2.3.dev0",
+            "optimum_hash": "5ac9c0d9fd7e7cca55b2f9935b961ed5b6c50112"
+        },
+        "evaluation": {
+            "time": [
+                {
+                    "batch_size": 1,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 69,
+                        "throughput": 23.0,
+                        "latency_mean": 43.50449917391305,
+                        "latency_std": 1.1458006326491226,
+                        "latency_50": 43.443712,
+                        "latency_90": 44.833304,
+                        "latency_95": 45.4732784,
+                        "latency_99": 46.1717674,
+                        "latency_999": 46.293552340000005
+                    },
+                    "optimized": {
+                        "nb_forwards": 80,
+                        "throughput": 26.67,
+                        "latency_mean": 37.9267952125,
+                        "latency_std": 0.11734822683861629,
+                        "latency_50": 37.9285515,
+                        "latency_90": 38.085207600000004,
+                        "latency_95": 38.111036399999996,
+                        "latency_99": 38.2064807,
+                        "latency_999": 38.22722057
+                    }
+                },
+                {
+                    "batch_size": 8,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 14,
+                        "throughput": 4.67,
+                        "latency_mean": 214.81155885714287,
+                        "latency_std": 0.6229026122307055,
+                        "latency_50": 214.6879675,
+                        "latency_90": 215.571702,
+                        "latency_95": 215.72494925,
+                        "latency_99": 215.90999385,
+                        "latency_999": 215.951628885
+                    },
+                    "optimized": {
+                        "nb_forwards": 12,
+                        "throughput": 4.0,
+                        "latency_mean": 256.95122358333333,
+                        "latency_std": 1.2773226309110695,
+                        "latency_50": 257.0572985,
+                        "latency_90": 258.7638351,
+                        "latency_95": 258.84763195,
+                        "latency_99": 258.86815838999996,
+                        "latency_999": 258.872776839
+                    }
+                },
+                {
+                    "batch_size": 4,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 26,
+                        "throughput": 8.67,
+                        "latency_mean": 119.1024813076923,
+                        "latency_std": 1.5917975126134987,
+                        "latency_50": 118.759877,
+                        "latency_90": 120.792844,
+                        "latency_95": 121.9356475,
+                        "latency_99": 123.13953675,
+                        "latency_999": 123.40581367499999
+                    },
+                    "optimized": {
+                        "nb_forwards": 23,
+                        "throughput": 7.67,
+                        "latency_mean": 130.78132304347827,
+                        "latency_std": 0.5922745467393132,
+                        "latency_50": 130.955147,
+                        "latency_90": 131.512009,
+                        "latency_95": 131.5393553,
+                        "latency_99": 131.74930052,
+                        "latency_999": 131.801985152
+                    }
+                }
+            ],
+            "others": {
+                "baseline": {
+                    "precision": 0.936836221352711,
+                    "recall": 0.9533560864618885,
+                    "f1": 0.9450239639131661,
+                    "accuracy": 0.9880421708059153
+                },
+                "optimized": {
+                    "precision": 0.06477812995245642,
+                    "recall": 0.18600682593856654,
+                    "f1": 0.09609168380840435,
+                    "accuracy": 0.3400551899808958
+                }
+            }
+        },
+        "max_eval_samples": 1000,
+        "time_benchmark_args": {
+            "duration": 3,
+            "warmup_runs": 1
+        },
+        "model_type": "distilbert"
+    },
+    {
+        "model_name_or_path": "elastic/distilbert-base-uncased-finetuned-conll03-english",
+        "task": "token-classification",
+        "dataset": {
+            "path": "conll2003",
+            "eval_split": "validation",
+            "data_keys": {
+                "primary": "tokens",
+                "secondary": null
+            },
+            "ref_keys": [
+                "ner_tags"
+            ],
+            "name": null,
+            "calibration_split": "train"
+        },
+        "quantization_approach": "dynamic",
+        "operators_to_quantize": [
+            "Add",
+            "MatMul"
+        ],
+        "node_exclusion": [],
+        "aware_training": false,
+        "per_channel": false,
+        "calibration": {
+            "method": "minmax",
+            "num_calibration_samples": 100,
+            "calibration_histogram_percentile": null,
+            "calibration_moving_average": null,
+            "calibration_moving_average_constant": null
+        },
+        "framework": "onnxruntime",
+        "framework_args": {
+            "opset": 11,
+            "optimization_level": 1
+        },
+        "hardware": "Architecture:                    x86_64\nCPU op-mode(s):                  32-bit, 64-bit\nByte Order:                      Little Endian\nAddress sizes:                   46 bits physical, 48 bits virtual\nCPU(s):                          8\nOn-line CPU(s) list:             0-7\nThread(s) per core:              2\nCore(s) per socket:              4\nSocket(s):                       1\nNUMA node(s):                    1\nVendor ID:                       GenuineIntel\nCPU family:                      6\nModel:                           85\nModel name:                      Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz\nStepping:                        7\nCPU MHz:                         3100.009\nBogoMIPS:                        4999.99\nHypervisor vendor:               KVM\nVirtualization type:             full\nL1d cache:                       128 KiB\nL1i cache:                       128 KiB\nL2 cache:                        4 MiB\nL3 cache:                        35.8 MiB\nNUMA node0 CPU(s):               0-7\nVulnerability Itlb multihit:     KVM: Vulnerable\nVulnerability L1tf:              Mitigation; PTE Inversion\nVulnerability Mds:               Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown\nVulnerability Meltdown:          Mitigation; PTI\nVulnerability Spec store bypass: Vulnerable\nVulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization\nVulnerability Spectre v2:        Mitigation; Retpolines, STIBP disabled, RSB filling\nVulnerability Srbds:             Not affected\nVulnerability Tsx async abort:   Not affected\nFlags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke avx512_vnni\n",
+        "versions": {
+            "transformers": "4.20.1",
+            "optimum": "1.2.3.dev0",
+            "optimum_hash": "5ac9c0d9fd7e7cca55b2f9935b961ed5b6c50112"
+        },
+        "evaluation": {
+            "time": [
+                {
+                    "batch_size": 1,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 53,
+                        "throughput": 17.67,
+                        "latency_mean": 57.63860111320755,
+                        "latency_std": 0.5448611043553628,
+                        "latency_50": 57.65361,
+                        "latency_90": 58.180421,
+                        "latency_95": 58.392744,
+                        "latency_99": 58.71634352,
+                        "latency_999": 58.721444252
+                    },
+                    "optimized": {
+                        "nb_forwards": 244,
+                        "throughput": 81.33,
+                        "latency_mean": 12.298368512295083,
+                        "latency_std": 0.4560740565346141,
+                        "latency_50": 12.2116125,
+                        "latency_90": 13.001667200000002,
+                        "latency_95": 13.1330103,
+                        "latency_99": 13.2790208,
+                        "latency_999": 13.414312331
+                    }
+                },
+                {
+                    "batch_size": 4,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 26,
+                        "throughput": 8.67,
+                        "latency_mean": 119.50429169230769,
+                        "latency_std": 0.4639465722921096,
+                        "latency_50": 119.446385,
+                        "latency_90": 119.95197,
+                        "latency_95": 120.05153425,
+                        "latency_99": 120.7893855,
+                        "latency_999": 121.00299195000001
+                    },
+                    "optimized": {
+                        "nb_forwards": 76,
+                        "throughput": 25.33,
+                        "latency_mean": 39.91599960526316,
+                        "latency_std": 0.883213781232674,
+                        "latency_50": 39.8835755,
+                        "latency_90": 41.0755615,
+                        "latency_95": 41.48617225,
+                        "latency_99": 42.00973875,
+                        "latency_999": 42.412953375
+                    }
+                },
+                {
+                    "batch_size": 8,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 19,
+                        "throughput": 6.33,
+                        "latency_mean": 165.83700805263157,
+                        "latency_std": 1.7394953701654086,
+                        "latency_50": 165.801757,
+                        "latency_90": 168.0285054,
+                        "latency_95": 168.19460990000002,
+                        "latency_99": 168.78632678,
+                        "latency_999": 168.919463078
+                    },
+                    "optimized": {
+                        "nb_forwards": 40,
+                        "throughput": 13.33,
+                        "latency_mean": 75.448955425,
+                        "latency_std": 1.2544431966810392,
+                        "latency_50": 75.414968,
+                        "latency_90": 77.1854282,
+                        "latency_95": 77.5299735,
+                        "latency_99": 77.80073465000001,
+                        "latency_999": 77.95147686499999
+                    }
+                }
+            ],
+            "others": {
+                "baseline": {
+                    "precision": 0.936836221352711,
+                    "recall": 0.9533560864618885,
+                    "f1": 0.9450239639131661,
+                    "accuracy": 0.9880421708059153
+                },
+                "optimized": {
+                    "precision": 0.9368008948545862,
+                    "recall": 0.9527872582480091,
+                    "f1": 0.9447264523406655,
+                    "accuracy": 0.9879006580343876
+                }
+            }
+        },
+        "max_eval_samples": 1000,
+        "time_benchmark_args": {
+            "duration": 3,
+            "warmup_runs": 1
+        },
+        "model_type": "distilbert"
+    },
+    {
+        "model_name_or_path": "elastic/distilbert-base-uncased-finetuned-conll03-english",
+        "task": "token-classification",
+        "dataset": {
+            "path": "conll2003",
+            "eval_split": "validation",
+            "data_keys": {
+                "primary": "tokens",
+                "secondary": null
+            },
+            "ref_keys": [
+                "ner_tags"
+            ],
+            "name": null,
+            "calibration_split": "train"
+        },
+        "quantization_approach": "dynamic",
+        "operators_to_quantize": [
+            "Add"
+        ],
+        "node_exclusion": [],
+        "aware_training": false,
+        "per_channel": false,
+        "calibration": {
+            "method": "minmax",
+            "num_calibration_samples": 100,
+            "calibration_histogram_percentile": null,
+            "calibration_moving_average": null,
+            "calibration_moving_average_constant": null
+        },
+        "framework": "onnxruntime",
+        "framework_args": {
+            "opset": 11,
+            "optimization_level": 1
+        },
+        "hardware": "Architecture:                    x86_64\nCPU op-mode(s):                  32-bit, 64-bit\nByte Order:                      Little Endian\nAddress sizes:                   46 bits physical, 48 bits virtual\nCPU(s):                          8\nOn-line CPU(s) list:             0-7\nThread(s) per core:              2\nCore(s) per socket:              4\nSocket(s):                       1\nNUMA node(s):                    1\nVendor ID:                       GenuineIntel\nCPU family:                      6\nModel:                           85\nModel name:                      Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz\nStepping:                        7\nCPU MHz:                         2638.487\nBogoMIPS:                        4999.99\nHypervisor vendor:               KVM\nVirtualization type:             full\nL1d cache:                       128 KiB\nL1i cache:                       128 KiB\nL2 cache:                        4 MiB\nL3 cache:                        35.8 MiB\nNUMA node0 CPU(s):               0-7\nVulnerability Itlb multihit:     KVM: Vulnerable\nVulnerability L1tf:              Mitigation; PTE Inversion\nVulnerability Mds:               Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown\nVulnerability Meltdown:          Mitigation; PTI\nVulnerability Spec store bypass: Vulnerable\nVulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization\nVulnerability Spectre v2:        Mitigation; Retpolines, STIBP disabled, RSB filling\nVulnerability Srbds:             Not affected\nVulnerability Tsx async abort:   Not affected\nFlags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke avx512_vnni\n",
+        "versions": {
+            "transformers": "4.20.1",
+            "optimum": "1.2.3.dev0",
+            "optimum_hash": "5ac9c0d9fd7e7cca55b2f9935b961ed5b6c50112"
+        },
+        "evaluation": {
+            "time": [
+                {
+                    "batch_size": 1,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 69,
+                        "throughput": 23.0,
+                        "latency_mean": 43.50526027536232,
+                        "latency_std": 1.1770353674252074,
+                        "latency_50": 43.267983,
+                        "latency_90": 45.0357992,
+                        "latency_95": 45.6057136,
+                        "latency_99": 46.708998679999986,
+                        "latency_999": 47.814713768000004
+                    },
+                    "optimized": {
+                        "nb_forwards": 102,
+                        "throughput": 34.0,
+                        "latency_mean": 29.424613480392157,
+                        "latency_std": 0.14890697595200564,
+                        "latency_50": 29.3912705,
+                        "latency_90": 29.646715,
+                        "latency_95": 29.68545545,
+                        "latency_99": 29.80756655,
+                        "latency_999": 29.811399894
+                    }
+                },
+                {
+                    "batch_size": 4,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 26,
+                        "throughput": 8.67,
+                        "latency_mean": 119.6179461923077,
+                        "latency_std": 1.4057848288153165,
+                        "latency_50": 119.394914,
+                        "latency_90": 121.3817145,
+                        "latency_95": 121.8577975,
+                        "latency_99": 122.802906,
+                        "latency_999": 123.0513933
+                    },
+                    "optimized": {
+                        "nb_forwards": 28,
+                        "throughput": 9.33,
+                        "latency_mean": 107.42320235714286,
+                        "latency_std": 0.9405205161982765,
+                        "latency_50": 107.1847235,
+                        "latency_90": 107.6445599,
+                        "latency_95": 108.2160214,
+                        "latency_99": 111.05779109000001,
+                        "latency_999": 111.916852709
+                    }
+                },
+                {
+                    "batch_size": 8,
+                    "input_length": 64,
+                    "baseline": {
+                        "nb_forwards": 14,
+                        "throughput": 4.67,
+                        "latency_mean": 214.6487932857143,
+                        "latency_std": 0.9053003539723654,
+                        "latency_50": 214.552057,
+                        "latency_90": 215.54495519999998,
+                        "latency_95": 216.14476715,
+                        "latency_99": 216.93365343000002,
+                        "latency_999": 217.11115284299999
+                    },
+                    "optimized": {
+                        "nb_forwards": 15,
+                        "throughput": 5.0,
+                        "latency_mean": 211.41319233333334,
+                        "latency_std": 1.1447515204122778,
+                        "latency_50": 211.02957,
+                        "latency_90": 213.090243,
+                        "latency_95": 213.19109559999998,
+                        "latency_99": 213.37423912,
+                        "latency_999": 213.415446412
+                    }
+                }
+            ],
+            "others": {
+                "baseline": {
+                    "precision": 0.936836221352711,
+                    "recall": 0.9533560864618885,
+                    "f1": 0.9450239639131661,
+                    "accuracy": 0.9880421708059153
+                },
+                "optimized": {
+                    "precision": 0.936836221352711,
+                    "recall": 0.9533560864618885,
+                    "f1": 0.9450239639131661,
+                    "accuracy": 0.9880421708059153
+                }
+            }
+        },
+        "max_eval_samples": 1000,
+        "time_benchmark_args": {
+            "duration": 3,
+            "warmup_runs": 1
+        },
+        "model_type": "distilbert"
+    }
+]

tensorboard/1657610437.7287223/events.out.tfevents.1657610437.ip-10-2-224-27.ec2.internal.1.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8e1ceb68fac5c6b0f4db331c175a13093f199c2a166595387f5c6271dcfc8ff2
+size 738

tensorboard/1657610437.7304575/events.out.tfevents.1657610437.ip-10-2-224-27.ec2.internal.1.2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c4819b83566afb7799a1297d0dd2e0518c6c74748e52cad94206c7528c26dbdd
+size 728

tensorboard/1657610437.7316337/events.out.tfevents.1657610437.ip-10-2-224-27.ec2.internal.1.3 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9beb6fc624a7b8dec602eb0c8584fbc0a3a066b2942884d4310edab28ec0a1d0
+size 737

tensorboard/1657610437.7327793/events.out.tfevents.1657610437.ip-10-2-224-27.ec2.internal.1.4 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7bf6c4db6b1b42996059c35e3e7ff7e6ecbbbde0b26b457b48b9119917cd7a5b
+size 727

tensorboard/events.out.tfevents.1657610437.ip-10-2-224-27.ec2.internal.1.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7ac35b2342711834d9070a406c05e7f888ba13de67ef840d1aa407e0d482f35c
+size 40