Spaces:
Running
Running
from textwrap import dedent | |
from iso639 import Lang | |
BANNER_TEXT = """ | |
<div style="text-align: center;"> | |
<h1><a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit Benchmarks</a></h1> | |
</div> | |
""" | |
INTRO_LABEL = """We present comprehensive benchmarks for WhisperKit, our on-device ASR solution, compared against a reference implementation. These benchmarks aim to help developers and enterprises make informed decisions when choosing optimized or compressed variants of machine learning models for production use. Show more.""" | |
INTRO_TEXT = """ | |
<h3 style="display: flex; | |
justify-content: center; | |
align-items: center; | |
"></h2> | |
\n📈 Key Metrics: | |
Word Error Rate (WER) (⬇️): The percentage of words incorrectly transcribed. Lower is better. | |
Quality of Inference (QoI) (⬆️): Percentage of examples where WhisperKit performs no worse than the reference model. Higher is better. | |
Tokens per Second (⬆️): The number of output tokens generated per second. Higher is better. | |
Speed (⬆️): Input audio seconds transcribed per second. Higher is better. | |
🎯 WhisperKit is evaluated across different datasets, with a focus on per-example no-regressions (QoI) and overall accuracy (WER). | |
\n💻 Our benchmarks include: | |
Reference: <a href='https://platform.openai.com/docs/guides/speech-to-text'>WhisperOpenAIAPI</a> (OpenAI's Whisper API) | |
On-device: <a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit</a> (various versions and optimizations) | |
ℹ️ Reference Implementation: | |
<a href='https://platform.openai.com/docs/guides/speech-to-text'>WhisperOpenAIAPI</a> sets the reference standard. We assume it uses the equivalent of openai/whisper-large-v2 in float16 precision, along with additional undisclosed optimizations from OpenAI. As of 02/29/24, it costs $0.36 per hour of audio and has a 25MB file size limit per request. | |
\n🔍 We use two primary datasets: | |
<a href='https://huggingface.co/datasets/argmaxinc/librispeech'>LibriSpeech</a>: ~5 hours of short English audio clips | |
<a href='https://huggingface.co/datasets/argmaxinc/earnings22'>Earnings22</a>: ~120 hours of English audio from earnings calls | |
🌐 Multilingual Benchmarks: | |
These benchmarks aim to demonstrate WhisperKit's capabilities across diverse languages, helping developers assess its suitability for multilingual applications. | |
\nDataset: | |
<a href='https://huggingface.co/datasets/argmaxinc/whisperkit-evals-multilingual'>Common Voice 17.0</a>: Short-form audio files (<30s/clip) for a maximum of 400 samples per language from Common Voice 17.0. Test set covers a wide range of languages to test model's versatility. | |
\nMetrics: | |
Average WER: Provides an overall measure of model performance across all languages. | |
Language-specific WER: Allows for detailed analysis of model performance for each supported language. | |
Language Detection Accuracy: Measured using a confusion matrix, showing the model's ability to identify the correct language. | |
Results are shown for both forced (correct language given as input) and unforced (model detects language) scenarios. | |
🔄 Results are periodically updated using our automated evaluation pipeline on Apple Silicon Macs. | |
\n🛠️ Developers can use <a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit</a> to reproduce these results or run evaluations on their own custom datasets. | |
🔗 Links: | |
- <a href='https://github.com/argmaxinc/WhisperKit'>WhisperKit</a> | |
- <a href='https://github.com/argmaxinc/whisperkittools'>whisperkittools</a> | |
- <a href='https://huggingface.co/datasets/argmaxinc/librispeech'>LibriSpeech</a> | |
- <a href='https://huggingface.co/datasets/argmaxinc/earnings22'>Earnings22</a> | |
- <a href='https://huggingface.co/datasets/argmaxinc/whisperkit-evals-multilingual'>Common Voice 17.0</a> | |
- <a href='https://platform.openai.com/docs/guides/speech-to-text'>WhisperOpenAIAPI</a> | |
""" | |
METHODOLOGY_TEXT = dedent( | |
""" | |
# Methodology | |
## Overview | |
WhisperKit Benchmarks is the one-stop shop for on-device performance and quality testing of WhisperKit models across supported devices, OS versions and audio datasets. | |
## Metrics | |
- **Speed factor** (⬆️): Computed as the ratio of input audio length to end-to-end WhisperKit latency for transcribing that audio. A speed factor of N means N seconds of input audio was transcribed in 1 second. | |
- **Tok/s (Tokens per second)** (⬆️): Total number of text decoder forward passes divided by the end-to-end processing time. | |
- This metric varies with input data given that the pace of speech changes the text decoder % of overall latency. This metric should not be confused with the reciprocal of the text decoder latency which is constant across input files. | |
- **WER (Word Error Rate)** (⬇️): The ratio of words incorrectly transcribed when comparing the model's output to reference transcriptions, with lower values indicating better accuracy. | |
- **QoI (Quality of Inference)** (⬆️): The ratio of examples where WhisperKit performs no worse than the reference model. | |
- This metric does not capture improvements to the reference. It only measures potential regressions. | |
- **Parity %**: The percentage difference between a model's Average WER on a given device and its Average WER on the Apple M2 Ultra, where a negative value indicates worse performance compared to the M2 Ultra. | |
- **Multilingual results**: Separated into "language hinted" and "language predicted" categories to evaluate performance with and without prior knowledge of the input language. | |
## Data | |
- **Short-form**: 5 hours of English audiobook clips with 30s/clip comprising the [librispeech test set](https://huggingface.co/datasets/argmaxinc/librispeech). Proxy for average streaming performance. | |
- **Long-form**: 12 hours of earnings call recordings with ~1hr/clip in English with various accents. Built by randomly selecting 10% of the [earnings22 test set](https://huggingface.co/datasets/argmaxinc/earnings22-12hours). Proxy for average from-file performance. | |
- Full datasets are used for English Quality tests and random 10-minute subsets are used for Performance tests. | |
- **Multilingual**: Max 400 samples per language with <30s/clip from [Common Voice 17.0 Test Set](https://huggingface.co/datasets/argmaxinc/common_voice_17_0-argmax_subset-400). Common Voice covers 77 of the 99 languages supported by Whisper. | |
## Performance Measurement | |
1. On-device testing is conducted with [WhisperKit Regression Test Automations](https://github.com/argmaxinc/WhisperKit/blob/main/BENCHMARKS.md) on iPhones, iPads, and Macs, across different iOS and macOS versions. | |
2. Performance is recorded on 10-minute datasets described above for short- and long-form | |
3. Quality metrics are recorded on full datasets on Apple M2 Ultra Mac Studios to allow for fast processing of many configurations and providing a consistent, high-performance baseline for all evaluations displayed in the English Quality tab. | |
4. Quality is also sanity-checked on 10-minute datasets in order to catch potential correctness regressions across different device and OS combinations despite running the same version of WhisperKit. | |
5. Results are aggregated and presented in the dashboard, allowing for easy comparison and analysis. | |
## Dashboard Features | |
- Performance: Interactive filtering by model, device, OS, and performance metrics | |
- Timeline: Visualizations of performance trends | |
- English Quality: English transcription quality on short- and long-form audio | |
- Multilingual Quality: Multilingual (77) transcription quality on short-form audio with and without language prediction | |
- Device Support: Matrix of supported device, OS and model version combinations. Unsupported combinations are marked with :warning:. | |
- This methodology ensures a comprehensive and fair evaluation of speech recognition models supported by WhisperKit across a wide range of scenarios and use cases. | |
""" | |
) | |
PERFORMANCE_TEXT = dedent( | |
""" | |
## Metrics | |
- **Speed factor** (⬆️): Computed as the ratio of input audio length to end-to-end WhisperKit latency for transcribing that audio. A speed factor of N means N seconds of input audio was transcribed in 1 second. | |
- **Tok/s (Tokens per second)** (⬆️): Total number of text decoder forward passes divided by the end-to-end processing time. | |
- **Parity %**: The percentage difference between a model's Average WER on a given device and its Average WER on the Apple M2 Ultra, where a negative value indicates worse performance compared to the M2 Ultra. | |
## Data | |
- **Short-form**: 5 hours of English audiobook clips with 30s/clip comprising the [librispeech test set](https://huggingface.co/datasets/argmaxinc/librispeech). | |
- **Long-form**: 12 hours of earnings call recordings with ~1hr/clip in English with various accents. Built by randomly selecting 10% of the [earnings22 test set](https://huggingface.co/datasets/argmaxinc/earnings22-12hours). | |
""" | |
) | |
QUALITY_TEXT = dedent( | |
""" | |
## Metrics | |
- **WER (Word Error Rate)** (⬇️): The ratio of words incorrectly transcribed when comparing the model's output to reference transcriptions, with lower values indicating better accuracy. | |
- **QoI (Quality of Inference)** (⬆️): The ratio of examples where WhisperKit performs no worse than the reference model. | |
- This metric does not capture improvements to the reference. It only measures potential regressions. | |
""" | |
) | |
COL_NAMES = { | |
"model.model_version": "Model", | |
"device.product_name": "Device", | |
"device.os": "OS", | |
"average_wer": "Average WER", | |
"qoi": "QoI", | |
"speed": "Speed", | |
"tokens_per_second": "Tok / s", | |
"model": "Model", | |
"device": "Device", | |
"os": "OS", | |
"parity": "Parity %", | |
"english_wer": "English WER", | |
"multilingual_wer": "Multilingual WER", | |
} | |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results" | |
CITATION_BUTTON_TEXT = r"""@misc{whisperkit-argmax, | |
title = {WhisperKit}, | |
author = {Argmax, Inc.}, | |
year = {2024}, | |
URL = {https://github.com/argmaxinc/WhisperKit} | |
}""" | |
HEADER = """<div align="center"> | |
<div position: relative> | |
<img | |
src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAbgAAAG5CAYAAAD8liEWAAAdN0lEQVR4Ae3db3IbRbfH8XNGMhUeyzzKCtBdQXxXECW2c133Dc4KYlaQZAUhKwheQewVoLyhXNixhxVgVoBYASKWIWVp+txuxeEBbhxsSzM9an0/VRQQ/liZGc2vT/8VAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABadSsk27g9NANSYDVSz3MxeHRy1dgXRba6fdp1lj8Vs1b9AOzKXbCCqJ2Ky1/ikke/vf9qXihFwAP6s719G92K8jOCDbfP3TjEqXvpQ6Epa+qru6XevP+tJhTIBgP/oFOfFD+FFK6jUJNzOi+MEwy3omGXfrPvKVCpEwAH4u/bkRYtKXVzzjiRMnX6ztfVLWypCwAH4kM6DtbMtQSXW1t6Ea92R9LV/f7O0LRUh4AB8mHPbgkpkkn0hC8KpVPZ7JeAAfJCp3hFUZVUWhVX3eyXgAFymI6iGLVDA+W5KqQgBBwBIEgEHAEgSAQcASBIBBwBIEgEHAEgSAQcASBIBBwBIEgEHAEgSAQcASBIBBwBIEgEHAEgSAQcASBIBBwBIEgEHAEgSAQcASBIBB+BSW1u/VHZ2FzBrBByASw2HzUU6iDOKzfXTrqAUBByAS2VOuoJSFaaPBKUg4ABcykQfb27+3hGUYnJtjUZEWQg4AB/TLkbFS0EpLq5tR1CKpgDAx/gKY2Nt+ENjqfFwf//TvmBqoXKbhBvVW6lKDzgV6QtQMybW9k8nMwSvymS1OC9+8kG3q+JeNbOs70aNgdzQrfatQa+nN/7vY9jsTt9V6xqjVdHsbnE+3ub5K58KsOA2usNVa/rAc7qqKl/Qqq6KDUT1xF/vvYOj1q7U0Mb94bZ/Sz4Ss1UCaXb8/a4kewg44G9CS32cjbdV9JmgKv3GJ417dekCvehC/CZUroKZI+CAyELQuaw4NiYBVGXgQ+6/Y4fcJNzOi2PhvpemqoBjFiVwif38037mGvcYR65MLWZsTio3wi0JBBzwESHkXGZfCqrhxz/XI+7sMdlVhG7JZBBwwD84PFzJxdyeoBJqsiWRsKtIWgg44AqsobuCSqjoXYmnI0gGAQdcQajiGIurhplEm47vq8eOIBkEHHBV5r4XAHODgAOuyGl2IgDmBgEHXJGp6wuAuUHAAVe0VCxRwQFzhIADACSJgAOuKCz6FgBzg4ADACSJgAMAJImAAwAkiYADACSJgAMAJImAAwAkiYADACSJgAMAJImAAwAkiYADUDcdicQk3ll0mD0CDkDtrK+fdqViGxvDVSHgkkLAAaidzElXKmYmjwRJIeAA1I6JPt7a+qWyampz8/eOOtkSJIWAA1BH7bPTpW+kIsWoeCkRx/5QjqYAQB2ZdDfWhj80lhoP9/fLOaooVG6TcLPqu0RRPgIOf7G1Ze23g7d/dA3dat8a9Ho6ENSSivTDn00l3KPr3qe22rtJFVbX6sVktTgvfvJBt6viXmXF0sm05/Jtdn/vSHPccaJfFOfjbX8Vazux5P39Dfw97sv11P/+lkylZBv3hyaYczYQ1V5DbW//cCWXBVaH5zm89JzITsst7/by2TQ+wku/yIqu/38/W9SXYV2UcX+3utY+y8626nJ/D45apWdPQMDhelR2l1dGT3u92wtZ1UV/ntV2lovWV7N68X3I+v3Tr1T0maB6C3J/qwo4Jpngeky2/eD/cZUz3HDBv/wOXq88KfPlFxwerXzlxHYEVdvl/s4WAYfr8+Miv51+8lJQmdBtFVr2UpHCjb+S64/p4YbC/W24xnOpyKLcXwION2JmWzF2m1hUvl/0edkt+z/L89sDpYqrjL+/+bSTZ65jUe4vAYcbU5ex80NFlpycSMVcJrmgEpbZnlQsc81dSRwBhylYV1CJb/NW5QE3Ho8r/5mLqtWq/lpXWTHGQsAB9deXCEI3lqASsWYla6RnqyoEHAAgSQQcACBJBBwAIEkEHAAgSQQcgI9hognmFgEH4FJKwFWBa1wSAg4A4iLgSkLAYRpsuAygtgg4TIOAA1BbBByAS93gFGmgNgg4TCWcBC1Iljr3s6BcNCJKQ8BhKpkUdFMmzFSZAIG5RcBhKqNMVgXJMtG+oGT2o6AUBBymo0bAJUwbnAlXNpUsF5SCgMN0TL8QJOvgYHIOHd2UJcqKjHP3SkLAYVqd9fXTrqBMHYmq+tOmF4WavYp58KhFf7bKRcBhapmTrqBU/9sdxusKdrorKEVmzScSycbGMPnhBQIOUzPRx5ubLBco0zizLYnkIG+dOLEdwWyp7USt3kweSeIIOMxCuzgvjre2fmHJQElCIyLm9X19tPJElAknM6Nu7+D1SrTqLTRI1Um0RlNVCDjMSufszdIPVHKlaZ+dLn0jER28bt2jkpueiT0/eP3ZtkQ0Ho1fSOLjbwEBh1nq+Erup4214UuCrgQmXX9tj2Ne21DJNVzjv8QcE0+uxQb+j51w7Q6PVr6SSLa2rB2eITVNvnoLVEr24P7wJ0HtlTSbqh+2IVKVE3H2aybWl0z7merAjRqDW+1bg15vvnbK2Lg/NIkvXNe8oba3f7iSS0STGbROV1WsY5J9rmp/dKOqffyZmrcZfCof31LLdLKcYnDxN/7fdb9OFspndnIY+T5t+vvkXHbXxPluUY0+lHBw1Co9e4JKfgjmx1bX2mdy1rGmtcPsSFO9GyoHiSEEo8hJHV7k79Uk4P7GBipXayi8fwmHRoc5+bHxSSPf34830WGjO1y1TLq+K+lx/QLPBv4z7YxdczePOBkkVF1vh8PV8aQhoXd8y6Hjf7ntGxBXCqo6NiQIONRG2FC50NFXolm8WVcqu8sro6e93u2oFV89A25KvhpcyuTpt+8WdUcRnrFx5l74SrAeXWfq9paLlSe9PF4PQ6i6CtNnYmG3oPhV1ywRcKid8BJyWXEcrUXoqw4fcvdihlySAXdBM/n6u8PWU4loY+3NrlgWdfq6ifX8ONlDiSRUbL+9OXvmH7RosyzLVlXAMckEVxbW7GSucU9ibd1ksvrbcOmZoBS+y/LJxtrwh5jLEUZFEV7q0RowYZyt6ZrRQj6E29np2XHK4VYlAg7XEkLOMovWug0vYWZolihyIyLPb/vxxHhLEXyw5DEXX4fKLdwDwUwQcLi2yYywiIt+i9GY1m2JQiMi5v6i1tCeROIbb9GWP4QxNyq32SLgcCOm9lwiUdG7glJpmNwQyUHEyS6t1jjazy4iXvNUEXC4kYt1PVHGSszS34EhOpNuzLG4f1pzVpZYE5gm3e7GpuWzRsDhxtTcK4mDPS8rcPZmaSF2u6iD0WjEuFsJCDjcmFMOakwap7VXpmHKtS4BAYcbM3V9QbLU0RVcFadyRzBzBBxuLMuyviBZlum/BZXQxHYqqQsCDjfWGDXmaqNkXNPiTebheU4MAYcbi7kgFigBAZcYAg4A4qOLsgQEHABEdtWjb3A9BByA2rk4tw6YCgEH4DIxq4rqA07j7J6C8hBwAC4TLeBMjAoOUyPgANRO5uxXqZhG+JkoFwEHoHaibAOnwtZziSHgANRPZhECLt7eqiZsi1YGAg5A7YzHk3PZKh2H+9fKeS5ICgEH4FKxzoTL89uDKrsM1exVrLPgUB4CDsClhsNmtGNcqjw13jXka4lkY2PIUTklIeAAXEpNoh16Gk6Nd2I7Uja1nYsT6qMwk0eCUhBwAC5n+ihWN2Xw+mjlia/kelIWlfzg9coTiWRz8/eOuniNiNQRcAA+pn12uvSNRHT4euWhSQndlb5yO3jduicRjUfjF8IMytIQcJiKCtsbJc+ku7E2PA7VhkRyeLTyVcM1/kvM7U33zIUdUmwn/L9iVm5bW9YO11RNqd5K1JQFFB6ut4O3f3S7cK4Z8A98yBXnxU/+pbzbUNuTcbNf9ffm4udth7/e6A5XffN81X+wVZPsc1V7/31+/+d3MyJN/X/jfvV/cSIuOznIV6KtdZu8d4bDVeeyu2dvhj5cOcW7bCoLYnP9tFu47JGKddNbVKm7B0fLX0oED+4Pf4pxPQ+OWlGe3Y37QxN8hA1U9I/p9vZ+A2MfNJrZwDn5uZnZya1W66TX07malr/Z9RVsNuq6LLvjfz8d323a+fsxN/7X2gTXP6vq+5t8BRe6VYpR8bJw0p08fgKgPNq2P2/SbO8bP/675961qAuncvbmTHw1mPtf3vMvu12psUnj2PRZYUV3Mqrjwq++e5P8//fJwtQMcyHpgJuE23lxLAziAvVjodE5Gd973FhqPNzfr9dQQehS/P307IVvHG8L5lLSk0xC5SaEG1BvJquhIRpzEsuHnJ2eHTsj3OZZsgHnx0q2L1qIAOqvc9EgrQU/tvwiBK9grqVbwSm7AwBzxTdI1/14l0QWKkk/thZtCQFmJ92Ao/UFzB01fSaR+Uoy+mfAbCQZcN3uZGshpuoC88Y3TGNuDXbxGVh8nYgkA+6W3CLcgPnUPh8sdSSSzXddpLw/EpFkwLEzSXU4iRizNsriDS+MnTK0kRD2osQ86gvSpRYtZFSsI0gGAQegXiz7t0SiZnRPJoSAA1Arau62RGJZ9rkgGQQcgFqxTKNVcEhLkgE32fUbwFxSZjFiRqjgANSKGQGH2Ugz4JrjjiBlHUG6NN4sWZuc+I1UUMEBwIXM2a+CZCQZcIVjLQuA6zOdr1PG8XFJBpyKdgTAfDLtSyQqQsAlJM0uSqOCA+aVivtZIvFjcH1BMpIMOBZrAnNMsxOJpRHxZ2PmEq3gOAsOmFf/WjnPJZLRaNQXuimTkVzAcdwFMMdU8l7vdrSAyXP/s1Wo4hKRXMAVpo8EwFwytecSWR0+A2YjqYCbVG8m2wJgHu0eHq7kEtnkM6jbE8y9ZAJuc/P3TuH0pQCYP75bcNktP5WaGBXFE7oq518SARcqt+K8OBa2cALmj6+Wlovle728Pousw1jcqBjdE6OSm2dNmWOTYPNjboWjWxKYLzYQ1ZMw3nV4+FkuNTSZcCKyvb5+uisue6xiW4K5UvuA29qy9tvB23a2VLTHznWcZR0VveMftq4Pto7MmEq8jV5vyr8k2D8PFbOB/x7+Y8Vl7zdOvtidxMR+lMxOWuPWSZ0qto+5GBcMf4gPu66ZtXXyHgobSmT/Fv3LxhL+n/11Fre/Bu9+n5MNKJQZ3hUqPeAe3B/+ZFN0HZ69OZt0pBZF+LssBJBMviYyhdC3br716L9spq5fFEsnbbk1mJcvHBCHDfz3bmfsmrt5/mlfIvrfjeGqK9yqy7I75nxovAuZP8LlfbBOQlit75z83PTBuj/lJJZpJ8FsdIervkv2iWp21xhSKd1cd1Fez7svZ8u1vibIgGuajJOtPIn53QkTydy4eGzOtkdFCDLf8nXhn5i8b/H+0fC1d+FhF/8sNIwLp7Jx/zR0jfYaanv7EWZsHuStMHFlOxzKXDSKl/6zdQWlWYyA8xVbo2g+3I/c6gTmkQ+J3uHrz7YlIt8T9KI4L568+zuVm/PVnsm2H7vf3lgb7jaWGs/396t/L1y8i+75z3BMyJUn/fPgwvTjYvke4QZcXxiTbrpm1On7IQR8EfZEZi0E3XlxHCpDiWRUjB4KW4OVJumAC1/ORtF4SJckcDM+WPKYjcNQuZVc4XSKURFt/WyYqaliO4JSJB1w/sv5nMoNuDnLLNo6sLAMqJTK7e98gK6/28M2CmtoT1CKlANucHDU2hUAN9ZqjaPt5uHHyZ5JRbTCn/V3BwctdkwpSbIBp2bfC4CpxNrZfzIuVuXkC5PVra1foq1Rm8f1t/Mg3QqOfeSAuTUajao+07E9HDY5RzIx6VZwwsQSYF41TCsPm4bTaAHHou9yJBtwpu+2BgIwf5zKHamY++uWW0hAsgHn1FHBlSzsxiBACTTCno3qqKJSk2zAfTKmixKY0kJ9hyzTfwuSkm4FJw0CDpjOYn2HjAouNelv1QVgHi3UsTIsEygHAQegdv5+phpwEwQcALzTkUj+OBQVM0XAAfgg5aVbGRPrC2Yu2YDLpKCLA5iCcYxLZTJnvwpmLtmAGze0IyhXc9wRAFMzZVlTGdLdycSMCq5kI65x4uxHQUWMvXNLkPBmy8bGqSXLTLuCZKlkuaAajYyAK0HCk0y08r3sFo5Gu8Z051QgK3jpVuXiTDie6xlLN+Ain++UusrP6/oTVRbFlk3NXu3nn/Ylkli768d9Z8Q7PT1VKS8TaP/2pln+cfcLqhgV0U5ANuOsv7JZU7+SSDY2htGGF2KeCddwza8FM5X0OjgTfUwVN3sX1du2RKKqrwTlUdu56DKL8+ML25JI1CTazw4VsxPbEcxM6gu922enS8eE3OyEcCvOi2OJJOzZ993r5Z6gHCr5weuVaD0f4fnyDdNHEovpo0kDLpLXR/7a+3sgmIn0dzLxY3Eh5GI+tKnYXD/tXoRbRyII4Za5xj1BOULl9roV9fqOR+MXEvd067bvfn8pEYV7QCU3G4uxVZcPufBifnD/7BlBd30h2DbWhseF02jhFlq1IdxiTnxIkw18l2/PMrsXs3Lb2rL2/6wNX6pptC7CP5h0H6ydfRO7kgv3xFez9FZMQaVkD+4PfzKp2TlL/mWpZt/7FutJM8v6nywv93s9dhIIwovm/Oys4wq36rLsjjjZinP/bOAH2/ri75Nl0js8XMmlBjbuD01q6p+OXLH3s09N+yruZ5dJ3hq3Tnp5vGc/nArvMvfIxPlw1boNJfTVP3tNlb1vI45JbnWtfdo47TZMV51kd1TfbbCgHzm/rnbv3L85OGqVnj1BJT9kGt3uf1pRzea4E3YoCTfaVO/Ofpq6b83KYgadSfjSzPYFE164vgWa+//3j/6FOhCXnYwuDqLN57QSq0fA2cB/iB3f/5LHDv5Q3Y+drmaZfG7OPz9qHf/L7ascd1PGM1em65zZ9r4xoSonYu77bGnpZH8/3jO/7u+TFratmt2tQ/gRcFcQWn+Fjr4SzeINSuNDdn33yl5dqq5Zih5wvvdhuVh+GLXq8l13blw8Nmfb8xRQ0ansNpYaz2MGXXhnjjP3QiXeTNWAgLuGje5w1XclfFP3sjx1oYXrMvsyxWB7L2bA+Yqnd3i08lAi8kMOL/wFYH3pFPy76uvvDltPJaKNtTe7YvEKg6oCLolJJgd56+Ridh3jaJGEcPuXW/7vlMMtpnB9m64Z+aU4PCbcpmdOnoRrKRGNiiLcx+Tfl8nMogyz63y3WNTW7aJ6P30/ZrdZ6nywPI85gzRUbrG2ZktSmKm5PnwhkeT57YEuwFKEpJYJTKoHFklWzonsMH2/XEsu3vZkYSIJldvshUouTP6QSKyR/hKE5NbBmdpzQZUGrc9Gu4JSfZvHm6ZemEbbdzR1GvHaxtyOrSrJBdzFGBBdZRUJ6wl7vdtc73JFnTFJ12SJ/LWNuZXgdZY+zKMkdzLxVVwuqISq5oKyRQu44rzoCkp19mYp/u4tiUoy4DKTHwWVKDLj6Jqk2aqgXMo1LkuaFZxYX1CJ5rjZF6Qr3qntC0Md63fLkmTAOWUMriq32m+51sA0Mv1cUIokA25JWY9VFSaYpO1jG/piNuwK+3biZtI8LoduMwBYeItxHhwA1FdHUAoCDtOgexJAbSUZcGwbVRkCDpiBmIu9U0YFBwCRvR3cIuBKQMABuBRnLGKeEXAAgCQRcAAQGRsmlIOAA4DI2DChHAQcUH9MQABuINmAS/2co1pQrnFFCLjExVomkPokomQDzinHuJTOCLiqbHZ/70jFNtdPu4JKDIfNyo/M2dgYJn9MT7pdlKbfC0plme0JKmFZUfmhmIXpI0El1KT6Q08LIeDm1diNdummLJHKyeHhSi6ohBN5XGU31uamrxhNuoJq+MbE5JpX65kkLtmAy/PbA3PyUNhOauZCw6FRNB4KqtT5bbhU2QtpPBq/EBZ5V6ldjIpvqmrEPLg/XIj7m/QsyoO8deKbvveo5GYnXMvMNe6x32f1fIPtyYP14YsyX4JbW9b+n7XhSzWtvsts0Zmsnp0uHZdZyYX7G8LNRJ7IAlBZEBv3h9v+d/tIzHy/szIr7VpsIKon/gu4d3DU2pUF5p8jk/j6/kPsfNKQ/NuD1kwmU4VJLC5zj0zcE74f0fX9uypfymRnFvc3hNr52VlnXOgXdbm//j1SSfYsTMABs1CTgFswFoYZ9iyT3njc7OdT9h50fZg3m+OOFratmt2t+1T59z1QKU3pJ+CAGiLgqmVivZZrfdnLtZSx9FC5jjP3QsXokq1QVQHXFACooRBuh0crpU5muhhLfrix9mZXLGNZRGLYqgtA7YRuuaZrPpWKjIoiTLpgxnViCDgAteMDbqfKmbphWZHvptwRJIWAA1A7DSe5VCxzzV1BUgg4ALXzbT6b5Q/XcVEx0k2ZEAIOQN30JRIl4JJCwAEAkkTAAQCSRMABAJJEwAEAkkTAAQCSRMABAJJEwAEAkkTAAQCSRMABAJJEwAFXFM4OEwBzg4ADrqhYKtoCYG4QcMAVNcwIOGCOEHDAFY2drgqAuUHAAVekKncEwNwg4ICrMukKgLlBwAFXsLk5mUHZEZTOV8rRzmQz5Ty4lBBwwBUUo+KZoBJmUvlp3n/64d8LkkHAAf9gUr2ZbAsqoaqvJBLLpCdIBgEHfEQIt+K8OBZUQkX6371ejhYyh4cruf8QuSAJBBxwiT+FW0dQuhBumWvck8gaRePL8FkEc4+AA/5mc/20++D+2bPifPyDEG7V8FVTCLf9/NO+RBY+Q/gshNz8UwEW1NaWtd8O3rZdY7TqLOtkqnfNXNd/Ldix5EpsoKJTzTo00dwytzfpGqyhtbU3W2qNR6rWUZMbPxcmYRccnqv3Do5alWRP6T/kwf3hTwLUjFGZ3YAN/HXb8f0+eWvcOunlypT6a1r3vQNa2LZqdneRn8FkAm7j/tAEwHxTt7dcrDwh1GYjnEwxztwLFduSBVRVwDUFAD7Cd6/1Dl9/ti2YmYuxxocba292xbJHglIwyQTApcJEi6ZrPhWUYlQUT/yfqIpLQsABuJQfX8jrMLMxVXl+e+C7KXcEpSDgAFzKMtsTlMplLCwvCwEH4FKt1jjevpALYjzmGpeFgANwqV7vNuNDJQvdlIJSEHAAgCQRcACAJBFwAIAkEXAAgCQRcACAJBFwAIAkEXAAgCQRcACAJBFwAIAkEXAAgCQRcACAJBFwAIAkEXAAgCQRcACAJBFwABAfR+aUgIAD8EGqwkGcFdHFCrjKfq8EHIAPc/azoBrmvpdFUWHDiYAD8EGm2hNUwjV0VxaFyZ5UhIAD8P/4LrP+wVFrV1CJw8OV3F/0XBIXnquGa+RSEQIOwF+El1DmGvcElWoUjS/DtZeUqT7dzz/tS0UIOAD/4cdHQrhV+RLCO+Gah2ufYsiF35Nldu+718uVdnurlGzj/tAEQG2Fl48TPZHM7Uy6yhCdf29um+gXqtbxY1arMpds4Cu2EzN51XLLu71cWQoBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAoN7+D8MSQdHK7cEnAAAAAElFTkSuQmCC" | |
style="display:block;width:7%;height:auto;" | |
/> | |
</div> | |
</div>""" | |
EARNINGS22_URL = ( | |
"https://huggingface.co/datasets/argmaxinc/earnings22-debug/resolve/main/{0}" | |
) | |
LIBRISPEECH_URL = ( | |
"https://huggingface.co/datasets/argmaxinc/librispeech-debug/resolve/main/{0}" | |
) | |
AUDIO_URL = ( | |
"https://huggingface.co/datasets/argmaxinc/whisperkit-test-data/resolve/main/" | |
) | |
WHISPER_OPEN_AI_LINK = "https://huggingface.co/datasets/argmaxinc/whisperkit-evals/tree/main/WhisperKit/{}/{}" | |
BASE_WHISPERKIT_BENCHMARK_URL = "https://huggingface.co/datasets/argmaxinc/whisperkit-evals-dataset/blob/main/benchmark_data" | |
AVAILABLE_LANGUAGES = [ | |
"af", | |
"am", | |
"ar", | |
"as", | |
"az", | |
"ba", | |
"be", | |
"bg", | |
"bn", | |
"br", | |
"ca", | |
"cs", | |
"cy", | |
"da", | |
"de", | |
"el", | |
"en", | |
"es", | |
"et", | |
"eu", | |
"fa", | |
"fi", | |
"fr", | |
"gl", | |
"ha", | |
"he", | |
"hi", | |
"hu", | |
"hy", | |
"id", | |
"it", | |
"ja", | |
"ka", | |
"kk", | |
"ko", | |
"lo", | |
"lt", | |
"lv", | |
"mk", | |
"ml", | |
"mn", | |
"mr", | |
"mt", | |
"ne", | |
"nl", | |
"nn", | |
"oc", | |
"pa", | |
"pl", | |
"ps", | |
"pt", | |
"ro", | |
"ru", | |
"sk", | |
"sl", | |
"sq", | |
"sr", | |
"sv", | |
"sw", | |
"ta", | |
"te", | |
"th", | |
"tk", | |
"tr", | |
"tt", | |
"uk", | |
"ur", | |
"uz", | |
"vi", | |
"yi", | |
"yo", | |
"yue", | |
"zh", | |
] | |
LANGUAGE_MAP = {lang: Lang(lang).name for lang in AVAILABLE_LANGUAGES} | |