wanchichen commited on
Commit
756e566
1 Parent(s): 8968ae2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -4045,13 +4045,13 @@ language:
4045
  [XEUS - A Cross-lingual Encoder for Universal Speech]()
4046
 
4047
 
4048
- XEUS is a large-scale multilingual speech encoder by Carnegie Mellon University's [WAVLab]() that covers over **4000** languages. It is pre-trained on over 1 million hours of publicly available speech datasets. It requires fine-tuning to be used in downstream tasks such as Speech Recognition or Translation. Its hidden states can also be used with k-means for semantic Speech Tokenization. XEUS uses the [E-Branchformer]() architecture and is trained using [HuBERT]()-style masked prediction of discrete speech tokens extracted from [WavLabLM](). During training, the input speech is also augmented with acoustic noise and reverberation, making XEUS more robust. The total model size is 577M parameters.
4049
 
4050
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630438615c70c21d0eae6613/BBRKYvTjJmx2B5oyWBLcZ.png)
4051
 
4052
- XEUS tops the [ML-SUPERB]() multilingual speech recognition leaderboard, outperforming [MMS](), [w2v-BERT 2.0](), and [XLS-R](). XEUS also sets a new state-of-the-art on 4 tasks in the monolingual [SUPERB]() benchmark.
4053
 
4054
- More information about XEUS, including ***download links for our crawled 4000-language dataset***, can be found in the [project page]().
4055
 
4056
 
4057
  ## Requirements
@@ -4067,7 +4067,7 @@ git lfs install
4067
  git clone https://huggingface.co/espnet/XEUS
4068
  ```
4069
 
4070
- XEUS supports [Flash Attention](), which can be installed as follows:
4071
 
4072
  ```
4073
  pip install flash-attn --no-build-isolation
 
4045
  [XEUS - A Cross-lingual Encoder for Universal Speech]()
4046
 
4047
 
4048
+ XEUS is a large-scale multilingual speech encoder by Carnegie Mellon University's [WAVLab](https://www.wavlab.org/) that covers over **4000** languages. It is pre-trained on over 1 million hours of publicly available speech datasets. It requires fine-tuning to be used in downstream tasks such as Speech Recognition or Translation. Its hidden states can also be used with k-means for semantic Speech Tokenization. XEUS uses the [E-Branchformer]() architecture and is trained using [HuBERT]()-style masked prediction of discrete speech tokens extracted from [WavLabLM](). During training, the input speech is also augmented with acoustic noise and reverberation, making XEUS more robust. The total model size is 577M parameters.
4049
 
4050
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630438615c70c21d0eae6613/BBRKYvTjJmx2B5oyWBLcZ.png)
4051
 
4052
+ XEUS tops the [ML-SUPERB](https://arxiv.org/abs/2305.10615) multilingual speech recognition leaderboard, outperforming [MMS](https://arxiv.org/abs/2305.13516), [w2v-BERT 2.0](https://arxiv.org/abs/2312.05187), and [XLS-R](https://arxiv.org/abs/2111.09296). XEUS also sets a new state-of-the-art on 4 tasks in the monolingual [SUPERB]() benchmark.
4053
 
4054
+ More information about XEUS, including ***download links for our crawled 4000-language dataset***, can be found in the [project page](https://www.wavlab.org/activities/2024/xeus/).
4055
 
4056
 
4057
  ## Requirements
 
4067
  git clone https://huggingface.co/espnet/XEUS
4068
  ```
4069
 
4070
+ XEUS supports [Flash Attention](https://github.com/Dao-AILab/flash-attention), which can be installed as follows:
4071
 
4072
  ```
4073
  pip install flash-attn --no-build-isolation