Revise grammar
#2
by
Corbanp
- opened
README.md
CHANGED
@@ -20,7 +20,7 @@ widget:
|
|
20 |
|
21 |
MAGNeT is a text-to-music and text-to-sound model capable of generating high-quality audio samples conditioned on text descriptions.
|
22 |
It is a masked generative non-autoregressive Transformer trained over a 32kHz EnCodec tokenizer with 4 codebooks sampled at 50 Hz.
|
23 |
-
Unlike prior work, MAGNeT
|
24 |
|
25 |
MAGNeT was published in [Masked Audio Generation using a Single Non-Autoregressive Transformer](https://arxiv.org/abs/2401.04577) by *Alon Ziv, Itai Gat, Gael Le Lan, Tal Remez, Felix Kreuk, Alexandre Défossez, Jade Copet, Gabriel Synnaeve, Yossi Adi*.
|
26 |
|
|
|
20 |
|
21 |
MAGNeT is a text-to-music and text-to-sound model capable of generating high-quality audio samples conditioned on text descriptions.
|
22 |
It is a masked generative non-autoregressive Transformer trained over a 32kHz EnCodec tokenizer with 4 codebooks sampled at 50 Hz.
|
23 |
+
Unlike prior work, MAGNeT requires neither semantic token conditioning nor model cascading, and it generates all 4 codebooks using a single non-autoregressive Transformer.
|
24 |
|
25 |
MAGNeT was published in [Masked Audio Generation using a Single Non-Autoregressive Transformer](https://arxiv.org/abs/2401.04577) by *Alon Ziv, Itai Gat, Gael Le Lan, Tal Remez, Felix Kreuk, Alexandre Défossez, Jade Copet, Gabriel Synnaeve, Yossi Adi*.
|
26 |
|