File size: 2,510 Bytes
2a939ec
 
 
 
 
 
 
8c0b71f
2a939ec
 
 
 
 
 
 
c3f03d5
 
 
 
 
 
 
 
 
 
 
512db29
 
 
 
 
1325608
512db29
 
2a939ec
 
 
 
 
 
d348568
2a939ec
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
license: cc-by-nc-sa-4.0
---

### Description

This model is used to separate reverb and delay effects in vocals. In addition, it can also separate partial harmony, but it cannot completely separate them. I added random high cut after the reverberation and delay effects in the dataset, so the model's handling of high frequencies is not particularly aggressive.<br>
You can try listening to the performance of this model [here](https://huggingface.co/Sucial/Dereverb-Echo_Mel_Band_Roformer/tree/main/examples)!

### How to use the model?

Try it with [ZFTurbo's Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training)

### Model

### V2 Models

Finetuned from: `dereverb-echo_mel_band_roformer_sdr_10.0169.ckpt`<br>
Used 1000+ songs to Finetune.

Config: [config_dereverb_echo_mbr_v2.yaml](./config_dereverb_echo_mbr_v2.yaml)<br>
Model: [dereverb_echo_mbr_v2_sdr_dry_13.4843.ckpt](./dereverb_echo_mbr_v2_sdr_dry_13.4843.ckpt)<br>
Instr dry sdr: 13.4843 (Std: 4.8675)

#### V1 Models

Configs_256_8_4: [config_dereverb-echo_mel_band_roformer.yaml](./config_dereverb-echo_mel_band_roformer.yaml)<br>
Model_256_8_4: [dereverb-echo_mel_band_roformer_sdr_10.0169.ckpt](./dereverb-echo_mel_band_roformer_sdr_10.0169.ckpt)<br>
Instr dry sdr: 13.1507, Instr other sdr: 6.8830, Metric avg sdr: 10.0169

Configs_128_4_4: [config_dereverb-echo_128_4_4_mel_band_roformer.yaml](./config_dereverb-echo_128_4_4_mel_band_roformer.yaml)<br>
Model_128_4: [dereverb-echo_128_4_4_mel_band_roformer_sdr_dry_12.4235.ckpt](./dereverb-echo_128_4_4_mel_band_roformer_sdr_dry_12.4235.ckpt)<br>
Instr dry sdr: 12.4235

Instruments: [dry, other]<br>
Finetuned from: `model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt`<br>
Datasets: 
- Training datasets: 270 songs from [opencpop](https://github.com/wenet-e2e/opencpop) and [GTSinger](https://github.com/GTSinger/GTSinger)
- Validation datasets: 30 songs from my own collection
- All random reverbs and delay effects are generated by [this python script](./scripts/create_reverb_delay.py) and sorted into the mustb18 dataset format.

### Thanks

- Mel-Band-Roformer [[Paper](https://arxiv.org/abs/2310.01809), [Repository](https://github.com/lucidrains/BS-RoFormer)]
- [ZFTurbo](https://github.com/ZFTurbo)'s training code [[Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training)]
- [CN17161](https://github.com/CN17161) provided GPUs.
- [Glucy-2](https://github.com/Glucy-2) provided technical assistance.