|
--- |
|
license: cc-by-nc-sa-4.0 |
|
--- |
|
|
|
### Description |
|
|
|
This model is used to separate reverb and delay effects in vocals. In addition, it can also separate partial harmony, but it cannot completely separate them. I added random high cut after the reverberation and delay effects in the dataset, so the model's handling of high frequencies is not particularly aggressive.<br> |
|
You can try listening to the performance of this model [here](./examples)! |
|
|
|
### How to use the model? |
|
|
|
Try it with [ZFTurbo's Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training) |
|
|
|
### Model |
|
|
|
Configs: [config_dereverb-echo_mel_band_roformer.yaml](./config_dereverb-echo_mel_band_roformer.yaml)<br> |
|
Model: [dereverb-echo_mel_band_roformer_sdr_10.0169.ckpt](./dereverb-echo_mel_band_roformer_sdr_10.0169.ckpt)<br> |
|
Instruments: [dry, other]<br> |
|
Finetuned from: `model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt`<br> |
|
Datasets: |
|
- Training datasets: 270 songs from [opencpop](https://github.com/wenet-e2e/opencpop) and [GTSinger](https://github.com/GTSinger/GTSinger) |
|
- Validation datasets: 30 songs from my own collection |
|
- All random reverbs and delay effects are generated by [this python script](./scripts/create_reverb_delay.py) and sorted into the mustb18 dataset format. |
|
Metrics: Based on the sdr value of 30 songs for validation. |
|
|
|
``` |
|
Instr dry sdr: 13.1507 (Std: 4.1088) |
|
Instr dry l1_freq: 53.7715 (Std: 13.3363) |
|
Instr dry si_sdr: 12.7707 (Std: 4.6134) |
|
Instr other sdr: 6.8830 (Std: 2.5547) |
|
Instr other l1_freq: 52.7358 (Std: 11.8587) |
|
Instr other si_sdr: 5.9448 (Std: 2.8721) |
|
Metric avg sdr : 10.0169 |
|
Metric avg l1_freq : 53.2536 |
|
Metric avg si_sdr : 9.3577 |
|
``` |
|
|
|
### Training log |
|
|
|
Training logs: [train.log](./train.log)<br> |
|
The following image is the TensorBoard visualization training log generated by [this script](./scripts/start_tensorboard.py). |
|
![image](./tensorboard.png) |
|
|
|
### Thanks |
|
|
|
- Mel-Band-Roformer [[Paper](https://arxiv.org/abs/2310.01809), [Repository](https://github.com/lucidrains/BS-RoFormer)] |
|
- [ZFTurbo](https://github.com/ZFTurbo)'s training code [[Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training)] |
|
- [CN17161](https://github.com/CN17161) provided GPUs. |
|
- [Glucy-2](https://github.com/Glucy-2) provided technical assistance. |