Sucial commited on
Commit
2a939ec
1 Parent(s): 13370e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -3
README.md CHANGED
@@ -1,3 +1,49 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ ---
4
+
5
+ ### Description
6
+
7
+ This model is used to separate reverb and delay effects in vocals. In addition, it can also separate partial harmony, but it cannot completely separate them. I added random high cut after the reverberation and delay effects in the dataset, so the model's handling of high frequencies is not particularly aggressive.<br>
8
+ You can try listening to the performance of this model [here](./examples)!
9
+
10
+ ### How to use the model?
11
+
12
+ Try it with [ZFTurbo's Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training)
13
+
14
+ ### Model
15
+
16
+ Configs: [config_dereverb-echo_mel_band_roformer.yaml](./config_dereverb-echo_mel_band_roformer.yaml)<br>
17
+ Model: [dereverb-echo_mel_band_roformer_sdr_10.0169.ckpt](./dereverb-echo_mel_band_roformer_sdr_10.0169.ckpt)<br>
18
+ Instruments: [dry, other]<br>
19
+ Finetuned from: `model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt`<br>
20
+ Datasets:
21
+ - Training datasets: 270 songs from [opencpop](https://github.com/wenet-e2e/opencpop) and [GTSinger](https://github.com/GTSinger/GTSinger)
22
+ - Validation datasets: 30 songs from my own collection
23
+ - All random reverbs and delay effects are generated by [this python script](./scripts/create_reverb_delay.py) and sorted into the mustb18 dataset format.
24
+ Metrics: Based on the sdr value of 30 songs for validation.
25
+
26
+ ```
27
+ Instr dry sdr: 13.1507 (Std: 4.1088)
28
+ Instr dry l1_freq: 53.7715 (Std: 13.3363)
29
+ Instr dry si_sdr: 12.7707 (Std: 4.6134)
30
+ Instr other sdr: 6.8830 (Std: 2.5547)
31
+ Instr other l1_freq: 52.7358 (Std: 11.8587)
32
+ Instr other si_sdr: 5.9448 (Std: 2.8721)
33
+ Metric avg sdr : 10.0169
34
+ Metric avg l1_freq : 53.2536
35
+ Metric avg si_sdr : 9.3577
36
+ ```
37
+
38
+ ### Training log
39
+
40
+ Training logs: [train.log](./train.log)<br>
41
+ The following image is the TensorBoard visualization training log generated by [this script](./scripts/start_tensorboard.py).
42
+ ![image](./tensorboard.png)
43
+
44
+ ### Thanks
45
+
46
+ - Mel-Band-Roformer [[Paper](https://arxiv.org/abs/2310.01809), [Repository](https://github.com/lucidrains/BS-RoFormer)]
47
+ - [ZFTurbo](https://github.com/ZFTurbo)'s training code [[Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training)]
48
+ - [CN17161](https://github.com/CN17161) provided GPUs.
49
+ - [Glucy-2](https://github.com/Glucy-2) provided technical assistance.