File size: 1,810 Bytes
aea73e2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# Cell Segmentation

## Training

The data structure used to train cell segmentation networks is different than to train classification networks on WSI/Patient level. Cureently, due to the massive amount of cells inside a WSI, all famous cell segmentation datasets (such like [PanNuke](https://warwick.ac.uk/fac/cross_fac/tia/data/pannuke), https://doi.org/10.48550/arXiv.2003.10778) provide just patches with cell annotations. Therefore, we use the following dataset structure (with k folds):

```bash
dataset  
β”œβ”€β”€ dataset_config.yaml  
β”œβ”€β”€ fold0  
β”‚   β”œβ”€β”€ images  
|   |   β”œβ”€β”€ 0_imgname0.png
|   |   β”œβ”€β”€ 0_imgname1.png
|   |   β”œβ”€β”€ 0_imgname2.png
...
|   |   └── 0_imgnameN.png  
β”‚   β”œβ”€β”€ labels
|   |   β”œβ”€β”€ 0_imgname0.npy
|   |   β”œβ”€β”€ 0_imgname1.npy
|   |   β”œβ”€β”€ 0_imgname2.npy
...
|   |   └── 0_imgnameN.npy  
|   └── types.csv
β”œβ”€β”€ fold1  
β”‚   β”œβ”€β”€ images  
|   |   β”œβ”€β”€ 1_imgname0.png
|   |   β”œβ”€β”€ 1_imgname1.png
...
β”‚   β”œβ”€β”€ labels
|   |   β”œβ”€β”€ 1_imgname0.npy
|   |   β”œβ”€β”€ 1_imgname1.npy
...
|   └── types.csv
...
└── foldk  
β”‚   β”œβ”€β”€ images  
    |   β”œβ”€β”€ k_imgname0.png
    |   β”œβ”€β”€ k_imgname1.png
...
    β”œβ”€β”€ labels
    |   β”œβ”€β”€ k_imgname0.npy
    |   β”œβ”€β”€ k_imgname1.npy
    └── types.csv
```

Each type csv should have the following header:
```csv
img,type                            # Header
foldnum_imgname0.png,SetTypeHeare   # Each row is one patch with tissue type
```

The labels are numpy masks with the following structure:
TBD

## Add a new dataset
add to dataset coordnator.

All settings of the dataset must be performed in the correspondinng yaml file, under the data section

dataset name is **not** case sensitive!