eaa6a2b
7181f4c
eaa6a2b
fc3ac98
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
---
license: mit
tags:
- dna
- biology
- genomics
---
# Tokenizer for masked language modeling of DNA sequences
```json
"vocab": {
"[PAD]": 0,
"[MASK]": 1,
"[UNK]": 2,
"a": 3,
"c": 4,
"g": 5,
"t": 6
},
``` |