gonzalobenegas commited on
Commit
fc3ac98
1 Parent(s): 352620e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -1,3 +1,16 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ # Tokenizer for masked language modeling of DNA sequences
5
+
6
+ ```json
7
+ "vocab": {
8
+ "[PAD]": 0,
9
+ "[MASK]": 1,
10
+ "[UNK]": 2,
11
+ "a": 3,
12
+ "c": 4,
13
+ "g": 5,
14
+ "t": 6
15
+ },
16
+ ```