tokenizer-dna-mlm / README.md
gonzalobenegas's picture
Update README.md
7181f4c verified
metadata
license: mit
tags:
  - dna
  - biology
  - genomics

Tokenizer for masked language modeling of DNA sequences

    "vocab": {
      "[PAD]": 0,
      "[MASK]": 1,
      "[UNK]": 2,
      "a": 3,
      "c": 4,
      "g": 5,
      "t": 6
    },