Upload 6 files
#1
by
mmokoatle
- opened
This repository contains the trained model for our manuscript, which is currently being reviewed by BMC Bioinformatics. This model, called simcse-dna, is based on the original implementation of SimCSE. The original model was adapted for DNA downstream tasks by training it on a small sample size k-mer tokens generated from the human reference genome, and can be used to generate sentence embeddings for DNA tasks.
mmokoatle
changed pull request status to
merged