|
--- |
|
license: mit |
|
language: |
|
- rna |
|
- dna |
|
|
|
tags: |
|
- Genomic-Language-Modeling |
|
- OmniGenome Foundation Model |
|
--- |
|
|
|
# Multi-species Foundation Model for Universal RNA and DNA Downstream Tasks |
|
|
|
# Notes |
|
We are keep updating the checkpoints, the current checkpoint is trained for 0.85 epoch. |
|
|
|
## Training Examples |
|
Refer to GitHub [https://github.com/yangheng95/OmniGenome](https://github.com/yangheng95/OmniGenome) |
|
|
|
## Usage |
|
This model is available for replacing genomic foundation models such as CDSBERT, Nucleotide Transformers, DNABERT2, etc. |
|
``` |
|
from transformers import AutoModel |
|
model = AutoModel.from_pretrained("yangheng/OmniGenome-52M", trust_remote_code=True) |
|
``` |
|
|
|
## Subtasks |
|
- Secondary structure prediction |
|
- Genome Sequence Classification |
|
- Genome Sequence Regression |
|
- Single Nucleotide Repair |
|
- Genome Masked Language Modeling |
|
- etc. |
|
|
|
Part of the codes are adapted from ESM2. |