|
--- |
|
license: cc-by-nc-sa-4.0 |
|
tags: |
|
- Helical |
|
- rna |
|
- mrna |
|
- biology |
|
- transformers |
|
- mamba2 |
|
- sequence |
|
- genomics |
|
library_name: transformers |
|
--- |
|
# Mamba2-mRNA |
|
Mamba2-mRNA is a state-space model built on the Mamba2 architecture, trained at single-nucleotide resolution. This innovative model offers several advantages, including faster processing speeds compared to traditional transformer models, efficient handling of long sequences, and reduced memory requirements. Its state-space approach enables better modeling of biological sequences by capturing both local and long-range dependencies in mRNA data. The single-nucleotide resolution allows for precise prediction and analysis of genetic elements. |
|
|
|
# Helical<a name="helical"></a> |
|
|
|
#### Install the package |
|
|
|
Run the following to install the [Helical](https://github.com/helicalAI/helical) package via pip: |
|
```console |
|
pip install --upgrade helical |
|
``` |
|
|
|
#### Generate Embeddings |
|
```python |
|
from helical import Mamba2mRNA, Mamba2mRNAConfig |
|
import torch |
|
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
|
input_sequences = ["ACU"*20, "AUG"*20, "AUG"*20, "ACU"*20, "AUU"*20] |
|
|
|
mamba2_mrna_config = Mamba2mRNAConfig(batch_size=5, device=device) |
|
mamba2_mrna = Mamba2mRNA(configurer=mamba2_mrna_config) |
|
|
|
# prepare data for input to the model |
|
processed_input_data = mamba2_mrna.process_data(input_sequences) |
|
|
|
# generate the embeddings for the input data |
|
embeddings = mamba2_mrna.get_embeddings(processed_input_data) |
|
``` |
|
|
|
#### Fine-Tuning |
|
Classification fine-tuning example: |
|
```python |
|
from helical import Mamba2mRNAFineTuningModel, Mamba2mRNAConfig |
|
import torch |
|
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
|
input_sequences = ["ACU"*20, "AUG"*20, "AUG"*20, "ACU"*20, "AUU"*20] |
|
labels = [0, 2, 2, 0, 1] |
|
|
|
mamba2_mrna_config = Mamba2mRNAConfig(batch_size=5, device=device, max_length=100) |
|
mamba2_mrna_fine_tune = Mamba2mRNAFineTuningModel(mamba2_mrna_config=mamba2_mrna_config, fine_tuning_head="classification", output_size=3) |
|
|
|
# prepare data for input to the model |
|
train_dataset = mamba2_mrna_fine_tune.process_data(input_sequences) |
|
|
|
# fine-tune the model with the relevant training labels |
|
mamba2_mrna_fine_tune.train(train_dataset=train_dataset, train_labels=labels) |
|
|
|
# get outputs from the fine-tuned model on a processed dataset |
|
outputs = mamba2_mrna_fine_tune.get_outputs(train_dataset) |
|
``` |