Edit model card

mamba130-proteinpretrain-quinoa

Full model finetuning of Mamba-130M-HF on the "research" split (quinoa protein sequences) of GreenBeing-Proteins dataset.

Due to limits of V100 GPU, trained 510 steps x batches of 3, ~5% of the research split.

Requires GitHub main branch of Transformers (Mamba is not included in releases)

Considering training on natural language + proteins, or new "biotokens".

More details TBD

Training procedure

Notebook: https://colab.research.google.com/drive/1W1rB6rRt8krHZSVYQ_TjbnD9OwzFQeGL

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 3
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1
  • mixed_precision_training: Native AMP

Framework versions

  • Transformers 4.40.0.dev0
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
207
Safetensors
Model size
129M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for monsoon-nlp/mamba130-proteinpretrain-quinoa

Finetuned
(8)
this model

Dataset used to train monsoon-nlp/mamba130-proteinpretrain-quinoa