FarmerTao's picture
Update README.md
1b1f085 verified
metadata
base_model: westlake-repl/SaProt_35M_AF2
library_name: peft

Base model: westlake-repl/SaProt_35M_AF2

Model Card for Model ID

This model is trained on a sigle site deep mutation scanning dataset and can be used to predict fitness score of mutant amino acid sequence of protein PTEN_HUMAN (Phosphatidylinositol 3,4,5-trisphosphate 3-phosphatase and dual-specificity protein phosphatase).

Protein Function

Dual-specificity protein phosphatase, dephosphorylating tyrosine-, serine- and threonine-phosphorylated proteins. Also functions as a lipid phosphatase, removing the phosphate in the D3 position of the inositol ring of PtdIns(3,4,5)P3/phosphatidylinositol 3,4,5-trisphosphate, PtdIns(3,4)P2/phosphatidylinositol 3,4-diphosphate and PtdIns3P/phosphatidylinositol 3-phosphate with a preference for PtdIns(3,4,5)P3. Furthermore, this enzyme can also act as a cytosolic inositol 3-phosphatase acting on Ins(1,3,4,5,6)P5/inositol 1,3,4,5,6 pentakisphosphate and possibly Ins(1,3,4,5)P4/1D-myo-inositol 1,3,4,5-tetrakisphosphate.

Task type

protein level regression

Dataset description

The dataset is from Deep generative models of genetic variation capture the effects of mutations. And can also be found on SaprotHub dataset.

Label means fitness score of each mutant amino acid sequence, ranging from minus infinity to positive infinity, smaller means more stable.

Model input type

Amino acid sequence

Performance

0.62 Spearman's ρ

LoRA config

lora_dropout: 0.0

lora_alpha: 16

target_modules: ["query", "key", "value", "intermediate.dense", "output.dense"]

modules_to_save: ["classifier"]

Training config

class: AdamW

betas: (0.9, 0.98)

weight_decay: 0.01

learning rate: 1e-4

epoch: 50

batch size: 64

precision: 16-mixed