Edit model card

Label Semantics:

Label 0: Non-crystallizable (Negative)

Label 1: Crystallizable (Positive)

Dataset

  1. DeepCrystal Train
  2. DeepCrystal Test
  3. BCrystal Test
  4. SP Test
  5. TR Test

Model

ESMCrystal_t12_35M_v2

ESMCrystal_t12_35M_v2 is a state-of-the-art protein crystallization prediction model finetuned on esm2_t12_35M_UR50D, having 12 layers and 35M parameters with size of approx. 136MB using transfer learning to predict whether an input protein sequence will crystallize or not.

Accuracy :

Dataset Accuracy
DeepCrystal Test 0.8161222339304531
BCrystal test 0.8052602126468943
SP test 0.7637130801687764
TR test 0.8389328063241107

Comparision Table:

Dataset Count Positives Negatives TP FP FN TN Precision Recall F1 Accuracy ROC Mathew's Coefficient PPV NPV
DeepCrystalTest 1898 898 1000 579 319 30 970 0.64476615 0.95073892 0.76841407 0.81612223 0.9403 0.657526117 0.64476615 0.97
BCrystal Test 1787 891 896 573 318 30 866 0.64309764 0.95024876 0.76706827 0.80526021 0.9396 0.644635696 0.64309764 0.96651786
SP Test 237 148 89 97 51 5 84 0.65540541 0.95098039 0.776 0.76371308 0.9293 0.586069704 0.65540541 0.94382022
TR Test 1012 374 638 225 149 14 624 0.60160428 0.94142259 0.73409462 0.83893281 0.9562 0.658766192 0.60160428 0.97805643

Graphs

ROC-AUC Curve

  • DeepCrystal Test Test ROC-AUC Curve

  • BCrystal Test BCrystal Test ROC-AUC Curve

  • SP Test SP Test ROC-AUC Curve

  • TR Test TR Test ROC-AUC Curve

PR-AUC Curve

  • DeepCrystal Test Test PR-AUC Curve

  • BCrystal Test BCrystal Test PR-AUC Curve

  • SP Test SP Test PR-AUC Curve

  • TR Test TR Test PR-AUC Curve

Final scores :

  • on DeepCrystal test:
precision recall f1-score support
non-crystallizable 0.75 0.97 0.85 1000
crystallizable 0.95 0.64 0.77 898
accuracy 0.82 1898
macro avg 0.85 0.81 0.81 1898
weighted avg 0.85 0.82 0.81 1898
  • on BCrystal test:
precision recall f1-score support
non-crystallizable 0.73 0.97 0.83 896
crystallizable 0.95 0.64 0.77 891
accuracy 0.81 1787
macro avg 0.84 0.80 0.80 1787
weighted avg 0.84 0.81 0.80 1787
  • on SP test:
precision recall f1-score support
non-crystallizable 0.62 0.94 0.75 89
crystallizable 0.95 0.66 0.78 148
accuracy 0.76 237
macro avg 0.79 0.80 0.76 237
weighted avg 0.83 0.76 0.77 237
  • on TR test:
precision recall f1-score support
non-crystallizable 0.81 0.98 0.88 638
crystallizable 0.94 0.60 0.73 374
accuracy 0.84 1012
macro avg 0.87 0.79 0.81 1012
weighted avg 0.86 0.84 0.83 1012

Confusion matrix:

  • on DeepCrystal test:
    | 579 | 319 |
    |  30 | 970 |
  • on BCrystal test:
    | 573 | 318 |
    |  30 | 866 |
  • on SP test:
    | 97 |  51 |
    |  5 |  84 |
  • on TR test:
    | 225 | 149 |
    |  14 | 624 |

Metrics

roc score:

  • on DeepCrystal test: 0.9403474387527841

  • on BCrystal test: 0.9395705567580568

  • on SP test: 0.9293197692074097

  • on TR test: 0.9561924798417515

Mathews Coefficient:

  • on DeepCrystal test: 0.6575261170551334

  • on BCrystal test: 0.6446356961702661

  • on SP test: 0.586069703866632

  • on TR test: 0.6587661924247377

NPV:

  • on DeepCrystal test: 0.97

  • on BCrystal test: 0.9665178571428571

  • on SP test: 0.9438202247191011

  • on TR test: 0.9780564263322884

PPV:

  • on DeepCrystal test: 0.6447661469933185

  • on BCrystal test: 0.6430976430976431

  • on SP test: 0.6554054054054054

  • on TR test: 0.6016042780748663

Researchers:

Credits:

Downloads last month
30
Safetensors
Model size
34M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.