jaykmr
/

ESMCrystal_t12_35M_v2

Text Classification

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Label Semantics:

Label 0: Non-crystallizable (Negative)

Label 1: Crystallizable (Positive)

Dataset

Model

ESMCrystal_t12_35M_v2

ESMCrystal_t12_35M_v2 is a state-of-the-art protein crystallization prediction model finetuned on esm2_t12_35M_UR50D, having 12 layers and 35M parameters with size of approx. 136MB using transfer learning to predict whether an input protein sequence will crystallize or not.

Accuracy :

Dataset	Accuracy
DeepCrystal Test	0.8161222339304531
BCrystal test	0.8052602126468943
SP test	0.7637130801687764
TR test	0.8389328063241107

Comparision Table:

Dataset	Count	Positives	Negatives	TP	FP	FN	TN	Precision	Recall	F1	Accuracy	ROC	Mathew's Coefficient	PPV	NPV

DeepCrystalTest	1898	898	1000	579	319	30	970	0.64476615	0.95073892	0.76841407	0.81612223	0.9403	0.657526117	0.64476615	0.97

BCrystal Test	1787	891	896	573	318	30	866	0.64309764	0.95024876	0.76706827	0.80526021	0.9396	0.644635696	0.64309764	0.96651786

SP Test	237	148	89	97	51	5	84	0.65540541	0.95098039	0.776	0.76371308	0.9293	0.586069704	0.65540541	0.94382022

TR Test	1012	374	638	225	149	14	624	0.60160428	0.94142259	0.73409462	0.83893281	0.9562	0.658766192	0.60160428	0.97805643

Graphs

ROC-AUC Curve

DeepCrystal Test
BCrystal Test
SP Test
TR Test

PR-AUC Curve

DeepCrystal Test
BCrystal Test
SP Test
TR Test

Final scores :

on DeepCrystal test:

	precision	recall	f1-score	support
non-crystallizable	0.75	0.97	0.85	1000
crystallizable	0.95	0.64	0.77	898
accuracy			0.82	1898
macro avg	0.85	0.81	0.81	1898
weighted avg	0.85	0.82	0.81	1898

on BCrystal test:

	precision	recall	f1-score	support
non-crystallizable	0.73	0.97	0.83	896
crystallizable	0.95	0.64	0.77	891
accuracy			0.81	1787
macro avg	0.84	0.80	0.80	1787
weighted avg	0.84	0.81	0.80	1787

on SP test:

	precision	recall	f1-score	support
non-crystallizable	0.62	0.94	0.75	89
crystallizable	0.95	0.66	0.78	148
accuracy			0.76	237
macro avg	0.79	0.80	0.76	237
weighted avg	0.83	0.76	0.77	237

on TR test:

	precision	recall	f1-score	support
non-crystallizable	0.81	0.98	0.88	638
crystallizable	0.94	0.60	0.73	374
accuracy			0.84	1012
macro avg	0.87	0.79	0.81	1012
weighted avg	0.86	0.84	0.83	1012

Confusion matrix:

on DeepCrystal test:

    | 579 | 319 |
    |  30 | 970 |

on BCrystal test:

    | 573 | 318 |
    |  30 | 866 |

on SP test:

    | 97 |  51 |
    |  5 |  84 |

on TR test:

    | 225 | 149 |
    |  14 | 624 |

Metrics

roc score:

on DeepCrystal test: 0.9403474387527841
on BCrystal test: 0.9395705567580568
on SP test: 0.9293197692074097
on TR test: 0.9561924798417515

Mathews Coefficient:

on DeepCrystal test: 0.6575261170551334
on BCrystal test: 0.6446356961702661
on SP test: 0.586069703866632
on TR test: 0.6587661924247377

NPV:

on DeepCrystal test: 0.97
on BCrystal test: 0.9665178571428571
on SP test: 0.9438202247191011
on TR test: 0.9780564263322884

PPV:

on DeepCrystal test: 0.6447661469933185
on BCrystal test: 0.6430976430976431
on SP test: 0.6554054054054054
on TR test: 0.6016042780748663

Researchers:

Credits:

Downloads last month: 30

Safetensors

Model size

34M params

Tensor type

I64

·

F32

·

Inference Examples

Text Classification

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.