|
--- |
|
license: mit |
|
--- |
|
|
|
# ESM-2 QLoRA for Binding Sites Prediction |
|
|
|
In this model we added in more QLoRA adapter layers, modifying all of the weight matrices with QLoRA. The differences between the |
|
train and test metrics, again, are smaller for this model than for the model with fewer adapter layers (only using query, key, and value |
|
matrices). So, we see that adapting more of the weight matrices in this larger ESM-2 model decreases overfitting and serves as a better |
|
regularizer. For comparison, see [this model](https://huggingface.co/AmelieSchreiber/esm2_t12_35M_qlora_binding_sites_v0) which only |
|
has QLoRA adapters on the query, key, and value matrices. This model was trained on [this dataset](https://huggingface.co/datasets/AmelieSchreiber/1111K_binding_sites). |
|
Note, this dataset is too small for this model, so overfitting is expected, but overfitting is clearly reduced by including more adapter |
|
layers in the QLoRA. |
|
|
|
## Testing for Overfitting |
|
|
|
```python |
|
Train metrics: |
|
{'eval_loss': 0.17861589789390564, |
|
'eval_accuracy': 0.9336392007583741, |
|
'eval_precision': 0.24007189695313816, |
|
'eval_recall': 0.9234520216135872, |
|
'eval_f1': 0.38107489676203077, |
|
'eval_auc': 0.9286608447868842, |
|
'eval_mcc': 0.4519203165484902} |
|
|
|
Test metrics: |
|
{'eval_loss': 0.2265990674495697, |
|
'eval_accuracy': 0.913988661430497, |
|
'eval_precision': 0.1725452162312655, |
|
'eval_recall': 0.8272126203209694, |
|
'eval_f1': 0.28553230637278637, |
|
'eval_auc': 0.8715212375759034, |
|
'eval_mcc': 0.3539008454498742 |
|
``` |