AmelieSchreiber
/

esm2_t12_35M_qlora_binding_sites_v1

Model card Files Files and versions Community

AmelieSchreiber commited on Sep 30, 2023

Commit

287b523

·

1 Parent(s): 7e9a955

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -8,7 +8,9 @@ In this model we added in more QLoRA adapter layers, modifying all of the weight
 train and test metrics, again, are smaller for this model than for the model with fewer adapter layers (only using query, key, and value
 matrices). So, we see that adapting more of the weight matrices in this larger ESM-2 model decreases overfitting and serves as a better
 regularizer. For comparison, see [this model](https://huggingface.co/AmelieSchreiber/esm2_t12_35M_qlora_binding_sites_v0) which only
-has QLoRA adapters on the query, key, and value matrices.
 ## Testing for Overfitting

 train and test metrics, again, are smaller for this model than for the model with fewer adapter layers (only using query, key, and value
 matrices). So, we see that adapting more of the weight matrices in this larger ESM-2 model decreases overfitting and serves as a better
 regularizer. For comparison, see [this model](https://huggingface.co/AmelieSchreiber/esm2_t12_35M_qlora_binding_sites_v0) which only
+has QLoRA adapters on the query, key, and value matrices. This model was trained on [this dataset](https://huggingface.co/datasets/AmelieSchreiber/1111K_binding_sites).
+Note, this dataset is too small for this model, so overfitting is expected, but overfitting is clearly reduced by including more adapter
+layers in the QLoRA.
 ## Testing for Overfitting