ibm
/

biomed.omics.bl.sm.ma-ted-458m.dti_bindingdb_pkd

Safetensors

PyTorch

biomed-multi-alignment

drug-target-interaction

ibm

mammal

Model card Files Files and versions Community

SagiPolaczek

moshe-raboh commited on 28 days ago

Commit

603c70e

•

1 Parent(s): 4c8d3fa

Update README.md (#1)

Browse files

- Update README.md (fc4233c400af4a7e9d648c7c9078bbc473c2630a)

Co-authored-by: Moshe Raboh <moshe-raboh@users.noreply.huggingface.co>

Files changed (1) hide show

README.md +91 -4

README.md CHANGED Viewed

@@ -1,8 +1,95 @@
 ---
 tags:
-- model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Library: [More Information Needed]
-- Docs: [More Information Needed]

 ---
 tags:
+- protein
+- small-molecule
+- dti
+- ibm
+- mammal
+- pytorch
+- transformers
+library_name: biomed
+license: apache-2.0
+base_model:
+- ibm/biomed.omics.bl.sm.ma-ted-400m
 ---
+Accurate prediction of drug-target binding affinity is essential in the early stages of drug discovery.
+This is an example of finetuning ibm/biomed.omics.bl.sm-ted-400 the task.
+Prediction of binding affinities using pKd, the negative logarithm of the dissociation constant, which reflects the strength of the interaction between a small molecule (drug) and a protein (target).
+The expected inputs for the model are the amino acid sequence of the target and the SMILES representation of the drug.
+The benchmark used for fine-tuning defined on: `https://tdcommons.ai/multi_pred_tasks/dti/`
+We also harmonize the values using data.harmonize_affinities(mode = 'max_affinity') and transforming to log-scale.
+By default, we are using Drug+Target cold-split, as provided by tdcommons.
+## Model Summary
+- **Developers:** IBM Research
+- **GitHub Repository:** https://github.com/BiomedSciAI/biomed-multi-alignment
+- **Paper:** TBD
+- **Release Date**: Oct 28th, 2024
+- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
+## Usage
+Using `ibm/biomed.omics.bl.sm.ma-ted-400m` requires installing [https://github.com/BiomedSciAI/biomed-multi-alignment](https://github.com/TBD)
+```
+pip install git+https://github.com/BiomedSciAI/biomed-multi-alignment.git
+```
+A simple example for a task already supported by `ibm/biomed.omics.bl.sm.ma-ted-400m`:
+```python
+# Load Model
+model = Mammal.from_pretrained("ibm/biomed.omics.bl.sm.ma-ted-400m.dti_bindingdb_pkd")
+# Load Tokenizer
+tokenizer_op = ModularTokenizerOp.from_pretrained("ibm/biomed.omics.bl.sm.ma-ted-400m.dti_bindingdb_pkd")
+# convert to MAMMAL style
+sample_dict = {"target_seq": target_seq, "drug_seq": drug_seq}
+sample_dict = DtiBindingdbKdTask.data_preprocessing(
+    sample_dict=sample_dict,
+    tokenizer_op=tokenizer_op,
+    target_sequence_key="target_seq",
+    drug_sequence_key="drug_seq",
+    norm_y_mean=None,
+    norm_y_std=None,
+    device=nn_model.device,
+)
+# forward pass - encoder_only mode which supports scalars predictions
+batch_dict = nn_model.forward_encoder_only([sample_dict])
+# Post-process the model's output
+batch_dict = DtiBindingdbKdTask.process_model_output(
+    batch_dict,
+    scalars_preds_processed_key="model.out.dti_bindingdb_kd",
+    norm_y_mean=norm_y_mean,
+    norm_y_std=norm_y_std,
+)
+ans = {
+    "model.out.dti_bindingdb_kd": float(batch_dict["model.out.dti_bindingdb_kd"][0])
+}
+# Print prediction
+print(f"{ans=}")
+```
+For more advanced usage, see our detailed example at: on `https://github.com/BiomedSciAI/biomed-multi-alignment`
+## Citation
+If you found our work useful, please consider to give a star to the repo and cite our paper:
+```
+@article{TBD,
+  title={TBD},
+  author={IBM Research Team},
+  jounal={arXiv preprint arXiv:TBD},
+  year={2024}
+}
+```