juliocesar-io commited on
Commit
0245944
1 Parent(s): 8faf658

updaded Docker build image

Browse files
Files changed (3) hide show
  1. Dockerfile +5 -10
  2. README.md +118 -0
  3. requirements.txt +2 -1
Dockerfile CHANGED
@@ -1,13 +1,8 @@
1
- # Use NVIDIA PyTorch image as the base
2
- FROM nvcr.io/nvidia/pytorch:22.03-py3
3
 
4
- RUN apt-get update && apt-get install -y libxrender1
5
-
6
- # Base pytorch
7
- RUN conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.6 -c pytorch -c conda-forge
8
-
9
- # Set required versions for each core dependency using cu116
10
- RUN pip install torch-scatter==2.0.9 torch-sparse==0.6.14 torch-cluster==1.6.0 torch-spline-conv==1.2.1 torch-geometric==2.1.0 -f https://data.pyg.org/whl/torch-1.12.0+cu116.html
11
 
12
  # Create a new user named "user" with UID 1000
13
  RUN useradd -m -u 1000 user
@@ -41,4 +36,4 @@ RUN pip install --no-cache-dir --upgrade pip
41
  RUN pip install --user .
42
 
43
  # Set the default command to bash
44
- CMD ["/bin/bash"]
 
1
+ FROM pytorch/pytorch:1.13.1-cuda11.6-cudnn8-runtime
 
2
 
3
+ RUN apt-get update && apt-get install -y libxrender1 build-essential
4
+ RUN pip install torch-sparse -f https://data.pyg.org/whl/torch-1.13.1+cu116.html
5
+ RUN pip install torch-scatter==2.0.9 torch-cluster==1.6.0 torch-spline-conv==1.2.1 torch-geometric==2.1.0 -f https://data.pyg.org/whl/torch-1.13.1+cu116.html
 
 
 
 
6
 
7
  # Create a new user named "user" with UID 1000
8
  RUN useradd -m -u 1000 user
 
36
  RUN pip install --user .
37
 
38
  # Set the default command to bash
39
+ CMD ["python", "app.py"]
README.md CHANGED
@@ -8,3 +8,121 @@ app_port: 7860
8
  app_file: app.py
9
  pinned: false
10
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  app_file: app.py
9
  pinned: false
10
  ---
11
+
12
+ # PLA-Net: Predicting Protein-Ligand Interactions with Deep Graph Networks
13
+
14
+ Forked version of [PLA-Net](https://github.com/BCV-Uniandes/PLA-Net)
15
+
16
+ ## Background
17
+
18
+ **PLA-Net** is a deep learning model designed to predict interactions between small organic molecules (ligands) and any of the 102 target proteins in the Alzheimer's Disease (AD) dataset. By transforming molecular and protein sequences into graph representations, PLA-Net leverages Graph Convolutional Networks (GCNs) to analyze and predict target-ligand interaction probabilities. Developed by [BCV-Uniandes](https://github.com/BCV-Uniandes/PLA-Net).
19
+
20
+ ## Key Features
21
+
22
+ - **Graph-Based Input Representation**
23
+ - **Ligand Module (LM):** Converts SMILES sequences of molecules into graph representations.
24
+ - **Protein Module (PM):** Transforms FASTA sequences of proteins into graph structures.
25
+
26
+ - **Deep Graph Convolutional Networks**
27
+ - Each module employs a deep GCN followed by an average pooling layer to extract meaningful features from the input graphs.
28
+
29
+ - **Interaction Prediction**
30
+ - The feature representations from the LM and PM are concatenated.
31
+ - A fully connected layer processes the combined features to predict the interaction probability between the ligand and the target protein.
32
+
33
+ ## Quick Start
34
+
35
+ If you want to run PLA-Net without installing it, you can run it freely on this [Hugging Face Space](https://huggingface.co/spaces/juliocesar-io/PLA-Net).
36
+
37
+ ## Docker Install
38
+
39
+ To prevent conflicts with the host machine, it is recommended to run PLA-Net in a Docker container.
40
+
41
+ First make sure you have an NVIDIA GPU and [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) installed. Then build the image with the following command:
42
+
43
+ ```bash
44
+ docker build -t pla-net:latest .
45
+ ```
46
+
47
+ ### Inference
48
+
49
+ To run inference, run the following command:
50
+
51
+ This will run inference for the target protein `ada` with the SMILES in the `input_smiles.csv` file and save the predictions to the `output_predictions.csv` file.
52
+
53
+ The prediction file has the following format:
54
+
55
+ ```bash
56
+ target,smiles,interaction_probability,interaction_class
57
+ ada,Cn4c(CCC(=O)Nc3ccc2ccn(CC[C@H](CO)n1cnc(C(N)=O)c1)c2c3)nc5ccccc45,0.9994347542524338,1
58
+ ```
59
+
60
+ Where `interaction_class` is 1 if the interaction probability is greater than 0.5, and 0 otherwise.
61
+
62
+ ```bash
63
+ docker run \
64
+ -it --rm --gpus all \
65
+ -v "$(pwd)":/home/user/output \
66
+ pla-net:latest \
67
+ python /home/user/app/scripts/pla_net_inference.py \
68
+ --use_gpu \
69
+ --target ada \
70
+ --target_list /home/user/app/data/datasets/AD/Targets_Fasta.csv \
71
+ --target_checkpoint_path /home/user/app/pretrained-models/BINARY_ada \
72
+ --input_file_smiles /home/user/app/example/input_smiles.csv \
73
+ --output_file /home/user/output/output_predictions.csv
74
+ ```
75
+
76
+ Args:
77
+
78
+ - `use_gpu`: Use GPU for inference.
79
+ - `target`: Target protein ID from the list of targets. Check the list of available targets in the [data](https://github.com/juliocesar-io/PLA-Net/blob/main/data/datasets/AD/Targets_Fasta.csv) folder.
80
+ - `target_list`: Path to the target list CSV file.
81
+ - `target_checkpoint_path`: Path to the target checkpoint. (e.g. `/workspace/pretrained-models/BINARY_ada`) one checkpoint for each target.
82
+ - `input_file_smiles`: Path to the input SMILES file.
83
+ - `output_file`: Path to the output predictions file.
84
+
85
+
86
+ ### Gradio Server
87
+ We provide a simple graphical user interface to run PLA-Net with Gradio. To use it, run the following command:
88
+
89
+ ```bash
90
+ docker run \
91
+ -it --rm --gpus all \
92
+ -p 7860:7860 \
93
+ pla-net:latest \
94
+ python app.py
95
+ ```
96
+
97
+ Then open your browser and go to `http://localhost:7860/` to access the web interface.
98
+
99
+
100
+ ## Local Install
101
+
102
+ To do inference with PLA-Net, you need to install the dependencies and activate the environment. You can do this by running the following commands:
103
+
104
+ ```bash
105
+ conda env create -f environment.yml
106
+ conda activate pla-net
107
+ ```
108
+
109
+ Now you can run inference with PLA-Net locally. In the project folder, run the following command:
110
+
111
+ ```bash
112
+ python scripts/pla_net_inference.py \
113
+ --use_gpu \
114
+ --target ada \
115
+ --target_list data/datasets/AD/Targets_Fasta.csv \
116
+ --target_checkpoint_path pretrained-models/BINARY_ada \
117
+ --input_file_smiles example/input_smiles.csv \
118
+ --output_file example/output_predictions.csv
119
+ ```
120
+
121
+ ## Models
122
+
123
+ You can download the pre-trained models from [Hugging Face](https://huggingface.co/juliocesar-io/PLA-Net).
124
+ ## Training
125
+
126
+ To train each of the components of our method: LM, LM+Advs, LMPM and PLA-Net please refer to planet.sh file and run the desired models.
127
+
128
+ To evaluate each of the components of our method: LM, LM+Advs, LMPM and PLA-Net please run the corresponding bash file in the inference folder.
requirements.txt CHANGED
@@ -6,4 +6,5 @@ h5py==3.11.0
6
  scipy==1.9.0
7
  numpy==1.24.4
8
  gradio==4.43.0
9
- fastapi==0.112.4
 
 
6
  scipy==1.9.0
7
  numpy==1.24.4
8
  gradio==4.43.0
9
+ fastapi==0.112.4
10
+ jinja2==3.1.4