adibvafa commited on
Commit
fca296e
1 Parent(s): d47df2a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +92 -3
README.md CHANGED
@@ -1,3 +1,92 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - CodonTransformer
5
+ - Computational Biology
6
+ - Machine Learning
7
+ - Bioinformatics
8
+ - Synthetic Biology
9
+ license: apache-2.0
10
+ pipeline_tag: token-classification
11
+ ---
12
+
13
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c9888b3137cc529d0761c4/GqKutRwiGGif69Gjd8Df3.png)
14
+
15
+ **CodonTransformer** is the ultimate tool for codon optimization, transforming protein sequences into optimized DNA sequences specific for your target organisms. Whether you are a researcher or a practitioner in genetic engineering, CodonTransformer provides a comprehensive suite of features to facilitate your work. By leveraging the Transformer architecture and a user-friendly Jupyter notebook, it reduces the complexity of codon optimization, saving you time and effort.
16
+ Note this is the pretrained model. We recommend using the finetuned model available at https://huggingface.co/adibvafa/CodonTransformer
17
+
18
+ ## Use Case
19
+ **For an interactive demo, check out our [Google Colab Notebook.](https://adibvafa.github.io/CodonTransformer/GoogleColab)**
20
+ <br></br>
21
+ After installing CodonTransformer, you can use:
22
+ ```python
23
+ import torch
24
+ from transformers import AutoTokenizer, BigBirdForMaskedLM
25
+ from CodonTransformer.CodonPrediction import predict_dna_sequence
26
+ from CodonTransformer.CodonJupyter import format_model_output
27
+ DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
28
+
29
+
30
+ # Load model and tokenizer
31
+ tokenizer = AutoTokenizer.from_pretrained("adibvafa/CodonTransformer")
32
+ model = BigBirdForMaskedLM.from_pretrained("adibvafa/CodonTransformer").to(DEVICE)
33
+
34
+
35
+ # Set your input data
36
+ protein = "MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGG"
37
+ organism = "Escherichia coli general"
38
+
39
+
40
+ # Predict with CodonTransformer
41
+ output = predict_dna_sequence(
42
+ protein=protein,
43
+ organism=organism,
44
+ device=DEVICE,
45
+ tokenizer_object=tokenizer,
46
+ model_object=model,
47
+ attention_type="original_full",
48
+ )
49
+ print(format_model_output(output))
50
+ ```
51
+ The output is:
52
+ <br>
53
+
54
+
55
+ ```python
56
+ -----------------------------
57
+ | Organism |
58
+ -----------------------------
59
+ Escherichia coli general
60
+
61
+ -----------------------------
62
+ | Input Protein |
63
+ -----------------------------
64
+ MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGG
65
+
66
+ -----------------------------
67
+ | Processed Input |
68
+ -----------------------------
69
+ M_UNK A_UNK L_UNK W_UNK M_UNK R_UNK L_UNK L_UNK P_UNK L_UNK L_UNK A_UNK L_UNK L_UNK A_UNK L_UNK W_UNK G_UNK P_UNK D_UNK P_UNK A_UNK A_UNK A_UNK F_UNK V_UNK N_UNK Q_UNK H_UNK L_UNK C_UNK G_UNK S_UNK H_UNK L_UNK V_UNK E_UNK A_UNK L_UNK Y_UNK L_UNK V_UNK C_UNK G_UNK E_UNK R_UNK G_UNK F_UNK F_UNK Y_UNK T_UNK P_UNK K_UNK T_UNK R_UNK R_UNK E_UNK A_UNK E_UNK D_UNK L_UNK Q_UNK V_UNK G_UNK Q_UNK V_UNK E_UNK L_UNK G_UNK G_UNK __UNK
70
+
71
+ -----------------------------
72
+ | Predicted DNA |
73
+ -----------------------------
74
+ ATGGCTTTATGGATGCGTCTGCTGCCGCTGCTGGCGCTGCTGGCGCTGTGGGGCCCGGACCCGGCGGCGGCGTTTGTGAATCAGCACCTGTGCGGCAGCCACCTGGTGGAAGCGCTGTATCTGGTGTGCGGTGAGCGCGGCTTCTTCTACACGCCCAAAACCCGCCGCGAAGCGGAAGATCTGCAGGTGGGCCAGGTGGAGCTGGGCGGCTAA
75
+ ```
76
+
77
+
78
+ ## Additional Resources
79
+ - **Project Website** <br>
80
+ https://adibvafa.github.io/CodonTransformer/
81
+
82
+ - **GitHub Repository** <br>
83
+ https://github.com/Adibvafa/CodonTransformer
84
+
85
+ - **Google Colab Demo** <br>
86
+ https://adibvafa.github.io/CodonTransformer/GoogleColab
87
+
88
+ - **PyPI Package** <br>
89
+ https://pypi.org/project/CodonTransformer/
90
+
91
+ - **Paper** <br>
92
+ TBD