hogru commited on
Commit
db585d3
1 Parent(s): a159286

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -3,6 +3,9 @@ license: mit
3
  tags:
4
  - chemistry
5
  - smiles
 
 
 
6
  ---
7
 
8
  # Model Card for Model hogru/MolReactGen-GuacaMol-Molecules
@@ -44,7 +47,6 @@ The main use of this model is to pass the master's examination of the author ;-)
44
  The model can be used in a Hugging Face text generation pipeline. For the intended use case a wrapper around the raw text generation pipeline is needed. This is the [`generate.py` from the repository](https://github.com/hogru/MolReactGen/blob/main/src/molreactgen/generate.py).
45
  The model has a default `GenerationConfig()` (`generation_config.json`) which can be overwritten. Depending on the number of molecules to be generated (`num_return_sequences` in the `JSON` file) this might take a while. The generation code above shows a progress bar during generation.
46
 
47
-
48
  ## Bias, Risks, and Limitations
49
 
50
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
@@ -63,11 +65,11 @@ The model generates molecules that are similar to the GuacaMol training data, wh
63
 
64
  The default Hugging Face `Trainer()` has been used, with an `EarlyStoppingCallback()`.
65
 
66
- #### Preprocessing
67
 
68
  The training data was pre-processed with a `PreTrainedTokenizerFast()` trained on the training data with a character level pre-tokenizer and Unigram as the sub-word tokenization algorithm with a vocabulary size of 88. Other tokenizers can be configured.
69
 
70
- #### Training Hyperparameters
71
 
72
  - **Batch size:** 64
73
  - **Gradient accumulation steps:** 4
@@ -86,7 +88,7 @@ More configuration (options) can be found in the [`conf`](https://github.com/hog
86
 
87
  Please see the slides / the poster mentioned above.
88
 
89
- #### Metrics
90
 
91
  <!-- These are the evaluation metrics being used, ideally with a description of why. -->
92
 
 
3
  tags:
4
  - chemistry
5
  - smiles
6
+ widget:
7
+ - text: "^"
8
+ example_title: "Sample molecule | SMILES"
9
  ---
10
 
11
  # Model Card for Model hogru/MolReactGen-GuacaMol-Molecules
 
47
  The model can be used in a Hugging Face text generation pipeline. For the intended use case a wrapper around the raw text generation pipeline is needed. This is the [`generate.py` from the repository](https://github.com/hogru/MolReactGen/blob/main/src/molreactgen/generate.py).
48
  The model has a default `GenerationConfig()` (`generation_config.json`) which can be overwritten. Depending on the number of molecules to be generated (`num_return_sequences` in the `JSON` file) this might take a while. The generation code above shows a progress bar during generation.
49
 
 
50
  ## Bias, Risks, and Limitations
51
 
52
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 
65
 
66
  The default Hugging Face `Trainer()` has been used, with an `EarlyStoppingCallback()`.
67
 
68
+ ### Preprocessing
69
 
70
  The training data was pre-processed with a `PreTrainedTokenizerFast()` trained on the training data with a character level pre-tokenizer and Unigram as the sub-word tokenization algorithm with a vocabulary size of 88. Other tokenizers can be configured.
71
 
72
+ ### Training Hyperparameters
73
 
74
  - **Batch size:** 64
75
  - **Gradient accumulation steps:** 4
 
88
 
89
  Please see the slides / the poster mentioned above.
90
 
91
+ ### Metrics
92
 
93
  <!-- These are the evaluation metrics being used, ideally with a description of why. -->
94