fix errors in readme
Browse files
README.md
CHANGED
@@ -120,7 +120,7 @@ This is a LoRA for the [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.
|
|
120 |
|
121 |
Given text extracted from pages of a sustainability report, this model extracts the scope 1, 2 and 3 emissions in JSON format. The JSON object also contains the pages containing this information. For example, the [2022 sustainability report by the Bristol-Myers Squibb Company](https://www.bms.com/assets/bms/us/en-us/pdf/bmy-2022-esg-report.pdf) leads to the following output: `{"scope_1":202290,"scope_2":161907,"scope_3":1696100,"sources":[88,89]}`.
|
122 |
|
123 |
-
Reaches an emission value extraction accuracy of 65\% (up from 46\% of the base model) and a source citation accuracy of
|
124 |
|
125 |
## Intended uses & limitations
|
126 |
|
@@ -132,19 +132,19 @@ The model is intended to be used together with the [mistralai/Mistral-7B-Instruc
|
|
132 |
|
133 |
Using [transformers](https://github.com/huggingface/transformers) as inference engine:
|
134 |
|
135 |
-
python -m corporate_emissions_reports.inference
|
136 |
|
137 |
Compare to base model without LoRA:
|
138 |
|
139 |
-
python -m corporate_emissions_reports.inference
|
140 |
|
141 |
Alternatively, it is possible to use [llama.cpp](https://github.com/ggerganov/llama.cpp) as inference engine. In this case, follow the installation instructions of the [package readme](https://github.com/nopperl/corporate_emission_reports/blob/main/README.md). In particular, the model needs to be downloaded beforehand. Then:
|
142 |
|
143 |
-
python -m corporate_emissions_reports.inference
|
144 |
|
145 |
Compare to base model without LoRA:
|
146 |
|
147 |
-
python -m corporate_emissions_reports.inference
|
148 |
|
149 |
#### Programmatically
|
150 |
|
|
|
120 |
|
121 |
Given text extracted from pages of a sustainability report, this model extracts the scope 1, 2 and 3 emissions in JSON format. The JSON object also contains the pages containing this information. For example, the [2022 sustainability report by the Bristol-Myers Squibb Company](https://www.bms.com/assets/bms/us/en-us/pdf/bmy-2022-esg-report.pdf) leads to the following output: `{"scope_1":202290,"scope_2":161907,"scope_3":1696100,"sources":[88,89]}`.
|
122 |
|
123 |
+
Reaches an emission value extraction accuracy of 65\% (up from 46\% of the base model) and a source citation accuracy of 77\% (base model: 52\%) on the [corporate-emission-reports](https://huggingface.co/datasets/nopperl/corporate-emission-reports) dataset. For more information, refer to the [GitHub repo](https://github.com/nopperl/corporate_emission_reports).
|
124 |
|
125 |
## Intended uses & limitations
|
126 |
|
|
|
132 |
|
133 |
Using [transformers](https://github.com/huggingface/transformers) as inference engine:
|
134 |
|
135 |
+
python -m corporate_emissions_reports.inference --model_path mistralai/Mistral-7B-Instruct-v0.2 --lora nopperl/emissions-extraction-lora --model_context_size 32768 --engine hf https://www.bms.com/assets/bms/us/en-us/pdf/bmy-2022-esg-report.pdf
|
136 |
|
137 |
Compare to base model without LoRA:
|
138 |
|
139 |
+
python -m corporate_emissions_reports.inference --model_path mistralai/Mistral-7B-Instruct-v0.2 --model_context_size 32768 --engine hf https://www.bms.com/assets/bms/us/en-us/pdf/bmy-2022-esg-report.pdf
|
140 |
|
141 |
Alternatively, it is possible to use [llama.cpp](https://github.com/ggerganov/llama.cpp) as inference engine. In this case, follow the installation instructions of the [package readme](https://github.com/nopperl/corporate_emission_reports/blob/main/README.md). In particular, the model needs to be downloaded beforehand. Then:
|
142 |
|
143 |
+
python -m corporate_emissions_reports.inference --model mistral --lora ./emissions-extraction-lora/ggml-adapter-model.bin https://www.bms.com/assets/bms/us/en-us/pdf/bmy-2022-esg-report.pdf
|
144 |
|
145 |
Compare to base model without LoRA:
|
146 |
|
147 |
+
python -m corporate_emissions_reports.inference --model mistral https://www.bms.com/assets/bms/us/en-us/pdf/bmy-2022-esg-report.pdf
|
148 |
|
149 |
#### Programmatically
|
150 |
|