--- license: apache-2.0 base_model: sentence-transformers/all-mpnet-base-v2 tags: - generated_from_trainer metrics: - accuracy model-index: - name: IKT_classifier_mitigation_best results: [] widget: - text: "Existing Gas Turbine power plant (570 MW)  Installation of prepaid meter  Bring down total T&D loss to a single digit by 2030 Transport  Improvement of road traffic congestion improvement in fuel efficiency)  Widening of roads (2 to 4 lanes) and improving road quality  Construct NMT and bicycle lanes  Electronic Road Pricing (ERP) or congestion charging  Reduction of private cars and encourage electric and hybrid vehicles  Development of Urban Transport Master Plans (UTMP) to improve transport systems in line with the Urban Plan/ City Plan for all major cities and urban area  Introducing Intelligent Transport System (ITS) based public transport management system to ensure better performance, enhance reliability, safety and service  Establish charging station network and electric buses in major cities  Modal shift from road to rail (25% modal shift of passenger-km) through different Transport projects such as BRT, MRT in major cities, Multi-modal hub creation, new bridges etc." example_title: ['Act. mob.', 'Pub. transport improvement'] - text: "Energy efficiency improvement measures include market transformation to energy efficient lighting that showed significant drop in electricity consumption that reached 40% in some buildings as well as improved energy efficiency in industrial sector through energy management systems and simple energy optimization measures. • Low Carbon Transport: The further expansion in the Greater Cairo underground metro network included the operation of stage 4 of length 11.5 km (Phase I: 2019, Phase II: 2020) of the third Cairo metro line as a progress towards achieving the modal shift to low carbon mass transit.14 The third line is the first metro to link east and west Cairo and is expected to serve 2 million passenger trips per day.15 The concept of high quality service buses has been introduced to Egypt targeting car owners to use the newly public transportation system that is integrated with the existing mass transit systems." example_title: ['Public transport improvement'] - text: "Potential Actions Unconditional Contribution The targeted GHG emission reduction for unconditional contributions will be implemented through a set of mitigation actions. The potential mitigations actions are elaborated in Table 4. Table 4: Possible Mitigation Actions to deliver the Unconditional Contribution Sector Description Actions by 2030 Energy Power  Implementation of renewable energy projects  Enhanced efficiency of existing power plants  Use of improved technology for power generation Transport  Improvement of fuel efficiency for transport sub- sector  Increase use of less emission- based transport system and improve Inland Water Transport System Power  Implementation of renewable energy projects of 911.8 MW  Grid-connected Solar-581 MW, Wind-149 MW, MW, Solar Mini-grid-56.8 MW  Installation of new Combined Cycle Gas based power plant (3208 MW)  Efficiency improvement of Existing Gas Turbine power plant (570 MW)  Installation of prepaid meter Transport  Improvement of road traffic congestion improvement in" example_title: ['Veh. impr.', 'Impr. infrastructure'] --- # IKT_classifier_mitigation_best This model is a fine-tuned version of [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) on the [GIZ/policy_qa_v0_1](https://huggingface.co/datasets/GIZ/policy_qa_v0_1) dataset. It achieves the following results on the evaluation set: - Loss: 0.6517 - Precision Micro: 0.3667 - Precision Weighted: 0.4273 - Precision Samples: 0.4539 - Recall Micro: 0.7543 - Recall Weighted: 0.7543 - Recall Samples: 0.7982 - F1-score: 0.5422 - Accuracy: 0.1654 ## Model description The model is a multi-label text classifier based on [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) and fine-tuned on text sourced from national climate policy documents. ## Intended uses & limitations The classifier assigns the following classes to to denote Mitigation categories as portrayed in extracted passages from the documents. The Mitigation categories are based on a taxonomy defined by the TraCS Climate Strategies for Transport (implemented by GIZ and funded by the International Climate Initiative (IKI) of the German Federal Ministry for Economic Affairs and Climate Action (BMWK)): |index|Category| |---|---| |0|Active mobility| |1|Alternative fuels| |2|Aviation improvements| |3|Comprehensive transport planning| |4|Digital solutions| |5|Economic instruments| |6|Education and behavioral change| |7|Electric mobility| |8|Freight efficiency improvements| |9|Improve infrastructure| |10|Labels| |11|Land use| |12|Public transport improvement| |13|Shipping improvements| |14|Transport demand management| |15|Vehicle improvements| The intended use is for climate policy researchers and analysts seeking to automate the process of reviewing lengthy, non-standardized PDF documents to produce summaries and reports. Due to inconsistencies in the training data, the classifier performance leaves room for improvement. The classifier exhibits reasonable multi-class training metrics (F1 ~ 0.5), with low precision in the identification of true positive classifications (precision ~ 0.4), but a wide net to capture as many true positives as possible (recall ~ 0.75). When tested on real world unseen test data, the performance was similar to training validation (F1 ~ 0.5). However, testing was based on a small out-of-sample dataset containing it's own inconsistencies. Therefore classification may prove better or worse in practice. ## Training and evaluation data The training dataset is comprised of labelled passages from 2 sources: - [ClimateWatch NDC Sector data](https://www.climatewatchdata.org/data-explorer/historical-emissions?historical-emissions-data-sources=climate-watch&historical-emissions-gases=all-ghg&historical-emissions-regions=All%20Selected&historical-emissions-sectors=total-including-lucf%2Ctotal-including-lucf&page=1). Here we utilized the QA dataset (CW_NDC_data_Sector). - [IKI TraCS Climate Strategies for Transport Tracker](https://changing-transport.org/wp-content/uploads/20220722_Tracker_Database.xlsx) implemented by GIZ and funded by the International Climate Initiative (IKI) of the German Federal Ministry for Economic Affairs and Climate Action (BMWK). The combined dataset[GIZ/policy_qa_v0_1](https://huggingface.co/datasets/GIZ/policy_qa_v0_1) contains ~85k rows. Each row is duplicated twice, to provide varying sequence lengths (denoted by the values 'small', 'medium', and 'large', which correspond to sequence lengths of 60, 85, and 150 respectively - indicated in the 'strategy' column). This effectively means the dataset is reduced by 1/3 in useful size, and the 'strategy' value should be selected based on the use case. For this training, we utilized the 'medium' samples, from the IKITracs data only. Furthermore, for each row, the 'context' column contains 3 samples of varying quality. The approach used to assess quality and select samples is described below. The pre-processing operations used to produce the final training dataset were as follows: 1. Dataset is filtered based on 'medium' value in 'strategy' column (sequence length = 85), selecting only IKITracs samples. 2. For ClimateWatch, all rows are removed as there was assessed to be no taxonomical alignment with the IKITracs labels inherent to the dataset. 3. For IKITracs, labels are assigned based on the presence of of 'parameter' values matching the category mapping taxonomy defined by TraCS (ref. below) 4. If 'context_translated' is available and the 'language' is not English, 'context' is replaced with 'context_translated'. This results in the model being trained on English translations of original text samples. 5. The dataset is "exploded" - i.e., the text samples in the 'context' column, which are lists, are converted into separate rows - and labels are merged to align with the associated samples. 6. The 'match_onanswer' and 'answerWordcount' are used conditionally to select high quality samples (prefers high % of word matches in 'match_onanswer', but will take lower if there is a high 'answerWordcount') 7. Data is then augmented using sentence shuffle from the ```albumentations``` library and insertions from ```nlpaug```. This is done to increase the number of training samples available for under-represented classes. Given the large number of classes for this classifier, it is unsurprising that some categories have very low representation. In this case, classes with instances less than 1/3 of the most represented classes are categorized as under-represented and each instance is augmented to effectively double the number of instances for these classes. 8. To address the remaining class imbalances, the ratio of negative instances to positive instances for each class is computed to produce a weights array. This array is passed to a custom multi label trainer function which is used during hyperparameter tuning and final model training. ###**Parameter to category mapping taxonomy** |index|Category|Parameter| |---|---|---| |0|Active mobility|S\_Activemobility , S\_Cycling , S\_Walking | |1|Alternative fuels|I\_Altfuels , I\_Biofuel , I\_Ethanol , I\_Hydrogen , I\_LPGCNGLNG , I\_RE | |2|Aviation improvements|I\_Aircraftfleet , I\_Airtraffic , I\_Aviation , I\_Capacityairport , I\_CO2certificate , I\_Jetfuel | |3|Comprehensive transport planning|A\_Complan , A\_LATM , A\_Natmobplan , A\_SUMP | |4|Digital solutions|I\_Autonomous , I\_DataModelling , I\_ITS , I\_Other , S\_Maas , S\_Ondemand , S\_Sharedmob | |5|Economic instruments|A\_Economic , A\_Emistrad , A\_Finance , A\_Fossilfuelsubs , A\_Fueltax , A\_Procurement , A\_Roadcharging , A\_Vehicletax | |6|Education and behavioral change|I\_Campaigns , I\_Capacity , I\_Ecodriving , I\_Education | |7|Electric mobility|I\_Emobility , I\_Emobilitycharging , I\_Emobilitypurchase , I\_ICEdiesel , I\_Smartcharging , S\_Micromobility | |8|Freight efficiency improvements|I\_Freighteff , I\_Load , S\_Railfreight | |9|Improve infrastructure|S\_Infraexpansion , S\_Infraimprove , S\_Intermodality | |10|Labels|I\_Efficiencylabel , I\_Freightlabel , I\_Fuellabel , I\_Transportlabel , I\_Vehiclelabel | |11|Land use|A\_Density , A\_Landuse , A\_Mixuse | |12|Public transport improvement|S\_BRT , S\_PTIntegration , S\_PTPriority , S\_PublicTransport | |13|Shipping improvements|I\_Onshorepower , I\_PortInfra , I\_Shipefficiency , I\_Shipping | |14|Transport demand management|A\_Caraccess , A\_Commute , A\_Parkingprice , A\_TDM , A\_Teleworking , A\_Work , S\_Parking | |15|Vehicle improvements|A\_LEZ , I\_Efficiencystd , I\_Fuelqualimprove , I\_Inspection , I\_Lowemissionincentive , I\_Vehicleeff , I\_Vehicleimprove , I\_VehicleRestrictions , I\_Vehiclescrappage | ## Training procedure The model hyperparameters were tuned using ```optuna``` over 10 trials on a truncated training and validation dataset. The model was then trained over 5 epochs using the best hyperparameters identified. ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 3.6181464293180716e-05 - train_batch_size: 3 - eval_batch_size: 3 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 300.0 - num_epochs: 5 ### Training results | Training Loss | Epoch | Step | Validation Loss | Precision Micro | Precision Weighted | Precision Samples | Recall Micro | Recall Weighted | Recall Samples | F1-score | Accuracy | |:-------------:|:-----:|:----:|:---------------:|:---------------:|:------------------:|:-----------------:|:------------:|:---------------:|:--------------:|:--------:|:--------:| | No log | 1.0 | 398 | 1.0635 | 0.1718 | 0.2238 | 0.1763 | 0.7714 | 0.7714 | 0.7945 | 0.2794 | 0.0 | | 1.2442 | 2.0 | 796 | 0.8827 | 0.2167 | 0.2522 | 0.2388 | 0.7543 | 0.7543 | 0.7863 | 0.3518 | 0.0 | | 0.9539 | 3.0 | 1194 | 0.7579 | 0.2710 | 0.3279 | 0.2979 | 0.7543 | 0.7543 | 0.7932 | 0.4134 | 0.0150 | | 0.8265 | 4.0 | 1592 | 0.6773 | 0.3377 | 0.3943 | 0.3937 | 0.7429 | 0.7429 | 0.7901 | 0.4961 | 0.0752 | | 0.8265 | 5.0 | 1990 | 0.6517 | 0.3667 | 0.4273 | 0.4539 | 0.7543 | 0.7543 | 0.7982 | 0.5422 | 0.1654 | ### Framework versions - Transformers 4.31.0 - Pytorch 2.0.1+cu118 - Datasets 2.13.1 - Tokenizers 0.13.3