---
license: mit
pretty_name: SynthIE
datasets:
- martinjosifoski/SynthIE
---
This repository hosts the pre-trained models from the paper [Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction](https://arxiv.org/abs/2303.04132). It is a companion to the project's [homepage and GitHub repository](https://github.com/epfl-dlab/SynthIE), which contains all the details.
The repository contains 5 models:
- **SynthIE-base-FE** (`synthie_base_fe.ckpt`): *FLAN-T5-base*, finetuned on the *SynthIE-code dataset* following the *fully-expanded* output linearization
- **SynthIE-base-SC** (`synthie_base_sc.ckpt`): *FLAN-T5-base*, finetuned on the *SynthIE-code dataset* following the *subject-collapsed* linearization
- **SynthIE-large-FE** (`synthie_large_fe.ckpt`): *FLAN-T5-large*, finetuned on the *SynthIE-code dataset* following the *fully-expanded* linearization
- **GenIE-base-FE** (`genie_base_fe.ckpt`): *FLAN-T5-base*, finetuned on the *REBEL dataset* following the *fully-expanded* output linearization
- **GenIE-base-SC** (`genie_base_sc.ckpt`): *FLAN-T5-base*, finetuned on the *REBEL dataset* following the *subject-collapsed* output linearization
The [demo notebook](https://github.com/epfl-dlab/SynthIE/blob/main/notebooks/demo.ipynb) in the project's GitHub [repository](https://github.com/epfl-dlab/SynthIE) provides instructions on how to download, load and use the models (as well as other resources released by the paper such as the datasets).
For more information, please refer to the project's GitHub [repository](https://github.com/epfl-dlab/SynthIE) and the [paper](https://arxiv.org/abs/2303.04132).