---
license: mit
pretty_name: SynthIE
datasets:
- martinjosifoski/SynthIE
---

This repository hosts the pre-trained models from the paper [Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction](https://arxiv.org/abs/2303.04132). It is a companion to the project's [homepage and GitHub repository](https://github.com/epfl-dlab/SynthIE), which contains all the details.

The repository contains 5 models:
- **SynthIE-base-FE** (`synthie_base_fe.ckpt`): *FLAN-T5-base*, finetuned on the *SynthIE-code dataset* following the *fully-expanded* output linearization
- **SynthIE-base-SC** (`synthie_base_sc.ckpt`): *FLAN-T5-base*, finetuned on the *SynthIE-code dataset* following the *subject-collapsed* linearization
- **SynthIE-large-FE** (`synthie_large_fe.ckpt`): *FLAN-T5-large*, finetuned on the *SynthIE-code dataset* following the *fully-expanded* linearization
<br><br>
- **GenIE-base-FE** (`genie_base_fe.ckpt`): *FLAN-T5-base*, finetuned on the *REBEL dataset* following the *fully-expanded* output linearization
- **GenIE-base-SC** (`genie_base_sc.ckpt`): *FLAN-T5-base*, finetuned on the *REBEL dataset* following the *subject-collapsed* output linearization

The [demo notebook](https://github.com/epfl-dlab/SynthIE/blob/main/notebooks/demo.ipynb) in the project's GitHub [repository](https://github.com/epfl-dlab/SynthIE) provides instructions on how to download, load and use the models (as well as other resources released by the paper such as the datasets).<br>
For more information, please refer to the project's GitHub [repository](https://github.com/epfl-dlab/SynthIE) and the [paper](https://arxiv.org/abs/2303.04132).