File size: 1,813 Bytes
07e12dc
 
 
 
15d2ec1
 
 
7be731f
07e12dc
7be731f
 
 
07e12dc
7be731f
 
aa9e512
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
---
library_name: transformers
tags: []
---

<img src="https://github.com/edchengg/gollie-transfusion/raw/main/assets/gollie-tf-example.png" style="height: 150px;">

# Translation and Fusion Improves Zero-shot Cross-lingual Information Extraction

## Summary
We propose TransFusion, a framework in which models are fine-tuned to use English translations of low-resource language data, enabling more precise predictions through annotation fusion. 
Based on TransFusion, we introduce GoLLIE-TF, a cross-lingual instruction-tuned LLM for IE tasks, designed to close the performance gap between high and low-resource languages.

- 📖 Paper: [Translation and Fusion Improves Zero-shot Cross-lingual Information Extraction](https://arxiv.org/abs/2305.13582)
- 🤗 Model: [GoLLIE-7B-TF](https://huggingface.co/ychenNLP/GoLLIE-7B-TF)
- 🚀 Example Jupyter Notebooks: [GoLLIE-TF Notebooks](notebooks/tf.ipynb)


**Important**: This is based on GoLLIE README (Our flash attention implementation has small numerical differences compared to the attention implementation in Huggingface.
You must use the flag `trust_remote_code=True` or you will get inferior results. Flash attention requires an available CUDA GPU. Running GOLLIE 
pre-trained models on a CPU is not supported. We plan to address this in future releases. First, install flash attention 2:)
```bash
pip install flash-attn --no-build-isolation
pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
```

Then you can load the model using

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("HiTZ/GoLLIE-7B")
model = AutoModelForCausalLM.from_pretrained("HiTZ/GoLLIE-7B", trust_remote_code=True, torch_dtype=torch.bfloat16)
model.to("cuda")
```