Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,38 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
metrics:
|
4 |
+
- exact_match
|
5 |
+
---
|
6 |
+
|
7 |
+
# LLoCO: Learning Long Contexts Offline
|
8 |
+
[**Paper**](https://arxiv.org/abs/2404.07979) | [**Code**](https://github.com/jeffreysijuntan/lloco)
|
9 |
+
|
10 |
+
Lloco-7b-quality is the LoRA adaptor checkpoint finetuned from [AutoCompressor-Llama-2-7b-6k](https://huggingface.co/princeton-nlp/AutoCompressor-Llama-2-7b-6k/) and [meta-llama/Llama-2-7b-hf](https://huggingface.co/meta-llama/Llama-2-7b-hf)
|
11 |
+
using the **LLoCO** method in [LLoCO: Learning Long Contexts Offline](https://arxiv.org/abs/2404.07979). It is instruction-tuned on the QuALITY training set.
|
12 |
+
|
13 |
+
**LLoCO** enables LLMs to process long-context efficiently by learning contexts offline through context compression and in-domain parameter-efficient finetuning with LoRA. This approach extends the effective context window of a 4k token LLaMA2-7B model to handle up to 128k tokens, while using
|
14 |
+
30x fewer tokens and achieving up to 7.62x inference speed-up.
|
15 |
+
|
16 |
+
## Released LoRA Checkpoint
|
17 |
+
| Model | LoRA Rank | Dataset | Link |
|
18 |
+
|:----------------|-----------|-------------|--------------------------------------------------------|
|
19 |
+
| Lloco-7b-quality| 8 | QuALITY | [link](https://huggingface.co/xiuyul/Lloco-7b-quality/)|
|
20 |
+
| Lloco-7b-qasper | 8 | Qasper | [link](https://huggingface.co/xiuyul/Lloco-7b-qasper/) |
|
21 |
+
| Lloco-7b-qmsum | 8 | QMSum | [link](https://huggingface.co/xiuyul/Lloco-7b-qmsum/) |
|
22 |
+
| Lloco-7b-nqa | 8 | NarrativeQA | [link](https://huggingface.co/xiuyul/Lloco-7b-nqa/) |
|
23 |
+
| Lloco-7b-hqa | 8 | HotpotQA | [link](https://huggingface.co/xiuyul/Lloco-7b-hqa/) |
|
24 |
+
|
25 |
+
## Citation
|
26 |
+
If you find this project useful, please consider citing:
|
27 |
+
|
28 |
+
```
|
29 |
+
@article{tan2024lloco,
|
30 |
+
title={LLoCO: Learning Long Contexts Offline},
|
31 |
+
author={Tan, Sijun and Li, Xiuyu and Patil, Shishir and Wu, Ziyang and Zhang, Tianjun and Keutzer, Kurt and Gonzalez, Joseph E and Popa, Raluca Ada},
|
32 |
+
journal={arXiv preprint arXiv:2404.07979},
|
33 |
+
year={2024}
|
34 |
+
}
|
35 |
+
```
|
36 |
+
|
37 |
+
## Evaluation
|
38 |
+
Check out [LLoCO: Learning Long Contexts Offline](https://arxiv.org/abs/2404.07979) for evaluation results on various long-context tasks such as long document question answering and summarization.
|