Abstract
Document retrieval techniques form the foundation for the development of large-scale information systems. The prevailing methodology is to construct a bi-encoder and compute the semantic similarity. However, such scalar similarity is difficult to reflect enough information and impedes our comprehension of the retrieval results. In addition, this computational process mainly emphasizes the global semantics and ignores the fine-grained semantic relationship between the query and the complex text in the document. In this paper, we propose a new method called Generation Augmented Retrieval (GeAR) that incorporates well-designed fusion and decoding modules. This enables GeAR to generate the relevant text from documents based on the fused representation of the query and the document, thus learning to "focus on" the fine-grained information. Also when used as a retriever, GeAR does not add any computational burden over bi-encoders. To support the training of the new framework, we have introduced a pipeline to efficiently synthesize high-quality data by utilizing large language models. GeAR exhibits competitive retrieval and localization performance across diverse scenarios and datasets. Moreover, the qualitative analysis and the results generated by GeAR provide novel insights into the interpretation of retrieval results. The code, data, and models will be released after completing technical review to facilitate future research.
Community
TLDR: We propose Generation Augmented Retrieval (GeAR), a framework that enhances documents retrieval by incorporating fine-grained semantic focus and interpretable generation capabilities, while maintaining the efficiency of bi-encoders.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Invar-RAG: Invariant LLM-aligned Retrieval for Better Generation (2024)
- Enhancing Multimodal Query Representation via Visual Dialogues for End-to-End Knowledge Retrieval (2024)
- jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images (2024)
- Adaptive Two-Phase Finetuning LLMs for Japanese Legal Text Retrieval (2024)
- KG-Retriever: Efficient Knowledge Indexing for Retrieval-Augmented Large Language Models (2024)
- EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation (2024)
- VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper