Spaces:
Running
Running
documentation
Browse files
README.md
CHANGED
@@ -16,11 +16,14 @@ license: apache-2.0
|
|
16 |
|
17 |
## Introduction
|
18 |
|
19 |
-
Question/Answering on scientific documents using LLMs
|
20 |
-
|
21 |
-
Differently to most of the
|
|
|
22 |
|
23 |
-
|
|
|
|
|
24 |
|
25 |
**Demos**:
|
26 |
- (on HuggingFace spaces): https://lfoppiano-document-qa.hf.space/
|
|
|
16 |
|
17 |
## Introduction
|
18 |
|
19 |
+
Question/Answering on scientific documents using LLMs: ChatGPT-3.5-turbo, Mistral-7b-instruct and Zephyr-7b-beta.
|
20 |
+
The streamlit application demonstrate the implementaiton of a RAG (Retrieval Augmented Generation) on scientific documents, that we are developing at NIMS (National Institute for Materials Science), in Tsukuba, Japan.
|
21 |
+
Differently to most of the projects, we focus on scientific articles.
|
22 |
+
We target only the full-text using [Grobid](https://github.com/kermitt2/grobid) that provide and cleaner results than the raw PDF2Text converter (which is comparable with most of other solutions).
|
23 |
|
24 |
+
Additionally, this frontend provides the visualisation of named entities on LLM responses to extract <span stype="color:yellow">physical quantities, measurements</span> (with [grobid-quantities](https://github.com/kermitt2/grobid-quantities)) and <span stype="color:blue">materials</span> mentions (with [grobid-superconductors](https://github.com/lfoppiano/grobid-superconductors)).
|
25 |
+
|
26 |
+
The conversation is backed up by a sliding window memory (top 4 more recent messages) that help refers to information previously discussed in the chat.
|
27 |
|
28 |
**Demos**:
|
29 |
- (on HuggingFace spaces): https://lfoppiano-document-qa.hf.space/
|