File size: 965 Bytes
e8ebf39
 
d251baf
 
ebe573d
d251baf
844c34d
d251baf
6170d15
ebe573d
 
d251baf
6170d15
6f2a39c
 
844c34d
6f2a39c
 
d251baf
 
 
 
 
 
 
6f2a39c
 
6170d15
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# DocumentIQA: Document Insight Question/Answer

## Introduction

Question/Answering on scientific documents. Upload your document and ask questions.
In our implementation we use [Grobid](https://github.com/kermitt2/grobid) for text extraction instead of the raw PDF2Text converter.
Thanks to Grobid we are able to precisely extract abstract and full-text.
This is just the beginning and publishing might help gathering more feedback. 

**NOTE**: This project focus on scientific articles. Uploading books or other large document might not work as expected. 

**Work in progress**

https://document-insights.streamlit.app/

**OpenAI or HuggingFace API KEY required**


### Screencast 
This is a screencast on an older version: 

https://github.com/lfoppiano/document-qa/assets/15426/b3882119-5a87-40f5-a2de-ad47447eb40c


### Acknolwedgement 

This project is developed at the [National Institute for Materials Science](https://www.nims.go.jp) (NIMS) in Japan.