Update README.md
Browse files
README.md
CHANGED
@@ -42,14 +42,19 @@ Run the service:
|
|
42 |
|
43 |
- With GPU support:
|
44 |
```
|
45 |
-
docker run --rm --name pdf-document-layout-analysis --gpus '"device=0"' -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.
|
46 |
```
|
47 |
|
48 |
- Without GPU support:
|
49 |
```
|
50 |
-
docker run --rm --name pdf-document-layout-analysis -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.
|
51 |
```
|
52 |
|
|
|
|
|
|
|
|
|
|
|
53 |
Get the segments from a PDF:
|
54 |
|
55 |
curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
|
@@ -77,6 +82,12 @@ Start the service:
|
|
77 |
|
78 |
make start
|
79 |
|
|
|
|
|
|
|
|
|
|
|
|
|
80 |
Get the segments from a PDF:
|
81 |
|
82 |
curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
|
@@ -110,6 +121,9 @@ Even though the visual model using more resources than the others, generally it'
|
|
110 |
"sees" the whole page and has an idea about all the context. On the other hand, LightGBM models are performing slightly worse
|
111 |
but they are much faster and more resource-friendly. It will only require your CPU power.
|
112 |
|
|
|
|
|
|
|
113 |
## Data
|
114 |
|
115 |
As we mentioned, we are using the visual model that trained on [DocLayNet](https://github.com/DS4SD/DocLayNet) dataset.
|
|
|
42 |
|
43 |
- With GPU support:
|
44 |
```
|
45 |
+
docker run --rm --name pdf-document-layout-analysis --gpus '"device=0"' -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.21
|
46 |
```
|
47 |
|
48 |
- Without GPU support:
|
49 |
```
|
50 |
+
docker run --rm --name pdf-document-layout-analysis -p 5060:5060 --entrypoint ./start.sh huridocs/pdf-document-layout-analysis:v0.0.21
|
51 |
```
|
52 |
|
53 |
+
[OPTIONAL] OCR the PDF. Check supported languages (curl localhost:5060/info):
|
54 |
+
|
55 |
+
curl -X POST -F 'language=en' -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060/ocr --output ocr_document.pdf
|
56 |
+
|
57 |
+
|
58 |
Get the segments from a PDF:
|
59 |
|
60 |
curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
|
|
|
82 |
|
83 |
make start
|
84 |
|
85 |
+
|
86 |
+
[OPTIONAL] OCR the PDF. Check supported languages (curl localhost:5060/info):
|
87 |
+
|
88 |
+
curl -X POST -F 'language=en' -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060/ocr --output ocr_document.pdf
|
89 |
+
|
90 |
+
|
91 |
Get the segments from a PDF:
|
92 |
|
93 |
curl -X POST -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5060
|
|
|
121 |
"sees" the whole page and has an idea about all the context. On the other hand, LightGBM models are performing slightly worse
|
122 |
but they are much faster and more resource-friendly. It will only require your CPU power.
|
123 |
|
124 |
+
The service converts PDFs to text-searchable PDFs using [Tesseract OCR](https://github.com/tesseract-ocr/tesseract) and [ocrmypdf](https://ocrmypdf.readthedocs.io/en/latest/index.html).
|
125 |
+
|
126 |
+
|
127 |
## Data
|
128 |
|
129 |
As we mentioned, we are using the visual model that trained on [DocLayNet](https://github.com/DS4SD/DocLayNet) dataset.
|