nbroad
/

donut-base-ascii

Image-Text-to-Text

vision-encoder-decoder

Inference Endpoints

Model card Files Files and versions Community

donut-base-ascii / README.md

nbroad's picture

first

6601038 over 1 year ago

|

485 Bytes

	---
	license: apache-2.0
	---

	# donut-base-ascii

	This is `"naver-clova-ix/donut-base"` but with all non-ascii tokens removed. This means the model is good for basic English use cases where the text is primarily a-zA-Z0-9 and basic punctuation.


	The original model, `"naver-clova-ix/donut-base"`, did not have a token for `"1"`, so that has also been added. The notebook remove-donut-tokens.ipynb details the whole process.


	This has not been trained any more than the original model.