Jawi
Collection
Models for historical documents in Jawi (an adaptation of the Perso-Arabic script for the Malay language)
•
5 items
•
Updated
An YOLO-based model for detecting different regions in historical newspapers written in Malay in the Jawi script.
Note: You must scale down the size of the image for higher accuracy. Ideally the image sizes should scaled down be around 400 × 600 to 900 x 1100 for both models. Images too large (ie above 2000x2000) will produce erroneous results.
A helper function for image scaling would be as follows:
def scale_down_image(image, scale_factor=1/4):
resized_img = image.resize((int(image.width * scale_factor), int(image.height * scale_factor)))
return resized_img
Classes are as follows:
calligraphy
footer
header
image
row
text
Base model
Ultralytics/YOLOv8