Training languages in the model card
#9
by
fyvo
- opened
The model card does not show the proportion of Arabic in the training data. The distribution of languages from the Niger-Congo family contains 'Kuganda', a probable misspelling of 'Luganda', spoken in Uganda. It is difficult to tell, as the corpora for Niger-Congo languages are not documented individually.
fyvo
changed pull request status to
open
Thanks for pointing out this!
I think it is worth it to open a PR on the main bloom repo as well since the model cards have been copied from there
cc-ing also @cakiki in case I did not missed anything