Vinsingh's picture
Rename README (1).md to README.md
8455fde verified
metadata
license: apache-2.0
language:
  - hi
  - en

This is the pytorch model parameters and associated data used for training a small transformer model from scratch. The transformer model is used to train for translation from hindi_latin to english.

Among the files, training dataset used to create the model is also there. Data used for training is semi-synthetic.

Steps for creating datasets: Obtain actualuser questions in hindi and human translations thereof in english. Prompt GPT to create variations of key words taking phonetics in account and giving a user persona.