| language | thumbnail | tags | license | datasets | metrics | | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | | English-Greek | lighteternal/SSE-TUC-mt-en-el-cased | NTM, EL-EN | Apache2 |Opus, CC-Matrix |BLEU, chrF | # English to Greek NMT from Hellenic Army Academy (SSE) and Technical University of Crete (TUC) ## Model description Trained using the Fairseq framework, transformer_iwslt_de_en architecture.\ BPE segmentation (20k codes).\ Mixed-case model. \ #### How to use ``` from transformers import FSMTTokenizer, FSMTForConditionalGeneration mname = " " tokenizer = FSMTTokenizer.from_pretrained(mname) model = FSMTForConditionalGeneration.from_pretrained(mname) text = " Katerina, is the best name for a girl." encoded = tokenizer.encode(text, return_tensors='pt') outputs = model.generate(encoded, num_beams=5, num_return_sequences=5, early_stopping=True) for i, output in enumerate(outputs): i += 1 print(f"{i}: {output.tolist()}") decoded = tokenizer.decode(output, skip_special_tokens=True) print(f"{i}: {decoded}") ``` ## Training data Consolidated corpus from Opus and CC-Matrix (~6.6GB in total) ## Eval results Results on Tatoeba testset (EN-EL): | BLEU | chrF | | ------ | ------ | | 76.9 | 0.733 | Results on XNLI parallel (EN-EL): | BLEU | chrF | | ------ | ------ | | 65.4 | 0.624 | ### BibTeX entry and citation info TODO