facebook/fastspeech2-en-ljspeech · adding pauses and dealing with numbers

Jun 17, 2023

•

edited Jun 17, 2023

just wanted to share what worked for me

i noticed the model has a bit of an issue dealing with numbers and punctuations. but it deals quite well with a ','
so processed my text with:

text = text.replace(".", ",").replace("!", ",").replace("?", ",").replace(":", ",").replace(";", ",")
text= text.replace("(",',').replace(")",',').replace("[",',').replace("]",',').replace("{",',').replace("}",',')
text= text.replace('"',',').replace("“",',').replace("”",',')
text= text.replace("-",' ').replace("_",' ').replace("—",' ').replace("–",' ').replace("…",' ')

in addition i saw it has a bit of a problem pronouncing numbers like years.. so even before the replacing i processed it with

from num2words import num2words
import re

def convert_numbers_to_text(text):
    # Regular expression pattern to match numbers
    pattern = r'\b\d+\b'
    
    def replace(match):
        number = int(match.group())
        return num2words(number)
    
    # Replace numbers in the text with their textual representation
    converted_text = re.sub(pattern, replace, text)

    return converted_text

text= convert_numbers_to_text(text)

hope it helps you too

aXlireza

Jun 23, 2023

That did the job, thanks a lot!

aXlireza

Jun 25, 2023

I actually ended up splitting the paragraphs by the "dot"s in there and feed them to the model separately which showed a better result