Voice cloning using transformers

#1
by anon7463435254 - opened

Hi, thank you for your amazing work. I would like to know if and how voice cloning can be added to the script using Transformers. Thank you in advance.

Thank you! This is a sample code:

import torchaudio

# Load reference voice
reference_audio, sr = torchaudio.load("reference_voice.wav")
reference_audio = torchaudio.functional.resample(reference_audio, sr, 24000)

# Generate with cloned voice
inputs = processor(
    text="Clone this voice and say hello!",
    voice_samples=[reference_audio.numpy()],
    return_tensors="pt"
).to(model.device)

output = model.generate(**inputs)

Sign up or log in to comment