Voice cloning using transformers
#1
by
anon7463435254
- opened
Hi, thank you for your amazing work. I would like to know if and how voice cloning can be added to the script using Transformers. Thank you in advance.
Thank you! This is a sample code:
import torchaudio
# Load reference voice
reference_audio, sr = torchaudio.load("reference_voice.wav")
reference_audio = torchaudio.functional.resample(reference_audio, sr, 24000)
# Generate with cloned voice
inputs = processor(
text="Clone this voice and say hello!",
voice_samples=[reference_audio.numpy()],
return_tensors="pt"
).to(model.device)
output = model.generate(**inputs)