---
license: mit
language:
- en
base_model:
- yl4579/StyleTTS2-LibriTTS
pipeline_tag: text-to-speech
---
This model is a direct ONNX conversion of [yl4579/StyleTTS2-LibriTTS](https://huggingface.co/yl4579/StyleTTS2-LibriTTS) without modification to its weights and therefore inherits the [original license](https://github.com/yl4579/StyleTTS2?tab=readme-ov-file#license). It currently powers a [WebUI that performs TTS inference on CPU](https://hexgrad.com/). To facilitate lazy loading, it is chunked into four parts.

The original PyTorch checkpoint, trained on a subset of LibriTTS by the authors of StyleTTS 2, can be considered a "base model", and it has some limitations. That base model should not be confused with Kokoro, a StyleTTS 2 variant using a different decoder architecture, which is currently only being hosted at [https://hf.co/spaces/hexgrad/Kokoro-TTS](https://hf.co/spaces/hexgrad/Kokoro-TTS).

This repository will likely remain dormant and undocumented. You are welcome to use these ONNX models subject to the original license, but be advised that in most cases PyTorch is likely the more suitable option. The ONNX conversion was not performance optimized and was intended for CPU usage. Informal benchmarking shows that this ONNX descendant is substantially outpaced by its PyTorch ancestor in GPU environments.