Spaces:

sethuiyer
/

ttsdoc

Runtime error

File size: 1,476 Bytes

2da35dc
020af7d
 
 
 
2da35dc
 
 
 
 
 
020af7d
2da35dc
020af7d

---
title: ttsdoc
emoji: 🌖
colorFrom: yellow
colorTo: gray
sdk: gradio
sdk_version: 4.41.0
app_file: app.py
pinned: false
license: apache-2.0
---
# ttsdoc 🌖

ttsdoc is a Text-to-Speech (TTS) application that can read your PDF documents aloud. It uses the Parler TTS Mini v1 model to generate high-quality audio from text inputs, including uploaded PDF files.

## Features

- 📄 Support for PDF, TXT, and DOCX file uploads
- ✍️ Direct text input option
- 🗣️ Customizable voice descriptions
- ⏱️ Adjustable maximum audio duration
- 🚀 GPU-accelerated audio generation

## How to Use

1. Upload a PDF, TXT, or DOCX file or enter text directly.
2. Customize the voice description if desired.
3. Adjust the maximum audio duration.
4. Click "Generate Audio" to create the TTS output.

## Tips for Best Results

- For longer texts, the generator will create audio up to the specified maximum duration.
- Experiment with different voice descriptions to achieve the desired output.
- Use punctuation to control pacing and intonation in the generated speech.
- For optimal quality, try to keep individual sentences or paragraphs concise.

## Technical Details

- This demo uses the Parler TTS Mini v1 model.
- Audio generation is GPU-accelerated for faster processing.
- Maximum file size for uploads: 5MB

## License

This project is licensed under the Apache 2.0 License.

---

Powered by [Gradio](https://gradio.app) and [Hugging Face](https://huggingface.co)