Improving the Quality of Neural TTS Using Long-form Content and Multi-speaker Multi-style Modeling Apple Machine Learning Research
Neural text-to-speech (TTS) can provide quality close to natural speech if an adequate amount of high-quality speech material is available for training. However, acquiring speech data for TTS training is costly and time-consuming, especially if the goal is to generate different speaking styles. In this… Read More »Improving the Quality of Neural TTS Using Long-form Content and Multi-speaker Multi-style Modeling Apple Machine Learning Research