The new age of TTS models — Project link Paper link This paper presents NaturalSpeech 2, an upgraded text-to-speech (TTS) system designed to better capture the diversity and nuances of human speech, including different speaker identities, prosodies, styles, and even singing. The new system addresses the shortcomings of existing TTS systems, such as unstable prosody, word skipping/repetition…