2 min readfrom Machine Learning

Best architecture for seamless Bilingual TTS? (Azure / English + Korean) [D]

Hi guys, when building a language learning app (React Native/Expo frontend, Python backend) and I’ve hit a frustrating wall with Text-to-Speech. I need the app to read sentences that mix English instructions and Korean examples (e.g., "To say hello, we use the phrase 안녕하세요.").

Since native pronunciation is critical for a learning app, I'm struggling to find a solution that sounds natural. I'm currently using Azure Cognitive Services, and I'm stuck between two bad options:

Approach 1: The Multilingual Voice (en-US-AvaMultilingualNeural)

The Good: Seamless reading, zero pauses mid-sentence.

The Bad: Because it's an English-first model, the Korean comes out with a slight, robotic/Americanized accent. It doesn't sound like a true native speaker, which defeats the purpose of teaching pronunciation. And also there is some scratching and lack of smoothness when it is reading korean words.

Approach 2: SSML Voice Switching (Ava for EN, SunHi for KO)

The Good: Perfect English, perfect native Korean.

The Bad: Switching <voice> tags mid-sentence causes Azure to pause for a fraction of a second while it unloads/loads the neural models. It completely ruins the natural flow of the audio, making it sound very disjointed.

My Questions:

Is there an SSML trick in Azure to pre-load voices or eliminate that micro-pause when switching voices?

How do the big apps handle this? Because if I use two models for korean and english they will sound different when reading.

Should I migrate away from standard Azure Speech and use the Azure OpenAI voices (alloy, nova) instead? Are they truly seamless for bilingual text?

Any advice on the best tech stack or architecture for this would be massively appreciated!

submitted by /u/Lumpy-Simple9185
[link] [comments]

Want to read more?

Check out the full article on the original site

View original article

Tagged with

#natural language processing for spreadsheets
#generative AI for data analysis
#Excel alternatives for data analysis
#AI-native spreadsheets
#cloud-native spreadsheets
#natural language processing
#machine learning in spreadsheet applications
#financial modeling with spreadsheets
#rows.com
#big data management in spreadsheets
#row zero
#big data performance
#cognitive automation
#Bilingual TTS
#Azure Cognitive Services
#Text-to-Speech
#Multilingual Voice
#SSML Voice Switching
#native pronunciation
#React Native