Text-to-Speech (TTS) Analysis System with Client-Side Processing

In todays digital age, text-to-speech (TTS) technology has become a crucial tool for increasing accessibility and enhancing user experiences across platforms, from devices for the visually impaired to smart assistants. Developing this technology efficiently and quickly is essential. This article focuses on developing a text-to-speech analysis and synthesis system using client-side processing technology, an approach that enables TTS conversion to occur directly on a user’s web browser, thereby reducing server load and increasing response speed. The work covers everything from the process of TTS, user interface (UI) development, to Web Speech API implementation. Furthermore, to ensure the quality of the synthesized voices, a systematic evaluation was conducted using the internationally-standardized Mean Opinion Score (MOS) for Thai voices from Microsoft client-side TTS voices, namely Pattara and Kanya, to measure clarity, naturalness, and fluidity. The results of this project not only serve as a prototype for an effective TTS system, but also provide valuable insights for the future development of synthetic voices that are more natural and closely approximate human speech.

Taskeow Srisod, Prachyanun Nilsook, Sasitorn Issaro, Oraphan Amnuaysin, Thananan Areepong, Orawan Saeung, Thani Jintasuttisak and Thamasan Suwanroj.
Text-to-Speech (TTS) Analysis System with Client-Side Processing.
Journal of Theoretical and Applied Information Technology 15th April 2026. Vol.104. No.7
https://doi.org/10.5281/zenodo.19593882

https://www.jatit.org/volumes/Vol104No7/18Vol104No7.pdf