The Voice Synthesis Business: 2022 Update. Part 1.
Fetch error
Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on February 06, 2023 06:07 ()
What now? This series will be checked again in the next day. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.
Manage episode 348106307 series 3389184
In the past few years, high-quality automated text-to-speech synthesis has effectively become a commodity, with easy access to cloud-based APIs provided by a number of major players.
At the same time, developments in deep learning have broadened the scope of voice synthesis functionalities that can be delivered, leading to a growth in the range of commercially-viable use cases.
Today our podcast Art Intel and me, Brian , the Artificial Intelligence Voice, will take a look at the technology features and use cases that have attracted attention and investment in the past few years, identifying the major players and recent start-ups in the space. Today we read and listen the first part of Robert Dale researching.
Introduction.
Humans have been fascinated by the idea of making machines sound like humans for quite a long time, going at least as far back as Wolfgang von Kempelen’s mechanical experiments in the second half of the 18th century.
In the modern era, early attempts at computer-based speech synthesis were already appearing in the 1960s and 1970s, and the 1980s saw the arrival of the DECtalk system, familiar to many as the voice of Stephen Hawking.
The outputs from early applications based on formant synthesis sounded too artificial to be mistaken for human speech and were generally criticised as sounding “robotic.” Subsequent products based on unit concatenation dramatically increased the naturalness of the synthesized speech — but still not enough to make it indistinguishable from real human speech, especially when uttering more than a few sentences in sequence.
However, by the early 2000s, it was good enough to field telephony- based spoken language dialog systems whose conversational contributions weren’t particularly offensive to the ears.
Things stepped up a notch with DeepMind’s 2016 introduction of WaveNet, the first of the deep-learning based approaches to speech synthesis. The years since have seen the development of a wide range of deep-learning architectures for speech synthesis. As well as providing a noticeable increase in the quality and naturalness of the voice output that can be produced, these have opened the door to a variety of new voice synthesis applications built on deep-learning techniques.
So, given the advances made over the past few years, what does the commercial voice synthesis market look like today? In this post, we look at the applications of the technology that are enticing investors and aiming to generate revenue, and identify the companies that are making news.
https://becominghuman.ai/the-voice-synthesis-business-2022-update-68401b4b0f57
81 פרקים