“If we look at the dubbing industry alone at the moment, its roughly worth $2.5B and that correlates to roughly 50,000 hours of content dubbed through traditional human dubbing. But where we’re playing is not necessarily in that $2.5B, we’re playing within a $10B market of suppressed existing demand – content that is not currently being localised,” says Amir Jirbandey, Head of Growth at Papercup, a next-gen AI dubbing service which automates translation and dubbing using synthetic voices.
With the rise of AVOD and FAST channels – and their particular popularity in developing economies – Jirbandey believes AI dubbing represents an opportunity for content owners to localise content at speed and cost-effectively. He remarks: “People who have a huge back catalogue of content which is just sitting there collecting dust are now able to monetise it in a way they weren’t able to before.”
Some content owners and service providers have already started exploring AI dubbing as an alternative to traditional dubbing houses. Cinedigm used Papercup’s AI dubbing service to dub all 31 seasons of The Joy of Painting – a popular instructional art show hosted by Bob Ross – in Latin American Spanish, which it intends to distribute as a dedicated Bob Ross channel on Pluto TV and Tubi. Jirbandey says: “This gives you a small snapshot as to how content that might otherwise have been forgotten is now being revitalised and localised in new territories.”
One of the main benefits of AI dubbing is the slashing of turnaround times. According to Jirbandey, a typical feature length film or documentary may take between one to three months to dub through traditional dubbing houses, while AI dubbing can take between one to three weeks (however, this will differ based on the complexity of the project). Jirbandey believes AI dubbing is also significantly less expensive.
The process of AI dubbing through Papercup involves two steps. First, the platform takes a video file as input and automatically transcribes the audio into text. Through machine translation, the platform then translates the text into the target language, after which the company’s proprietary speech synthesis engine creates the synthetic voice in the target language. The voice is then pushed through Papercup’s dubbing tool, which is created specifically for synthetic voices.
“The second step is something which we’ve created called ‘human-in-the-loop,’” says Jirbandey. This involves human translators going through the resulting translation “almost word for word” to check the accuracy and quality of the dubbing and paraphrasing when necessary.
Papercup also allows these translators to vary the emotion and expression of the synthetic voice based on the content. Jirbandey elaborates: “They have plethora of tools available to them to finesse the voice to sound natural as possible. We can adjust the voice to change the age range, the tone, the emotion, the intonations, emphasis and speed. All of these different tools and techniques allow us to create a very premium voice-over.”
While Papercup has developed means to capture a range of emotions through synthetic voices, currently the technology still lends itself more strongly to factual and informative content, which is less demanding in terms of expression. This has included documentaries, business reporting, sports commentary and the news, with Papercup already working with Bloomberg and Sky News to localise content in international markets.
As the company continues to develop the expressive range of the technology however, Jirbandey anticipates Papercup will dub almost all videos types. He says: “Within the next three to five years, beyond video, we want to start tackling audiobooks, online radio and podcasts, which will unlock a huge pool of additional content which has not necessarily been looked at for localisation before.” He estimates this market, combined with the video market, could be as large as $100B.
Two of the main challenges facing the company currently includes trying to scale the human-in-the-loop aspect of the process and create efficiencies to further reduce the turnaround time of dubbing. Another challenge is commercialising the service: “This is a brand new invention. There is no specific category which we belong to. We’re creating a new market so a lot of education needs to be done.”
Dubbing could potentially be more popular among audiences than subtitled content. Netflix reported in 2019 that 72% of U.S. viewers preferred the dubbed version of the popular show Money Heist to the subtitled version, and 85% preferred the dubbed version of Dark.