Product Updates: Dubbing & More New AI Voice Features

May 9, 2023

“When you are having fun, and creating something you love, it shows in the product.” 
                                                                       – Tom Ford, American Fashion Designer 

Today is one of those moments where we’re excited to share new product updates that we love! Our team is constantly looking for ways to improve our generative voice AI products based on our customer’s needs and these new features will undoubtedly enhance your AI voice capabilities. This week, we’ve launched multiple new features to both Pro and trial users. First off, we have a very exciting update to our localization capabilities! We’re now suited to be your one-stop shop for dubbing audio and video content. In support of our Localize tool, users are now able to translate and localize their AI voices in up to 62 languages. In addition, to enhance AI voice generation options, we have extended our realistic speech-to-speech voice generator to all users. Lastly, for Pro customers, we’re improving API responsiveness with batching! Without further ado, we’ll share details surrounding these new product updates and show you how take advantage of each based on your needs.

Dubbing: Free Language Translation and Localization 

As highlighted earlier, we’ve taken the next step to provide our customers with a comprehensive AI voice dubbing solution! To complement our Localize tool, we’ve launched language translation. Now users can translate text or audio and then localize the translated text into up to 62 authentic languages for full-service dubbing. This will significantly streamline your dubbing workflow. Users can translate confidently through our Large Language Model integration (think ChatGPT) which provides accurate and efficient translation for your custom AI voices. For those unfamiliar with an LLM, they are machine learning models that use deep learning algorithms to understand and generate human language. Rest assured that our integration’s cutting-edge algorithm has been trained to generate translation off of a very large set of data.

Translation and Localization Explained

Translation

Language translation is the process of converting text or speech from one language to another language to enable communication.

Localization

Language localization is the adaptation of content, products, or services to fit the linguistic, cultural, and regional preferences of a specific region.

Watch all the way through to see how you can give your translations context.

Capture Emotions With Free Speech-to-Speech

In September 2022, we launched speech-to-speech AI voice generation for our Pro users. Our real-time speech-to-speech AI voice generator gives users the ability to easily capture the naturalness of human speech. The great news is that we’ve rolled out our realistic speech-to-speech generator to all Resemble AI users! All users, both Pro subscribers or trial accounts now have the ability to generate custom AI voices using realistic speech-to-speech. Prior to this, all users have had access to our text-to-speech AI voice generator and with this product update, text-to-speech and speech-to-speech will provide users greater flexibility when generating custom AI voices. For example, a user with a script can generate realistic AI voice by entering text in to our app via text-to-speech conversion. Conversely, in the instance the user wants to easily capture the emotions and imperfections of natural speech, they can read a script using speech-to-speech conversion to generate their custom AI voice. Another bonus, is that speech-to-speech voice generation is not only available to your custom AI voices but can be applied to our robust marketplace of free voices. Below is an explanation of both AI generated inputs.

Speech-to-Speech and Text-to-Speech Explained

Speech-to-Speech

Speech-to-speech converts voice from one form to another using AI algorithms. It involves 3 components: automatic speech recognition (ASR) to transcribe the speech into text, machine translation (MT) to convert the text into the target language, and text-to-speech (TTS) synthesis to generate voice.

Text-to-Speech

Text-to-speech (TTS) technology converts written text into spoken words using AI algorithms. The algorithm synthesizes human-like speech by analyzing the input text, understanding the context, and generating corresponding speech patterns.

Try out our free speech-to-speech voice generator at https://app.resemble.ai/.

Batching Released To Optimize API Requests

Lastly, for clients integrating with our real-time API, we’re introducing batching to enhance processing efficiency. Batching is typically implemented to help boost processing performance and accommodate larger workloads in data processing, machine learning, computer graphics, and network communications. In this case, batching will optimize our model’s machine learning processes. In simple terms, users can now create a bunch (or batch) of generated content with one API request rather than sending in multiple API requests for individual audio clips to be processed. This gives the user flexibility to process generated content in bulk. This takes a load off of your CPU usage and optimizes overall efficiency. By minimizing the additional processes, we’re able to leverage parallel processing capabilities and streamline error handling. This should lead to more streamlined and responsive voice generation.

Our team is thrilled to have rolled out these new generative voice AI features and products to new and existing customers! These updates are intended to make your AI voice projects more dynamic, efficient and successful. We’re excited to see how you’re able to implement dubbing and speech-to-speech along with getting more optimal responsiveness through our API integration. If you’re interested in a demo of these products, please schedule some time with one of our experts by clicking the button below. Ciao until the next product update.

More From This Category

Our Commitment to Consent

Our Commitment to Consent

Remember when creating a synthetic voice meant hours in a studio, carefully recording every syllable? Now, with a few clicks, you can clone anyone's voice. It's mind-blowing tech. But with great power comes great responsibility. At Resemble, we've always believed that...

read more
Introducing ‘Edit’ by Resemble AI: Say No More Beeps

Introducing ‘Edit’ by Resemble AI: Say No More Beeps

In audio production, mistakes are inevitable. You’ve wrapped up a recording session, but then you notice a mispronounced word, an awkward pause, or a phrase that just doesn’t flow right. The frustration kicks in—do you re-record the whole segment, or do you spend...

read more