How does it work?
What does custom speech recognition do?
By customizing speech recognition to your slides, you will see improved accuracy in the subtitles for that presentation, in some cases seeing up to 30% improvement in accuracy.
Presentation Translator improves the accuracy of your subtitles by learning from the content in your slides and slide notes, and therefore being more effective in recognizing the words you are likely to use while presenting.
Without customization, you might see inaccuracies in speech recognition when you use industry-specific vocabulary, technical terms, acronyms, and product or place names. Customization will reduce the errors in your subtitles for specialized vocabulary, as long as the words are present in your slide or slide notes.
The first time you customize speech recognition for your presentation, it can take up to 5 minutes for Presentation Translator to finish learning. After the first time, the subtitles will start instantaneously unless you update the content of your slides. Tip: start the custom speech recognition during a practice run so that you don’t experience delay when you are ready to start presenting to an audience.
How does the custom speech recognition feature work?
The custom speech recognition feature works by training unique language models with the content of your slides. The language models behind Microsoft’s world-class speech recognition engine have been optimized for common usage scenarios.
The language model is a probability distribution over sequences of words and helps the system decide among sequences of words that sound similar, based on the likelihood of the word sequences themselves. For example, “recognize speech” and “wreck a nice beach” sound alike but the first hypothesis is far more likely to occur, and therefore will be assigned a higher score by the language model.
If your presentation uses particular vocabulary items, such as product names or jargon that rarely occur in typical speech, it is likely that you can obtain improved performance by customizing the language model.
For example, if your presentation is about automotive, it might contain terms like “powertrain” or “catalytic converter” or “limited slip differential.” Customizing the language model will enable the system to learn this.
When you use the Customize speech recognition feature in Presentation Translator, your presentation content - including notes from the slides - is securely transmitted to the Microsoft Translator transcription service to create an adapted language model based on this data. Data used for customization is not de-identified and is retained in full, along with the adapted model, by the service for thirty (30) days from last use to support your future presentations and use of the language modeling.
When subtitling starts, Translator live presentation mode is activated which mutes audience members, ensuring speakers can present and subtitles are displayed on screen without interruption.
At the end of the presentation, the presenter can decide to save a transcript of the presentation (including audience participation) in a text file.
1) Audience participation
As this capability is powered by the Translator live feature, by sharing the conversation code, it allows up to 500 people from the audience to join the conversation on their device in their preferred language.
2) Audience Personalization
Presenters have the option to choose how the subtitling will appear to the audience. It will automatically appear near the top of the screen for easy visibility for the audience, or it can be docked at the bottom of each slide.
Audience Q&A: When the audience is "unmuted" from the subtitle menu bar, audience members can join the conversation and comment in an interactive, multilingual Q&A session. The audience’s questions and comments will then display in the subtitle box in the presenter's chosen subtitling language.