HoloLens开发手记 - 语音识别（听写识别）

Hololens上语音输入有三种形式，分别是：

语音命令 Voice Command
听写 Diction
语法识别 Grammar Recognizer

在 HoloLens开发手记 - 语音识别（语音命令）博客已经介绍了 Voice Command 的用法。本文将介绍听写的用法：

听写识别 Diction

听写就是语音转化成文字 (Speech to Text)。此特性在HoloLens上使用的场所一般多用于需要用到键入文字的地方，例如在HoloLens中使用 Edge 搜索时，由于在HoloLens上一般是非常规的物理键盘输入，使用手势点按虚拟键盘键入文字的具体操作需要用户转动头部将Gaze射线光标定位到想输入的虚拟键盘字母上，再用Gesture点按手势确认选定此字母，由此可见还是有极大的不便性。

Paste_Image.png

所以语音转为文字实现键入内容的操作将能大大提高效率。

听写特性用于将用户语音转为文字输入，同时支持内容推断和事件注册特性。Start()和Stop()方法用于启用和禁用听写功能，在听写结束后需要调用Dispose()方法来关闭听写页面。GC会自动回收它的资源，如果不Dispose会带来额外的性能开销。

使用听写识别应该注意的是:

在你的应用中必须打开 Microphone 特性。设置如下：Edit -> Project Settings -> Player -> Windows Store -> Publishing Settings > Capabilities 中确认勾上Microphone。
必须确认HoloLens连接上了wifi，这样听写识别才能工作。

DictationRecognizer.cs

using HoloToolkit;
using System.Collections;
using System.Text;
using UnityEngine;
using UnityEngine.UI;
using UnityEngine.Windows.Speech;

public class MicrophoneManager : MonoBehaviour
{
    [Tooltip("A text area for the recognizer to display the recognized strings.")]
    public Text DictationDisplay;

    private DictationRecognizer dictationRecognizer;

    // Use this string to cache the text currently displayed in the text box.
    //使用此字符串可以缓存当前显示在文本框中的文本。
    private StringBuilder textSoFar;

    void Awake()
    {
        /* TODO: DEVELOPER CODING EXERCISE 3.a */

        //Create a new DictationRecognizer and assign it to dictationRecognizer variable.
        dictationRecognizer = new DictationRecognizer();

        //Register for dictationRecognizer.DictationHypothesis and implement DictationHypothesis below
        // This event is fired while the user is talking. As the recognizer listens, it provides text of what it's heard so far.
        //注册听写假设事件。此事件在用户说话时触发。当识别器收听时，提供到目前为止所听到的内容文本
        dictationRecognizer.DictationHypothesis += DictationRecognizer_DictationHypothesis;

        //Register for dictationRecognizer.DictationResult and implement DictationResult below
        // This event is fired after the user pauses, typically at the end of a sentence. The full recognized string is returned here.
        //注册听写结果事件。此事件在用户暂停后触发，通常在句子的结尾处，返回完整的已识别字符串
        dictationRecognizer.DictationResult += DictationRecognizer_DictationResult;

        //Register for dictationRecognizer.DictationComplete and implement DictationComplete below
        // This event is fired when the recognizer stops, whether from Stop() being called, a timeout occurring, or some other error.
        //注册听写完成事件。无论是调用Stop()函数、发生超时或者其他的错误使得识别器停止都会触发此事件
        dictationRecognizer.DictationComplete += DictationRecognizer_DictationComplete;

        //Register for dictationRecognizer.DictationError and implement DictationError below
        // This event is fired when an error occurs.
        //注册听写错误事件。当发生错误时调用此事件，通常是为连接网络或者在识别过程中网络发生中断等时产生错误
        dictationRecognizer.DictationError += DictationRecognizer_DictationError;

        // Shutdown the PhraseRecognitionSystem. This controls the KeywordRecognizers
        //PhraseRecognitionSystem控制的是KeywordRecognizers，关闭语音命令关键字识别。只有在关闭这个后才能开启听写识别
        PhraseRecognitionSystem.Shutdown();

        //Start dictationRecognizer
        //开启听写识别
        dictationRecognizer.Start();

    }

    /// <summary>
    /// This event is fired while the user is talking. As the recognizer listens, it provides text of what it's heard so far.
    /// </summary>
    /// <param name="text">The currently hypothesized recognition.</param>
    private void DictationRecognizer_DictationHypothesis(string text)
    {
        // Set DictationDisplay text to be textSoFar and new hypothesized text
        // We don't want to append to textSoFar yet, because the hypothesis may have changed on the next event
        DictationDisplay.text = textSoFar.ToString() + " " + text + "...";
    }

    /// <summary>
    /// This event is fired after the user pauses, typically at the end of a sentence. The full recognized string is returned here.
    /// </summary>
    /// <param name="text">The text that was heard by the recognizer.</param>
    /// <param name="confidence">A representation of how confident (rejected, low, medium, high) the recognizer is of this recognition.</param>
    private void DictationRecognizer_DictationResult(string text, ConfidenceLevel confidence)
    {
        // 3.a: Append textSoFar with latest text
        textSoFar.Append(text + "");

        // 3.a: Set DictationDisplay text to be textSoFar
        DictationDisplay.text = textSoFar.ToString();
    }

    /// <summary>
    /// This event is fired when the recognizer stops, whether from Stop() being called, a timeout occurring, or some other error.
    /// Typically, this will simply return "Complete". In this case, we check to see if the recognizer timed out.
    /// </summary>
    /// <param name="cause">An enumerated reason for the session completing.</param>
    private void DictationRecognizer_DictationComplete(DictationCompletionCause cause)
    {
        // If Timeout occurs, the user has been silent for too long.
        // With dictation, the default timeout after a recognition is 20 seconds.
        // The default timeout with initial silence is 5 seconds.
        //如果在听写开始后第一个5秒内没听到任何声音，将会超时  
        //如果识别到了一个结果但是之后20秒没听到任何声音，也会超时  
        if (cause == DictationCompletionCause.TimeoutExceeded)
        {
            Microphone.End(deviceName);

            DictationDisplay.text = "Dictation has timed out. Please press the record button again.";
            SendMessage("ResetAfterTimeout");
        }
    }

    /// <summary>
    /// This event is fired when an error occurs.
    /// </summary>
    /// <param name="error">The string representation of the error reason.</param>
    /// <param name="hresult">The int representation of the hresult.</param>
    private void DictationRecognizer_DictationError(string error, int hresult)
    {
        // 3.a: Set DictationDisplay text to be the error string
        DictationDisplay.text = error + "\nHRESULT: " + hresult;
    }


    // Update is called once per frame  
    void Update () {  
      
    }  
  
    void OnDestroy()  
    {  
        dictationRecognizer.Stop();  
        dictationRecognizer.DictationHypothesis -= DictationRecognizer_DictationHypothesis;  
        dictationRecognizer.DictationResult -= DictationRecognizer_DictationResult;  
        dictationRecognizer.DictationComplete -= DictationRecognizer_DictationComplete;  
        dictationRecognizer.DictationError -= DictationRecognizer_DictationError;  
        dictationRecognizer.Dispose();  
    }  

}

HoloLens只能运行单个语音识别 (run at a time)，所以若要使用听写识别的话，必须要关闭KeywordRecognizer。

DictationRecognizer中设置有两个超时：

如果识别器启用并且在5秒内没有听到任何声音，将会超时。
如果识别器识别到了结果，但是在20秒内没有听到声音，将会超时。

最后编辑于：2017.12.04 17:12:19

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 206,839评论 6赞 482
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 88,543评论 2赞 382
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 153,116评论 0赞 344
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 55,371评论 1赞 279
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 64,384评论 5赞 374
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 49,111评论 1赞 285
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,416评论 3赞 400
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 37,053评论 0赞 259
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 43,558评论 1赞 300
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 36,007评论 2赞 325
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 38,117评论 1赞 334
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,756评论 4赞 324
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,324评论 3赞 307
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 30,315评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,539评论 1赞 262
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 45,578评论 2赞 355
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,877评论 2赞 345

HoloLens开发手记 - 语音识别（听写识别）

听写识别 Diction

DictationRecognizer.cs

推荐阅读更多精彩内容