PLATINUM: A method to extract excitation signals for voice synthesis system

Roughly two types of systems for voice synthesis have been proposed. One is based on the time domain pitch synchronous overlap-add (TD-PSOLA [2]), which synthesizes a voice using the short time waveform directly extracted from the input signal. The other is based on a vocoder [3], which analyzes a voice in terms of its pitch (fundamental frequency; F0) and timbre (spectral envelope) and synthesizes it with the estimated parameters.

TD-PSOLA直接使用从音频库中提取的波形图合成语音;vocoder分析语音的音调(基频)和音色(频谱包络)并结合一些估计得到的参数合成语音

TD-PSOLA and vocoders have trade offs. TD-PSOLA synthesizes voice with better quality than vocoders; however, vocoders can manipulate pitch and voice timbre independently.

TD-PSOLA合成效果好,但是vocoder可以控制音调和音色

The STRAIGHT [7] and TANDEM- STRAIGHT [8] have been proposed to solve this problem. They use the pitch synchronous analysis [9] to improve the estimation performance of the spectral envelope

pitch synchronous analysis是什么?待续

Furthermore, aperiodicity is used as the parameter to represent not only the periodic signal but also the aperiodic signal

AP可以表示周期信号以及非周期信号

In STRAIGHT and TANDEM-STRAIGHT, the aperiodicity is defined as the spectrum to synthesize both periodic and aperiodic signals.The periodic and the aperiodic spectra are calculated using the spectral envelope and aperiodicity, and the periodic and aperiodic signals are individually calculated

AP是可以生成周期信号和非周期信号的频谱图。周期频谱和非周期频谱是用频谱包络和AP计算得到,而且周期信号和非周期信号是独立计算的。

This approach cannot represent the phase of the input voice because the periodic signal is calculated as the minimum phase response, and the vocal tract response hðtÞ generally includes not only minimum phase response but also maximum phase response. To accurately synthesize a voice, it is essential to extract the phase of the input voice. We used a waveform-based parameter as a new parameter instead of aperiodicity.

待续

PLATINUM extracts the waveform- based parameter to reconstruct the input voice.

** PLATINUM提取波形图的参数,重建输入语音。**

The proposed system equals vocoder-based systems except that it uses the excitation signal instead of aperiodicity, which therefore suggests that it is possible for the proposed system to independently manipulate the F0 and spectral envelope like vocoder-based systems.

该系统等价于vocoder-based systems,但是他使用激励信号替换了AP,这表明这个系统可以独立的控制音调(基频)和音色(频谱包络)

The observed spectrum Yð!Þ is defined as the product of the spectral envelope Hð!Þ and target spectrum for reconstructing the waveform. The target spectrum Xð!Þ is given by,Since the phase of Hð!Þ for vocoder-based systems is generally the minimum phase, the maximum phase of the input voice is included in Xð!Þ. The power of Xð!Þ is nearly flat, provided that the spectral envelope is accurately estimated. If Hð!Þ does not include any zeros, the inverse spectrum can be calculated reliably.

观察到的频谱Y是由频谱包络和用于重建波形图的目标频谱的产物。目标频谱X是有如下公式获得的。既然频谱包络中的相位是最小相位,那么输入信号的最大相位在X中,X的能量几乎平稳的,这表明频谱包络的估算值准确的。如果H中不包含0,那么H的倒数可以计算获得。

To estimate Xð!Þ, determining the temporal positions for
windowing is an important problem. PLATINUM uses the F0 contour and waveform. First, the voiced section is estimated based on the F0 contour, and the temporal position with maximum value of yðtÞ2 is then extracted as the basic temporal position. The other positions are automatically calculated based on the basic position and F0 contour.

在估计X的时候,测定窗口的位置是关键。PLATINUM使用基频等高线和波形图。首先,语音部分由基频登高线估计得到,然后获得具有最大值yt2的时间位置,并以此作为基础时间位置,剩下的位置自动的通过基础时间位置和基频等高线计算得到。

总结

f0基频代表音调的高低,女生偏高,男生偏低。sp代表音色,吉他和钢琴的音色就不一样,ap代表说话的内容,比如”你好吗“,ap可能涉及到拼音中的1234声。用提取激励信号的方式代替ap,能取得更好的结果。

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 202,905评论 5 476
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 85,140评论 2 379
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 149,791评论 0 335
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,483评论 1 273
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,476评论 5 364
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,516评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,905评论 3 395
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,560评论 0 256
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,778评论 1 296
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,557评论 2 319
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,635评论 1 329
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,338评论 4 318
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,925评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,898评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,142评论 1 259
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 42,818评论 2 349
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,347评论 2 342

推荐阅读更多精彩内容