HMM-based Mandarine Singing Voice Synthesis

A brief description:

Musical analysis + speech synthesis technique = singing synthesis
Input: music score, Output: singing voice

Details

This is a HMM-based mandarin singing synthesis system.

During training, F0 and spectrum are extracted from singing voices based on source-filter model. HMM is then used to model acoustic parameters, F0 and Spectrum.

During synthesis, An input music score is analyzed to get musical information, and then acoustic parameters are generated by selecting HMMs based on these information. Finally, source-filter model is used to synthesize singing voice from acoustic parameters.

This system can synthesize singing voice with a nutural sound, and also precise tempo and pitch.

Training Dataset

dataset

Tools:

Demo:

Lilypond input:

\paper{
  top-margin = 2.5\cm
  bottom-margin = 2\cm
  left-margin = 3\cm
  right-margin = 3\cm
  markup-system-spacing = 
    #'((padding . 3))
}

\version "2.14.0"
\header {
  title="3644913_1_1"
}
\score {
  \relative  c  {
    \time  4/4 
    \tempo  4 = 130 
    \key  e \major

  b''8 b b4 b8 b b r8 
  | % 2
  gis b cis b16 gis fis8 gis b r8 
  | % 3
  b gis b gis16 fis e8 gis fis r8 
  | % 4
  gis gis fis4 cis8 e fis r8 
  | % 5
  cis'4 cis8 b gis cis b4 
  | % 6
  b8 gis fis gis b4 r4 
  | % 7
  b8 gis fis gis b gis fis gis 
  | % 8
  cis, e fis gis e2 
  | % 9
  

   }
  \addlyrics {
    啦 啦 啦, 啦 啦 啦, 我 是 卖 报 的 小 行 家, 不 等 天 明 去 等 派 报. 一 面 走, 一 面 叫, 今 天 的 新 _ 闻 真 _ 正 _ 好, 七 个 铜 板 就 _ 买 _ 两 _ 份 _ 报.
  }
  \addlyrics {
    \teeny
    "la5" "la5" "la5" "la5" "la5" "la5" "wo3" "shi4" "mai4" "bao4" "de5" "xiao3" "hang2" "jia1" "bu4" "deng3" "tian1" "ming2" "qu4" "deng3" "pai4" "bao4" "yi2" "mian4" "zou3" "yi2" "mian4" "jiao4" "jin1" "tian1" "de5" "xin1" _ "wen2" "zhen1" _ "zheng4" _ "hao3" "qi1" "ge5" "tong2" "ban3" "jiu4" _ "mai3" _ "liang3" _ "fen4" _ "bao4"
  }
  \midi{}
  \layout{}
}

output:

Publication

Li X, Wang Z. A HMM-based mandarin chinese singing voice synthesis system[J]. IEEE/CAA Journal of Automatica Sinica, 2016, 3(2): 192-202.