1 Star 0 Fork 0

Nice / MTTS

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
MIT

本项目已停止维护

推荐:https://github.com/thuhcsi/Crystal

欢迎加入

  • 语音合成交流QQ群:882726654

Build Status

A Demo of MTTS Mandarin/Chinese Text to Speech FrontEnd

Mandarin/Chinese Text to Speech based on statistical parametric speech synthesis using merlin toolkit

这只是一个语音合成前端的Demo,没有提供文本正则化,韵律预测功能,文字转拼音使用pypinyin,分词使用结巴分词,这两者的准确度也达不到商用水平。

其他语音合成项目传送门,端到端是不错的方向,自然度要优于merlin。

This is only a demo of mandarin frontend which is lack of some parts like "text normalization" and "prosody prediction", and the phone set && Question Set this project use havn't fully tested yet.

一个粗略的文档:A draft documentation written in Mandarin

Data

There is no open-source mandarin speech synthesis dataset on the internet, this proj used thchs30 dataset to demostrate speech synthesis

UPDATE

open-source mandarin speech synthesis data from data-banker company, 开源的中文语音合成数据,感谢标贝公司

【数据下载】https://weixinxcxdb.oss-cn-beijing.aliyuncs.com/gwYinPinKu/BZNSYP.rar 【数据说明】http://www.data-baker.com/open_source.html

Generated Samples

Listen to https://jackiexiao.github.io/MTTS/

How To Reproduce

  1. First, you need data contain wav and txt (prosody mark is optional)
  2. Second, generate HTS label using this project
  3. Using merlin/egs/mandarin_voice to train and generate Mandarin Voice

Context related annotation & Question Set

Install

Python : python3.6
System: linux(tested on ubuntu16.04)

pip install jieba pypinyin
sudo apt-get install libatlas3-base

Run bash tools/install_mtts.sh
Or download file by yourself

Run Demo

bash run_demo.sh

Usage

1. Generate HTS Label by wav and text

  • Usage: Run python src/mtts.py txtfile wav_directory_path output_directory_path (Absolute path or relative path) Then you will get HTS label, if you have your own acoustic model trained by monthreal-forced-aligner, add-a your_acoustic_model.zip, otherwise, this project use thchs30.zip acoustic model as default
  • Attention: Currently only support Chinese Character, txt should not have any Arabia number or English alphabet(不可包含阿拉伯数字和英文字符)

txtfile example

A_01 这是一段文本
A_02 这是第二段文本

wav_directory example(Sampleing Rate should larger than 16khz)

A_01.wav  
A_02.wav  

2. Generate HTS Label by text with or without alignment file

  • Usage: Run python src/mandarin_frontend.py txtfile output_directory_path
  • or import mandarin_frontend
from mandarin_frontend import txt2label

result = txt2label('向香港特别行政区同胞澳门和台湾同胞海外侨胞')
[print(line) for line in result]

# with prosody mark and alignment file (sfs file)
# result = txt2label('向#1香港#2特别#1行政区#1同胞#4澳门#2和#1台湾#1同胞#4海外#1侨胞',
            sfsfile='example_file/example.sfs')

see source code for more information, but pay attention to the alignment file(sfs file), the format is endtime phone_type not start_time, phone_type(which is different from speech ocean's data)

3. Forced-alignment

This project use Montreal-Forced-Aligner to do forced alignment, if you want to get a better alignment, use your data to train a alignment-model, see mfa: algin-using-only-the-dataset

  1. We trained the acoustic model using thchs30 dataset, see misc/thchs30.zip, the dictionary we use mandarin_mtts.lexicon. If you use larger dataset than thchs30, you may get better alignment.
  2. If you want to use mfa's (montreal-forced-aligner) pre-trained mandarin model, this is the dictionary you need mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon

Prosody Mark

You can generate HTS Label without prosody mark. we assume that word segment is smaller than prosodic word(which is adjusted in code)

"#0","#1", "#2","#3" and "#4" are the prosody labeling symbols.

  • #0 stands for word segment
  • #1 stands for prosodic word
  • #2 stands for stressful word (actually in this project we regrad it as #1)
  • #3 stands for prosodic phrase
  • #4 stands for intonational phrase

Improvement to be done in future

  • Text Normalization
  • Better Chinese word segment
  • G2P: Polyphone Problem
  • Better Label format and Question Set
  • Improvement of prosody analyse
  • Better alignment

Contributor

  • Jackiexiao
  • willian56
MIT License Copyright (c) 2017-2018 Jackiexiao <707610215@qq.com> Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

简介

暂无描述 展开 收起
Python 等 2 种语言
MIT
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
1
https://gitee.com/jqying/MTTS.git
git@gitee.com:jqying/MTTS.git
jqying
MTTS
MTTS
master

搜索帮助