diff --git a/README-CN.md b/README-CN.md index 263c9f7..72978bd 100644 --- a/README-CN.md +++ b/README-CN.md @@ -7,7 +7,7 @@ ### [English](README.md) | 中文 ## 特性 -🌍 **中文** 支持普通话并使用多种中文数据集进行测试:adatatang_200zh, SLR68 +🌍 **中文** 支持普通话并使用多种中文数据集进行测试:adatatang_200zh, magicdata 🤩 **PyTorch** 适用于 pytorch,已在 1.9.0 版本(最新于 2021 年 8 月)中测试,GPU Tesla T4 和 GTX 2060 @@ -29,7 +29,7 @@ * 下载 数据集并解压:确保您可以访问 *train* 文件夹中的所有音频文件(如.wav) * 使用音频和梅尔频谱图进行预处理: `python synthesizer_preprocess_audio.py ` -可以传入参数 --dataset `{dataset}` 支持 adatatang_200zh, SLR68 +可以传入参数 --dataset `{dataset}` 支持 adatatang_200zh, magicdata > 假如你下载的 `aidatatang_200zh`文件放在D盘,`train`文件路径为 `D:\data\aidatatang_200zh\corpus\train` , 你的`datasets_root`就是 `D:\data\` * 预处理嵌入: diff --git a/README.md b/README.md index 37b22e9..ec39421 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ > English | [中文](README-CN.md) ## Features -🌍 **Chinese** supported mandarin and tested with multiple datasets: aidatatang_200zh, SLR68 +🌍 **Chinese** supported mandarin and tested with multiple datasets: aidatatang_200zh, magicdata 🤩 **PyTorch** worked for pytorch, tested in version of 1.9.0(latest in August 2021), with GPU Tesla T4 and GTX 2060 @@ -31,7 +31,7 @@ * Download aidatatang_200zh or SLR68 dataset and unzip: make sure you can access all .wav in *train* folder * Preprocess with the audios and the mel spectrograms: `python synthesizer_preprocess_audio.py ` -Allow parameter `--dataset {dataset}` to support adatatang_200zh, SLR68 +Allow parameter `--dataset {dataset}` to support adatatang_200zh, magicdata * Preprocess the embeddings: `python synthesizer_preprocess_embeds.py /SV2TTS/synthesizer` diff --git a/synthesizer/preprocess.py b/synthesizer/preprocess.py index 17a1123..60fd2cc 100644 --- a/synthesizer/preprocess.py +++ b/synthesizer/preprocess.py @@ -14,7 +14,7 @@ data_info = { "trans_filepath": "transcript/aidatatang_200_zh_transcript.txt", "speak_func": preprocess_speaker_general }, - "SLR68": { + "magicdata": { "subfolders": ["train"], "trans_filepath": "train/TRANS.txt", "speak_func": preprocess_speaker_general diff --git a/synthesizer_preprocess_audio.py b/synthesizer_preprocess_audio.py index 501c13f..2af7b9f 100644 --- a/synthesizer_preprocess_audio.py +++ b/synthesizer_preprocess_audio.py @@ -7,7 +7,7 @@ import argparse recognized_datasets = [ "aidatatang_200zh", - "SLR68", + "magicdata", ] if __name__ == "__main__": @@ -35,7 +35,7 @@ if __name__ == "__main__": "Use this option when dataset does not include alignments\ (these are used to split long audio files into sub-utterances.)") parser.add_argument("--dataset", type=str, default="aidatatang_200zh", help=\ - "Name of the dataset to process.") + "Name of the dataset to process, allowing values: magicdata, aidatatang_200zh.") args = parser.parse_args() # Process the arguments