rename slr68 to magicdata to keep consistent naming convention

(cherry picked from commit bbdad858ebc4d0ee3b720ba22ae3e0ce9732a734)
2024-03-22 13:11:31 +08:00 · 2021-08-17 20:55:28 +08:00 · 2021-08-17 20:55:28 +08:00 · feb1c7cb88
commit feb1c7cb88
parent e66d29872f
4 changed files with 7 additions and 7 deletions
--- a/README-CN.md
+++ b/README-CN.md
@ -7,7 +7,7 @@
 ### [English](README.md)  | 中文

 ## 特性
-🌍 **中文** 支持普通话并使用多种中文数据集进行测试：adatatang_200zh, SLR68
+🌍 **中文** 支持普通话并使用多种中文数据集进行测试：adatatang_200zh, magicdata

 🤩 **PyTorch** 适用于 pytorch，已在 1.9.0 版本（最新于 2021 年 8 月）中测试，GPU Tesla T4 和 GTX 2060

@ -29,7 +29,7 @@
 * 下载 数据集并解压：确保您可以访问 *train* 文件夹中的所有音频文件（如.wav）
 * 使用音频和梅尔频谱图进行预处理：
 `python synthesizer_preprocess_audio.py <datasets_root>`
-可以传入参数 --dataset `{dataset}` 支持 adatatang_200zh, SLR68
+可以传入参数 --dataset `{dataset}` 支持 adatatang_200zh, magicdata
 > 假如你下载的 `aidatatang_200zh`文件放在D盘，`train`文件路径为 `D:\data\aidatatang_200zh\corpus\train` , 你的`datasets_root`就是 `D:\data\`

 * 预处理嵌入：
--- a/README.md
+++ b/README.md
@ -6,7 +6,7 @@
 > English | [中文](README-CN.md) 

 ## Features
-🌍 **Chinese** supported mandarin and tested with multiple datasets: aidatatang_200zh, SLR68
+🌍 **Chinese** supported mandarin and tested with multiple datasets: aidatatang_200zh, magicdata

 🤩 **PyTorch** worked for pytorch, tested in version of 1.9.0(latest in August 2021), with GPU Tesla T4 and GTX 2060

@ -31,7 +31,7 @@
 * Download aidatatang_200zh or SLR68 dataset and unzip: make sure you can access all .wav in *train* folder
 * Preprocess with the audios and the mel spectrograms:
 `python synthesizer_preprocess_audio.py <datasets_root>`
-Allow parameter `--dataset {dataset}` to support adatatang_200zh, SLR68
+Allow parameter `--dataset {dataset}` to support adatatang_200zh, magicdata
 * Preprocess the embeddings:
 `python synthesizer_preprocess_embeds.py <datasets_root>/SV2TTS/synthesizer`

--- a/synthesizer/preprocess.py
+++ b/synthesizer/preprocess.py
@ -14,7 +14,7 @@ data_info = {
        "trans_filepath": "transcript/aidatatang_200_zh_transcript.txt",
        "speak_func": preprocess_speaker_general
    },
-    "SLR68": {
+    "magicdata": {
        "subfolders": ["train"],
        "trans_filepath": "train/TRANS.txt",
        "speak_func": preprocess_speaker_general
--- a/synthesizer_preprocess_audio.py
+++ b/synthesizer_preprocess_audio.py
@ -7,7 +7,7 @@ import argparse

 recognized_datasets = [
    "aidatatang_200zh",
-    "SLR68",
+    "magicdata",
 ]

 if __name__ == "__main__":
@ -35,7 +35,7 @@ if __name__ == "__main__":
        "Use this option when dataset does not include alignments\
        (these are used to split long audio files into sub-utterances.)")
    parser.add_argument("--dataset", type=str, default="aidatatang_200zh", help=\
-        "Name of the dataset to process.")
+        "Name of the dataset to process, allowing values: magicdata, aidatatang_200zh.")
    args = parser.parse_args()

    # Process the arguments