From 630023c7b25ae33872aa37ad4552c597c98f27dc Mon Sep 17 00:00:00 2001 From: babysor00 Date: Sun, 29 Aug 2021 10:55:59 +0800 Subject: [PATCH] format readme and add paper references --- README-CN.md | 11 ++++++++++- README.md | 11 ++++++++++- synthesizer_preprocess_audio.py | 2 +- 3 files changed, 21 insertions(+), 3 deletions(-) diff --git a/README-CN.md b/README-CN.md index 7a657e9..c792dd6 100644 --- a/README-CN.md +++ b/README-CN.md @@ -2,7 +2,6 @@ ![mockingbird](https://user-images.githubusercontent.com/12797292/131216767-6eb251d6-14fc-4951-8324-2722f0cd4c63.jpg) [![MIT License](https://img.shields.io/badge/license-MIT-blue.svg?style=flat)](http://choosealicense.com/licenses/mit/) -> 该库是从仅支持英语的[Real-Time-Voice-Cloning](https://github.com/CorentinJ/Real-Time-Voice-Cloning) 分叉出来的。 ### [English](README.md) | 中文 @@ -71,3 +70,13 @@ - [ ] 支持parallel tacotron - [ ] 服务化与容器化 - [ ] 🙏 欢迎补充 + +## 引用及论文 +> 该库一开始从仅支持英语的[Real-Time-Voice-Cloning](https://github.com/CorentinJ/Real-Time-Voice-Cloning) 分叉出来的,鸣谢作者。 + +| URL | Designation | 标题 | 实现源码 | +| --- | ----------- | ----- | --------------------- | +|[**1806.04558**](https://arxiv.org/pdf/1806.04558.pdf) | **SV2TTS** | **Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis** | This repo | +|[1802.08435](https://arxiv.org/pdf/1802.08435.pdf) | WaveRNN (vocoder) | Efficient Neural Audio Synthesis | [fatchord/WaveRNN](https://github.com/fatchord/WaveRNN) | +|[1703.10135](https://arxiv.org/pdf/1703.10135.pdf) | Tacotron (synthesizer) | Tacotron: Towards End-to-End Speech Synthesis | [fatchord/WaveRNN](https://github.com/fatchord/WaveRNN) +|[1710.10467](https://arxiv.org/pdf/1710.10467.pdf) | GE2E (encoder)| Generalized End-To-End Loss for Speaker Verification | 本代码库 | \ No newline at end of file diff --git a/README.md b/README.md index 80e3c8d..b237dac 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,6 @@ [![MIT License](https://img.shields.io/badge/license-MIT-blue.svg?style=flat)](http://choosealicense.com/licenses/mit/) -> This repository is forked from [Real-Time-Voice-Cloning](https://github.com/CorentinJ/Real-Time-Voice-Cloning) which only support English. > English | [中文](README-CN.md) @@ -74,3 +73,13 @@ or - [ ] Support parallel tacotron - [ ] Service orianted and docterize - 🙏 Welcome to add more + +## Reference +> This repository is forked from [Real-Time-Voice-Cloning](https://github.com/CorentinJ/Real-Time-Voice-Cloning) which only support English. + +| URL | Designation | Title | Implementation source | +| --- | ----------- | ----- | --------------------- | +|[**1806.04558**](https://arxiv.org/pdf/1806.04558.pdf) | **SV2TTS** | **Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis** | This repo | +|[1802.08435](https://arxiv.org/pdf/1802.08435.pdf) | WaveRNN (vocoder) | Efficient Neural Audio Synthesis | [fatchord/WaveRNN](https://github.com/fatchord/WaveRNN) | +|[1703.10135](https://arxiv.org/pdf/1703.10135.pdf) | Tacotron (synthesizer) | Tacotron: Towards End-to-End Speech Synthesis | [fatchord/WaveRNN](https://github.com/fatchord/WaveRNN) +|[1710.10467](https://arxiv.org/pdf/1710.10467.pdf) | GE2E (encoder)| Generalized End-To-End Loss for Speaker Verification | This repo | \ No newline at end of file diff --git a/synthesizer_preprocess_audio.py b/synthesizer_preprocess_audio.py index 871f5e7..7c322e7 100644 --- a/synthesizer_preprocess_audio.py +++ b/synthesizer_preprocess_audio.py @@ -42,7 +42,7 @@ if __name__ == "__main__": # Process the arguments if not hasattr(args, "out_dir"): args.out_dir = args.datasets_root.joinpath("SV2TTS", "synthesizer") - assert args.dataset in recognized_datasets, 'not surpport such dataset' + assert args.dataset in recognized_datasets, 'is not supported, please vote for it in https://github.com/babysor/MockingBird/issues/10' # Create directories assert args.datasets_root.exists() args.out_dir.mkdir(exist_ok=True, parents=True)