Variational Inference with adversarial learning for end-to-end Singing Voice Synthesis

Different from VISinger, It is just VITS without MAS and DurationPredictor.

作为一个用于学习的项目，就这样了：Pitch的预测是需要改进的地方

Pitch and Duration will be developed as add-on!

训练步骤

1 下载数据 segments.zip，并解压

segments
|-- test.txt
|-- train.txt
|-- transcriptions.txt
`-- wavs
    |-- 2001000001.wav
    |-- 2001000002.wav
    |-- 2001000003.wav

2 转换采样率: 本项目采用32KHz

python util/resample.py -w segments/wavs/ -o data_svs/wavs -s 32000

3 生成数据标注

python util/generate_label.py --config configs/singing_base.yaml --data data_svs/ --file segments/transcriptions.txt

data_svs/labels.txt，内容格式：wave path|label path|score path|pitch path|slurs path

3 划分训练索引

python util/generate_label.py --file data_svs/labels.txt

生成 filelists/singing_train.txt 和 filelists/singing_valid.txt

4 启动训练

python svs_train.py -c configs/singing_base.yaml -n vits_svs

5 训练Pitch

python pit_train.py -c configs/singing_base.yaml -n pitch

推理验证

0 模型导出

python svs_export.py --config configs/singing_base.yaml --model chkpt/vits_svs/vits_svs_****.pt

1 推理验证: F0根据乐谱生成

python svs_infer.py --config configs/singing_base.yaml --model svs_opencpop.pt

2 完整歌曲合成（使用release模型）

python svs_song.py --config configs/singing_base.yaml --model svs_opencpop.pt

推理验证，使用Pitch预测，效果不佳

0 模型导出

python svs_export.py --config configs/singing_base.yaml --model chkpt/vits_svs/vits_svs_****.pt

python pit_export.py --config configs/singing_base.yaml --model chkpt/pitch/pitch_****.pt

1 推理验证

python svs_infer_pitch.py --config configs/singing_base.yaml --model svs_opencpop.pt --pitch pit_opencpop.pt

2 完整歌曲合成（使用release模型）

python svs_song_pitch.py --config configs/singing_base.yaml --model svs_opencpop.pt --pitch pit_opencpop.pt

数据

https://wenet.org.cn/opencpop/

歌声合成参考

https://github.com/SJTMusicTeam/Muskits

https://github.com/MoonInTheRiver/DiffSinger

VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis

模型设计参考

https://github.com/NVIDIA/BigVGAN

https://github.com/jaywalnut310/vits

https://github.com/mindslab-ai/univnet

https://github.com/PlayVoice/so-vits-svc-5.0

https://github.com/shivammehta25/Matcha-TTS

RoFormer: Enhanced Transformer with rotary position embedding

Diffusion Pitch

https://github.com/thuhcsi/DiffVar

https://github.com/hayeong0/Diff-HierVC

https://github.com/tonnetonne814/SiFi-VITS2-44100-Ja

Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech

Diffusion Pitch of Diff-HierVC

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
configs		configs
pitch		pitch
pitch_extend		pitch_extend
resource		resource
svs		svs
util		util
vits		vits
vits_decoder		vits_decoder
vits_extend		vits_extend
LICENSE		LICENSE
README.md		README.md
pit_export.py		pit_export.py
pit_train.py		pit_train.py
svs_export.py		svs_export.py
svs_infer.py		svs_infer.py
svs_infer.txt		svs_infer.txt
svs_infer_pitch.py		svs_infer_pitch.py
svs_song.py		svs_song.py
svs_song.txt		svs_song.txt
svs_song_pitch.py		svs_song_pitch.py
svs_train.py		svs_train.py

License

PlayVoice/VI-SVS

Folders and files

Latest commit

History

Repository files navigation

Variational Inference with adversarial learning for end-to-end Singing Voice Synthesis

训练步骤

推理验证

推理验证，使用Pitch预测，效果不佳

数据

歌声合成参考

模型设计参考

Diffusion Pitch

Diffusion Pitch of Diff-HierVC

About

Topics

Resources

License

Stars

Watchers

Forks

Languages