Retrieval-based-Voice-Conversion

An easy-to-use Voice Conversion framework based on VITS.

Note

Currently under development... Provided as a library and API in rvc

Installation and usage

Standard Setup

First, create a directory in your project. The assets folder will contain the models needed for inference and training, and the result folder will contain the results of the training.

rvc init

This will create an assets folder and .env in your working directory.

Warning

The directory should be empty or without an assets folder.

Custom Setup

If you have already downloaded models or want to change these configurations, edit the .env file. If you do not already have a .env file,

rvc env create

can create one.

Also, when downloading a model, you can use the

rvc dlmodel

or

rvc dlmodel {download_dir}

Finally, specify the location of the model in the env file, and you are done!

Library Usage

Inference Audio

from pathlib import Path

from dotenv import load_dotenv
from scipy.io import wavfile

from rvc.modules.vc.modules import VC


def main():
      vc = VC()
      vc.get_vc("{model.pth}")
      tgt_sr, audio_opt, times, _ = vc.vc_inference(
            1, Path("{InputAudio}")
      )
      wavfile.write("{OutputAudio}", tgt_sr, audio_opt)


if __name__ == "__main__":
      load_dotenv("{envPath}")
      main()

CLI Usage

Inference Audio

rvc infer -m {model.pth} -i {input.wav} -o {output.wav}

option	flag	type	default value	description
modelPath	-m	Path	*required	Model path or filename (reads in the directory set in env)
inputPath	-i	Path	*required	Input audio path or folder
outputPath	-o	Path	*required	Output audio path or folder
sid	-s	int	0	Speaker/Singer ID
f0_up_key	-fu	int	0	Transpose (integer, number of semitones, raise by an octave: 12, lower by an octave: -12)
f0_method	-fm	str	rmvpe	pitch extraction algorithm (pm, harvest, crepe, rmvpe
f0_file	-ff	Path \| None	None	F0 curve file (optional). One pitch per line. Replaces the default F0 and pitch modulation
index_file	-if	Path \| None	None	Path to the feature index file
index_rate	-if	float	0.75	Search feature ratio (controls accent strength, too high has artifacting)
filter_radius	-fr	int	3	If >=3: apply median filtering to the harvested pitch results. The value represents the filter radius and can reduce breathiness
resample_sr	-rsr	int	0	Resample the output audio in post-processing to the final sample rate. Set to 0 for no resampling
rms_mix_rate	-rmr	float	0.25	Adjust the volume envelope scaling. Closer to 0, the more it mimicks the volume of the original vocals. Can help mask noise and make volume sound more natural when set relatively low. Closer to 1 will be more of a consistently loud volume
protect	-p	float	0.33	Protect voiceless consonants and breath sounds to prevent artifacts such as tearing in electronic music. Set to 0.5 to disable. Decrease the value to increase protection, but it may reduce indexing accuracy

API Usage

First, start up the server.

rvc-api

or

poetry run poe rvc-api

Inference Audio

Get as blob

curl -X 'POST' \
      'http://127.0.0.1:8000/inference?res_type=blob' \
      -H 'accept: application/json' \
      -H 'Content-Type: multipart/form-data' \
      -F 'modelpath={model.pth}' \
      -F 'input={input audio path}'

Get as json(include time)

curl -X 'POST' \
      'http://127.0.0.1:8000/inference?res_type=json' \
      -H 'accept: application/json' \
      -H 'Content-Type: multipart/form-data' \
      -F 'modelpath={model.pth}' \
      -F 'input={input audio path}'

Docker Usage

Build and run via script:

./docker-run.sh

Or use manually:

Build:
```
docker build -t "rvc" .
```

Run:

docker run -it \
  -p 8000:8000 \
  -v "${PWD}/assets/weights:/weights:ro" \
  -v "${PWD}/assets/indices:/indices:ro" \
  -v "${PWD}/assets/audios:/audios:ro" \
  "rvc"

Notice assumption that weights, indices and input audios are stored in current-directory/assets

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
docs		docs
rvc		rvc
.dockerignore		.dockerignore
.env-docker		.env-docker
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
api-request.sh		api-request.sh
assets-download.sh		assets-download.sh
docker-run.sh		docker-run.sh
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

License

RVC-Project/Retrieval-based-Voice-Conversion

Folders and files

Latest commit

History

Repository files navigation

Retrieval-based-Voice-Conversion

Installation and usage

Standard Setup

Custom Setup

Library Usage

Inference Audio

CLI Usage

Inference Audio

API Usage

Inference Audio

Get as blob

Get as json(include time)

Docker Usage

About

Resources

License

Stars

Watchers

Forks

Languages