Vietnamese Voice Conversion

Vietnamese Voice Conversion

Overview

This thesis develops a voice conversion model for Vietnamese based on the Phoneme Hallucinator model with 2 adoptions: (1) Add a Text2SSL module to get more context information before performing the KNN algorithm, (2) To create a more diverse dataset we apply spectrogram-resize (SR) based data augmentation idea from Free-VC model which distorts speaker information without changing content information to generate more ”speakers”.

The proposal model
The proposal model

Comparing different methods

This section compares the baseline and the proposal model.

Source Target Baseline Model Proposal Model
[trangntt] Female to Female Conversion
[trangntt] Male to Female Conversion
[nguyenlm] Male to Male Conversion
[nguyenlm] Female to Male Conversion
[thanhpv] Male to Male Conversion
[thanhpv] Female to Male Conversion
Le Minh Nguyen (nguyenlm)
Le Minh Nguyen (nguyenlm)
Research Engineer

A Software Engineer loves NLP & Speech Technology.

Next

Related