Posts

Vietnamese Voice Conversion

Mar 9, 2024

Overview

This thesis develops a voice conversion model for Vietnamese based on the Phoneme Hallucinator model with 2 adoptions: (1) Add a Text2SSL module to get more context information before performing the KNN algorithm, (2) To create a more diverse dataset we apply spectrogram-resize (SR) based data augmentation idea from Free-VC model which distorts speaker information without changing content information to generate more ”speakers”.

Mar 9, 2024

Postnet Layer

Mar 9, 2022

Generally speaking, the postnet layer receives a mel-spectrogram and predicts another mel-spectrogram with additional information. That makes the output mel-spectrogram more detail, and hence improves the quality of synthesis audio.

Mar 9, 2022

KNN-VC vs Phoneme Hallucinator [23/03/2024] ?

Mar 9, 2022

Overview

Comparing different methods

This section compares Phoneme Hallucinator kNN-VC and Phoneme Hallucinator.

Source	Target	Phoneme Hallucinator	Phoneme Hallucinator + Text2SSL

Mar 9, 2022

KNN-VC vs Phoneme Hallucinator [09/03/2024] ?

Mar 9, 2022

Overview

Comparing different methods

This section compares Phoneme Hallucinator kNN-VC and Phoneme Hallucinator.

Source	Target	kNN-VC	Phoneme Hallucinator

Mar 9, 2022

How to kill zombie processes using GPU ?

Mar 9, 2022

The trick for killing zombie processes using GPU in Linux 😃.

Mar 9, 2022

Fix "[Errno 32] Broken pipe" in Python

Mar 9, 2022

One day, I’ve tried to run a python script using the multiprocessing technique and for a while the program crashed and raised the [Errno 32] Broken pipe error…