Spaces:

CVPR
/

ml-talking-face

Running

App Files Files Community

与wav2lips的源代码和模型的区别

#14

by xingruispace - opened May 18, 2023

Discussion

xingruispace

May 18, 2023

非常棒的工作。尤其是生产效率上非常的高，达到了实时的程度。

我想请问在工程方面与wav2lips的源代码和模型的区别是什么呢？从论文描述中，我发现图像重构的部分和wav2lips是一致的。
是否是再wav2lip的基础上进行了数据的训练产生新的模型，然后节后具体的人物模特进行了微调呢？

谢谢

deepkyu

CVPR Demo Track org May 22, 2023

Translated:

Great job. In particular, the production efficiency is very high, reaching a real-time level.

I would like to ask what is the difference between the source code and model of wav2lips in terms of engineering? From the description of the paper, I found that the image reconstruction part is consistent with wav2lips.
Is it based on wav2lip data training to generate a new model, and then fine-tuning the specific character models after the festival?

Thanks

deepkyu

CVPR Demo Track org May 22, 2023

Hi @xingruispace ,

First of all, thank you for your interest in our demo!

There were questions about whether or not we succeeded in training Wav2Lip with the dataset from a single person.
Basically, we haven't started from the Wav2Lip code. We implemented our model from scratch and with PyTorch Lightning.
Also for the dataset, we record single-person video and training from scratch without any pre-training from LRS2, the dataset of Wav2Lip.

You can see the details of our model and how different it is from wav2lip paper at our model card: Is your demo made with Wav2Lip?.

Have a nice one.

xingruispace

May 22, 2023

非常感谢您详细的解答，以及对于模型的不同点的经验，祝您学习和工作顺利！

deepkyu

CVPR Demo Track org May 22, 2023

•

edited May 22, 2023

@xingruispace
고마워요 谢谢
Wish you good luck, too :)

deepkyu changed discussion status to closed May 23, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment