CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
Paper
•
2412.06782
•
Published
•
6
digital human, human pose, audio to face, text to motion, talking head, 3d human reconstruction, etc.