add hf links
Browse files
README.md
CHANGED
@@ -25,7 +25,7 @@ We train VidTok on a large-scale video dataset and evaluation reveal that VidTok
|
|
25 |
Resources and technical documentation:
|
26 |
|
27 |
+ [GitHub](https://github.com/microsoft/VidTok)
|
28 |
-
+ [arXiv](https://arxiv.org/abs)
|
29 |
|
30 |
|
31 |
## Model Performance
|
@@ -34,18 +34,18 @@ The following table shows model performance evaluated on 30 test videos in [MCL_
|
|
34 |
|
35 |
| Model | Regularizer | Causal | VCR | PSNR | SSIM | LPIPS | FVD |
|
36 |
|------|------|------|------|------|------|------|------|
|
37 |
-
| [
|
38 |
-
| [
|
39 |
-
| [
|
40 |
-
| [
|
41 |
-
| [
|
42 |
-
| [
|
43 |
-
| [
|
44 |
-
| [
|
45 |
-
| [
|
46 |
-
| [
|
47 |
-
| [
|
48 |
-
| [
|
49 |
|
50 |
## Training
|
51 |
### Training Data
|
|
|
25 |
Resources and technical documentation:
|
26 |
|
27 |
+ [GitHub](https://github.com/microsoft/VidTok)
|
28 |
+
+ [arXiv](https://arxiv.org/abs/)
|
29 |
|
30 |
|
31 |
## Model Performance
|
|
|
34 |
|
35 |
| Model | Regularizer | Causal | VCR | PSNR | SSIM | LPIPS | FVD |
|
36 |
|------|------|------|------|------|------|------|------|
|
37 |
+
| [vidtok_kl_causal_488_4chn](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_kl_causal_488_4chn.ckpt) | KL-4chn | ✔️ | 4x8x8 | 29.64 | 0.852| 0.114| 194.2|
|
38 |
+
| [vidtok_kl_causal_488_8chn](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_kl_causal_488_8chn.ckpt) | KL-8chn | ✔️ |4x8x8 | 31.83 | 0.897| 0.083| 109.3|
|
39 |
+
| [vidtok_kl_causal_488_16chn](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_kl_causal_488_16chn.ckpt) | KL-16chn | ✔️ | 4x8x8 | 35.04 |0.942 |0.047 | 78.9|
|
40 |
+
| [vidtok_kl_causal_41616_4chn](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_kl_causal_41616_4chn.ckpt) | KL-4chn | ✔️ | 4x16x16 | 25.05 | 0.711| 0.228| 549.1| |
|
41 |
+
| [vidtok_kl_noncausal_488_4chn](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_kl_noncausal_488_4chn.ckpt) | KL-4chn | ✖️ | 4x8x8 | 30.60 | 0.876 | 0.098| 157.9|
|
42 |
+
| [vidtok_kl_noncausal_41616_4chn](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_kl_noncausal_41616_4chn.ckpt) | KL-4chn | ✖️ | 4x16x16 | 26.06 | 0.751 | 0.190|423.2 |
|
43 |
+
| [vidtok_fsq_causal_488_262144](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_fsq_causal_488_262144.ckpt) | FSQ-262,144 | ✔️ | 4x8x8 | 29.82 | 0.867 |0.106 | 160.1|
|
44 |
+
| [vidtok_fsq_causal_488_32768](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_fsq_causal_488_32768.ckpt) | FSQ-32,768 | ✔️ | 4x8x8 | 29.16 | 0.854 | 0.117| 196.9|
|
45 |
+
| [vidtok_fsq_causal_488_4096](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_fsq_causal_488_4096.ckpt) | FSQ-4096 | ✔️ | 4x8x8 | 28.36 | 0.832 | 0.133| 218.1|
|
46 |
+
| [vidtok_fsq_causal_41616_262144](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_fsq_causal_41616_262144.ckpt) | FSQ-262,144 | ✔️ | 4x16x16 | 25.38 | 0.738 |0.206 | 430.1|
|
47 |
+
| [vidtok_fsq_noncausal_488_262144](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_fsq_noncausal_488_262144.ckpt) | FSQ-262,144 | ✖️ | 4x8x8 | 30.78 | 0.889| 0.091| 132.1|
|
48 |
+
| [vidtok_fsq_noncausal_41616_262144](https://huggingface.co/microsoft/VidTok/blob/main/checkpoints/vidtok_fsq_noncausal_41616_262144.ckpt) | FSQ-262,144 | ✖️ | 4x16x16 | 26.37 | 0.772| 0.171| 357.0|
|
49 |
|
50 |
## Training
|
51 |
### Training Data
|