Hunyuan Video Lora - AnimeStills

Prompt
an anime illustration of kitsune, girl, blue eyes, braided hair, multicoloured hair, brown hair, pink hair, brown fox ears, brown fox tail, fantasy school uniform, open shoulders, masterpiece, best quality, with professional photography composition, dynamic lighting, well-balanced color and contrast, clear separation of subject and background, detailed, and storytelling.

EXPERIMENTAL: the model generates noisy, low-resolution illustration-like images. It can be used to guide more refined models such as SDXL for its natural language (and composition) capabilities, but use with a grain of salt if you plan to use it directly. Also, results might look 'old-time anime' due to dataset used.

A experimental model that uses HunyuanVideo as a image generator. outputs images at 768 resolution.

In a typical HunyuanVideo workflow, set 'frame' to 1 and add this lora to get an anime illustration-like output.

Trigger words

You should use an anime illustration of to trigger the image generation.

Resolutions

Use the following resolution for the best results:

(768, 768)
(672, 864), (864, 672)
(608, 960), (960, 608)
(544, 1088), (1088, 544)

Training

The model has been trained on a tag-balanced dataset of 2k best pixiv illustrations, at resolution of 768, for 856 eopchs (214 epochs * 4 repeats per epoch).

The training takes about 3 days on a 8 x H100 cluster. By the time training ends the loss is still consistently going down, so further training could be beneficial.

Download model

Weights for this model are available in Safetensors format.

Download them in the Files & versions tab.

Limitations

The model outputs could be deformed, not conforming to prompt, turning realistic, or getting nsfw results, due to the limited size of dataset used and limitations of lora models.