π£ Please follow me for new updates https://twitter.com/camenduru
π₯ Please join our discord server https://discord.gg/k5BwmmvJJU
Potat 1οΈβ£
First Open-Source 1024x576 Text To Video Model π₯³
https://huggingface.co/vdo/potat1-5000/tree/main
https://huggingface.co/vdo/potat1-10000/tree/main
https://huggingface.co/vdo/potat1-10000-base-text-encoder/tree/main
https://huggingface.co/vdo/potat1-15000/tree/main
https://huggingface.co/vdo/potat1-20000/tree/main
https://huggingface.co/vdo/potat1-25000/tree/main
https://huggingface.co/vdo/potat1-30000/tree/main
https://huggingface.co/vdo/potat1-35000/tree/main
https://huggingface.co/vdo/potat1-40000/tree/main
https://huggingface.co/vdo/potat1-45000/tree/main
https://huggingface.co/vdo/potat1-50000/tree/main
https://huggingface.co/vdo/potat1-50000-base-text-encoder/tree/main = https://huggingface.co/camenduru/potat1 (you are here)
Info
Prototype Model
Trained with https://lambdalabs.com β€ 1xA100 (40GB)
2197 clips, 68388 tagged frames ( salesforce/blip2-opt-6.7b-coco )
train_steps: 10000
Dataset & Config
https://huggingface.co/camenduru/potat1_dataset/tree/main
Finetuning
https://github.com/Breakthrough/PySceneDetect
https://github.com/ExponentialML/Video-BLIP2-Preprocessor
https://github.com/ExponentialML/Text-To-Video-Finetuning
https://github.com/camenduru/Text-To-Video-Finetuning-colab
Base Model
https://huggingface.co/damo-vilab/modelscope-damo-text-to-video-synthesis
https://www.modelscope.cn/models/damo/text-to-video-synthesis
Thanks to damo-vilab β€ ExponentialML β€ kabachuha β€ @DiffusersLib β€ @LambdaAPI β€ @cerspense β€ @CiaraRowles1 β€ @p1atdev_art β€
Thanks to Orellius β€ (important bug report)
Please try it π£
https://github.com/camenduru/text-to-video-synthesis-colab
Potat 2οΈβ£ is in the oven β¨
- Downloads last month
- 71