Compare commits

...

1 Commits

Author SHA1 Message Date
sayakpaul
6bc78b3902 fix: video rendering on svd. 2023-12-26 09:36:16 +05:30

View File

@@ -44,7 +44,7 @@ pipe = StableVideoDiffusionPipeline.from_pretrained(
pipe.enable_model_cpu_offload() pipe.enable_model_cpu_offload()
# Load the conditioning image # Load the conditioning image
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png?download=true") image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png")
image = image.resize((1024, 576)) image = image.resize((1024, 576))
generator = torch.manual_seed(42) generator = torch.manual_seed(42)
@@ -58,6 +58,11 @@ export_to_video(frames, "generated.mp4", fps=7)
<source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket_generated.mp4" type="video/mp4" /> <source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket_generated.mp4" type="video/mp4" />
</video> </video>
| **Source Image** | **Video** |
|:------------:|:-----:|
| ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png) | ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/output_rocket.gif) |
<Tip> <Tip>
Since generating videos is more memory intensive we can use the `decode_chunk_size` argument to control how many frames are decoded at once. This will reduce the memory usage. It's recommended to tweak this value based on your GPU memory. Since generating videos is more memory intensive we can use the `decode_chunk_size` argument to control how many frames are decoded at once. This will reduce the memory usage. It's recommended to tweak this value based on your GPU memory.
Setting `decode_chunk_size=1` will decode one frame at a time and will use the least amount of memory but the video might have some flickering. Setting `decode_chunk_size=1` will decode one frame at a time and will use the least amount of memory but the video might have some flickering.
@@ -120,7 +125,7 @@ pipe = StableVideoDiffusionPipeline.from_pretrained(
pipe.enable_model_cpu_offload() pipe.enable_model_cpu_offload()
# Load the conditioning image # Load the conditioning image
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png?download=true") image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png")
image = image.resize((1024, 576)) image = image.resize((1024, 576))
generator = torch.manual_seed(42) generator = torch.manual_seed(42)
@@ -128,7 +133,5 @@ frames = pipe(image, decode_chunk_size=8, generator=generator, motion_bucket_id=
export_to_video(frames, "generated.mp4", fps=7) export_to_video(frames, "generated.mp4", fps=7)
``` ```
<video width="1024" height="576" controls> ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/output_rocket_with_conditions.gif)
<source src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket_generated_motion.mp4" type="video/mp4">
</video>