mirror of
https://github.com/huggingface/diffusers.git
synced 2025-12-07 13:04:15 +08:00
Compare commits
321 Commits
fix_indent
...
controlnet
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
47ee2a737a | ||
|
|
e0e9f81971 | ||
|
|
5d848ec07c | ||
|
|
4974b84564 | ||
|
|
83062fb872 | ||
|
|
b6d7e31d10 | ||
|
|
94fc2d3fe6 | ||
|
|
503e359204 | ||
|
|
53e9aacc10 | ||
|
|
41424466e3 | ||
|
|
95de1981c9 | ||
|
|
0b45b58867 | ||
|
|
d3986f18be | ||
|
|
ee6a3a993d | ||
|
|
b300517305 | ||
|
|
ac07b6dc6a | ||
|
|
46ab56a468 | ||
|
|
038ff70023 | ||
|
|
00eca4b887 | ||
|
|
30132aba30 | ||
|
|
a17d6d6858 | ||
|
|
8efd9ce787 | ||
|
|
299c16d0f5 | ||
|
|
69f49195ac | ||
|
|
ed224f94ba | ||
|
|
531e719163 | ||
|
|
4fbd310fd2 | ||
|
|
2ea28d69dc | ||
|
|
a1cb106459 | ||
|
|
5dd8e04d4b | ||
|
|
165af7edd3 | ||
|
|
6c5f0de713 | ||
|
|
e64fdcf2ce | ||
|
|
ec64f371b1 | ||
|
|
cd6e1f1171 | ||
|
|
6f2b310a17 | ||
|
|
e3cd6cae50 | ||
|
|
e5ee05da76 | ||
|
|
e6ff752840 | ||
|
|
3f9c746fb2 | ||
|
|
1f22c98820 | ||
|
|
b4226bd6a7 | ||
|
|
46fac824be | ||
|
|
b33b64f595 | ||
|
|
9d9744075e | ||
|
|
d9a3b69806 | ||
|
|
f7e5954d5e | ||
|
|
8e19c073e5 | ||
|
|
f6df16cbb8 | ||
|
|
b24f78349c | ||
|
|
3ce905c9d0 | ||
|
|
f539497ab4 | ||
|
|
39dfb7abbd | ||
|
|
196835695e | ||
|
|
0d4dfbbd0a | ||
|
|
ada3bb941b | ||
|
|
b5814c5555 | ||
|
|
9940573618 | ||
|
|
59433ca1ae | ||
|
|
534f5d54fa | ||
|
|
40aa47b998 | ||
|
|
1bc0d37ffe | ||
|
|
eb942b866a | ||
|
|
687bc27727 | ||
|
|
6246c70d21 | ||
|
|
577b8a2783 | ||
|
|
13f0c8b219 | ||
|
|
fa1bdce3d4 | ||
|
|
ca6cdc77a9 | ||
|
|
f4977abcd8 | ||
|
|
df8559a7f9 | ||
|
|
8f206a5873 | ||
|
|
8da360aa12 | ||
|
|
869bad3e52 | ||
|
|
01ee0978cc | ||
|
|
56b68459f5 | ||
|
|
2ca264244b | ||
|
|
b9e1c30d0e | ||
|
|
03cd62520f | ||
|
|
001b14023e | ||
|
|
f55873b783 | ||
|
|
ccb93dcad1 | ||
|
|
ec953047bc | ||
|
|
9a2600ede9 | ||
|
|
5f150c4cef | ||
|
|
66f8bd6869 | ||
|
|
64a8cd627a | ||
|
|
5d3923b670 | ||
|
|
9451235e5a | ||
|
|
c2b6ac4e34 | ||
|
|
06b01ea87e | ||
|
|
f4fc75035f | ||
|
|
8f2d13c684 | ||
|
|
fcfa270fbd | ||
|
|
56dac1cedc | ||
|
|
3daebe2b44 | ||
|
|
abd922bd0c | ||
|
|
fa633ed6de | ||
|
|
2cad1a8465 | ||
|
|
e6cf21906d | ||
|
|
7db935a141 | ||
|
|
fa9bc029b4 | ||
|
|
2e31a759b5 | ||
|
|
e51862bbed | ||
|
|
8492db2332 | ||
|
|
f57e7bd92c | ||
|
|
3e3d46924b | ||
|
|
d71ecad8cd | ||
|
|
ac49f97a75 | ||
|
|
04bafcbbc2 | ||
|
|
7081a25618 | ||
|
|
848f9fe6ce | ||
|
|
8a692739c0 | ||
|
|
5aa31bd674 | ||
|
|
88aa7f6ebf | ||
|
|
ad310af0d6 | ||
|
|
d603ccb614 | ||
|
|
fd0f469568 | ||
|
|
ae84e405a3 | ||
|
|
3a66113306 | ||
|
|
7f16187182 | ||
|
|
f11b922b4f | ||
|
|
3dd4168d4c | ||
|
|
1c47d1fc05 | ||
|
|
bbf70c8739 | ||
|
|
738c986957 | ||
|
|
c09bb588d3 | ||
|
|
66a7160f9d | ||
|
|
f05ee56b2f | ||
|
|
34cc7f9b98 | ||
|
|
53605ed00a | ||
|
|
bb1b76d3bf | ||
|
|
e4b8f173b9 | ||
|
|
f0216b7756 | ||
|
|
d5f444de4b | ||
|
|
5a54dc9e95 | ||
|
|
6fedbd850a | ||
|
|
1b3cfb1b10 | ||
|
|
af13a90ebd | ||
|
|
3067da1261 | ||
|
|
6bceaea3fe | ||
|
|
baf9924be7 | ||
|
|
d8d208acde | ||
|
|
e0f33dfca4 | ||
|
|
15b125bb0e | ||
|
|
12004bf3a7 | ||
|
|
d2fc5ebb95 | ||
|
|
779eef95b4 | ||
|
|
d5b8d1ca04 | ||
|
|
eba7e7a6d7 | ||
|
|
31de879fb4 | ||
|
|
07349c25fe | ||
|
|
8974c50bff | ||
|
|
c18058b405 | ||
|
|
2938d5a672 | ||
|
|
d4ade821cd | ||
|
|
3a7e481611 | ||
|
|
d649d6c6f3 | ||
|
|
777063e1bf | ||
|
|
104afbce84 | ||
|
|
c0f5346a20 | ||
|
|
087daee2f0 | ||
|
|
7e164d98a8 | ||
|
|
e6d1728e0a | ||
|
|
8f2c7b4df0 | ||
|
|
2e387dad5f | ||
|
|
9efe1e52c3 | ||
|
|
37b09517b9 | ||
|
|
4343ce2c8e | ||
|
|
0ca7b68198 | ||
|
|
3cf4f9c735 | ||
|
|
40dd9cb2bd | ||
|
|
30bcda7de6 | ||
|
|
9ea62d119a | ||
|
|
a326d61118 | ||
|
|
e7696e20f9 | ||
|
|
4b89aeffe1 | ||
|
|
0a1daadef8 | ||
|
|
371f765908 | ||
|
|
75aee39eac | ||
|
|
215e6804d3 | ||
|
|
9254d1f39a | ||
|
|
e1bdcc7af3 | ||
|
|
84905ca728 | ||
|
|
6f336650c3 | ||
|
|
06a042cd0e | ||
|
|
8772496586 | ||
|
|
35fd84be27 | ||
|
|
f2756253e6 | ||
|
|
0071478d9e | ||
|
|
7c8cab313e | ||
|
|
ca9ed5e8d1 | ||
|
|
98b6bee1a1 | ||
|
|
ab7113487c | ||
|
|
59c307f1d5 | ||
|
|
159885adc6 | ||
|
|
7337eea59b | ||
|
|
f07899a57c | ||
|
|
a83cc0c0bc | ||
|
|
db5194a45d | ||
|
|
e6c9c2513f | ||
|
|
d643b6691f | ||
|
|
f5c9be3a0a | ||
|
|
1824d0050e | ||
|
|
30e5e81d58 | ||
|
|
8de78001df | ||
|
|
3ac2357794 | ||
|
|
17808a091e | ||
|
|
491a933a1b | ||
|
|
aa82df52e7 | ||
|
|
a11b0f83b7 | ||
|
|
1835510524 | ||
|
|
4a3d52850b | ||
|
|
97d004b9b4 | ||
|
|
76696dca55 | ||
|
|
17612de451 | ||
|
|
994360f7a5 | ||
|
|
e6a48db633 | ||
|
|
4f1df69d1a | ||
|
|
15f6b22466 | ||
|
|
e6fd9ada3a | ||
|
|
493228a708 | ||
|
|
8bf046b7fb | ||
|
|
bb99623d09 | ||
|
|
fdf55b1f1c | ||
|
|
c6f8c310c3 | ||
|
|
64909f17b7 | ||
|
|
f09ca909c8 | ||
|
|
a5fc62f819 | ||
|
|
fbdf26bac5 | ||
|
|
13001ee315 | ||
|
|
65329aed98 | ||
|
|
02338c9317 | ||
|
|
15ed53d272 | ||
|
|
9cc59ba089 | ||
|
|
adcbe674a4 | ||
|
|
ec9840a5db | ||
|
|
093a03a1a1 | ||
|
|
c3369f5673 | ||
|
|
04cd6adf8c | ||
|
|
66722dbea7 | ||
|
|
2e8d18e699 | ||
|
|
03373de0db | ||
|
|
56bea6b4a1 | ||
|
|
d7dc0ffd79 | ||
|
|
97ee616971 | ||
|
|
0fc62d1702 | ||
|
|
f4d3f913f4 | ||
|
|
1cab64b3be | ||
|
|
8d7dc85312 | ||
|
|
87a92f779c | ||
|
|
0db766ba77 | ||
|
|
8e94663503 | ||
|
|
b09b90e24c | ||
|
|
058b47553e | ||
|
|
7f58a76f48 | ||
|
|
09b7bfce91 | ||
|
|
5d8b1987ec | ||
|
|
acd1962769 | ||
|
|
5b1b80a5b6 | ||
|
|
8581d9bce4 | ||
|
|
c101066227 | ||
|
|
d4c7ab7bf1 | ||
|
|
ea9dc3fa90 | ||
|
|
b4220e97b1 | ||
|
|
dc85b578c2 | ||
|
|
0d927c7542 | ||
|
|
5b93338235 | ||
|
|
7c1c705f60 | ||
|
|
9e72016468 | ||
|
|
3e9716f22b | ||
|
|
87bfbc320d | ||
|
|
a517f665a4 | ||
|
|
16748d1eba | ||
|
|
c9081a8abd | ||
|
|
0eb68d9ddb | ||
|
|
9941b3f124 | ||
|
|
16b9f98b48 | ||
|
|
fee93c81eb | ||
|
|
5308cce994 | ||
|
|
318556b20e | ||
|
|
6620eda357 | ||
|
|
1f0705adcf | ||
|
|
5e96333cb2 | ||
|
|
da95a28ff6 | ||
|
|
d66d554dc2 | ||
|
|
c7df846dec | ||
|
|
8e7bbfbe5a | ||
|
|
e2773c6255 | ||
|
|
ac61eefc9f | ||
|
|
f95615b823 | ||
|
|
a9288b49c9 | ||
|
|
c54419658b | ||
|
|
6382663dc8 | ||
|
|
58b8dce129 | ||
|
|
a65ca8a059 | ||
|
|
5ca062e011 | ||
|
|
619e3ab6f6 | ||
|
|
9e2804f720 | ||
|
|
9112028ed8 | ||
|
|
dce06680d2 | ||
|
|
dd63168319 | ||
|
|
1040dfd9cc | ||
|
|
49a4b377c1 | ||
|
|
dff35a86e4 | ||
|
|
8842bcadb9 | ||
|
|
181280baba | ||
|
|
53f498d2a4 | ||
|
|
990860911f | ||
|
|
23eed39702 | ||
|
|
fefed44543 | ||
|
|
814f56d2fe | ||
|
|
96d6e16550 | ||
|
|
c11de13588 | ||
|
|
357855f8fc | ||
|
|
f825221b5d | ||
|
|
119d734f6e | ||
|
|
cb4b3f0b78 | ||
|
|
3d574b3bbe | ||
|
|
09903774d9 | ||
|
|
d6a70d8ba8 |
38
.github/ISSUE_TEMPLATE/bug-report.yml
vendored
38
.github/ISSUE_TEMPLATE/bug-report.yml
vendored
@@ -66,32 +66,32 @@ body:
|
|||||||
Questions on DiffusionPipeline (Saving, Loading, From pretrained, ...):
|
Questions on DiffusionPipeline (Saving, Loading, From pretrained, ...):
|
||||||
|
|
||||||
Questions on pipelines:
|
Questions on pipelines:
|
||||||
- Stable Diffusion @yiyixuxu @DN6 @sayakpaul @patrickvonplaten
|
- Stable Diffusion @yiyixuxu @DN6 @sayakpaul
|
||||||
- Stable Diffusion XL @yiyixuxu @sayakpaul @DN6 @patrickvonplaten
|
- Stable Diffusion XL @yiyixuxu @sayakpaul @DN6
|
||||||
- Kandinsky @yiyixuxu @patrickvonplaten
|
- Kandinsky @yiyixuxu
|
||||||
- ControlNet @sayakpaul @yiyixuxu @DN6 @patrickvonplaten
|
- ControlNet @sayakpaul @yiyixuxu @DN6
|
||||||
- T2I Adapter @sayakpaul @yiyixuxu @DN6 @patrickvonplaten
|
- T2I Adapter @sayakpaul @yiyixuxu @DN6
|
||||||
- IF @DN6 @patrickvonplaten
|
- IF @DN6
|
||||||
- Text-to-Video / Video-to-Video @DN6 @sayakpaul @patrickvonplaten
|
- Text-to-Video / Video-to-Video @DN6 @sayakpaul
|
||||||
- Wuerstchen @DN6 @patrickvonplaten
|
- Wuerstchen @DN6
|
||||||
- Other: @yiyixuxu @DN6
|
- Other: @yiyixuxu @DN6
|
||||||
|
|
||||||
Questions on models:
|
Questions on models:
|
||||||
- UNet @DN6 @yiyixuxu @sayakpaul @patrickvonplaten
|
- UNet @DN6 @yiyixuxu @sayakpaul
|
||||||
- VAE @sayakpaul @DN6 @yiyixuxu @patrickvonplaten
|
- VAE @sayakpaul @DN6 @yiyixuxu
|
||||||
- Transformers/Attention @DN6 @yiyixuxu @sayakpaul @DN6 @patrickvonplaten
|
- Transformers/Attention @DN6 @yiyixuxu @sayakpaul @DN6
|
||||||
|
|
||||||
Questions on Schedulers: @yiyixuxu @patrickvonplaten
|
Questions on Schedulers: @yiyixuxu
|
||||||
|
|
||||||
Questions on LoRA: @sayakpaul @patrickvonplaten
|
Questions on LoRA: @sayakpaul
|
||||||
|
|
||||||
Questions on Textual Inversion: @sayakpaul @patrickvonplaten
|
Questions on Textual Inversion: @sayakpaul
|
||||||
|
|
||||||
Questions on Training:
|
Questions on Training:
|
||||||
- DreamBooth @sayakpaul @patrickvonplaten
|
- DreamBooth @sayakpaul
|
||||||
- Text-to-Image Fine-tuning @sayakpaul @patrickvonplaten
|
- Text-to-Image Fine-tuning @sayakpaul
|
||||||
- Textual Inversion @sayakpaul @patrickvonplaten
|
- Textual Inversion @sayakpaul
|
||||||
- ControlNet @sayakpaul @patrickvonplaten
|
- ControlNet @sayakpaul
|
||||||
|
|
||||||
Questions on Tests: @DN6 @sayakpaul @yiyixuxu
|
Questions on Tests: @DN6 @sayakpaul @yiyixuxu
|
||||||
|
|
||||||
@@ -99,7 +99,7 @@ body:
|
|||||||
|
|
||||||
Questions on JAX- and MPS-related things: @pcuenca
|
Questions on JAX- and MPS-related things: @pcuenca
|
||||||
|
|
||||||
Questions on audio pipelines: @DN6 @patrickvonplaten
|
Questions on audio pipelines: @DN6
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
10
.github/PULL_REQUEST_TEMPLATE.md
vendored
10
.github/PULL_REQUEST_TEMPLATE.md
vendored
@@ -38,13 +38,13 @@ members/contributors who may be interested in your PR.
|
|||||||
|
|
||||||
Core library:
|
Core library:
|
||||||
|
|
||||||
- Schedulers: @yiyixuxu and @patrickvonplaten
|
- Schedulers: @yiyixuxu
|
||||||
- Pipelines: @patrickvonplaten and @sayakpaul
|
- Pipelines: @sayakpaul @yiyixuxu @DN6
|
||||||
- Training examples: @sayakpaul and @patrickvonplaten
|
- Training examples: @sayakpaul
|
||||||
- Docs: @stevhliu and @yiyixuxu
|
- Docs: @stevhliu and @sayakpaul
|
||||||
- JAX and MPS: @pcuenca
|
- JAX and MPS: @pcuenca
|
||||||
- Audio: @sanchit-gandhi
|
- Audio: @sanchit-gandhi
|
||||||
- General functionalities: @patrickvonplaten and @sayakpaul
|
- General functionalities: @sayakpaul @yiyixuxu @DN6
|
||||||
|
|
||||||
Integrations:
|
Integrations:
|
||||||
|
|
||||||
|
|||||||
6
.github/workflows/benchmark.yml
vendored
6
.github/workflows/benchmark.yml
vendored
@@ -1,6 +1,7 @@
|
|||||||
name: Benchmarking tests
|
name: Benchmarking tests
|
||||||
|
|
||||||
on:
|
on:
|
||||||
|
workflow_dispatch:
|
||||||
schedule:
|
schedule:
|
||||||
- cron: "30 1 1,15 * *" # every 2 weeks on the 1st and the 15th of every month at 1:30 AM
|
- cron: "30 1 1,15 * *" # every 2 weeks on the 1st and the 15th of every month at 1:30 AM
|
||||||
|
|
||||||
@@ -31,8 +32,9 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pip install pandas
|
python -m uv pip install -e [quality,test]
|
||||||
|
python -m uv pip install pandas peft
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
|
|||||||
69
.github/workflows/build_docker_images.yml
vendored
69
.github/workflows/build_docker_images.yml
vendored
@@ -1,21 +1,58 @@
|
|||||||
name: Build Docker images (nightly)
|
name: Test, build, and push Docker images
|
||||||
|
|
||||||
on:
|
on:
|
||||||
|
pull_request: # During PRs, we just check if the changes Dockerfiles can be successfully built
|
||||||
|
branches:
|
||||||
|
- main
|
||||||
|
paths:
|
||||||
|
- "docker/**"
|
||||||
workflow_dispatch:
|
workflow_dispatch:
|
||||||
schedule:
|
schedule:
|
||||||
- cron: "0 0 * * *" # every day at midnight
|
- cron: "0 0 * * *" # every day at midnight
|
||||||
|
|
||||||
concurrency:
|
concurrency:
|
||||||
group: docker-image-builds
|
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
||||||
cancel-in-progress: false
|
cancel-in-progress: true
|
||||||
|
|
||||||
env:
|
env:
|
||||||
REGISTRY: diffusers
|
REGISTRY: diffusers
|
||||||
|
CI_SLACK_CHANNEL: ${{ secrets.CI_DOCKER_CHANNEL }}
|
||||||
|
|
||||||
jobs:
|
jobs:
|
||||||
build-docker-images:
|
test-build-docker-images:
|
||||||
runs-on: ubuntu-latest
|
runs-on: ubuntu-latest
|
||||||
|
if: github.event_name == 'pull_request'
|
||||||
|
steps:
|
||||||
|
- name: Set up Docker Buildx
|
||||||
|
uses: docker/setup-buildx-action@v1
|
||||||
|
|
||||||
|
- name: Check out code
|
||||||
|
uses: actions/checkout@v3
|
||||||
|
|
||||||
|
- name: Find Changed Dockerfiles
|
||||||
|
id: file_changes
|
||||||
|
uses: jitterbit/get-changed-files@v1
|
||||||
|
with:
|
||||||
|
format: 'space-delimited'
|
||||||
|
token: ${{ secrets.GITHUB_TOKEN }}
|
||||||
|
|
||||||
|
- name: Build Changed Docker Images
|
||||||
|
run: |
|
||||||
|
CHANGED_FILES="${{ steps.file_changes.outputs.all }}"
|
||||||
|
for FILE in $CHANGED_FILES; do
|
||||||
|
if [[ "$FILE" == docker/*Dockerfile ]]; then
|
||||||
|
DOCKER_PATH="${FILE%/Dockerfile}"
|
||||||
|
DOCKER_TAG=$(basename "$DOCKER_PATH")
|
||||||
|
echo "Building Docker image for $DOCKER_TAG"
|
||||||
|
docker build -t "$DOCKER_TAG" "$DOCKER_PATH"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
if: steps.file_changes.outputs.all != ''
|
||||||
|
|
||||||
|
build-and-push-docker-images:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
if: github.event_name != 'pull_request'
|
||||||
|
|
||||||
permissions:
|
permissions:
|
||||||
contents: read
|
contents: read
|
||||||
packages: write
|
packages: write
|
||||||
@@ -50,3 +87,27 @@ jobs:
|
|||||||
context: ./docker/${{ matrix.image-name }}
|
context: ./docker/${{ matrix.image-name }}
|
||||||
push: true
|
push: true
|
||||||
tags: ${{ env.REGISTRY }}/${{ matrix.image-name }}:latest
|
tags: ${{ env.REGISTRY }}/${{ matrix.image-name }}:latest
|
||||||
|
|
||||||
|
- name: Post to a Slack channel
|
||||||
|
id: slack
|
||||||
|
uses: slackapi/slack-github-action@6c661ce58804a1a20f6dc5fbee7f0381b469e001
|
||||||
|
with:
|
||||||
|
# Slack channel id, channel name, or user id to post message.
|
||||||
|
# See also: https://api.slack.com/methods/chat.postMessage#channels
|
||||||
|
channel-id: ${{ env.CI_SLACK_CHANNEL }}
|
||||||
|
# For posting a rich message using Block Kit
|
||||||
|
payload: |
|
||||||
|
{
|
||||||
|
"text": "${{ matrix.image-name }} Docker Image build result: ${{ job.status }}\n${{ github.event.head_commit.url }}",
|
||||||
|
"blocks": [
|
||||||
|
{
|
||||||
|
"type": "section",
|
||||||
|
"text": {
|
||||||
|
"type": "mrkdwn",
|
||||||
|
"text": "${{ matrix.image-name }} Docker Image build result: ${{ job.status }}\n${{ github.event.head_commit.url }}"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
env:
|
||||||
|
SLACK_BOT_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||||
|
|||||||
4
.github/workflows/build_documentation.yml
vendored
4
.github/workflows/build_documentation.yml
vendored
@@ -7,6 +7,10 @@ on:
|
|||||||
- doc-builder*
|
- doc-builder*
|
||||||
- v*-release
|
- v*-release
|
||||||
- v*-patch
|
- v*-patch
|
||||||
|
paths:
|
||||||
|
- "src/diffusers/**.py"
|
||||||
|
- "examples/**"
|
||||||
|
- "docs/**"
|
||||||
|
|
||||||
jobs:
|
jobs:
|
||||||
build:
|
build:
|
||||||
|
|||||||
4
.github/workflows/build_pr_documentation.yml
vendored
4
.github/workflows/build_pr_documentation.yml
vendored
@@ -2,6 +2,10 @@ name: Build PR Documentation
|
|||||||
|
|
||||||
on:
|
on:
|
||||||
pull_request:
|
pull_request:
|
||||||
|
paths:
|
||||||
|
- "src/diffusers/**.py"
|
||||||
|
- "examples/**"
|
||||||
|
- "docs/**"
|
||||||
|
|
||||||
concurrency:
|
concurrency:
|
||||||
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
||||||
|
|||||||
42
.github/workflows/nightly_tests.yml
vendored
42
.github/workflows/nightly_tests.yml
vendored
@@ -12,6 +12,7 @@ env:
|
|||||||
PYTEST_TIMEOUT: 600
|
PYTEST_TIMEOUT: 600
|
||||||
RUN_SLOW: yes
|
RUN_SLOW: yes
|
||||||
RUN_NIGHTLY: yes
|
RUN_NIGHTLY: yes
|
||||||
|
SLACK_API_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
|
||||||
|
|
||||||
jobs:
|
jobs:
|
||||||
run_nightly_tests:
|
run_nightly_tests:
|
||||||
@@ -60,9 +61,11 @@ jobs:
|
|||||||
|
|
||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pip install -U git+https://github.com/huggingface/transformers
|
python -m uv pip install -e [quality,test]
|
||||||
python -m pip install git+https://github.com/huggingface/accelerate
|
python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers
|
||||||
|
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate
|
||||||
|
python -m uv pip install pytest-reportlog
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
@@ -73,19 +76,23 @@ jobs:
|
|||||||
env:
|
env:
|
||||||
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
|
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
|
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
|
||||||
-s -v -k "not Flax and not Onnx" \
|
-s -v -k "not Flax and not Onnx" \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
tests/
|
--report-log=${{ matrix.config.report }}.log \
|
||||||
|
tests/
|
||||||
|
|
||||||
- name: Run nightly Flax TPU tests
|
- name: Run nightly Flax TPU tests
|
||||||
if: ${{ matrix.config.framework == 'flax' }}
|
if: ${{ matrix.config.framework == 'flax' }}
|
||||||
env:
|
env:
|
||||||
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
|
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pytest -n 0 \
|
python -m pytest -n 0 \
|
||||||
-s -v -k "Flax" \
|
-s -v -k "Flax" \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
|
--report-log=${{ matrix.config.report }}.log \
|
||||||
tests/
|
tests/
|
||||||
|
|
||||||
- name: Run nightly ONNXRuntime CUDA tests
|
- name: Run nightly ONNXRuntime CUDA tests
|
||||||
@@ -93,9 +100,11 @@ jobs:
|
|||||||
env:
|
env:
|
||||||
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
|
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
|
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
|
||||||
-s -v -k "Onnx" \
|
-s -v -k "Onnx" \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
|
--report-log=${{ matrix.config.report }}.log \
|
||||||
tests/
|
tests/
|
||||||
|
|
||||||
- name: Failure short reports
|
- name: Failure short reports
|
||||||
@@ -108,6 +117,12 @@ jobs:
|
|||||||
with:
|
with:
|
||||||
name: ${{ matrix.config.report }}_test_reports
|
name: ${{ matrix.config.report }}_test_reports
|
||||||
path: reports
|
path: reports
|
||||||
|
|
||||||
|
- name: Generate Report and Notify Channel
|
||||||
|
if: always()
|
||||||
|
run: |
|
||||||
|
pip install slack_sdk tabulate
|
||||||
|
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
|
||||||
|
|
||||||
run_nightly_tests_apple_m1:
|
run_nightly_tests_apple_m1:
|
||||||
name: Nightly PyTorch MPS tests on MacOS
|
name: Nightly PyTorch MPS tests on MacOS
|
||||||
@@ -132,10 +147,11 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
shell: arch -arch arm64 bash {0}
|
shell: arch -arch arm64 bash {0}
|
||||||
run: |
|
run: |
|
||||||
${CONDA_RUN} python -m pip install --upgrade pip
|
${CONDA_RUN} python -m pip install --upgrade pip uv
|
||||||
${CONDA_RUN} python -m pip install -e .[quality,test]
|
${CONDA_RUN} python -m uv pip install -e [quality,test]
|
||||||
${CONDA_RUN} python -m pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
|
${CONDA_RUN} python -m uv pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
|
||||||
${CONDA_RUN} python -m pip install git+https://github.com/huggingface/accelerate
|
${CONDA_RUN} python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate
|
||||||
|
${CONDA_RUN} python -m uv pip install pytest-reportlog
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
shell: arch -arch arm64 bash {0}
|
shell: arch -arch arm64 bash {0}
|
||||||
@@ -148,7 +164,9 @@ jobs:
|
|||||||
HF_HOME: /System/Volumes/Data/mnt/cache
|
HF_HOME: /System/Volumes/Data/mnt/cache
|
||||||
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
|
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
|
||||||
run: |
|
run: |
|
||||||
${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps tests/
|
${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps \
|
||||||
|
--report-log=tests_torch_mps.log \
|
||||||
|
tests/
|
||||||
|
|
||||||
- name: Failure short reports
|
- name: Failure short reports
|
||||||
if: ${{ failure() }}
|
if: ${{ failure() }}
|
||||||
@@ -160,3 +178,9 @@ jobs:
|
|||||||
with:
|
with:
|
||||||
name: torch_mps_test_reports
|
name: torch_mps_test_reports
|
||||||
path: reports
|
path: reports
|
||||||
|
|
||||||
|
- name: Generate Report and Notify Channel
|
||||||
|
if: always()
|
||||||
|
run: |
|
||||||
|
pip install slack_sdk tabulate
|
||||||
|
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
|
||||||
|
|||||||
23
.github/workflows/notify_slack_about_release.yml
vendored
Normal file
23
.github/workflows/notify_slack_about_release.yml
vendored
Normal file
@@ -0,0 +1,23 @@
|
|||||||
|
name: Notify Slack about a release
|
||||||
|
|
||||||
|
on:
|
||||||
|
workflow_dispatch:
|
||||||
|
release:
|
||||||
|
types: [published]
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
build:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v3
|
||||||
|
|
||||||
|
- name: Setup Python
|
||||||
|
uses: actions/setup-python@v4
|
||||||
|
with:
|
||||||
|
python-version: '3.8'
|
||||||
|
|
||||||
|
- name: Notify Slack about the release
|
||||||
|
env:
|
||||||
|
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
|
||||||
|
run: pip install requests && python utils/notify_slack_about_release.py
|
||||||
10
.github/workflows/pr_dependency_test.yml
vendored
10
.github/workflows/pr_dependency_test.yml
vendored
@@ -4,6 +4,8 @@ on:
|
|||||||
pull_request:
|
pull_request:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- main
|
||||||
|
paths:
|
||||||
|
- "src/diffusers/**.py"
|
||||||
push:
|
push:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- main
|
||||||
@@ -23,10 +25,12 @@ jobs:
|
|||||||
python-version: "3.8"
|
python-version: "3.8"
|
||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
python -m pip install --upgrade pip
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
pip install -e .
|
python -m pip install --upgrade pip uv
|
||||||
pip install pytest
|
python -m uv pip install -e .
|
||||||
|
python -m uv pip install pytest
|
||||||
- name: Check for soft dependencies
|
- name: Check for soft dependencies
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
pytest tests/others/test_dependencies.py
|
pytest tests/others/test_dependencies.py
|
||||||
|
|
||||||
16
.github/workflows/pr_flax_dependency_test.yml
vendored
16
.github/workflows/pr_flax_dependency_test.yml
vendored
@@ -4,6 +4,8 @@ on:
|
|||||||
pull_request:
|
pull_request:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- main
|
||||||
|
paths:
|
||||||
|
- "src/diffusers/**.py"
|
||||||
push:
|
push:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- main
|
||||||
@@ -23,12 +25,14 @@ jobs:
|
|||||||
python-version: "3.8"
|
python-version: "3.8"
|
||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
python -m pip install --upgrade pip
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
pip install -e .
|
python -m pip install --upgrade pip uv
|
||||||
pip install "jax[cpu]>=0.2.16,!=0.3.2"
|
python -m uv pip install -e .
|
||||||
pip install "flax>=0.4.1"
|
python -m uv pip install "jax[cpu]>=0.2.16,!=0.3.2"
|
||||||
pip install "jaxlib>=0.1.65"
|
python -m uv pip install "flax>=0.4.1"
|
||||||
pip install pytest
|
python -m uv pip install "jaxlib>=0.1.65"
|
||||||
|
python -m uv pip install pytest
|
||||||
- name: Check for soft dependencies
|
- name: Check for soft dependencies
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
pytest tests/others/test_dependencies.py
|
pytest tests/others/test_dependencies.py
|
||||||
|
|||||||
49
.github/workflows/pr_quality.yml
vendored
49
.github/workflows/pr_quality.yml
vendored
@@ -1,49 +0,0 @@
|
|||||||
name: Run code quality checks
|
|
||||||
|
|
||||||
on:
|
|
||||||
pull_request:
|
|
||||||
branches:
|
|
||||||
- main
|
|
||||||
push:
|
|
||||||
branches:
|
|
||||||
- main
|
|
||||||
|
|
||||||
concurrency:
|
|
||||||
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
|
||||||
cancel-in-progress: true
|
|
||||||
|
|
||||||
jobs:
|
|
||||||
check_code_quality:
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
steps:
|
|
||||||
- uses: actions/checkout@v3
|
|
||||||
- name: Set up Python
|
|
||||||
uses: actions/setup-python@v4
|
|
||||||
with:
|
|
||||||
python-version: "3.8"
|
|
||||||
- name: Install dependencies
|
|
||||||
run: |
|
|
||||||
python -m pip install --upgrade pip
|
|
||||||
pip install .[quality]
|
|
||||||
- name: Check quality
|
|
||||||
run: |
|
|
||||||
ruff check examples tests src utils scripts
|
|
||||||
ruff format examples tests src utils scripts --check
|
|
||||||
|
|
||||||
check_repository_consistency:
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
steps:
|
|
||||||
- uses: actions/checkout@v3
|
|
||||||
- name: Set up Python
|
|
||||||
uses: actions/setup-python@v4
|
|
||||||
with:
|
|
||||||
python-version: "3.8"
|
|
||||||
- name: Install dependencies
|
|
||||||
run: |
|
|
||||||
python -m pip install --upgrade pip
|
|
||||||
pip install .[quality]
|
|
||||||
- name: Check quality
|
|
||||||
run: |
|
|
||||||
python utils/check_copies.py
|
|
||||||
python utils/check_dummies.py
|
|
||||||
make deps_table_check_updated
|
|
||||||
13
.github/workflows/pr_test_fetcher.yml
vendored
13
.github/workflows/pr_test_fetcher.yml
vendored
@@ -33,7 +33,8 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m uv pip install -e [quality,test]
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
@@ -89,15 +90,18 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m pip install -e [quality,test]
|
||||||
python -m pip install accelerate
|
python -m pip install accelerate
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
|
|
||||||
- name: Run all selected tests on CPU
|
- name: Run all selected tests on CPU
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pytest -n 2 --dist=loadfile -v --make-reports=${{ matrix.modules }}_tests_cpu ${{ fromJson(needs.setup_pr_tests.outputs.test_map)[matrix.modules] }}
|
python -m pytest -n 2 --dist=loadfile -v --make-reports=${{ matrix.modules }}_tests_cpu ${{ fromJson(needs.setup_pr_tests.outputs.test_map)[matrix.modules] }}
|
||||||
|
|
||||||
- name: Failure short reports
|
- name: Failure short reports
|
||||||
@@ -144,15 +148,18 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m pip install -e [quality,test]
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
|
|
||||||
- name: Run Hub tests for models, schedulers, and pipelines on a staging env
|
- name: Run Hub tests for models, schedulers, and pipelines on a staging env
|
||||||
if: ${{ matrix.config.framework == 'hub_tests_pytorch' }}
|
if: ${{ matrix.config.framework == 'hub_tests_pytorch' }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
HUGGINGFACE_CO_STAGING=true python -m pytest \
|
HUGGINGFACE_CO_STAGING=true python -m pytest \
|
||||||
-m "is_staging_test" \
|
-m "is_staging_test" \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
|
|||||||
55
.github/workflows/pr_test_peft_backend.yml
vendored
55
.github/workflows/pr_test_peft_backend.yml
vendored
@@ -4,6 +4,9 @@ on:
|
|||||||
pull_request:
|
pull_request:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- main
|
||||||
|
paths:
|
||||||
|
- "src/diffusers/**.py"
|
||||||
|
- "tests/**.py"
|
||||||
|
|
||||||
concurrency:
|
concurrency:
|
||||||
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
||||||
@@ -16,7 +19,44 @@ env:
|
|||||||
PYTEST_TIMEOUT: 60
|
PYTEST_TIMEOUT: 60
|
||||||
|
|
||||||
jobs:
|
jobs:
|
||||||
|
check_code_quality:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v3
|
||||||
|
- name: Set up Python
|
||||||
|
uses: actions/setup-python@v4
|
||||||
|
with:
|
||||||
|
python-version: "3.8"
|
||||||
|
- name: Install dependencies
|
||||||
|
run: |
|
||||||
|
python -m pip install --upgrade pip
|
||||||
|
pip install .[quality]
|
||||||
|
- name: Check quality
|
||||||
|
run: |
|
||||||
|
ruff check examples tests src utils scripts
|
||||||
|
ruff format examples tests src utils scripts --check
|
||||||
|
|
||||||
|
check_repository_consistency:
|
||||||
|
needs: check_code_quality
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v3
|
||||||
|
- name: Set up Python
|
||||||
|
uses: actions/setup-python@v4
|
||||||
|
with:
|
||||||
|
python-version: "3.8"
|
||||||
|
- name: Install dependencies
|
||||||
|
run: |
|
||||||
|
python -m pip install --upgrade pip
|
||||||
|
pip install .[quality]
|
||||||
|
- name: Check quality
|
||||||
|
run: |
|
||||||
|
python utils/check_copies.py
|
||||||
|
python utils/check_dummies.py
|
||||||
|
make deps_table_check_updated
|
||||||
|
|
||||||
run_fast_tests:
|
run_fast_tests:
|
||||||
|
needs: [check_code_quality, check_repository_consistency]
|
||||||
strategy:
|
strategy:
|
||||||
fail-fast: false
|
fail-fast: false
|
||||||
matrix:
|
matrix:
|
||||||
@@ -44,22 +84,25 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m uv pip install -e [quality,test]
|
||||||
if [ "${{ matrix.lib-versions }}" == "main" ]; then
|
if [ "${{ matrix.lib-versions }}" == "main" ]; then
|
||||||
python -m pip install -U git+https://github.com/huggingface/peft.git
|
python -m uv pip install -U peft@git+https://github.com/huggingface/peft.git
|
||||||
python -m pip install -U git+https://github.com/huggingface/transformers.git
|
python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git
|
||||||
python -m pip install -U git+https://github.com/huggingface/accelerate.git
|
python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
|
||||||
else
|
else
|
||||||
python -m pip install -U peft transformers accelerate
|
python -m uv pip install -U peft transformers accelerate
|
||||||
fi
|
fi
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
|
|
||||||
- name: Run fast PyTorch LoRA CPU tests with PEFT backend
|
- name: Run fast PyTorch LoRA CPU tests with PEFT backend
|
||||||
run: |
|
run: |
|
||||||
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
|
||||||
-s -v \
|
-s -v \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
tests/lora/test_lora_layers_peft.py
|
tests/lora/test_lora_layers_peft.py
|
||||||
|
|||||||
76
.github/workflows/pr_tests.yml
vendored
76
.github/workflows/pr_tests.yml
vendored
@@ -4,6 +4,14 @@ on:
|
|||||||
pull_request:
|
pull_request:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- main
|
||||||
|
paths:
|
||||||
|
- "src/diffusers/**.py"
|
||||||
|
- "benchmarks/**.py"
|
||||||
|
- "examples/**.py"
|
||||||
|
- "scripts/**.py"
|
||||||
|
- "tests/**.py"
|
||||||
|
- ".github/**.yml"
|
||||||
|
- "utils/**.py"
|
||||||
push:
|
push:
|
||||||
branches:
|
branches:
|
||||||
- ci-*
|
- ci-*
|
||||||
@@ -19,7 +27,44 @@ env:
|
|||||||
PYTEST_TIMEOUT: 60
|
PYTEST_TIMEOUT: 60
|
||||||
|
|
||||||
jobs:
|
jobs:
|
||||||
|
check_code_quality:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v3
|
||||||
|
- name: Set up Python
|
||||||
|
uses: actions/setup-python@v4
|
||||||
|
with:
|
||||||
|
python-version: "3.8"
|
||||||
|
- name: Install dependencies
|
||||||
|
run: |
|
||||||
|
python -m pip install --upgrade pip
|
||||||
|
pip install .[quality]
|
||||||
|
- name: Check quality
|
||||||
|
run: |
|
||||||
|
ruff check examples tests src utils scripts
|
||||||
|
ruff format examples tests src utils scripts --check
|
||||||
|
|
||||||
|
check_repository_consistency:
|
||||||
|
needs: check_code_quality
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v3
|
||||||
|
- name: Set up Python
|
||||||
|
uses: actions/setup-python@v4
|
||||||
|
with:
|
||||||
|
python-version: "3.8"
|
||||||
|
- name: Install dependencies
|
||||||
|
run: |
|
||||||
|
python -m pip install --upgrade pip
|
||||||
|
pip install .[quality]
|
||||||
|
- name: Check quality
|
||||||
|
run: |
|
||||||
|
python utils/check_copies.py
|
||||||
|
python utils/check_dummies.py
|
||||||
|
make deps_table_check_updated
|
||||||
|
|
||||||
run_fast_tests:
|
run_fast_tests:
|
||||||
|
needs: [check_code_quality, check_repository_consistency]
|
||||||
strategy:
|
strategy:
|
||||||
fail-fast: false
|
fail-fast: false
|
||||||
matrix:
|
matrix:
|
||||||
@@ -34,11 +79,6 @@ jobs:
|
|||||||
runner: docker-cpu
|
runner: docker-cpu
|
||||||
image: diffusers/diffusers-pytorch-cpu
|
image: diffusers/diffusers-pytorch-cpu
|
||||||
report: torch_cpu_models_schedulers
|
report: torch_cpu_models_schedulers
|
||||||
- name: LoRA
|
|
||||||
framework: lora
|
|
||||||
runner: docker-cpu
|
|
||||||
image: diffusers/diffusers-pytorch-cpu
|
|
||||||
report: torch_cpu_lora
|
|
||||||
- name: Fast Flax CPU tests
|
- name: Fast Flax CPU tests
|
||||||
framework: flax
|
framework: flax
|
||||||
runner: docker-cpu
|
runner: docker-cpu
|
||||||
@@ -71,16 +111,19 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pip install accelerate
|
python -m uv pip install -e [quality,test]
|
||||||
|
python -m uv pip install accelerate
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
|
|
||||||
- name: Run fast PyTorch Pipeline CPU tests
|
- name: Run fast PyTorch Pipeline CPU tests
|
||||||
if: ${{ matrix.config.framework == 'pytorch_pipelines' }}
|
if: ${{ matrix.config.framework == 'pytorch_pipelines' }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
||||||
-s -v -k "not Flax and not Onnx" \
|
-s -v -k "not Flax and not Onnx" \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
@@ -89,22 +132,16 @@ jobs:
|
|||||||
- name: Run fast PyTorch Model Scheduler CPU tests
|
- name: Run fast PyTorch Model Scheduler CPU tests
|
||||||
if: ${{ matrix.config.framework == 'pytorch_models' }}
|
if: ${{ matrix.config.framework == 'pytorch_models' }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
||||||
-s -v -k "not Flax and not Onnx and not Dependency" \
|
-s -v -k "not Flax and not Onnx and not Dependency" \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
tests/models tests/schedulers tests/others
|
tests/models tests/schedulers tests/others
|
||||||
|
|
||||||
- name: Run fast PyTorch LoRA CPU tests
|
|
||||||
if: ${{ matrix.config.framework == 'lora' }}
|
|
||||||
run: |
|
|
||||||
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
|
||||||
-s -v -k "not Flax and not Onnx and not Dependency" \
|
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
|
||||||
tests/lora
|
|
||||||
|
|
||||||
- name: Run fast Flax TPU tests
|
- name: Run fast Flax TPU tests
|
||||||
if: ${{ matrix.config.framework == 'flax' }}
|
if: ${{ matrix.config.framework == 'flax' }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
||||||
-s -v -k "Flax" \
|
-s -v -k "Flax" \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
@@ -113,7 +150,8 @@ jobs:
|
|||||||
- name: Run example PyTorch CPU tests
|
- name: Run example PyTorch CPU tests
|
||||||
if: ${{ matrix.config.framework == 'pytorch_examples' }}
|
if: ${{ matrix.config.framework == 'pytorch_examples' }}
|
||||||
run: |
|
run: |
|
||||||
python -m pip install peft
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m uv pip install peft
|
||||||
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
examples
|
examples
|
||||||
@@ -130,6 +168,7 @@ jobs:
|
|||||||
path: reports
|
path: reports
|
||||||
|
|
||||||
run_staging_tests:
|
run_staging_tests:
|
||||||
|
needs: [check_code_quality, check_repository_consistency]
|
||||||
strategy:
|
strategy:
|
||||||
fail-fast: false
|
fail-fast: false
|
||||||
matrix:
|
matrix:
|
||||||
@@ -161,15 +200,18 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m uv pip install -e [quality,test]
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
|
|
||||||
- name: Run Hub tests for models, schedulers, and pipelines on a staging env
|
- name: Run Hub tests for models, schedulers, and pipelines on a staging env
|
||||||
if: ${{ matrix.config.framework == 'hub_tests_pytorch' }}
|
if: ${{ matrix.config.framework == 'hub_tests_pytorch' }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
HUGGINGFACE_CO_STAGING=true python -m pytest \
|
HUGGINGFACE_CO_STAGING=true python -m pytest \
|
||||||
-m "is_staging_test" \
|
-m "is_staging_test" \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
|
|||||||
12
.github/workflows/pr_torch_dependency_test.yml
vendored
12
.github/workflows/pr_torch_dependency_test.yml
vendored
@@ -4,6 +4,8 @@ on:
|
|||||||
pull_request:
|
pull_request:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- main
|
||||||
|
paths:
|
||||||
|
- "src/diffusers/**.py"
|
||||||
push:
|
push:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- main
|
||||||
@@ -23,10 +25,12 @@ jobs:
|
|||||||
python-version: "3.8"
|
python-version: "3.8"
|
||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
python -m pip install --upgrade pip
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
pip install -e .
|
python -m pip install --upgrade pip uv
|
||||||
pip install torch torchvision torchaudio
|
python -m uv pip install -e .
|
||||||
pip install pytest
|
python -m uv pip install torch torchvision torchaudio
|
||||||
|
python -m uv pip install pytest
|
||||||
- name: Check for soft dependencies
|
- name: Check for soft dependencies
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
pytest tests/others/test_dependencies.py
|
pytest tests/others/test_dependencies.py
|
||||||
|
|||||||
53
.github/workflows/push_tests.yml
vendored
53
.github/workflows/push_tests.yml
vendored
@@ -4,7 +4,10 @@ on:
|
|||||||
push:
|
push:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- main
|
||||||
|
paths:
|
||||||
|
- "src/diffusers/**.py"
|
||||||
|
- "examples/**.py"
|
||||||
|
- "tests/**.py"
|
||||||
|
|
||||||
env:
|
env:
|
||||||
DIFFUSERS_IS_CI: yes
|
DIFFUSERS_IS_CI: yes
|
||||||
@@ -18,7 +21,7 @@ env:
|
|||||||
jobs:
|
jobs:
|
||||||
setup_torch_cuda_pipeline_matrix:
|
setup_torch_cuda_pipeline_matrix:
|
||||||
name: Setup Torch Pipelines CUDA Slow Tests Matrix
|
name: Setup Torch Pipelines CUDA Slow Tests Matrix
|
||||||
runs-on: docker-gpu
|
runs-on: [single-gpu, nvidia-gpu, t4, ci]
|
||||||
container:
|
container:
|
||||||
image: diffusers/diffusers-pytorch-cpu # this is a CPU image, but we need it to fetch the matrix
|
image: diffusers/diffusers-pytorch-cpu # this is a CPU image, but we need it to fetch the matrix
|
||||||
options: --shm-size "16gb" --ipc host
|
options: --shm-size "16gb" --ipc host
|
||||||
@@ -32,8 +35,9 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pip install git+https://github.com/huggingface/accelerate.git
|
python -m uv pip install -e [quality,test]
|
||||||
|
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
@@ -58,10 +62,9 @@ jobs:
|
|||||||
needs: setup_torch_cuda_pipeline_matrix
|
needs: setup_torch_cuda_pipeline_matrix
|
||||||
strategy:
|
strategy:
|
||||||
fail-fast: false
|
fail-fast: false
|
||||||
max-parallel: 1
|
|
||||||
matrix:
|
matrix:
|
||||||
module: ${{ fromJson(needs.setup_torch_cuda_pipeline_matrix.outputs.pipeline_test_matrix) }}
|
module: ${{ fromJson(needs.setup_torch_cuda_pipeline_matrix.outputs.pipeline_test_matrix) }}
|
||||||
runs-on: docker-gpu
|
runs-on: [single-gpu, nvidia-gpu, t4, ci]
|
||||||
container:
|
container:
|
||||||
image: diffusers/diffusers-pytorch-cuda
|
image: diffusers/diffusers-pytorch-cuda
|
||||||
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
|
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
|
||||||
@@ -76,8 +79,9 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pip install git+https://github.com/huggingface/accelerate.git
|
python -m uv pip install -e [quality,test]
|
||||||
|
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
@@ -125,8 +129,9 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pip install git+https://github.com/huggingface/accelerate.git
|
python -m uv pip install -e [quality,test]
|
||||||
|
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
@@ -174,9 +179,10 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pip install git+https://github.com/huggingface/accelerate.git
|
python -m uv pip install -e [quality,test]
|
||||||
python -m pip install git+https://github.com/huggingface/peft.git
|
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
|
||||||
|
python -m uv pip install peft@git+https://github.com/huggingface/peft.git
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
@@ -224,8 +230,9 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pip install git+https://github.com/huggingface/accelerate.git
|
python -m uv pip install -e [quality,test]
|
||||||
|
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
@@ -271,8 +278,9 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pip install git+https://github.com/huggingface/accelerate.git
|
python -m uv pip install -e [quality,test]
|
||||||
|
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
@@ -320,7 +328,8 @@ jobs:
|
|||||||
nvidia-smi
|
nvidia-smi
|
||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
python -m pip install -e .[quality,test,training]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m uv pip install -e [quality,test,training]
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
@@ -360,7 +369,8 @@ jobs:
|
|||||||
nvidia-smi
|
nvidia-smi
|
||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
python -m pip install -e .[quality,test,training]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m uv pip install -e [quality,test,training]
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
@@ -401,16 +411,19 @@ jobs:
|
|||||||
|
|
||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
python -m pip install -e .[quality,test,training]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m uv pip install -e [quality,test,training]
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
|
|
||||||
- name: Run example tests on GPU
|
- name: Run example tests on GPU
|
||||||
env:
|
env:
|
||||||
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
|
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v --make-reports=examples_torch_cuda examples/
|
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v --make-reports=examples_torch_cuda examples/
|
||||||
|
|
||||||
- name: Failure short reports
|
- name: Failure short reports
|
||||||
|
|||||||
14
.github/workflows/push_tests_fast.yml
vendored
14
.github/workflows/push_tests_fast.yml
vendored
@@ -4,6 +4,10 @@ on:
|
|||||||
push:
|
push:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- main
|
||||||
|
paths:
|
||||||
|
- "src/diffusers/**.py"
|
||||||
|
- "examples/**.py"
|
||||||
|
- "tests/**.py"
|
||||||
|
|
||||||
concurrency:
|
concurrency:
|
||||||
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
|
||||||
@@ -65,15 +69,18 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: |
|
run: |
|
||||||
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
apt-get update && apt-get install libsndfile1-dev libgl1 -y
|
||||||
python -m pip install -e .[quality,test]
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m uv pip install -e [quality,test]
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python utils/print_env.py
|
python utils/print_env.py
|
||||||
|
|
||||||
- name: Run fast PyTorch CPU tests
|
- name: Run fast PyTorch CPU tests
|
||||||
if: ${{ matrix.config.framework == 'pytorch' }}
|
if: ${{ matrix.config.framework == 'pytorch' }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
||||||
-s -v -k "not Flax and not Onnx" \
|
-s -v -k "not Flax and not Onnx" \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
@@ -82,6 +89,7 @@ jobs:
|
|||||||
- name: Run fast Flax TPU tests
|
- name: Run fast Flax TPU tests
|
||||||
if: ${{ matrix.config.framework == 'flax' }}
|
if: ${{ matrix.config.framework == 'flax' }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
||||||
-s -v -k "Flax" \
|
-s -v -k "Flax" \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
@@ -90,6 +98,7 @@ jobs:
|
|||||||
- name: Run fast ONNXRuntime CPU tests
|
- name: Run fast ONNXRuntime CPU tests
|
||||||
if: ${{ matrix.config.framework == 'onnxruntime' }}
|
if: ${{ matrix.config.framework == 'onnxruntime' }}
|
||||||
run: |
|
run: |
|
||||||
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
||||||
-s -v -k "Onnx" \
|
-s -v -k "Onnx" \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
@@ -98,7 +107,8 @@ jobs:
|
|||||||
- name: Run example PyTorch CPU tests
|
- name: Run example PyTorch CPU tests
|
||||||
if: ${{ matrix.config.framework == 'pytorch_examples' }}
|
if: ${{ matrix.config.framework == 'pytorch_examples' }}
|
||||||
run: |
|
run: |
|
||||||
python -m pip install peft
|
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
|
||||||
|
python -m uv pip install peft
|
||||||
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
|
||||||
--make-reports=tests_${{ matrix.config.report }} \
|
--make-reports=tests_${{ matrix.config.report }} \
|
||||||
examples
|
examples
|
||||||
|
|||||||
13
.github/workflows/push_tests_mps.yml
vendored
13
.github/workflows/push_tests_mps.yml
vendored
@@ -4,6 +4,9 @@ on:
|
|||||||
push:
|
push:
|
||||||
branches:
|
branches:
|
||||||
- main
|
- main
|
||||||
|
paths:
|
||||||
|
- "src/diffusers/**.py"
|
||||||
|
- "tests/**.py"
|
||||||
|
|
||||||
env:
|
env:
|
||||||
DIFFUSERS_IS_CI: yes
|
DIFFUSERS_IS_CI: yes
|
||||||
@@ -41,11 +44,11 @@ jobs:
|
|||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
shell: arch -arch arm64 bash {0}
|
shell: arch -arch arm64 bash {0}
|
||||||
run: |
|
run: |
|
||||||
${CONDA_RUN} python -m pip install --upgrade pip
|
${CONDA_RUN} python -m pip install --upgrade pip uv
|
||||||
${CONDA_RUN} python -m pip install -e .[quality,test]
|
${CONDA_RUN} python -m uv pip install -e [quality,test]
|
||||||
${CONDA_RUN} python -m pip install torch torchvision torchaudio
|
${CONDA_RUN} python -m uv pip install torch torchvision torchaudio
|
||||||
${CONDA_RUN} python -m pip install git+https://github.com/huggingface/accelerate.git
|
${CONDA_RUN} python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
|
||||||
${CONDA_RUN} python -m pip install transformers --upgrade
|
${CONDA_RUN} python -m uv pip install transformers --upgrade
|
||||||
|
|
||||||
- name: Environment
|
- name: Environment
|
||||||
shell: arch -arch arm64 bash {0}
|
shell: arch -arch arm64 bash {0}
|
||||||
|
|||||||
79
.github/workflows/pypi_publish.yaml
vendored
Normal file
79
.github/workflows/pypi_publish.yaml
vendored
Normal file
@@ -0,0 +1,79 @@
|
|||||||
|
# Adapted from https://blog.deepjyoti30.dev/pypi-release-github-action
|
||||||
|
|
||||||
|
name: PyPI release
|
||||||
|
|
||||||
|
on:
|
||||||
|
workflow_dispatch:
|
||||||
|
push:
|
||||||
|
tags:
|
||||||
|
- "*"
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
find-and-checkout-latest-branch:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
outputs:
|
||||||
|
latest_branch: ${{ steps.set_latest_branch.outputs.latest_branch }}
|
||||||
|
steps:
|
||||||
|
- name: Checkout Repo
|
||||||
|
uses: actions/checkout@v3
|
||||||
|
|
||||||
|
- name: Set up Python
|
||||||
|
uses: actions/setup-python@v4
|
||||||
|
with:
|
||||||
|
python-version: '3.8'
|
||||||
|
|
||||||
|
- name: Fetch latest branch
|
||||||
|
id: fetch_latest_branch
|
||||||
|
run: |
|
||||||
|
pip install -U requests packaging
|
||||||
|
LATEST_BRANCH=$(python utils/fetch_latest_release_branch.py)
|
||||||
|
echo "Latest branch: $LATEST_BRANCH"
|
||||||
|
echo "latest_branch=$LATEST_BRANCH" >> $GITHUB_ENV
|
||||||
|
|
||||||
|
- name: Set latest branch output
|
||||||
|
id: set_latest_branch
|
||||||
|
run: echo "::set-output name=latest_branch::${{ env.latest_branch }}"
|
||||||
|
|
||||||
|
release:
|
||||||
|
needs: find-and-checkout-latest-branch
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
|
||||||
|
steps:
|
||||||
|
- name: Checkout Repo
|
||||||
|
uses: actions/checkout@v3
|
||||||
|
with:
|
||||||
|
ref: ${{ needs.find-and-checkout-latest-branch.outputs.latest_branch }}
|
||||||
|
|
||||||
|
- name: Setup Python
|
||||||
|
uses: actions/setup-python@v4
|
||||||
|
with:
|
||||||
|
python-version: "3.8"
|
||||||
|
|
||||||
|
- name: Install dependencies
|
||||||
|
run: |
|
||||||
|
python -m pip install --upgrade pip
|
||||||
|
pip install -U setuptools wheel twine torch
|
||||||
|
|
||||||
|
- name: Build the dist files
|
||||||
|
run: python setup.py bdist_wheel && python setup.py sdist
|
||||||
|
|
||||||
|
- name: Publish to the test PyPI
|
||||||
|
env:
|
||||||
|
TWINE_USERNAME: ${{ secrets.TEST_PYPI_USERNAME }}
|
||||||
|
TWINE_PASSWORD: ${{ secrets.TEST_PYPI_PASSWORD }}
|
||||||
|
run: twine upload dist/* -r pypitest --repository-url=https://test.pypi.org/legacy/
|
||||||
|
|
||||||
|
- name: Test installing diffusers and importing
|
||||||
|
run: |
|
||||||
|
pip install diffusers && pip uninstall diffusers -y
|
||||||
|
pip install -i https://testpypi.python.org/pypi diffusers
|
||||||
|
python -c "from diffusers import __version__; print(__version__)"
|
||||||
|
python -c "from diffusers import DiffusionPipeline; pipe = DiffusionPipeline.from_pretrained('fusing/unet-ldm-dummy-update'); pipe()"
|
||||||
|
python -c "from diffusers import DiffusionPipeline; pipe = DiffusionPipeline.from_pretrained('hf-internal-testing/tiny-stable-diffusion-pipe', safety_checker=None); pipe('ah suh du')"
|
||||||
|
python -c "from diffusers import *"
|
||||||
|
|
||||||
|
- name: Publish to PyPI
|
||||||
|
env:
|
||||||
|
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
|
||||||
|
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
|
||||||
|
run: twine upload dist/* -r pypi
|
||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -77,7 +77,7 @@ Please refer to the [How to use Stable Diffusion in Apple Silicon](https://huggi
|
|||||||
|
|
||||||
## Quickstart
|
## Quickstart
|
||||||
|
|
||||||
Generating outputs is super easy with 🤗 Diffusers. To generate an image from text, use the `from_pretrained` method to load any pretrained diffusion model (browse the [Hub](https://huggingface.co/models?library=diffusers&sort=downloads) for 16000+ checkpoints):
|
Generating outputs is super easy with 🤗 Diffusers. To generate an image from text, use the `from_pretrained` method to load any pretrained diffusion model (browse the [Hub](https://huggingface.co/models?library=diffusers&sort=downloads) for 22000+ checkpoints):
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from diffusers import DiffusionPipeline
|
from diffusers import DiffusionPipeline
|
||||||
@@ -219,7 +219,7 @@ Also, say 👋 in our public Discord channel <a href="https://discord.gg/G7tWnz9
|
|||||||
- https://github.com/deep-floyd/IF
|
- https://github.com/deep-floyd/IF
|
||||||
- https://github.com/bentoml/BentoML
|
- https://github.com/bentoml/BentoML
|
||||||
- https://github.com/bmaltais/kohya_ss
|
- https://github.com/bmaltais/kohya_ss
|
||||||
- +7000 other amazing GitHub repositories 💪
|
- +9000 other amazing GitHub repositories 💪
|
||||||
|
|
||||||
Thank you for using us ❤️.
|
Thank you for using us ❤️.
|
||||||
|
|
||||||
|
|||||||
@@ -141,6 +141,7 @@ class LCMLoRATextToImageBenchmark(TextToImageBenchmark):
|
|||||||
super().__init__(args)
|
super().__init__(args)
|
||||||
self.pipe.load_lora_weights(self.lora_id)
|
self.pipe.load_lora_weights(self.lora_id)
|
||||||
self.pipe.fuse_lora()
|
self.pipe.fuse_lora()
|
||||||
|
self.pipe.unload_lora_weights()
|
||||||
self.pipe.scheduler = LCMScheduler.from_config(self.pipe.scheduler.config)
|
self.pipe.scheduler = LCMScheduler.from_config(self.pipe.scheduler.config)
|
||||||
|
|
||||||
def get_result_filepath(self, args):
|
def get_result_filepath(self, args):
|
||||||
@@ -235,6 +236,35 @@ class InpaintingBenchmark(ImageToImageBenchmark):
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class IPAdapterTextToImageBenchmark(TextToImageBenchmark):
|
||||||
|
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_neg_embed.png"
|
||||||
|
image = load_image(url)
|
||||||
|
|
||||||
|
def __init__(self, args):
|
||||||
|
pipe = self.pipeline_class.from_pretrained(args.ckpt, torch_dtype=torch.float16).to("cuda")
|
||||||
|
pipe.load_ip_adapter(
|
||||||
|
args.ip_adapter_id[0],
|
||||||
|
subfolder="models" if "sdxl" not in args.ip_adapter_id[1] else "sdxl_models",
|
||||||
|
weight_name=args.ip_adapter_id[1],
|
||||||
|
)
|
||||||
|
|
||||||
|
if args.run_compile:
|
||||||
|
pipe.unet.to(memory_format=torch.channels_last)
|
||||||
|
print("Run torch compile")
|
||||||
|
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
|
||||||
|
|
||||||
|
pipe.set_progress_bar_config(disable=True)
|
||||||
|
self.pipe = pipe
|
||||||
|
|
||||||
|
def run_inference(self, pipe, args):
|
||||||
|
_ = pipe(
|
||||||
|
prompt=PROMPT,
|
||||||
|
ip_adapter_image=self.image,
|
||||||
|
num_inference_steps=args.num_inference_steps,
|
||||||
|
num_images_per_prompt=args.batch_size,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class ControlNetBenchmark(TextToImageBenchmark):
|
class ControlNetBenchmark(TextToImageBenchmark):
|
||||||
pipeline_class = StableDiffusionControlNetPipeline
|
pipeline_class = StableDiffusionControlNetPipeline
|
||||||
aux_network_class = ControlNetModel
|
aux_network_class = ControlNetModel
|
||||||
|
|||||||
32
benchmarks/benchmark_ip_adapters.py
Normal file
32
benchmarks/benchmark_ip_adapters.py
Normal file
@@ -0,0 +1,32 @@
|
|||||||
|
import argparse
|
||||||
|
import sys
|
||||||
|
|
||||||
|
|
||||||
|
sys.path.append(".")
|
||||||
|
from base_classes import IPAdapterTextToImageBenchmark # noqa: E402
|
||||||
|
|
||||||
|
|
||||||
|
IP_ADAPTER_CKPTS = {
|
||||||
|
"runwayml/stable-diffusion-v1-5": ("h94/IP-Adapter", "ip-adapter_sd15.bin"),
|
||||||
|
"stabilityai/stable-diffusion-xl-base-1.0": ("h94/IP-Adapter", "ip-adapter_sdxl.bin"),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
parser = argparse.ArgumentParser()
|
||||||
|
parser.add_argument(
|
||||||
|
"--ckpt",
|
||||||
|
type=str,
|
||||||
|
default="runwayml/stable-diffusion-v1-5",
|
||||||
|
choices=list(IP_ADAPTER_CKPTS.keys()),
|
||||||
|
)
|
||||||
|
parser.add_argument("--batch_size", type=int, default=1)
|
||||||
|
parser.add_argument("--num_inference_steps", type=int, default=50)
|
||||||
|
parser.add_argument("--model_cpu_offload", action="store_true")
|
||||||
|
parser.add_argument("--run_compile", action="store_true")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
args.ip_adapter_id = IP_ADAPTER_CKPTS[args.ckpt]
|
||||||
|
benchmark_pipe = IPAdapterTextToImageBenchmark(args)
|
||||||
|
args.ckpt = f"{args.ckpt} (IP-Adapter)"
|
||||||
|
benchmark_pipe.benchmark(args)
|
||||||
@@ -72,7 +72,7 @@ def main():
|
|||||||
command += " --run_compile"
|
command += " --run_compile"
|
||||||
run_command(command.split())
|
run_command(command.split())
|
||||||
|
|
||||||
elif file == "benchmark_sd_inpainting.py":
|
elif file in ["benchmark_sd_inpainting.py", "benchmark_ip_adapters.py"]:
|
||||||
sdxl_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
|
sdxl_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
|
||||||
command = f"python {file} --ckpt {sdxl_ckpt}"
|
command = f"python {file} --ckpt {sdxl_ckpt}"
|
||||||
run_command(command.split())
|
run_command(command.split())
|
||||||
|
|||||||
@@ -23,13 +23,13 @@ ENV PATH="/opt/venv/bin:$PATH"
|
|||||||
|
|
||||||
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
||||||
# follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
|
# follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
|
||||||
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
|
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
|
||||||
python3 -m pip install --upgrade --no-cache-dir \
|
python3 -m uv pip install --upgrade --no-cache-dir \
|
||||||
clu \
|
clu \
|
||||||
"jax[cpu]>=0.2.16,!=0.3.2" \
|
"jax[cpu]>=0.2.16,!=0.3.2" \
|
||||||
"flax>=0.4.1" \
|
"flax>=0.4.1" \
|
||||||
"jaxlib>=0.1.65" && \
|
"jaxlib>=0.1.65" && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m uv pip install --no-cache-dir \
|
||||||
accelerate \
|
accelerate \
|
||||||
datasets \
|
datasets \
|
||||||
hf-doc-builder \
|
hf-doc-builder \
|
||||||
|
|||||||
@@ -23,15 +23,15 @@ ENV PATH="/opt/venv/bin:$PATH"
|
|||||||
|
|
||||||
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
||||||
# follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
|
# follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
|
||||||
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
|
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m pip install --no-cache-dir \
|
||||||
"jax[tpu]>=0.2.16,!=0.3.2" \
|
"jax[tpu]>=0.2.16,!=0.3.2" \
|
||||||
-f https://storage.googleapis.com/jax-releases/libtpu_releases.html && \
|
-f https://storage.googleapis.com/jax-releases/libtpu_releases.html && \
|
||||||
python3 -m pip install --upgrade --no-cache-dir \
|
python3 -m uv pip install --upgrade --no-cache-dir \
|
||||||
clu \
|
clu \
|
||||||
"flax>=0.4.1" \
|
"flax>=0.4.1" \
|
||||||
"jaxlib>=0.1.65" && \
|
"jaxlib>=0.1.65" && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m uv pip install --no-cache-dir \
|
||||||
accelerate \
|
accelerate \
|
||||||
datasets \
|
datasets \
|
||||||
hf-doc-builder \
|
hf-doc-builder \
|
||||||
|
|||||||
@@ -22,14 +22,14 @@ RUN python3 -m venv /opt/venv
|
|||||||
ENV PATH="/opt/venv/bin:$PATH"
|
ENV PATH="/opt/venv/bin:$PATH"
|
||||||
|
|
||||||
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
||||||
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
|
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m uv pip install --no-cache-dir \
|
||||||
torch \
|
torch==2.1.2 \
|
||||||
torchvision \
|
torchvision==0.16.2 \
|
||||||
torchaudio \
|
torchaudio==2.1.2 \
|
||||||
onnxruntime \
|
onnxruntime \
|
||||||
--extra-index-url https://download.pytorch.org/whl/cpu && \
|
--extra-index-url https://download.pytorch.org/whl/cpu && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m uv pip install --no-cache-dir \
|
||||||
accelerate \
|
accelerate \
|
||||||
datasets \
|
datasets \
|
||||||
hf-doc-builder \
|
hf-doc-builder \
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
FROM nvidia/cuda:11.6.2-cudnn8-devel-ubuntu20.04
|
FROM nvidia/cuda:12.1.0-runtime-ubuntu20.04
|
||||||
LABEL maintainer="Hugging Face"
|
LABEL maintainer="Hugging Face"
|
||||||
LABEL repository="diffusers"
|
LABEL repository="diffusers"
|
||||||
|
|
||||||
@@ -22,14 +22,14 @@ RUN python3 -m venv /opt/venv
|
|||||||
ENV PATH="/opt/venv/bin:$PATH"
|
ENV PATH="/opt/venv/bin:$PATH"
|
||||||
|
|
||||||
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
||||||
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
|
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m uv pip install --no-cache-dir \
|
||||||
torch \
|
torch \
|
||||||
torchvision \
|
torchvision \
|
||||||
torchaudio \
|
torchaudio \
|
||||||
"onnxruntime-gpu>=1.13.1" \
|
"onnxruntime-gpu>=1.13.1" \
|
||||||
--extra-index-url https://download.pytorch.org/whl/cu117 && \
|
--extra-index-url https://download.pytorch.org/whl/cu117 && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m uv pip install --no-cache-dir \
|
||||||
accelerate \
|
accelerate \
|
||||||
datasets \
|
datasets \
|
||||||
hf-doc-builder \
|
hf-doc-builder \
|
||||||
|
|||||||
@@ -24,8 +24,8 @@ RUN python3.9 -m venv /opt/venv
|
|||||||
ENV PATH="/opt/venv/bin:$PATH"
|
ENV PATH="/opt/venv/bin:$PATH"
|
||||||
|
|
||||||
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
||||||
RUN python3.9 -m pip install --no-cache-dir --upgrade pip && \
|
RUN python3.9 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
|
||||||
python3.9 -m pip install --no-cache-dir \
|
python3.9 -m uv pip install --no-cache-dir \
|
||||||
torch \
|
torch \
|
||||||
torchvision \
|
torchvision \
|
||||||
torchaudio \
|
torchaudio \
|
||||||
@@ -40,7 +40,6 @@ RUN python3.9 -m pip install --no-cache-dir --upgrade pip && \
|
|||||||
numpy \
|
numpy \
|
||||||
scipy \
|
scipy \
|
||||||
tensorboard \
|
tensorboard \
|
||||||
transformers \
|
transformers
|
||||||
omegaconf
|
|
||||||
|
|
||||||
CMD ["/bin/bash"]
|
CMD ["/bin/bash"]
|
||||||
|
|||||||
@@ -23,14 +23,14 @@ RUN python3 -m venv /opt/venv
|
|||||||
ENV PATH="/opt/venv/bin:$PATH"
|
ENV PATH="/opt/venv/bin:$PATH"
|
||||||
|
|
||||||
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
||||||
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
|
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m uv pip install --no-cache-dir \
|
||||||
torch \
|
torch \
|
||||||
torchvision \
|
torchvision \
|
||||||
torchaudio \
|
torchaudio \
|
||||||
invisible_watermark \
|
invisible_watermark \
|
||||||
--extra-index-url https://download.pytorch.org/whl/cpu && \
|
--extra-index-url https://download.pytorch.org/whl/cpu && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m uv pip install --no-cache-dir \
|
||||||
accelerate \
|
accelerate \
|
||||||
datasets \
|
datasets \
|
||||||
hf-doc-builder \
|
hf-doc-builder \
|
||||||
@@ -40,6 +40,6 @@ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
|
|||||||
numpy \
|
numpy \
|
||||||
scipy \
|
scipy \
|
||||||
tensorboard \
|
tensorboard \
|
||||||
transformers
|
transformers matplotlib
|
||||||
|
|
||||||
CMD ["/bin/bash"]
|
CMD ["/bin/bash"]
|
||||||
|
|||||||
@@ -23,8 +23,8 @@ RUN python3 -m venv /opt/venv
|
|||||||
ENV PATH="/opt/venv/bin:$PATH"
|
ENV PATH="/opt/venv/bin:$PATH"
|
||||||
|
|
||||||
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
||||||
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
|
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m uv pip install --no-cache-dir \
|
||||||
torch \
|
torch \
|
||||||
torchvision \
|
torchvision \
|
||||||
torchaudio \
|
torchaudio \
|
||||||
@@ -40,7 +40,6 @@ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
|
|||||||
scipy \
|
scipy \
|
||||||
tensorboard \
|
tensorboard \
|
||||||
transformers \
|
transformers \
|
||||||
omegaconf \
|
|
||||||
pytorch-lightning
|
pytorch-lightning
|
||||||
|
|
||||||
CMD ["/bin/bash"]
|
CMD ["/bin/bash"]
|
||||||
|
|||||||
@@ -23,13 +23,13 @@ RUN python3 -m venv /opt/venv
|
|||||||
ENV PATH="/opt/venv/bin:$PATH"
|
ENV PATH="/opt/venv/bin:$PATH"
|
||||||
|
|
||||||
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
|
||||||
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
|
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m pip install --no-cache-dir \
|
||||||
torch \
|
torch \
|
||||||
torchvision \
|
torchvision \
|
||||||
torchaudio \
|
torchaudio \
|
||||||
invisible_watermark && \
|
invisible_watermark && \
|
||||||
python3 -m pip install --no-cache-dir \
|
python3 -m uv pip install --no-cache-dir \
|
||||||
accelerate \
|
accelerate \
|
||||||
datasets \
|
datasets \
|
||||||
hf-doc-builder \
|
hf-doc-builder \
|
||||||
@@ -40,7 +40,6 @@ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
|
|||||||
scipy \
|
scipy \
|
||||||
tensorboard \
|
tensorboard \
|
||||||
transformers \
|
transformers \
|
||||||
omegaconf \
|
|
||||||
xformers
|
xformers
|
||||||
|
|
||||||
CMD ["/bin/bash"]
|
CMD ["/bin/bash"]
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
<!---
|
<!---
|
||||||
Copyright 2023- The HuggingFace Team. All rights reserved.
|
Copyright 2024- The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License");
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
you may not use this file except in compliance with the License.
|
you may not use this file except in compliance with the License.
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -18,7 +18,7 @@
|
|||||||
- local: tutorials/basic_training
|
- local: tutorials/basic_training
|
||||||
title: Train a diffusion model
|
title: Train a diffusion model
|
||||||
- local: tutorials/using_peft_for_inference
|
- local: tutorials/using_peft_for_inference
|
||||||
title: Inference with PEFT
|
title: Load LoRAs for inference
|
||||||
- local: tutorials/fast_diffusion
|
- local: tutorials/fast_diffusion
|
||||||
title: Accelerate inference of text-to-image diffusion models
|
title: Accelerate inference of text-to-image diffusion models
|
||||||
title: Tutorials
|
title: Tutorials
|
||||||
@@ -52,12 +52,18 @@
|
|||||||
title: Image-to-image
|
title: Image-to-image
|
||||||
- local: using-diffusers/inpaint
|
- local: using-diffusers/inpaint
|
||||||
title: Inpainting
|
title: Inpainting
|
||||||
|
- local: using-diffusers/text-img2vid
|
||||||
|
title: Text or image-to-video
|
||||||
- local: using-diffusers/depth2img
|
- local: using-diffusers/depth2img
|
||||||
title: Depth-to-image
|
title: Depth-to-image
|
||||||
title: Tasks
|
title: Tasks
|
||||||
- sections:
|
- sections:
|
||||||
- local: using-diffusers/textual_inversion_inference
|
- local: using-diffusers/textual_inversion_inference
|
||||||
title: Textual inversion
|
title: Textual inversion
|
||||||
|
- local: using-diffusers/ip_adapter
|
||||||
|
title: IP-Adapter
|
||||||
|
- local: using-diffusers/merge_loras
|
||||||
|
title: Merge LoRAs
|
||||||
- local: training/distributed_inference
|
- local: training/distributed_inference
|
||||||
title: Distributed inference with multiple GPUs
|
title: Distributed inference with multiple GPUs
|
||||||
- local: using-diffusers/reusing_seeds
|
- local: using-diffusers/reusing_seeds
|
||||||
@@ -98,6 +104,8 @@
|
|||||||
title: Latent Consistency Model-LoRA
|
title: Latent Consistency Model-LoRA
|
||||||
- local: using-diffusers/inference_with_lcm
|
- local: using-diffusers/inference_with_lcm
|
||||||
title: Latent Consistency Model
|
title: Latent Consistency Model
|
||||||
|
- local: using-diffusers/inference_with_tcd_lora
|
||||||
|
title: Trajectory Consistency Distillation-LoRA
|
||||||
- local: using-diffusers/svd
|
- local: using-diffusers/svd
|
||||||
title: Stable Video Diffusion
|
title: Stable Video Diffusion
|
||||||
title: Specific pipeline examples
|
title: Specific pipeline examples
|
||||||
@@ -228,6 +236,8 @@
|
|||||||
title: UNet3DConditionModel
|
title: UNet3DConditionModel
|
||||||
- local: api/models/unet-motion
|
- local: api/models/unet-motion
|
||||||
title: UNetMotionModel
|
title: UNetMotionModel
|
||||||
|
- local: api/models/uvit2d
|
||||||
|
title: UViT2DModel
|
||||||
- local: api/models/vq
|
- local: api/models/vq
|
||||||
title: VQModel
|
title: VQModel
|
||||||
- local: api/models/autoencoderkl
|
- local: api/models/autoencoderkl
|
||||||
@@ -282,6 +292,8 @@
|
|||||||
title: DiffEdit
|
title: DiffEdit
|
||||||
- local: api/pipelines/dit
|
- local: api/pipelines/dit
|
||||||
title: DiT
|
title: DiT
|
||||||
|
- local: api/pipelines/i2vgenxl
|
||||||
|
title: I2VGen-XL
|
||||||
- local: api/pipelines/pix2pix
|
- local: api/pipelines/pix2pix
|
||||||
title: InstructPix2Pix
|
title: InstructPix2Pix
|
||||||
- local: api/pipelines/kandinsky
|
- local: api/pipelines/kandinsky
|
||||||
@@ -294,12 +306,16 @@
|
|||||||
title: Latent Consistency Models
|
title: Latent Consistency Models
|
||||||
- local: api/pipelines/latent_diffusion
|
- local: api/pipelines/latent_diffusion
|
||||||
title: Latent Diffusion
|
title: Latent Diffusion
|
||||||
|
- local: api/pipelines/ledits_pp
|
||||||
|
title: LEDITS++
|
||||||
- local: api/pipelines/panorama
|
- local: api/pipelines/panorama
|
||||||
title: MultiDiffusion
|
title: MultiDiffusion
|
||||||
- local: api/pipelines/musicldm
|
- local: api/pipelines/musicldm
|
||||||
title: MusicLDM
|
title: MusicLDM
|
||||||
- local: api/pipelines/paint_by_example
|
- local: api/pipelines/paint_by_example
|
||||||
title: Paint by Example
|
title: Paint by Example
|
||||||
|
- local: api/pipelines/pia
|
||||||
|
title: Personalized Image Animator (PIA)
|
||||||
- local: api/pipelines/pixart
|
- local: api/pipelines/pixart
|
||||||
title: PixArt-α
|
title: PixArt-α
|
||||||
- local: api/pipelines/self_attention_guidance
|
- local: api/pipelines/self_attention_guidance
|
||||||
@@ -308,6 +324,8 @@
|
|||||||
title: Semantic Guidance
|
title: Semantic Guidance
|
||||||
- local: api/pipelines/shap_e
|
- local: api/pipelines/shap_e
|
||||||
title: Shap-E
|
title: Shap-E
|
||||||
|
- local: api/pipelines/stable_cascade
|
||||||
|
title: Stable Cascade
|
||||||
- sections:
|
- sections:
|
||||||
- local: api/pipelines/stable_diffusion/overview
|
- local: api/pipelines/stable_diffusion/overview
|
||||||
title: Overview
|
title: Overview
|
||||||
@@ -315,6 +333,8 @@
|
|||||||
title: Text-to-image
|
title: Text-to-image
|
||||||
- local: api/pipelines/stable_diffusion/img2img
|
- local: api/pipelines/stable_diffusion/img2img
|
||||||
title: Image-to-image
|
title: Image-to-image
|
||||||
|
- local: api/pipelines/stable_diffusion/svd
|
||||||
|
title: Image-to-video
|
||||||
- local: api/pipelines/stable_diffusion/inpaint
|
- local: api/pipelines/stable_diffusion/inpaint
|
||||||
title: Inpainting
|
title: Inpainting
|
||||||
- local: api/pipelines/stable_diffusion/depth2img
|
- local: api/pipelines/stable_diffusion/depth2img
|
||||||
@@ -384,6 +404,10 @@
|
|||||||
title: EulerAncestralDiscreteScheduler
|
title: EulerAncestralDiscreteScheduler
|
||||||
- local: api/schedulers/euler
|
- local: api/schedulers/euler
|
||||||
title: EulerDiscreteScheduler
|
title: EulerDiscreteScheduler
|
||||||
|
- local: api/schedulers/edm_euler
|
||||||
|
title: EDMEulerScheduler
|
||||||
|
- local: api/schedulers/edm_multistep_dpm_solver
|
||||||
|
title: EDMDPMSolverMultistepScheduler
|
||||||
- local: api/schedulers/heun
|
- local: api/schedulers/heun
|
||||||
title: HeunDiscreteScheduler
|
title: HeunDiscreteScheduler
|
||||||
- local: api/schedulers/ipndm
|
- local: api/schedulers/ipndm
|
||||||
@@ -406,6 +430,8 @@
|
|||||||
title: ScoreSdeVeScheduler
|
title: ScoreSdeVeScheduler
|
||||||
- local: api/schedulers/score_sde_vp
|
- local: api/schedulers/score_sde_vp
|
||||||
title: ScoreSdeVpScheduler
|
title: ScoreSdeVpScheduler
|
||||||
|
- local: api/schedulers/tcd
|
||||||
|
title: TCDScheduler
|
||||||
- local: api/schedulers/unipc
|
- local: api/schedulers/unipc
|
||||||
title: UniPCMultistepScheduler
|
title: UniPCMultistepScheduler
|
||||||
- local: api/schedulers/vq_diffusion
|
- local: api/schedulers/vq_diffusion
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -20,14 +20,14 @@ An attention processor is a class for applying different types of attention mech
|
|||||||
## AttnProcessor2_0
|
## AttnProcessor2_0
|
||||||
[[autodoc]] models.attention_processor.AttnProcessor2_0
|
[[autodoc]] models.attention_processor.AttnProcessor2_0
|
||||||
|
|
||||||
## FusedAttnProcessor2_0
|
## AttnAddedKVProcessor
|
||||||
[[autodoc]] models.attention_processor.FusedAttnProcessor2_0
|
[[autodoc]] models.attention_processor.AttnAddedKVProcessor
|
||||||
|
|
||||||
## LoRAAttnProcessor
|
## AttnAddedKVProcessor2_0
|
||||||
[[autodoc]] models.attention_processor.LoRAAttnProcessor
|
[[autodoc]] models.attention_processor.AttnAddedKVProcessor2_0
|
||||||
|
|
||||||
## LoRAAttnProcessor2_0
|
## CrossFrameAttnProcessor
|
||||||
[[autodoc]] models.attention_processor.LoRAAttnProcessor2_0
|
[[autodoc]] pipelines.text_to_video_synthesis.pipeline_text_to_video_zero.CrossFrameAttnProcessor
|
||||||
|
|
||||||
## CustomDiffusionAttnProcessor
|
## CustomDiffusionAttnProcessor
|
||||||
[[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor
|
[[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor
|
||||||
@@ -35,26 +35,23 @@ An attention processor is a class for applying different types of attention mech
|
|||||||
## CustomDiffusionAttnProcessor2_0
|
## CustomDiffusionAttnProcessor2_0
|
||||||
[[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor2_0
|
[[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor2_0
|
||||||
|
|
||||||
## AttnAddedKVProcessor
|
## CustomDiffusionXFormersAttnProcessor
|
||||||
[[autodoc]] models.attention_processor.AttnAddedKVProcessor
|
[[autodoc]] models.attention_processor.CustomDiffusionXFormersAttnProcessor
|
||||||
|
|
||||||
## AttnAddedKVProcessor2_0
|
## FusedAttnProcessor2_0
|
||||||
[[autodoc]] models.attention_processor.AttnAddedKVProcessor2_0
|
[[autodoc]] models.attention_processor.FusedAttnProcessor2_0
|
||||||
|
|
||||||
## LoRAAttnAddedKVProcessor
|
## LoRAAttnAddedKVProcessor
|
||||||
[[autodoc]] models.attention_processor.LoRAAttnAddedKVProcessor
|
[[autodoc]] models.attention_processor.LoRAAttnAddedKVProcessor
|
||||||
|
|
||||||
## XFormersAttnProcessor
|
|
||||||
[[autodoc]] models.attention_processor.XFormersAttnProcessor
|
|
||||||
|
|
||||||
## LoRAXFormersAttnProcessor
|
## LoRAXFormersAttnProcessor
|
||||||
[[autodoc]] models.attention_processor.LoRAXFormersAttnProcessor
|
[[autodoc]] models.attention_processor.LoRAXFormersAttnProcessor
|
||||||
|
|
||||||
## CustomDiffusionXFormersAttnProcessor
|
|
||||||
[[autodoc]] models.attention_processor.CustomDiffusionXFormersAttnProcessor
|
|
||||||
|
|
||||||
## SlicedAttnProcessor
|
## SlicedAttnProcessor
|
||||||
[[autodoc]] models.attention_processor.SlicedAttnProcessor
|
[[autodoc]] models.attention_processor.SlicedAttnProcessor
|
||||||
|
|
||||||
## SlicedAttnAddedKVProcessor
|
## SlicedAttnAddedKVProcessor
|
||||||
[[autodoc]] models.attention_processor.SlicedAttnAddedKVProcessor
|
[[autodoc]] models.attention_processor.SlicedAttnAddedKVProcessor
|
||||||
|
|
||||||
|
## XFormersAttnProcessor
|
||||||
|
[[autodoc]] models.attention_processor.XFormersAttnProcessor
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -12,14 +12,18 @@ specific language governing permissions and limitations under the License.
|
|||||||
|
|
||||||
# IP-Adapter
|
# IP-Adapter
|
||||||
|
|
||||||
[IP-Adapter](https://hf.co/papers/2308.06721) is a lightweight adapter that enables prompting a diffusion model with an image. This method decouples the cross-attention layers of the image and text features. The image features are generated from an image encoder. Files generated from IP-Adapter are only ~100MBs.
|
[IP-Adapter](https://hf.co/papers/2308.06721) is a lightweight adapter that enables prompting a diffusion model with an image. This method decouples the cross-attention layers of the image and text features. The image features are generated from an image encoder.
|
||||||
|
|
||||||
<Tip>
|
<Tip>
|
||||||
|
|
||||||
Learn how to load an IP-Adapter checkpoint and image in the [IP-Adapter](../../using-diffusers/loading_adapters#ip-adapter) loading guide.
|
Learn how to load an IP-Adapter checkpoint and image in the IP-Adapter [loading](../../using-diffusers/loading_adapters#ip-adapter) guide, and you can see how to use it in the [usage](../../using-diffusers/ip_adapter) guide.
|
||||||
|
|
||||||
</Tip>
|
</Tip>
|
||||||
|
|
||||||
## IPAdapterMixin
|
## IPAdapterMixin
|
||||||
|
|
||||||
[[autodoc]] loaders.ip_adapter.IPAdapterMixin
|
[[autodoc]] loaders.ip_adapter.IPAdapterMixin
|
||||||
|
|
||||||
|
## IPAdapterMaskProcessor
|
||||||
|
|
||||||
|
[[autodoc]] image_processor.IPAdapterMaskProcessor
|
||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -30,8 +30,8 @@ To learn more about how to load single file weights, see the [Load different Sta
|
|||||||
|
|
||||||
## FromOriginalVAEMixin
|
## FromOriginalVAEMixin
|
||||||
|
|
||||||
[[autodoc]] loaders.single_file.FromOriginalVAEMixin
|
[[autodoc]] loaders.autoencoder.FromOriginalVAEMixin
|
||||||
|
|
||||||
## FromOriginalControlnetMixin
|
## FromOriginalControlnetMixin
|
||||||
|
|
||||||
[[autodoc]] loaders.single_file.FromOriginalControlnetMixin
|
[[autodoc]] loaders.controlnet.FromOriginalControlNetMixin
|
||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -33,6 +33,9 @@ model = AutoencoderKL.from_single_file(url)
|
|||||||
## AutoencoderKL
|
## AutoencoderKL
|
||||||
|
|
||||||
[[autodoc]] AutoencoderKL
|
[[autodoc]] AutoencoderKL
|
||||||
|
- decode
|
||||||
|
- encode
|
||||||
|
- all
|
||||||
|
|
||||||
## AutoencoderKLOutput
|
## AutoencoderKLOutput
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,18 @@
|
|||||||
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
# Consistency Decoder
|
# Consistency Decoder
|
||||||
|
|
||||||
Consistency decoder can be used to decode the latents from the denoising UNet in the [`StableDiffusionPipeline`]. This decoder was introduced in the [DALL-E 3 technical report](https://openai.com/dall-e-3).
|
Consistency decoder can be used to decode the latents from the denoising UNet in the [`StableDiffusionPipeline`]. This decoder was introduced in the [DALL-E 3 technical report](https://openai.com/dall-e-3).
|
||||||
|
|
||||||
The original codebase can be found at [openai/consistencydecoder](https://github.com/openai/consistencydecoder).
|
The original codebase can be found at [openai/consistencydecoder](https://github.com/openai/consistencydecoder).
|
||||||
|
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -24,4 +24,4 @@ The abstract from the paper is:
|
|||||||
|
|
||||||
## PriorTransformerOutput
|
## PriorTransformerOutput
|
||||||
|
|
||||||
[[autodoc]] models.prior_transformer.PriorTransformerOutput
|
[[autodoc]] models.transformers.prior_transformer.PriorTransformerOutput
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -38,4 +38,4 @@ It is assumed one of the input classes is the masked latent pixel. The predicted
|
|||||||
|
|
||||||
## Transformer2DModelOutput
|
## Transformer2DModelOutput
|
||||||
|
|
||||||
[[autodoc]] models.transformer_2d.Transformer2DModelOutput
|
[[autodoc]] models.transformers.transformer_2d.Transformer2DModelOutput
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -16,8 +16,8 @@ A Transformer model for video-like data.
|
|||||||
|
|
||||||
## TransformerTemporalModel
|
## TransformerTemporalModel
|
||||||
|
|
||||||
[[autodoc]] models.transformer_temporal.TransformerTemporalModel
|
[[autodoc]] models.transformers.transformer_temporal.TransformerTemporalModel
|
||||||
|
|
||||||
## TransformerTemporalModelOutput
|
## TransformerTemporalModelOutput
|
||||||
|
|
||||||
[[autodoc]] models.transformer_temporal.TransformerTemporalModelOutput
|
[[autodoc]] models.transformers.transformer_temporal.TransformerTemporalModelOutput
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -22,4 +22,4 @@ The abstract from the paper is:
|
|||||||
[[autodoc]] UNetMotionModel
|
[[autodoc]] UNetMotionModel
|
||||||
|
|
||||||
## UNet3DConditionOutput
|
## UNet3DConditionOutput
|
||||||
[[autodoc]] models.unet_3d_condition.UNet3DConditionOutput
|
[[autodoc]] models.unets.unet_3d_condition.UNet3DConditionOutput
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -22,4 +22,4 @@ The abstract from the paper is:
|
|||||||
[[autodoc]] UNet1DModel
|
[[autodoc]] UNet1DModel
|
||||||
|
|
||||||
## UNet1DOutput
|
## UNet1DOutput
|
||||||
[[autodoc]] models.unet_1d.UNet1DOutput
|
[[autodoc]] models.unets.unet_1d.UNet1DOutput
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -22,10 +22,10 @@ The abstract from the paper is:
|
|||||||
[[autodoc]] UNet2DConditionModel
|
[[autodoc]] UNet2DConditionModel
|
||||||
|
|
||||||
## UNet2DConditionOutput
|
## UNet2DConditionOutput
|
||||||
[[autodoc]] models.unet_2d_condition.UNet2DConditionOutput
|
[[autodoc]] models.unets.unet_2d_condition.UNet2DConditionOutput
|
||||||
|
|
||||||
## FlaxUNet2DConditionModel
|
## FlaxUNet2DConditionModel
|
||||||
[[autodoc]] models.unet_2d_condition_flax.FlaxUNet2DConditionModel
|
[[autodoc]] models.unets.unet_2d_condition_flax.FlaxUNet2DConditionModel
|
||||||
|
|
||||||
## FlaxUNet2DConditionOutput
|
## FlaxUNet2DConditionOutput
|
||||||
[[autodoc]] models.unet_2d_condition_flax.FlaxUNet2DConditionOutput
|
[[autodoc]] models.unets.unet_2d_condition_flax.FlaxUNet2DConditionOutput
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -22,4 +22,4 @@ The abstract from the paper is:
|
|||||||
[[autodoc]] UNet2DModel
|
[[autodoc]] UNet2DModel
|
||||||
|
|
||||||
## UNet2DOutput
|
## UNet2DOutput
|
||||||
[[autodoc]] models.unet_2d.UNet2DOutput
|
[[autodoc]] models.unets.unet_2d.UNet2DOutput
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -22,4 +22,4 @@ The abstract from the paper is:
|
|||||||
[[autodoc]] UNet3DConditionModel
|
[[autodoc]] UNet3DConditionModel
|
||||||
|
|
||||||
## UNet3DConditionOutput
|
## UNet3DConditionOutput
|
||||||
[[autodoc]] models.unet_3d_condition.UNet3DConditionOutput
|
[[autodoc]] models.unets.unet_3d_condition.UNet3DConditionOutput
|
||||||
|
|||||||
39
docs/source/en/api/models/uvit2d.md
Normal file
39
docs/source/en/api/models/uvit2d.md
Normal file
@@ -0,0 +1,39 @@
|
|||||||
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# UVit2DModel
|
||||||
|
|
||||||
|
The [U-ViT](https://hf.co/papers/2301.11093) model is a vision transformer (ViT) based UNet. This model incorporates elements from ViT (considers all inputs such as time, conditions and noisy image patches as tokens) and a UNet (long skip connections between the shallow and deep layers). The skip connection is important for predicting pixel-level features. An additional 3x3 convolutional block is applied prior to the final output to improve image quality.
|
||||||
|
|
||||||
|
The abstract from the paper is:
|
||||||
|
|
||||||
|
*Currently, applying diffusion models in pixel space of high resolution images is difficult. Instead, existing approaches focus on diffusion in lower dimensional spaces (latent diffusion), or have multiple super-resolution levels of generation referred to as cascades. The downside is that these approaches add additional complexity to the diffusion framework. This paper aims to improve denoising diffusion for high resolution images while keeping the model as simple as possible. The paper is centered around the research question: How can one train a standard denoising diffusion models on high resolution images, and still obtain performance comparable to these alternate approaches? The four main findings are: 1) the noise schedule should be adjusted for high resolution images, 2) It is sufficient to scale only a particular part of the architecture, 3) dropout should be added at specific locations in the architecture, and 4) downsampling is an effective strategy to avoid high resolution feature maps. Combining these simple yet effective techniques, we achieve state-of-the-art on image generation among diffusion models without sampling modifiers on ImageNet.*
|
||||||
|
|
||||||
|
## UVit2DModel
|
||||||
|
|
||||||
|
[[autodoc]] UVit2DModel
|
||||||
|
|
||||||
|
## UVit2DConvEmbed
|
||||||
|
|
||||||
|
[[autodoc]] models.unets.uvit_2d.UVit2DConvEmbed
|
||||||
|
|
||||||
|
## UVitBlock
|
||||||
|
|
||||||
|
[[autodoc]] models.unets.uvit_2d.UVitBlock
|
||||||
|
|
||||||
|
## ConvNextBlock
|
||||||
|
|
||||||
|
[[autodoc]] models.unets.uvit_2d.ConvNextBlock
|
||||||
|
|
||||||
|
## ConvMlmLayer
|
||||||
|
|
||||||
|
[[autodoc]] models.unets.uvit_2d.ConvMlmLayer
|
||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -25,6 +25,7 @@ The abstract of the paper is the following:
|
|||||||
| Pipeline | Tasks | Demo
|
| Pipeline | Tasks | Demo
|
||||||
|---|---|:---:|
|
|---|---|:---:|
|
||||||
| [AnimateDiffPipeline](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/animatediff/pipeline_animatediff.py) | *Text-to-Video Generation with AnimateDiff* |
|
| [AnimateDiffPipeline](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/animatediff/pipeline_animatediff.py) | *Text-to-Video Generation with AnimateDiff* |
|
||||||
|
| [AnimateDiffVideoToVideoPipeline](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/animatediff/pipeline_animatediff_video2video.py) | *Video-to-Video Generation with AnimateDiff* |
|
||||||
|
|
||||||
## Available checkpoints
|
## Available checkpoints
|
||||||
|
|
||||||
@@ -32,6 +33,8 @@ Motion Adapter checkpoints can be found under [guoyww](https://huggingface.co/gu
|
|||||||
|
|
||||||
## Usage example
|
## Usage example
|
||||||
|
|
||||||
|
### AnimateDiffPipeline
|
||||||
|
|
||||||
AnimateDiff works with a MotionAdapter checkpoint and a Stable Diffusion model checkpoint. The MotionAdapter is a collection of Motion Modules that are responsible for adding coherent motion across image frames. These modules are applied after the Resnet and Attention blocks in Stable Diffusion UNet.
|
AnimateDiff works with a MotionAdapter checkpoint and a Stable Diffusion model checkpoint. The MotionAdapter is a collection of Motion Modules that are responsible for adding coherent motion across image frames. These modules are applied after the Resnet and Attention blocks in Stable Diffusion UNet.
|
||||||
|
|
||||||
The following example demonstrates how to use a *MotionAdapter* checkpoint with Diffusers for inference based on StableDiffusion-1.4/1.5.
|
The following example demonstrates how to use a *MotionAdapter* checkpoint with Diffusers for inference based on StableDiffusion-1.4/1.5.
|
||||||
@@ -98,6 +101,114 @@ AnimateDiff tends to work better with finetuned Stable Diffusion models. If you
|
|||||||
|
|
||||||
</Tip>
|
</Tip>
|
||||||
|
|
||||||
|
### AnimateDiffVideoToVideoPipeline
|
||||||
|
|
||||||
|
AnimateDiff can also be used to generate visually similar videos or enable style/character/background or other edits starting from an initial video, allowing you to seamlessly explore creative possibilities.
|
||||||
|
|
||||||
|
```python
|
||||||
|
import imageio
|
||||||
|
import requests
|
||||||
|
import torch
|
||||||
|
from diffusers import AnimateDiffVideoToVideoPipeline, DDIMScheduler, MotionAdapter
|
||||||
|
from diffusers.utils import export_to_gif
|
||||||
|
from io import BytesIO
|
||||||
|
from PIL import Image
|
||||||
|
|
||||||
|
# Load the motion adapter
|
||||||
|
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2", torch_dtype=torch.float16)
|
||||||
|
# load SD 1.5 based finetuned model
|
||||||
|
model_id = "SG161222/Realistic_Vision_V5.1_noVAE"
|
||||||
|
pipe = AnimateDiffVideoToVideoPipeline.from_pretrained(model_id, motion_adapter=adapter, torch_dtype=torch.float16).to("cuda")
|
||||||
|
scheduler = DDIMScheduler.from_pretrained(
|
||||||
|
model_id,
|
||||||
|
subfolder="scheduler",
|
||||||
|
clip_sample=False,
|
||||||
|
timestep_spacing="linspace",
|
||||||
|
beta_schedule="linear",
|
||||||
|
steps_offset=1,
|
||||||
|
)
|
||||||
|
pipe.scheduler = scheduler
|
||||||
|
|
||||||
|
# enable memory savings
|
||||||
|
pipe.enable_vae_slicing()
|
||||||
|
pipe.enable_model_cpu_offload()
|
||||||
|
|
||||||
|
# helper function to load videos
|
||||||
|
def load_video(file_path: str):
|
||||||
|
images = []
|
||||||
|
|
||||||
|
if file_path.startswith(('http://', 'https://')):
|
||||||
|
# If the file_path is a URL
|
||||||
|
response = requests.get(file_path)
|
||||||
|
response.raise_for_status()
|
||||||
|
content = BytesIO(response.content)
|
||||||
|
vid = imageio.get_reader(content)
|
||||||
|
else:
|
||||||
|
# Assuming it's a local file path
|
||||||
|
vid = imageio.get_reader(file_path)
|
||||||
|
|
||||||
|
for frame in vid:
|
||||||
|
pil_image = Image.fromarray(frame)
|
||||||
|
images.append(pil_image)
|
||||||
|
|
||||||
|
return images
|
||||||
|
|
||||||
|
video = load_video("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-vid2vid-input-1.gif")
|
||||||
|
|
||||||
|
output = pipe(
|
||||||
|
video = video,
|
||||||
|
prompt="panda playing a guitar, on a boat, in the ocean, high quality",
|
||||||
|
negative_prompt="bad quality, worse quality",
|
||||||
|
guidance_scale=7.5,
|
||||||
|
num_inference_steps=25,
|
||||||
|
strength=0.5,
|
||||||
|
generator=torch.Generator("cpu").manual_seed(42),
|
||||||
|
)
|
||||||
|
frames = output.frames[0]
|
||||||
|
export_to_gif(frames, "animation.gif")
|
||||||
|
```
|
||||||
|
|
||||||
|
Here are some sample outputs:
|
||||||
|
|
||||||
|
<table>
|
||||||
|
<tr>
|
||||||
|
<th align=center>Source Video</th>
|
||||||
|
<th align=center>Output Video</th>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td align=center>
|
||||||
|
raccoon playing a guitar
|
||||||
|
<br />
|
||||||
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-vid2vid-input-1.gif"
|
||||||
|
alt="racoon playing a guitar"
|
||||||
|
style="width: 300px;" />
|
||||||
|
</td>
|
||||||
|
<td align=center>
|
||||||
|
panda playing a guitar
|
||||||
|
<br/>
|
||||||
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-vid2vid-output-1.gif"
|
||||||
|
alt="panda playing a guitar"
|
||||||
|
style="width: 300px;" />
|
||||||
|
</td>
|
||||||
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td align=center>
|
||||||
|
closeup of margot robbie, fireworks in the background, high quality
|
||||||
|
<br />
|
||||||
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-vid2vid-input-2.gif"
|
||||||
|
alt="closeup of margot robbie, fireworks in the background, high quality"
|
||||||
|
style="width: 300px;" />
|
||||||
|
</td>
|
||||||
|
<td align=center>
|
||||||
|
closeup of tony stark, robert downey jr, fireworks
|
||||||
|
<br/>
|
||||||
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-vid2vid-output-2.gif"
|
||||||
|
alt="closeup of tony stark, robert downey jr, fireworks"
|
||||||
|
style="width: 300px;" />
|
||||||
|
</td>
|
||||||
|
</tr>
|
||||||
|
</table>
|
||||||
|
|
||||||
## Using Motion LoRAs
|
## Using Motion LoRAs
|
||||||
|
|
||||||
Motion LoRAs are a collection of LoRAs that work with the `guoyww/animatediff-motion-adapter-v1-5-2` checkpoint. These LoRAs are responsible for adding specific types of motion to the animations.
|
Motion LoRAs are a collection of LoRAs that work with the `guoyww/animatediff-motion-adapter-v1-5-2` checkpoint. These LoRAs are responsible for adding specific types of motion to the animations.
|
||||||
@@ -235,23 +346,164 @@ export_to_gif(frames, "animation.gif")
|
|||||||
</tr>
|
</tr>
|
||||||
</table>
|
</table>
|
||||||
|
|
||||||
|
## Using FreeInit
|
||||||
|
|
||||||
|
[FreeInit: Bridging Initialization Gap in Video Diffusion Models](https://arxiv.org/abs/2312.07537) by Tianxing Wu, Chenyang Si, Yuming Jiang, Ziqi Huang, Ziwei Liu.
|
||||||
|
|
||||||
|
FreeInit is an effective method that improves temporal consistency and overall quality of videos generated using video-diffusion-models without any addition training. It can be applied to AnimateDiff, ModelScope, VideoCrafter and various other video generation models seamlessly at inference time, and works by iteratively refining the latent-initialization noise. More details can be found it the paper.
|
||||||
|
|
||||||
|
The following example demonstrates the usage of FreeInit.
|
||||||
|
|
||||||
|
```python
|
||||||
|
import torch
|
||||||
|
from diffusers import MotionAdapter, AnimateDiffPipeline, DDIMScheduler
|
||||||
|
from diffusers.utils import export_to_gif
|
||||||
|
|
||||||
|
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2")
|
||||||
|
model_id = "SG161222/Realistic_Vision_V5.1_noVAE"
|
||||||
|
pipe = AnimateDiffPipeline.from_pretrained(model_id, motion_adapter=adapter, torch_dtype=torch.float16).to("cuda")
|
||||||
|
pipe.scheduler = DDIMScheduler.from_pretrained(
|
||||||
|
model_id,
|
||||||
|
subfolder="scheduler",
|
||||||
|
beta_schedule="linear",
|
||||||
|
clip_sample=False,
|
||||||
|
timestep_spacing="linspace",
|
||||||
|
steps_offset=1
|
||||||
|
)
|
||||||
|
|
||||||
|
# enable memory savings
|
||||||
|
pipe.enable_vae_slicing()
|
||||||
|
pipe.enable_vae_tiling()
|
||||||
|
|
||||||
|
# enable FreeInit
|
||||||
|
# Refer to the enable_free_init documentation for a full list of configurable parameters
|
||||||
|
pipe.enable_free_init(method="butterworth", use_fast_sampling=True)
|
||||||
|
|
||||||
|
# run inference
|
||||||
|
output = pipe(
|
||||||
|
prompt="a panda playing a guitar, on a boat, in the ocean, high quality",
|
||||||
|
negative_prompt="bad quality, worse quality",
|
||||||
|
num_frames=16,
|
||||||
|
guidance_scale=7.5,
|
||||||
|
num_inference_steps=20,
|
||||||
|
generator=torch.Generator("cpu").manual_seed(666),
|
||||||
|
)
|
||||||
|
|
||||||
|
# disable FreeInit
|
||||||
|
pipe.disable_free_init()
|
||||||
|
|
||||||
|
frames = output.frames[0]
|
||||||
|
export_to_gif(frames, "animation.gif")
|
||||||
|
```
|
||||||
|
|
||||||
|
<Tip warning={true}>
|
||||||
|
|
||||||
|
FreeInit is not really free - the improved quality comes at the cost of extra computation. It requires sampling a few extra times depending on the `num_iters` parameter that is set when enabling it. Setting the `use_fast_sampling` parameter to `True` can improve the overall performance (at the cost of lower quality compared to when `use_fast_sampling=False` but still better results than vanilla video generation models).
|
||||||
|
|
||||||
|
</Tip>
|
||||||
|
|
||||||
<Tip>
|
<Tip>
|
||||||
|
|
||||||
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
|
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
|
||||||
|
|
||||||
</Tip>
|
</Tip>
|
||||||
|
|
||||||
|
## Using AnimateLCM
|
||||||
|
|
||||||
|
[AnimateLCM](https://animatelcm.github.io/) is a motion module checkpoint and an [LCM LoRA](https://huggingface.co/docs/diffusers/using-diffusers/inference_with_lcm_lora) that have been created using a consistency learning strategy that decouples the distillation of the image generation priors and the motion generation priors.
|
||||||
|
|
||||||
|
```python
|
||||||
|
import torch
|
||||||
|
from diffusers import AnimateDiffPipeline, LCMScheduler, MotionAdapter
|
||||||
|
from diffusers.utils import export_to_gif
|
||||||
|
|
||||||
|
adapter = MotionAdapter.from_pretrained("wangfuyun/AnimateLCM")
|
||||||
|
pipe = AnimateDiffPipeline.from_pretrained("emilianJR/epiCRealism", motion_adapter=adapter)
|
||||||
|
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config, beta_schedule="linear")
|
||||||
|
|
||||||
|
pipe.load_lora_weights("wangfuyun/AnimateLCM", weight_name="sd15_lora_beta.safetensors", adapter_name="lcm-lora")
|
||||||
|
|
||||||
|
pipe.enable_vae_slicing()
|
||||||
|
pipe.enable_model_cpu_offload()
|
||||||
|
|
||||||
|
output = pipe(
|
||||||
|
prompt="A space rocket with trails of smoke behind it launching into space from the desert, 4k, high resolution",
|
||||||
|
negative_prompt="bad quality, worse quality, low resolution",
|
||||||
|
num_frames=16,
|
||||||
|
guidance_scale=1.5,
|
||||||
|
num_inference_steps=6,
|
||||||
|
generator=torch.Generator("cpu").manual_seed(0),
|
||||||
|
)
|
||||||
|
frames = output.frames[0]
|
||||||
|
export_to_gif(frames, "animatelcm.gif")
|
||||||
|
```
|
||||||
|
|
||||||
|
<table>
|
||||||
|
<tr>
|
||||||
|
<td><center>
|
||||||
|
A space rocket, 4K.
|
||||||
|
<br>
|
||||||
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatelcm-output.gif"
|
||||||
|
alt="A space rocket, 4K"
|
||||||
|
style="width: 300px;" />
|
||||||
|
</center></td>
|
||||||
|
</tr>
|
||||||
|
</table>
|
||||||
|
|
||||||
|
AnimateLCM is also compatible with existing [Motion LoRAs](https://huggingface.co/collections/dn6/animatediff-motion-loras-654cb8ad732b9e3cf4d3c17e).
|
||||||
|
|
||||||
|
```python
|
||||||
|
import torch
|
||||||
|
from diffusers import AnimateDiffPipeline, LCMScheduler, MotionAdapter
|
||||||
|
from diffusers.utils import export_to_gif
|
||||||
|
|
||||||
|
adapter = MotionAdapter.from_pretrained("wangfuyun/AnimateLCM")
|
||||||
|
pipe = AnimateDiffPipeline.from_pretrained("emilianJR/epiCRealism", motion_adapter=adapter)
|
||||||
|
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config, beta_schedule="linear")
|
||||||
|
|
||||||
|
pipe.load_lora_weights("wangfuyun/AnimateLCM", weight_name="sd15_lora_beta.safetensors", adapter_name="lcm-lora")
|
||||||
|
pipe.load_lora_weights("guoyww/animatediff-motion-lora-tilt-up", adapter_name="tilt-up")
|
||||||
|
|
||||||
|
pipe.set_adapters(["lcm-lora", "tilt-up"], [1.0, 0.8])
|
||||||
|
pipe.enable_vae_slicing()
|
||||||
|
pipe.enable_model_cpu_offload()
|
||||||
|
|
||||||
|
output = pipe(
|
||||||
|
prompt="A space rocket with trails of smoke behind it launching into space from the desert, 4k, high resolution",
|
||||||
|
negative_prompt="bad quality, worse quality, low resolution",
|
||||||
|
num_frames=16,
|
||||||
|
guidance_scale=1.5,
|
||||||
|
num_inference_steps=6,
|
||||||
|
generator=torch.Generator("cpu").manual_seed(0),
|
||||||
|
)
|
||||||
|
frames = output.frames[0]
|
||||||
|
export_to_gif(frames, "animatelcm-motion-lora.gif")
|
||||||
|
```
|
||||||
|
|
||||||
|
<table>
|
||||||
|
<tr>
|
||||||
|
<td><center>
|
||||||
|
A space rocket, 4K.
|
||||||
|
<br>
|
||||||
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatelcm-motion-lora.gif"
|
||||||
|
alt="A space rocket, 4K"
|
||||||
|
style="width: 300px;" />
|
||||||
|
</center></td>
|
||||||
|
</tr>
|
||||||
|
</table>
|
||||||
|
|
||||||
|
|
||||||
## AnimateDiffPipeline
|
## AnimateDiffPipeline
|
||||||
|
|
||||||
[[autodoc]] AnimateDiffPipeline
|
[[autodoc]] AnimateDiffPipeline
|
||||||
- all
|
- all
|
||||||
- __call__
|
- __call__
|
||||||
- enable_freeu
|
|
||||||
- disable_freeu
|
## AnimateDiffVideoToVideoPipeline
|
||||||
- enable_vae_slicing
|
|
||||||
- disable_vae_slicing
|
[[autodoc]] AnimateDiffVideoToVideoPipeline
|
||||||
- enable_vae_tiling
|
- all
|
||||||
- disable_vae_tiling
|
- __call__
|
||||||
|
|
||||||
## AnimateDiffPipelineOutput
|
## AnimateDiffPipelineOutput
|
||||||
|
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
57
docs/source/en/api/pipelines/i2vgenxl.md
Normal file
57
docs/source/en/api/pipelines/i2vgenxl.md
Normal file
@@ -0,0 +1,57 @@
|
|||||||
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# I2VGen-XL
|
||||||
|
|
||||||
|
[I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models](https://hf.co/papers/2311.04145.pdf) by Shiwei Zhang, Jiayu Wang, Yingya Zhang, Kang Zhao, Hangjie Yuan, Zhiwu Qin, Xiang Wang, Deli Zhao, and Jingren Zhou.
|
||||||
|
|
||||||
|
The abstract from the paper is:
|
||||||
|
|
||||||
|
*Video synthesis has recently made remarkable strides benefiting from the rapid development of diffusion models. However, it still encounters challenges in terms of semantic accuracy, clarity and spatio-temporal continuity. They primarily arise from the scarcity of well-aligned text-video data and the complex inherent structure of videos, making it difficult for the model to simultaneously ensure semantic and qualitative excellence. In this report, we propose a cascaded I2VGen-XL approach that enhances model performance by decoupling these two factors and ensures the alignment of the input data by utilizing static images as a form of crucial guidance. I2VGen-XL consists of two stages: i) the base stage guarantees coherent semantics and preserves content from input images by using two hierarchical encoders, and ii) the refinement stage enhances the video's details by incorporating an additional brief text and improves the resolution to 1280×720. To improve the diversity, we collect around 35 million single-shot text-video pairs and 6 billion text-image pairs to optimize the model. By this means, I2VGen-XL can simultaneously enhance the semantic accuracy, continuity of details and clarity of generated videos. Through extensive experiments, we have investigated the underlying principles of I2VGen-XL and compared it with current top methods, which can demonstrate its effectiveness on diverse data. The source code and models will be publicly available at [this https URL](https://i2vgen-xl.github.io/).*
|
||||||
|
|
||||||
|
The original codebase can be found [here](https://github.com/ali-vilab/i2vgen-xl/). The model checkpoints can be found [here](https://huggingface.co/ali-vilab/).
|
||||||
|
|
||||||
|
<Tip>
|
||||||
|
|
||||||
|
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines. Also, to know more about reducing the memory usage of this pipeline, refer to the ["Reduce memory usage"] section [here](../../using-diffusers/svd#reduce-memory-usage).
|
||||||
|
|
||||||
|
</Tip>
|
||||||
|
|
||||||
|
Sample output with I2VGenXL:
|
||||||
|
|
||||||
|
<table>
|
||||||
|
<tr>
|
||||||
|
<td><center>
|
||||||
|
library.
|
||||||
|
<br>
|
||||||
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/i2vgen-xl-example.gif"
|
||||||
|
alt="library"
|
||||||
|
style="width: 300px;" />
|
||||||
|
</center></td>
|
||||||
|
</tr>
|
||||||
|
</table>
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
* I2VGenXL always uses a `clip_skip` value of 1. This means it leverages the penultimate layer representations from the text encoder of CLIP.
|
||||||
|
* It can generate videos of quality that is often on par with [Stable Video Diffusion](../../using-diffusers/svd) (SVD).
|
||||||
|
* Unlike SVD, it additionally accepts text prompts as inputs.
|
||||||
|
* It can generate higher resolution videos.
|
||||||
|
* When using the [`DDIMScheduler`] (which is default for this pipeline), less than 50 steps for inference leads to bad results.
|
||||||
|
|
||||||
|
## I2VGenXLPipeline
|
||||||
|
[[autodoc]] I2VGenXLPipeline
|
||||||
|
- all
|
||||||
|
- __call__
|
||||||
|
|
||||||
|
## I2VGenXLPipelineOutput
|
||||||
|
[[autodoc]] pipelines.i2vgen_xl.pipeline_i2vgen_xl.I2VGenXLPipelineOutput
|
||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
http://www.apache.org/licenses/LICENSE-2.0
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
http://www.apache.org/licenses/LICENSE-2.0
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
http://www.apache.org/licenses/LICENSE-2.0
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
54
docs/source/en/api/pipelines/ledits_pp.md
Normal file
54
docs/source/en/api/pipelines/ledits_pp.md
Normal file
@@ -0,0 +1,54 @@
|
|||||||
|
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# LEDITS++
|
||||||
|
|
||||||
|
LEDITS++ was proposed in [LEDITS++: Limitless Image Editing using Text-to-Image Models](https://huggingface.co/papers/2311.16711) by Manuel Brack, Felix Friedrich, Katharina Kornmeier, Linoy Tsaban, Patrick Schramowski, Kristian Kersting, Apolinário Passos.
|
||||||
|
|
||||||
|
The abstract from the paper is:
|
||||||
|
|
||||||
|
*Text-to-image diffusion models have recently received increasing interest for their astonishing ability to produce high-fidelity images from solely text inputs. Subsequent research efforts aim to exploit and apply their capabilities to real image editing. However, existing image-to-image methods are often inefficient, imprecise, and of limited versatility. They either require time-consuming fine-tuning, deviate unnecessarily strongly from the input image, and/or lack support for multiple, simultaneous edits. To address these issues, we introduce LEDITS++, an efficient yet versatile and precise textual image manipulation technique. LEDITS++'s novel inversion approach requires no tuning nor optimization and produces high-fidelity results with a few diffusion steps. Second, our methodology supports multiple simultaneous edits and is architecture-agnostic. Third, we use a novel implicit masking technique that limits changes to relevant image regions. We propose the novel TEdBench++ benchmark as part of our exhaustive evaluation. Our results demonstrate the capabilities of LEDITS++ and its improvements over previous methods. The project page is available at https://leditsplusplus-project.static.hf.space .*
|
||||||
|
|
||||||
|
<Tip>
|
||||||
|
|
||||||
|
You can find additional information about LEDITS++ on the [project page](https://leditsplusplus-project.static.hf.space/index.html) and try it out in a [demo](https://huggingface.co/spaces/editing-images/leditsplusplus).
|
||||||
|
|
||||||
|
</Tip>
|
||||||
|
|
||||||
|
<Tip warning={true}>
|
||||||
|
Due to some backward compatability issues with the current diffusers implementation of [`~schedulers.DPMSolverMultistepScheduler`] this implementation of LEdits++ can no longer guarantee perfect inversion.
|
||||||
|
This issue is unlikely to have any noticeable effects on applied use-cases. However, we provide an alternative implementation that guarantees perfect inversion in a dedicated [GitHub repo](https://github.com/ml-research/ledits_pp).
|
||||||
|
</Tip>
|
||||||
|
|
||||||
|
We provide two distinct pipelines based on different pre-trained models.
|
||||||
|
|
||||||
|
## LEditsPPPipelineStableDiffusion
|
||||||
|
[[autodoc]] pipelines.ledits_pp.LEditsPPPipelineStableDiffusion
|
||||||
|
- all
|
||||||
|
- __call__
|
||||||
|
- invert
|
||||||
|
|
||||||
|
## LEditsPPPipelineStableDiffusionXL
|
||||||
|
[[autodoc]] pipelines.ledits_pp.LEditsPPPipelineStableDiffusionXL
|
||||||
|
- all
|
||||||
|
- __call__
|
||||||
|
- invert
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
## LEditsPPDiffusionPipelineOutput
|
||||||
|
[[autodoc]] pipelines.ledits_pp.pipeline_output.LEditsPPDiffusionPipelineOutput
|
||||||
|
- all
|
||||||
|
|
||||||
|
## LEditsPPInversionPipelineOutput
|
||||||
|
[[autodoc]] pipelines.ledits_pp.pipeline_output.LEditsPPInversionPipelineOutput
|
||||||
|
- all
|
||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -57,6 +57,7 @@ The table below lists all the pipelines currently available in 🤗 Diffusers an
|
|||||||
| [Latent Consistency Models](latent_consistency_models) | text2image |
|
| [Latent Consistency Models](latent_consistency_models) | text2image |
|
||||||
| [Latent Diffusion](latent_diffusion) | text2image, super-resolution |
|
| [Latent Diffusion](latent_diffusion) | text2image, super-resolution |
|
||||||
| [LDM3D](stable_diffusion/ldm3d_diffusion) | text2image, text-to-3D, text-to-pano, upscaling |
|
| [LDM3D](stable_diffusion/ldm3d_diffusion) | text2image, text-to-3D, text-to-pano, upscaling |
|
||||||
|
| [LEDITS++](ledits_pp) | image editing |
|
||||||
| [MultiDiffusion](panorama) | text2image |
|
| [MultiDiffusion](panorama) | text2image |
|
||||||
| [MusicLDM](musicldm) | text2audio |
|
| [MusicLDM](musicldm) | text2audio |
|
||||||
| [Paint by Example](paint_by_example) | inpainting |
|
| [Paint by Example](paint_by_example) | inpainting |
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
167
docs/source/en/api/pipelines/pia.md
Normal file
167
docs/source/en/api/pipelines/pia.md
Normal file
@@ -0,0 +1,167 @@
|
|||||||
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# Image-to-Video Generation with PIA (Personalized Image Animator)
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
[PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models](https://arxiv.org/abs/2312.13964) by Yiming Zhang, Zhening Xing, Yanhong Zeng, Youqing Fang, Kai Chen
|
||||||
|
|
||||||
|
Recent advancements in personalized text-to-image (T2I) models have revolutionized content creation, empowering non-experts to generate stunning images with unique styles. While promising, adding realistic motions into these personalized images by text poses significant challenges in preserving distinct styles, high-fidelity details, and achieving motion controllability by text. In this paper, we present PIA, a Personalized Image Animator that excels in aligning with condition images, achieving motion controllability by text, and the compatibility with various personalized T2I models without specific tuning. To achieve these goals, PIA builds upon a base T2I model with well-trained temporal alignment layers, allowing for the seamless transformation of any personalized T2I model into an image animation model. A key component of PIA is the introduction of the condition module, which utilizes the condition frame and inter-frame affinity as input to transfer appearance information guided by the affinity hint for individual frame synthesis in the latent space. This design mitigates the challenges of appearance-related image alignment within and allows for a stronger focus on aligning with motion-related guidance.
|
||||||
|
|
||||||
|
[Project page](https://pi-animator.github.io/)
|
||||||
|
|
||||||
|
## Available Pipelines
|
||||||
|
|
||||||
|
| Pipeline | Tasks | Demo
|
||||||
|
|---|---|:---:|
|
||||||
|
| [PIAPipeline](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/pia/pipeline_pia.py) | *Image-to-Video Generation with PIA* |
|
||||||
|
|
||||||
|
## Available checkpoints
|
||||||
|
|
||||||
|
Motion Adapter checkpoints for PIA can be found under the [OpenMMLab org](https://huggingface.co/openmmlab/PIA-condition-adapter). These checkpoints are meant to work with any model based on Stable Diffusion 1.5
|
||||||
|
|
||||||
|
## Usage example
|
||||||
|
|
||||||
|
PIA works with a MotionAdapter checkpoint and a Stable Diffusion 1.5 model checkpoint. The MotionAdapter is a collection of Motion Modules that are responsible for adding coherent motion across image frames. These modules are applied after the Resnet and Attention blocks in the Stable Diffusion UNet. In addition to the motion modules, PIA also replaces the input convolution layer of the SD 1.5 UNet model with a 9 channel input convolution layer.
|
||||||
|
|
||||||
|
The following example demonstrates how to use PIA to generate a video from a single image.
|
||||||
|
|
||||||
|
```python
|
||||||
|
import torch
|
||||||
|
from diffusers import (
|
||||||
|
EulerDiscreteScheduler,
|
||||||
|
MotionAdapter,
|
||||||
|
PIAPipeline,
|
||||||
|
)
|
||||||
|
from diffusers.utils import export_to_gif, load_image
|
||||||
|
|
||||||
|
adapter = MotionAdapter.from_pretrained("openmmlab/PIA-condition-adapter")
|
||||||
|
pipe = PIAPipeline.from_pretrained("SG161222/Realistic_Vision_V6.0_B1_noVAE", motion_adapter=adapter, torch_dtype=torch.float16)
|
||||||
|
|
||||||
|
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config)
|
||||||
|
pipe.enable_model_cpu_offload()
|
||||||
|
pipe.enable_vae_slicing()
|
||||||
|
|
||||||
|
image = load_image(
|
||||||
|
"https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/pix2pix/cat_6.png?download=true"
|
||||||
|
)
|
||||||
|
image = image.resize((512, 512))
|
||||||
|
prompt = "cat in a field"
|
||||||
|
negative_prompt = "wrong white balance, dark, sketches,worst quality,low quality"
|
||||||
|
|
||||||
|
generator = torch.Generator("cpu").manual_seed(0)
|
||||||
|
output = pipe(image=image, prompt=prompt, generator=generator)
|
||||||
|
frames = output.frames[0]
|
||||||
|
export_to_gif(frames, "pia-animation.gif")
|
||||||
|
```
|
||||||
|
|
||||||
|
Here are some sample outputs:
|
||||||
|
|
||||||
|
<table>
|
||||||
|
<tr>
|
||||||
|
<td><center>
|
||||||
|
cat in a field.
|
||||||
|
<br>
|
||||||
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/pia-default-output.gif"
|
||||||
|
alt="cat in a field"
|
||||||
|
style="width: 300px;" />
|
||||||
|
</center></td>
|
||||||
|
</tr>
|
||||||
|
</table>
|
||||||
|
|
||||||
|
|
||||||
|
<Tip>
|
||||||
|
|
||||||
|
If you plan on using a scheduler that can clip samples, make sure to disable it by setting `clip_sample=False` in the scheduler as this can also have an adverse effect on generated samples. Additionally, the PIA checkpoints can be sensitive to the beta schedule of the scheduler. We recommend setting this to `linear`.
|
||||||
|
|
||||||
|
</Tip>
|
||||||
|
|
||||||
|
## Using FreeInit
|
||||||
|
|
||||||
|
[FreeInit: Bridging Initialization Gap in Video Diffusion Models](https://arxiv.org/abs/2312.07537) by Tianxing Wu, Chenyang Si, Yuming Jiang, Ziqi Huang, Ziwei Liu.
|
||||||
|
|
||||||
|
FreeInit is an effective method that improves temporal consistency and overall quality of videos generated using video-diffusion-models without any addition training. It can be applied to PIA, AnimateDiff, ModelScope, VideoCrafter and various other video generation models seamlessly at inference time, and works by iteratively refining the latent-initialization noise. More details can be found it the paper.
|
||||||
|
|
||||||
|
The following example demonstrates the usage of FreeInit.
|
||||||
|
|
||||||
|
```python
|
||||||
|
import torch
|
||||||
|
from diffusers import (
|
||||||
|
DDIMScheduler,
|
||||||
|
MotionAdapter,
|
||||||
|
PIAPipeline,
|
||||||
|
)
|
||||||
|
from diffusers.utils import export_to_gif, load_image
|
||||||
|
|
||||||
|
adapter = MotionAdapter.from_pretrained("openmmlab/PIA-condition-adapter")
|
||||||
|
pipe = PIAPipeline.from_pretrained("SG161222/Realistic_Vision_V6.0_B1_noVAE", motion_adapter=adapter)
|
||||||
|
|
||||||
|
# enable FreeInit
|
||||||
|
# Refer to the enable_free_init documentation for a full list of configurable parameters
|
||||||
|
pipe.enable_free_init(method="butterworth", use_fast_sampling=True)
|
||||||
|
|
||||||
|
# Memory saving options
|
||||||
|
pipe.enable_model_cpu_offload()
|
||||||
|
pipe.enable_vae_slicing()
|
||||||
|
|
||||||
|
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
|
||||||
|
image = load_image(
|
||||||
|
"https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/pix2pix/cat_6.png?download=true"
|
||||||
|
)
|
||||||
|
image = image.resize((512, 512))
|
||||||
|
prompt = "cat in a field"
|
||||||
|
negative_prompt = "wrong white balance, dark, sketches,worst quality,low quality"
|
||||||
|
|
||||||
|
generator = torch.Generator("cpu").manual_seed(0)
|
||||||
|
|
||||||
|
output = pipe(image=image, prompt=prompt, generator=generator)
|
||||||
|
frames = output.frames[0]
|
||||||
|
export_to_gif(frames, "pia-freeinit-animation.gif")
|
||||||
|
```
|
||||||
|
|
||||||
|
<table>
|
||||||
|
<tr>
|
||||||
|
<td><center>
|
||||||
|
cat in a field.
|
||||||
|
<br>
|
||||||
|
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/pia-freeinit-output-cat.gif"
|
||||||
|
alt="cat in a field"
|
||||||
|
style="width: 300px;" />
|
||||||
|
</center></td>
|
||||||
|
</tr>
|
||||||
|
</table>
|
||||||
|
|
||||||
|
|
||||||
|
<Tip warning={true}>
|
||||||
|
|
||||||
|
FreeInit is not really free - the improved quality comes at the cost of extra computation. It requires sampling a few extra times depending on the `num_iters` parameter that is set when enabling it. Setting the `use_fast_sampling` parameter to `True` can improve the overall performance (at the cost of lower quality compared to when `use_fast_sampling=False` but still better results than vanilla video generation models).
|
||||||
|
|
||||||
|
</Tip>
|
||||||
|
|
||||||
|
## PIAPipeline
|
||||||
|
|
||||||
|
[[autodoc]] PIAPipeline
|
||||||
|
- all
|
||||||
|
- __call__
|
||||||
|
- enable_freeu
|
||||||
|
- disable_freeu
|
||||||
|
- enable_free_init
|
||||||
|
- disable_free_init
|
||||||
|
- enable_vae_slicing
|
||||||
|
- disable_vae_slicing
|
||||||
|
- enable_vae_tiling
|
||||||
|
- disable_vae_tiling
|
||||||
|
|
||||||
|
## PIAPipelineOutput
|
||||||
|
|
||||||
|
[[autodoc]] pipelines.pia.PIAPipelineOutput
|
||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
@@ -30,6 +30,6 @@ Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers)
|
|||||||
- all
|
- all
|
||||||
- __call__
|
- __call__
|
||||||
|
|
||||||
## StableDiffusionSafePipelineOutput
|
## SemanticStableDiffusionPipelineOutput
|
||||||
[[autodoc]] pipelines.semantic_stable_diffusion.pipeline_output.SemanticStableDiffusionPipelineOutput
|
[[autodoc]] pipelines.semantic_stable_diffusion.pipeline_output.SemanticStableDiffusionPipelineOutput
|
||||||
- all
|
- all
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
the License. You may obtain a copy of the License at
|
the License. You may obtain a copy of the License at
|
||||||
http://www.apache.org/licenses/LICENSE-2.0
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|||||||
229
docs/source/en/api/pipelines/stable_cascade.md
Normal file
229
docs/source/en/api/pipelines/stable_cascade.md
Normal file
@@ -0,0 +1,229 @@
|
|||||||
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# Stable Cascade
|
||||||
|
|
||||||
|
This model is built upon the [Würstchen](https://openreview.net/forum?id=gU58d5QeGv) architecture and its main
|
||||||
|
difference to other models like Stable Diffusion is that it is working at a much smaller latent space. Why is this
|
||||||
|
important? The smaller the latent space, the **faster** you can run inference and the **cheaper** the training becomes.
|
||||||
|
How small is the latent space? Stable Diffusion uses a compression factor of 8, resulting in a 1024x1024 image being
|
||||||
|
encoded to 128x128. Stable Cascade achieves a compression factor of 42, meaning that it is possible to encode a
|
||||||
|
1024x1024 image to 24x24, while maintaining crisp reconstructions. The text-conditional model is then trained in the
|
||||||
|
highly compressed latent space. Previous versions of this architecture, achieved a 16x cost reduction over Stable
|
||||||
|
Diffusion 1.5.
|
||||||
|
|
||||||
|
Therefore, this kind of model is well suited for usages where efficiency is important. Furthermore, all known extensions
|
||||||
|
like finetuning, LoRA, ControlNet, IP-Adapter, LCM etc. are possible with this method as well.
|
||||||
|
|
||||||
|
The original codebase can be found at [Stability-AI/StableCascade](https://github.com/Stability-AI/StableCascade).
|
||||||
|
|
||||||
|
## Model Overview
|
||||||
|
Stable Cascade consists of three models: Stage A, Stage B and Stage C, representing a cascade to generate images,
|
||||||
|
hence the name "Stable Cascade".
|
||||||
|
|
||||||
|
Stage A & B are used to compress images, similar to what the job of the VAE is in Stable Diffusion.
|
||||||
|
However, with this setup, a much higher compression of images can be achieved. While the Stable Diffusion models use a
|
||||||
|
spatial compression factor of 8, encoding an image with resolution of 1024 x 1024 to 128 x 128, Stable Cascade achieves
|
||||||
|
a compression factor of 42. This encodes a 1024 x 1024 image to 24 x 24, while being able to accurately decode the
|
||||||
|
image. This comes with the great benefit of cheaper training and inference. Furthermore, Stage C is responsible
|
||||||
|
for generating the small 24 x 24 latents given a text prompt.
|
||||||
|
|
||||||
|
The Stage C model operates on the small 24 x 24 latents and denoises the latents conditioned on text prompts. The model is also the largest component in the Cascade pipeline and is meant to be used with the `StableCascadePriorPipeline`
|
||||||
|
|
||||||
|
The Stage B and Stage A models are used with the `StableCascadeDecoderPipeline` and are responsible for generating the final image given the small 24 x 24 latents.
|
||||||
|
|
||||||
|
<Tip warning={true}>
|
||||||
|
|
||||||
|
There are some restrictions on data types that can be used with the Stable Cascade models. The official checkpoints for the `StableCascadePriorPipeline` do not support the `torch.float16` data type. Please use `torch.bfloat16` instead.
|
||||||
|
|
||||||
|
In order to use the `torch.bfloat16` data type with the `StableCascadeDecoderPipeline` you need to have PyTorch 2.2.0 or higher installed. This also means that using the `StableCascadeCombinedPipeline` with `torch.bfloat16` requires PyTorch 2.2.0 or higher, since it calls the `StableCascadeDecoderPipeline` internally.
|
||||||
|
|
||||||
|
If it is not possible to install PyTorch 2.2.0 or higher in your environment, the `StableCascadeDecoderPipeline` can be used on its own with the `torch.float16` data type. You can download the full precision or `bf16` variant weights for the pipeline and cast the weights to `torch.float16`.
|
||||||
|
|
||||||
|
</Tip>
|
||||||
|
|
||||||
|
## Usage example
|
||||||
|
|
||||||
|
```python
|
||||||
|
import torch
|
||||||
|
from diffusers import StableCascadeDecoderPipeline, StableCascadePriorPipeline
|
||||||
|
|
||||||
|
prompt = "an image of a shiba inu, donning a spacesuit and helmet"
|
||||||
|
negative_prompt = ""
|
||||||
|
|
||||||
|
prior = StableCascadePriorPipeline.from_pretrained("stabilityai/stable-cascade-prior", variant="bf16", torch_dtype=torch.bfloat16)
|
||||||
|
decoder = StableCascadeDecoderPipeline.from_pretrained("stabilityai/stable-cascade", variant="bf16", torch_dtype=torch.float16)
|
||||||
|
|
||||||
|
prior.enable_model_cpu_offload()
|
||||||
|
prior_output = prior(
|
||||||
|
prompt=prompt,
|
||||||
|
height=1024,
|
||||||
|
width=1024,
|
||||||
|
negative_prompt=negative_prompt,
|
||||||
|
guidance_scale=4.0,
|
||||||
|
num_images_per_prompt=1,
|
||||||
|
num_inference_steps=20
|
||||||
|
)
|
||||||
|
|
||||||
|
decoder.enable_model_cpu_offload()
|
||||||
|
decoder_output = decoder(
|
||||||
|
image_embeddings=prior_output.image_embeddings.to(torch.float16),
|
||||||
|
prompt=prompt,
|
||||||
|
negative_prompt=negative_prompt,
|
||||||
|
guidance_scale=0.0,
|
||||||
|
output_type="pil",
|
||||||
|
num_inference_steps=10
|
||||||
|
).images[0]
|
||||||
|
decoder_output.save("cascade.png")
|
||||||
|
```
|
||||||
|
|
||||||
|
## Using the Lite Versions of the Stage B and Stage C models
|
||||||
|
|
||||||
|
```python
|
||||||
|
import torch
|
||||||
|
from diffusers import (
|
||||||
|
StableCascadeDecoderPipeline,
|
||||||
|
StableCascadePriorPipeline,
|
||||||
|
StableCascadeUNet,
|
||||||
|
)
|
||||||
|
|
||||||
|
prompt = "an image of a shiba inu, donning a spacesuit and helmet"
|
||||||
|
negative_prompt = ""
|
||||||
|
|
||||||
|
prior_unet = StableCascadeUNet.from_pretrained("stabilityai/stable-cascade-prior", subfolder="prior_lite")
|
||||||
|
decoder_unet = StableCascadeUNet.from_pretrained("stabilityai/stable-cascade", subfolder="decoder_lite")
|
||||||
|
|
||||||
|
prior = StableCascadePriorPipeline.from_pretrained("stabilityai/stable-cascade-prior", prior=prior_unet)
|
||||||
|
decoder = StableCascadeDecoderPipeline.from_pretrained("stabilityai/stable-cascade", decoder=decoder_unet)
|
||||||
|
|
||||||
|
prior.enable_model_cpu_offload()
|
||||||
|
prior_output = prior(
|
||||||
|
prompt=prompt,
|
||||||
|
height=1024,
|
||||||
|
width=1024,
|
||||||
|
negative_prompt=negative_prompt,
|
||||||
|
guidance_scale=4.0,
|
||||||
|
num_images_per_prompt=1,
|
||||||
|
num_inference_steps=20
|
||||||
|
)
|
||||||
|
|
||||||
|
decoder.enable_model_cpu_offload()
|
||||||
|
decoder_output = decoder(
|
||||||
|
image_embeddings=prior_output.image_embeddings,
|
||||||
|
prompt=prompt,
|
||||||
|
negative_prompt=negative_prompt,
|
||||||
|
guidance_scale=0.0,
|
||||||
|
output_type="pil",
|
||||||
|
num_inference_steps=10
|
||||||
|
).images[0]
|
||||||
|
decoder_output.save("cascade.png")
|
||||||
|
```
|
||||||
|
|
||||||
|
## Loading original checkpoints with `from_single_file`
|
||||||
|
|
||||||
|
Loading the original format checkpoints is supported via `from_single_file` method in the StableCascadeUNet.
|
||||||
|
|
||||||
|
```python
|
||||||
|
import torch
|
||||||
|
from diffusers import (
|
||||||
|
StableCascadeDecoderPipeline,
|
||||||
|
StableCascadePriorPipeline,
|
||||||
|
StableCascadeUNet,
|
||||||
|
)
|
||||||
|
|
||||||
|
prompt = "an image of a shiba inu, donning a spacesuit and helmet"
|
||||||
|
negative_prompt = ""
|
||||||
|
|
||||||
|
prior_unet = StableCascadeUNet.from_single_file(
|
||||||
|
"https://huggingface.co/stabilityai/stable-cascade/resolve/main/stage_c_bf16.safetensors",
|
||||||
|
torch_dtype=torch.bfloat16
|
||||||
|
)
|
||||||
|
decoder_unet = StableCascadeUNet.from_single_file(
|
||||||
|
"https://huggingface.co/stabilityai/stable-cascade/blob/main/stage_b_bf16.safetensors",
|
||||||
|
torch_dtype=torch.bfloat16
|
||||||
|
)
|
||||||
|
|
||||||
|
prior = StableCascadePriorPipeline.from_pretrained("stabilityai/stable-cascade-prior", prior=prior_unet, torch_dtype=torch.bfloat16)
|
||||||
|
decoder = StableCascadeDecoderPipeline.from_pretrained("stabilityai/stable-cascade", decoder=decoder_unet, torch_dtype=torch.bfloat16)
|
||||||
|
|
||||||
|
prior.enable_model_cpu_offload()
|
||||||
|
prior_output = prior(
|
||||||
|
prompt=prompt,
|
||||||
|
height=1024,
|
||||||
|
width=1024,
|
||||||
|
negative_prompt=negative_prompt,
|
||||||
|
guidance_scale=4.0,
|
||||||
|
num_images_per_prompt=1,
|
||||||
|
num_inference_steps=20
|
||||||
|
)
|
||||||
|
|
||||||
|
decoder.enable_model_cpu_offload()
|
||||||
|
decoder_output = decoder(
|
||||||
|
image_embeddings=prior_output.image_embeddings,
|
||||||
|
prompt=prompt,
|
||||||
|
negative_prompt=negative_prompt,
|
||||||
|
guidance_scale=0.0,
|
||||||
|
output_type="pil",
|
||||||
|
num_inference_steps=10
|
||||||
|
).images[0]
|
||||||
|
decoder_output.save("cascade-single-file.png")
|
||||||
|
```
|
||||||
|
|
||||||
|
## Uses
|
||||||
|
|
||||||
|
### Direct Use
|
||||||
|
|
||||||
|
The model is intended for research purposes for now. Possible research areas and tasks include
|
||||||
|
|
||||||
|
- Research on generative models.
|
||||||
|
- Safe deployment of models which have the potential to generate harmful content.
|
||||||
|
- Probing and understanding the limitations and biases of generative models.
|
||||||
|
- Generation of artworks and use in design and other artistic processes.
|
||||||
|
- Applications in educational or creative tools.
|
||||||
|
|
||||||
|
Excluded uses are described below.
|
||||||
|
|
||||||
|
### Out-of-Scope Use
|
||||||
|
|
||||||
|
The model was not trained to be factual or true representations of people or events,
|
||||||
|
and therefore using the model to generate such content is out-of-scope for the abilities of this model.
|
||||||
|
The model should not be used in any way that violates Stability AI's [Acceptable Use Policy](https://stability.ai/use-policy).
|
||||||
|
|
||||||
|
## Limitations and Bias
|
||||||
|
|
||||||
|
### Limitations
|
||||||
|
- Faces and people in general may not be generated properly.
|
||||||
|
- The autoencoding part of the model is lossy.
|
||||||
|
|
||||||
|
|
||||||
|
## StableCascadeCombinedPipeline
|
||||||
|
|
||||||
|
[[autodoc]] StableCascadeCombinedPipeline
|
||||||
|
- all
|
||||||
|
- __call__
|
||||||
|
|
||||||
|
## StableCascadePriorPipeline
|
||||||
|
|
||||||
|
[[autodoc]] StableCascadePriorPipeline
|
||||||
|
- all
|
||||||
|
- __call__
|
||||||
|
|
||||||
|
## StableCascadePriorPipelineOutput
|
||||||
|
|
||||||
|
[[autodoc]] pipelines.stable_cascade.pipeline_stable_cascade_prior.StableCascadePriorPipelineOutput
|
||||||
|
|
||||||
|
## StableCascadeDecoderPipeline
|
||||||
|
|
||||||
|
[[autodoc]] StableCascadeDecoderPipeline
|
||||||
|
- all
|
||||||
|
- __call__
|
||||||
|
|
||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user