Compare commits

..

2 Commits

Author SHA1 Message Date
Sayak Paul
7ca6df34da Merge branch 'main' into more-lora-fixes 2023-12-22 19:44:55 +05:30
Dhruv Nair
a9880092ce update 2023-12-22 11:36:57 +00:00
1120 changed files with 35373 additions and 136020 deletions

View File

@@ -66,32 +66,32 @@ body:
Questions on DiffusionPipeline (Saving, Loading, From pretrained, ...): Questions on DiffusionPipeline (Saving, Loading, From pretrained, ...):
Questions on pipelines: Questions on pipelines:
- Stable Diffusion @yiyixuxu @DN6 @sayakpaul - Stable Diffusion @yiyixuxu @DN6 @sayakpaul @patrickvonplaten
- Stable Diffusion XL @yiyixuxu @sayakpaul @DN6 - Stable Diffusion XL @yiyixuxu @sayakpaul @DN6 @patrickvonplaten
- Kandinsky @yiyixuxu - Kandinsky @yiyixuxu @patrickvonplaten
- ControlNet @sayakpaul @yiyixuxu @DN6 - ControlNet @sayakpaul @yiyixuxu @DN6 @patrickvonplaten
- T2I Adapter @sayakpaul @yiyixuxu @DN6 - T2I Adapter @sayakpaul @yiyixuxu @DN6 @patrickvonplaten
- IF @DN6 - IF @DN6 @patrickvonplaten
- Text-to-Video / Video-to-Video @DN6 @sayakpaul - Text-to-Video / Video-to-Video @DN6 @sayakpaul @patrickvonplaten
- Wuerstchen @DN6 - Wuerstchen @DN6 @patrickvonplaten
- Other: @yiyixuxu @DN6 - Other: @yiyixuxu @DN6
Questions on models: Questions on models:
- UNet @DN6 @yiyixuxu @sayakpaul - UNet @DN6 @yiyixuxu @sayakpaul @patrickvonplaten
- VAE @sayakpaul @DN6 @yiyixuxu - VAE @sayakpaul @DN6 @yiyixuxu @patrickvonplaten
- Transformers/Attention @DN6 @yiyixuxu @sayakpaul @DN6 - Transformers/Attention @DN6 @yiyixuxu @sayakpaul @DN6 @patrickvonplaten
Questions on Schedulers: @yiyixuxu Questions on Schedulers: @yiyixuxu @patrickvonplaten
Questions on LoRA: @sayakpaul Questions on LoRA: @sayakpaul @patrickvonplaten
Questions on Textual Inversion: @sayakpaul Questions on Textual Inversion: @sayakpaul @patrickvonplaten
Questions on Training: Questions on Training:
- DreamBooth @sayakpaul - DreamBooth @sayakpaul @patrickvonplaten
- Text-to-Image Fine-tuning @sayakpaul - Text-to-Image Fine-tuning @sayakpaul @patrickvonplaten
- Textual Inversion @sayakpaul - Textual Inversion @sayakpaul @patrickvonplaten
- ControlNet @sayakpaul - ControlNet @sayakpaul @patrickvonplaten
Questions on Tests: @DN6 @sayakpaul @yiyixuxu Questions on Tests: @DN6 @sayakpaul @yiyixuxu
@@ -99,7 +99,7 @@ body:
Questions on JAX- and MPS-related things: @pcuenca Questions on JAX- and MPS-related things: @pcuenca
Questions on audio pipelines: @DN6 Questions on audio pipelines: @DN6 @patrickvonplaten

View File

@@ -1,4 +1,4 @@
contact_links: contact_links:
- name: Questions / Discussions - name: Forum
url: https://github.com/huggingface/diffusers/discussions url: https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63
about: General usage questions and community discussions about: General usage questions and community discussions

View File

@@ -38,13 +38,13 @@ members/contributors who may be interested in your PR.
Core library: Core library:
- Schedulers: @yiyixuxu - Schedulers: @williamberman and @patrickvonplaten
- Pipelines: @sayakpaul @yiyixuxu @DN6 - Pipelines: @patrickvonplaten and @sayakpaul
- Training examples: @sayakpaul - Training examples: @sayakpaul and @patrickvonplaten
- Docs: @stevhliu and @sayakpaul - Docs: @stevhliu and @yiyixuxu
- JAX and MPS: @pcuenca - JAX and MPS: @pcuenca
- Audio: @sanchit-gandhi - Audio: @sanchit-gandhi
- General functionalities: @sayakpaul @yiyixuxu @DN6 - General functionalities: @patrickvonplaten and @sayakpaul
Integrations: Integrations:

View File

@@ -1,7 +1,6 @@
name: Benchmarking tests name: Benchmarking tests
on: on:
workflow_dispatch:
schedule: schedule:
- cron: "30 1 1,15 * *" # every 2 weeks on the 1st and the 15th of every month at 1:30 AM - cron: "30 1 1,15 * *" # every 2 weeks on the 1st and the 15th of every month at 1:30 AM
@@ -31,15 +30,15 @@ jobs:
nvidia-smi nvidia-smi
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
python -m uv pip install pandas peft python -m pip install pandas
- name: Environment - name: Environment
run: | run: |
python utils/print_env.py python utils/print_env.py
- name: Diffusers Benchmarking - name: Diffusers Benchmarking
env: env:
HF_TOKEN: ${{ secrets.DIFFUSERS_BOT_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.DIFFUSERS_BOT_TOKEN }}
BASE_PATH: benchmark_outputs BASE_PATH: benchmark_outputs
run: | run: |
export TOTAL_GPU_MEMORY=$(python -c "import torch; print(torch.cuda.get_device_properties(0).total_memory / (1024**3))") export TOTAL_GPU_MEMORY=$(python -c "import torch; print(torch.cuda.get_device_properties(0).total_memory / (1024**3))")

View File

@@ -1,57 +1,20 @@
name: Test, build, and push Docker images name: Build Docker images (nightly)
on: on:
pull_request: # During PRs, we just check if the changes Dockerfiles can be successfully built
branches:
- main
paths:
- "docker/**"
workflow_dispatch: workflow_dispatch:
schedule: schedule:
- cron: "0 0 * * *" # every day at midnight - cron: "0 0 * * *" # every day at midnight
concurrency: concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }} group: docker-image-builds
cancel-in-progress: true cancel-in-progress: false
env: env:
REGISTRY: diffusers REGISTRY: diffusers
CI_SLACK_CHANNEL: ${{ secrets.CI_DOCKER_CHANNEL }}
jobs: jobs:
test-build-docker-images: build-docker-images:
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ] runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
steps:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Check out code
uses: actions/checkout@v3
- name: Find Changed Dockerfiles
id: file_changes
uses: jitterbit/get-changed-files@v1
with:
format: 'space-delimited'
token: ${{ secrets.GITHUB_TOKEN }}
- name: Build Changed Docker Images
run: |
CHANGED_FILES="${{ steps.file_changes.outputs.all }}"
for FILE in $CHANGED_FILES; do
if [[ "$FILE" == docker/*Dockerfile ]]; then
DOCKER_PATH="${FILE%/Dockerfile}"
DOCKER_TAG=$(basename "$DOCKER_PATH")
echo "Building Docker image for $DOCKER_TAG"
docker build -t "$DOCKER_TAG" "$DOCKER_PATH"
fi
done
if: steps.file_changes.outputs.all != ''
build-and-push-docker-images:
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
if: github.event_name != 'pull_request'
permissions: permissions:
contents: read contents: read
@@ -69,18 +32,17 @@ jobs:
- diffusers-flax-tpu - diffusers-flax-tpu
- diffusers-onnxruntime-cpu - diffusers-onnxruntime-cpu
- diffusers-onnxruntime-cuda - diffusers-onnxruntime-cuda
- diffusers-doc-builder
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@v3 uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Login to Docker Hub - name: Login to Docker Hub
uses: docker/login-action@v2 uses: docker/login-action@v2
with: with:
username: ${{ env.REGISTRY }} username: ${{ env.REGISTRY }}
password: ${{ secrets.DOCKERHUB_TOKEN }} password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and push - name: Build and push
uses: docker/build-push-action@v3 uses: docker/build-push-action@v3
with: with:
@@ -88,14 +50,3 @@ jobs:
context: ./docker/${{ matrix.image-name }} context: ./docker/${{ matrix.image-name }}
push: true push: true
tags: ${{ env.REGISTRY }}/${{ matrix.image-name }}:latest tags: ${{ env.REGISTRY }}/${{ matrix.image-name }}:latest
- name: Post to a Slack channel
id: slack
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
# Slack channel id, channel name, or user id to post message.
# See also: https://api.slack.com/methods/chat.postMessage#channels
slack_channel: ${{ env.CI_SLACK_CHANNEL }}
title: "🤗 Results of the ${{ matrix.image-name }} Docker Image build"
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

View File

@@ -7,10 +7,6 @@ on:
- doc-builder* - doc-builder*
- v*-release - v*-release
- v*-patch - v*-patch
paths:
- "src/diffusers/**.py"
- "examples/**"
- "docs/**"
jobs: jobs:
build: build:
@@ -21,7 +17,7 @@ jobs:
package: diffusers package: diffusers
notebook_folder: diffusers_doc notebook_folder: diffusers_doc
languages: en ko zh ja pt languages: en ko zh ja pt
custom_container: diffusers/diffusers-doc-builder
secrets: secrets:
token: ${{ secrets.HUGGINGFACE_PUSH }} token: ${{ secrets.HUGGINGFACE_PUSH }}
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }} hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}

View File

@@ -2,10 +2,6 @@ name: Build PR Documentation
on: on:
pull_request: pull_request:
paths:
- "src/diffusers/**.py"
- "examples/**"
- "docs/**"
concurrency: concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }} group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
@@ -20,4 +16,3 @@ jobs:
install_libgl1: true install_libgl1: true
package: diffusers package: diffusers
languages: en ko zh ja pt languages: en ko zh ja pt
custom_container: diffusers/diffusers-doc-builder

View File

@@ -0,0 +1,14 @@
name: Delete doc comment
on:
workflow_run:
workflows: ["Delete doc comment trigger"]
types:
- completed
jobs:
delete:
uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@main
secrets:
comment_bot_token: ${{ secrets.COMMENT_BOT_TOKEN }}

View File

@@ -0,0 +1,12 @@
name: Delete doc comment trigger
on:
pull_request:
types: [ closed ]
jobs:
delete:
uses: huggingface/doc-builder/.github/workflows/delete_doc_comment_trigger.yml@main
with:
pr_number: ${{ github.event.number }}

View File

@@ -1,7 +1,6 @@
name: Nightly and release tests on main/release branch name: Nightly tests on main
on: on:
workflow_dispatch:
schedule: schedule:
- cron: "0 0 * * *" # every day at midnight - cron: "0 0 * * *" # every day at midnight
@@ -13,330 +12,95 @@ env:
PYTEST_TIMEOUT: 600 PYTEST_TIMEOUT: 600
RUN_SLOW: yes RUN_SLOW: yes
RUN_NIGHTLY: yes RUN_NIGHTLY: yes
PIPELINE_USAGE_CUTOFF: 5000
SLACK_API_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
jobs: jobs:
setup_torch_cuda_pipeline_matrix: run_nightly_tests:
name: Setup Torch Pipelines Matrix
runs-on: diffusers/diffusers-pytorch-cpu
outputs:
pipeline_test_matrix: ${{ steps.fetch_pipeline_matrix.outputs.pipeline_test_matrix }}
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
pip install -e .
pip install huggingface_hub
- name: Fetch Pipeline Matrix
id: fetch_pipeline_matrix
run: |
matrix=$(python utils/fetch_torch_cuda_pipeline_test_matrix.py)
echo $matrix
echo "pipeline_test_matrix=$matrix" >> $GITHUB_OUTPUT
- name: Pipeline Tests Artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: test-pipelines.json
path: reports
run_nightly_tests_for_torch_pipelines:
name: Torch Pipelines CUDA Nightly Tests
needs: setup_torch_cuda_pipeline_matrix
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
module: ${{ fromJson(needs.setup_torch_cuda_pipeline_matrix.outputs.pipeline_test_matrix) }} config:
runs-on: [single-gpu, nvidia-gpu, t4, ci] - name: Nightly PyTorch CUDA tests on Ubuntu
container: framework: pytorch
runner: docker-gpu
image: diffusers/diffusers-pytorch-cuda image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0 report: torch_cuda
- name: Nightly Flax TPU tests on Ubuntu
framework: flax
runner: docker-tpu
image: diffusers/diffusers-flax-tpu
report: flax_tpu
- name: Nightly ONNXRuntime CUDA tests on Ubuntu
framework: onnxruntime
runner: docker-gpu
image: diffusers/diffusers-onnxruntime-cuda
report: onnx_cuda
name: ${{ matrix.config.name }}
runs-on: ${{ matrix.config.runner }}
container:
image: ${{ matrix.config.image }}
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ ${{ matrix.config.runner == 'docker-tpu' && '--privileged' || '--gpus 0'}}
defaults:
run:
shell: bash
steps: steps:
- name: Checkout diffusers - name: Checkout diffusers
uses: actions/checkout@v3 uses: actions/checkout@v3
with: with:
fetch-depth: 2 fetch-depth: 2
- name: NVIDIA-SMI - name: NVIDIA-SMI
run: nvidia-smi if: ${{ matrix.config.runner == 'docker-gpu' }}
run: |
nvidia-smi
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pip install -e .[quality,test]
python -m uv pip install -e [quality,test] python -m pip install -U git+https://github.com/huggingface/transformers
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git python -m pip install git+https://github.com/huggingface/accelerate
python -m uv pip install pytest-reportlog
- name: Environment - name: Environment
run: | run: |
python utils/print_env.py python utils/print_env.py
- name: Nightly PyTorch CUDA checkpoint (pipelines) tests - name: Run nightly PyTorch CUDA tests
if: ${{ matrix.config.framework == 'pytorch' }}
env: env:
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: | run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \ python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \ -s -v -k "not Flax and not Onnx" \
--make-reports=tests_pipeline_${{ matrix.module }}_cuda \ --make-reports=tests_${{ matrix.config.report }} \
--report-log=tests_pipeline_${{ matrix.module }}_cuda.log \ tests/
tests/pipelines/${{ matrix.module }}
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_pipeline_${{ matrix.module }}_cuda_stats.txt
cat reports/tests_pipeline_${{ matrix.module }}_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: pipeline_${{ matrix.module }}_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_tests_for_other_torch_modules:
name: Torch Non-Pipelines CUDA Nightly Tests
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
strategy:
matrix:
module: [models, schedulers, others, examples]
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly PyTorch CUDA tests for non-pipeline modules
if: ${{ matrix.module != 'examples'}}
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_torch_${{ matrix.module }}_cuda \
--report-log=tests_torch_${{ matrix.module }}_cuda.log \
tests/${{ matrix.module }}
- name: Run nightly example tests with Torch
if: ${{ matrix.module == 'examples' }}
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m uv pip install peft@git+https://github.com/huggingface/peft.git
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v --make-reports=examples_torch_cuda \
--report-log=examples_torch_cuda.log \
examples/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_torch_${{ matrix.module }}_cuda_stats.txt
cat reports/tests_torch_${{ matrix.module }}_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: torch_${{ matrix.module }}_cuda_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_lora_nightly_tests:
name: Nightly LoRA Tests with PEFT and TORCH
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install peft@git+https://github.com/huggingface/peft.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly LoRA tests with PEFT and Torch
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_torch_lora_cuda \
--report-log=tests_torch_lora_cuda.log \
tests/lora
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_torch_lora_cuda_stats.txt
cat reports/tests_torch_lora_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: torch_lora_cuda_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_flax_tpu_tests:
name: Nightly Flax TPU Tests
runs-on: docker-tpu
if: github.event_name == 'schedule'
container:
image: diffusers/diffusers-flax-tpu
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --privileged
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly Flax TPU tests - name: Run nightly Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }}
env: env:
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: | run: |
python -m pytest -n 0 \ python -m pytest -n 0 \
-s -v -k "Flax" \ -s -v -k "Flax" \
--make-reports=tests_flax_tpu \ --make-reports=tests_${{ matrix.config.report }} \
--report-log=tests_flax_tpu.log \
tests/ tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_flax_tpu_stats.txt
cat reports/tests_flax_tpu_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: flax_tpu_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_onnx_tests:
name: Nightly ONNXRuntime CUDA tests on Ubuntu
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-onnxruntime-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly ONNXRuntime CUDA tests - name: Run nightly ONNXRuntime CUDA tests
if: ${{ matrix.config.framework == 'onnxruntime' }}
env: env:
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: | run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \ python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \ -s -v -k "Onnx" \
--make-reports=tests_onnx_cuda \ --make-reports=tests_${{ matrix.config.report }} \
--report-log=tests_onnx_cuda.log \
tests/ tests/
- name: Failure short reports - name: Failure short reports
if: ${{ failure() }} if: ${{ failure() }}
run: | run: cat reports/tests_${{ matrix.config.report }}_failures_short.txt
cat reports/tests_onnx_cuda_stats.txt
cat reports/tests_onnx_cuda_failures_short.txt
- name: Test suite reports artifacts - name: Test suite reports artifacts
if: ${{ always() }} if: ${{ always() }}
@@ -345,16 +109,9 @@ jobs:
name: ${{ matrix.config.report }}_test_reports name: ${{ matrix.config.report }}_test_reports
path: reports path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_tests_apple_m1: run_nightly_tests_apple_m1:
name: Nightly PyTorch MPS tests on MacOS name: Nightly PyTorch MPS tests on MacOS
runs-on: [ self-hosted, apple-m1 ] runs-on: [ self-hosted, apple-m1 ]
if: github.event_name == 'schedule'
steps: steps:
- name: Checkout diffusers - name: Checkout diffusers
@@ -375,11 +132,10 @@ jobs:
- name: Install dependencies - name: Install dependencies
shell: arch -arch arm64 bash {0} shell: arch -arch arm64 bash {0}
run: | run: |
${CONDA_RUN} python -m pip install --upgrade pip uv ${CONDA_RUN} python -m pip install --upgrade pip
${CONDA_RUN} python -m uv pip install -e [quality,test] ${CONDA_RUN} python -m pip install -e .[quality,test]
${CONDA_RUN} python -m uv pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu ${CONDA_RUN} python -m pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
${CONDA_RUN} python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate ${CONDA_RUN} python -m pip install git+https://github.com/huggingface/accelerate
${CONDA_RUN} python -m uv pip install pytest-reportlog
- name: Environment - name: Environment
shell: arch -arch arm64 bash {0} shell: arch -arch arm64 bash {0}
@@ -390,11 +146,9 @@ jobs:
shell: arch -arch arm64 bash {0} shell: arch -arch arm64 bash {0}
env: env:
HF_HOME: /System/Volumes/Data/mnt/cache HF_HOME: /System/Volumes/Data/mnt/cache
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: | run: |
${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps \ ${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps tests/
--report-log=tests_torch_mps.log \
tests/
- name: Failure short reports - name: Failure short reports
if: ${{ failure() }} if: ${{ failure() }}
@@ -406,9 +160,3 @@ jobs:
with: with:
name: torch_mps_test_reports name: torch_mps_test_reports
path: reports path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY

View File

@@ -1,23 +0,0 @@
name: Notify Slack about a release
on:
workflow_dispatch:
release:
types: [published]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.8'
- name: Notify Slack about the release
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
run: pip install requests && python utils/notify_slack_about_release.py

View File

@@ -4,8 +4,6 @@ on:
pull_request: pull_request:
branches: branches:
- main - main
paths:
- "src/diffusers/**.py"
push: push:
branches: branches:
- main - main
@@ -25,12 +23,10 @@ jobs:
python-version: "3.8" python-version: "3.8"
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pip install --upgrade pip
python -m pip install --upgrade pip uv pip install -e .
python -m uv pip install -e . pip install pytest
python -m uv pip install pytest
- name: Check for soft dependencies - name: Check for soft dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
pytest tests/others/test_dependencies.py pytest tests/others/test_dependencies.py

View File

@@ -4,8 +4,6 @@ on:
pull_request: pull_request:
branches: branches:
- main - main
paths:
- "src/diffusers/**.py"
push: push:
branches: branches:
- main - main
@@ -25,14 +23,12 @@ jobs:
python-version: "3.8" python-version: "3.8"
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pip install --upgrade pip
python -m pip install --upgrade pip uv pip install -e .
python -m uv pip install -e . pip install "jax[cpu]>=0.2.16,!=0.3.2"
python -m uv pip install "jax[cpu]>=0.2.16,!=0.3.2" pip install "flax>=0.4.1"
python -m uv pip install "flax>=0.4.1" pip install "jaxlib>=0.1.65"
python -m uv pip install "jaxlib>=0.1.65" pip install pytest
python -m uv pip install pytest
- name: Check for soft dependencies - name: Check for soft dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
pytest tests/others/test_dependencies.py pytest tests/others/test_dependencies.py

49
.github/workflows/pr_quality.yml vendored Normal file
View File

@@ -0,0 +1,49 @@
name: Run code quality checks
on:
pull_request:
branches:
- main
push:
branches:
- main
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
check_code_quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: |
ruff check examples tests src utils scripts
ruff format examples tests src utils scripts --check
check_repository_consistency:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: |
python utils/check_copies.py
python utils/check_dummies.py
make deps_table_check_updated

View File

@@ -15,7 +15,7 @@ concurrency:
jobs: jobs:
setup_pr_tests: setup_pr_tests:
name: Setup PR Tests name: Setup PR Tests
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ] runs-on: docker-cpu
container: container:
image: diffusers/diffusers-pytorch-cpu image: diffusers/diffusers-pytorch-cpu
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
@@ -32,8 +32,8 @@ jobs:
fetch-depth: 0 fetch-depth: 0
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
- name: Environment - name: Environment
run: | run: |
python utils/print_env.py python utils/print_env.py
@@ -73,7 +73,7 @@ jobs:
max-parallel: 2 max-parallel: 2
matrix: matrix:
modules: ${{ fromJson(needs.setup_pr_tests.outputs.matrix) }} modules: ${{ fromJson(needs.setup_pr_tests.outputs.matrix) }}
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ] runs-on: docker-cpu
container: container:
image: diffusers/diffusers-pytorch-cpu image: diffusers/diffusers-pytorch-cpu
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
@@ -88,18 +88,16 @@ jobs:
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e [quality,test] python -m pip install -e .[quality,test]
python -m pip install accelerate python -m pip install accelerate
- name: Environment - name: Environment
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py python utils/print_env.py
- name: Run all selected tests on CPU - name: Run all selected tests on CPU
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 2 --dist=loadfile -v --make-reports=${{ matrix.modules }}_tests_cpu ${{ fromJson(needs.setup_pr_tests.outputs.test_map)[matrix.modules] }} python -m pytest -n 2 --dist=loadfile -v --make-reports=${{ matrix.modules }}_tests_cpu ${{ fromJson(needs.setup_pr_tests.outputs.test_map)[matrix.modules] }}
- name: Failure short reports - name: Failure short reports
@@ -123,7 +121,7 @@ jobs:
config: config:
- name: Hub tests for models, schedulers, and pipelines - name: Hub tests for models, schedulers, and pipelines
framework: hub_tests_pytorch framework: hub_tests_pytorch
runner: [ self-hosted, intel-cpu, 8-cpu, ci ] runner: docker-cpu
image: diffusers/diffusers-pytorch-cpu image: diffusers/diffusers-pytorch-cpu
report: torch_hub report: torch_hub
@@ -145,18 +143,16 @@ jobs:
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e [quality,test] python -m pip install -e .[quality,test]
- name: Environment - name: Environment
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py python utils/print_env.py
- name: Run Hub tests for models, schedulers, and pipelines on a staging env - name: Run Hub tests for models, schedulers, and pipelines on a staging env
if: ${{ matrix.config.framework == 'hub_tests_pytorch' }} if: ${{ matrix.config.framework == 'hub_tests_pytorch' }}
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
HUGGINGFACE_CO_STAGING=true python -m pytest \ HUGGINGFACE_CO_STAGING=true python -m pytest \
-m "is_staging_test" \ -m "is_staging_test" \
--make-reports=tests_${{ matrix.config.report }} \ --make-reports=tests_${{ matrix.config.report }} \

View File

@@ -4,9 +4,6 @@ on:
pull_request: pull_request:
branches: branches:
- main - main
paths:
- "src/diffusers/**.py"
- "tests/**.py"
concurrency: concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }} group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
@@ -19,50 +16,7 @@ env:
PYTEST_TIMEOUT: 60 PYTEST_TIMEOUT: 60
jobs: jobs:
check_code_quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: make quality
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Quality check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make style && make quality'" >> $GITHUB_STEP_SUMMARY
check_repository_consistency:
needs: check_code_quality
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check repo consistency
run: |
python utils/check_copies.py
python utils/check_dummies.py
make deps_table_check_updated
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Repo consistency check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make fix-copies'" >> $GITHUB_STEP_SUMMARY
run_fast_tests: run_fast_tests:
needs: [check_code_quality, check_repository_consistency]
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
@@ -71,7 +25,7 @@ jobs:
name: LoRA - ${{ matrix.lib-versions }} name: LoRA - ${{ matrix.lib-versions }}
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ] runs-on: docker-cpu
container: container:
image: diffusers/diffusers-pytorch-cpu image: diffusers/diffusers-pytorch-cpu
@@ -89,25 +43,23 @@ jobs:
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
if [ "${{ matrix.lib-versions }}" == "main" ]; then if [ "${{ matrix.lib-versions }}" == "main" ]; then
python -m pip install -U peft@git+https://github.com/huggingface/peft.git python -m pip install -U git+https://github.com/huggingface/peft.git
python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git python -m pip install -U git+https://github.com/huggingface/transformers.git
python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git python -m pip install -U git+https://github.com/huggingface/accelerate.git
else else
python -m uv pip install -U peft transformers accelerate python -m pip install -U peft transformers accelerate
fi fi
- name: Environment - name: Environment
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py python utils/print_env.py
- name: Run fast PyTorch LoRA CPU tests with PEFT backend - name: Run fast PyTorch LoRA CPU tests with PEFT backend
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v \ -s -v \
--make-reports=tests_${{ matrix.config.report }} \ --make-reports=tests_${{ matrix.config.report }} \
tests/lora/ tests/lora/test_lora_layers_peft.py

View File

@@ -4,14 +4,6 @@ on:
pull_request: pull_request:
branches: branches:
- main - main
paths:
- "src/diffusers/**.py"
- "benchmarks/**.py"
- "examples/**.py"
- "scripts/**.py"
- "tests/**.py"
- ".github/**.yml"
- "utils/**.py"
push: push:
branches: branches:
- ci-* - ci-*
@@ -27,72 +19,34 @@ env:
PYTEST_TIMEOUT: 60 PYTEST_TIMEOUT: 60
jobs: jobs:
check_code_quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: make quality
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Quality check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make style && make quality'" >> $GITHUB_STEP_SUMMARY
check_repository_consistency:
needs: check_code_quality
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check repo consistency
run: |
python utils/check_copies.py
python utils/check_dummies.py
make deps_table_check_updated
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Repo consistency check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make fix-copies'" >> $GITHUB_STEP_SUMMARY
run_fast_tests: run_fast_tests:
needs: [check_code_quality, check_repository_consistency]
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
config: config:
- name: Fast PyTorch Pipeline CPU tests - name: Fast PyTorch Pipeline CPU tests
framework: pytorch_pipelines framework: pytorch_pipelines
runner: [ self-hosted, intel-cpu, 32-cpu, 256-ram, ci ] runner: docker-cpu
image: diffusers/diffusers-pytorch-cpu image: diffusers/diffusers-pytorch-cpu
report: torch_cpu_pipelines report: torch_cpu_pipelines
- name: Fast PyTorch Models & Schedulers CPU tests - name: Fast PyTorch Models & Schedulers CPU tests
framework: pytorch_models framework: pytorch_models
runner: [ self-hosted, intel-cpu, 8-cpu, ci ] runner: docker-cpu
image: diffusers/diffusers-pytorch-cpu image: diffusers/diffusers-pytorch-cpu
report: torch_cpu_models_schedulers report: torch_cpu_models_schedulers
- name: LoRA
framework: lora
runner: docker-cpu
image: diffusers/diffusers-pytorch-cpu
report: torch_cpu_lora
- name: Fast Flax CPU tests - name: Fast Flax CPU tests
framework: flax framework: flax
runner: [ self-hosted, intel-cpu, 8-cpu, ci ] runner: docker-cpu
image: diffusers/diffusers-flax-cpu image: diffusers/diffusers-flax-cpu
report: flax_cpu report: flax_cpu
- name: PyTorch Example CPU tests - name: PyTorch Example CPU tests
framework: pytorch_examples framework: pytorch_examples
runner: [ self-hosted, intel-cpu, 8-cpu, ci ] runner: docker-cpu
image: diffusers/diffusers-pytorch-cpu image: diffusers/diffusers-pytorch-cpu
report: torch_example_cpu report: torch_example_cpu
@@ -116,20 +70,18 @@ jobs:
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
python -m uv pip install accelerate python -m pip install accelerate
- name: Environment - name: Environment
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py python utils/print_env.py
- name: Run fast PyTorch Pipeline CPU tests - name: Run fast PyTorch Pipeline CPU tests
if: ${{ matrix.config.framework == 'pytorch_pipelines' }} if: ${{ matrix.config.framework == 'pytorch_pipelines' }}
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m pytest -n 8 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \ -s -v -k "not Flax and not Onnx" \
--make-reports=tests_${{ matrix.config.report }} \ --make-reports=tests_${{ matrix.config.report }} \
tests/pipelines tests/pipelines
@@ -137,17 +89,23 @@ jobs:
- name: Run fast PyTorch Model Scheduler CPU tests - name: Run fast PyTorch Model Scheduler CPU tests
if: ${{ matrix.config.framework == 'pytorch_models' }} if: ${{ matrix.config.framework == 'pytorch_models' }}
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx and not Dependency" \ -s -v -k "not Flax and not Onnx and not Dependency" \
--make-reports=tests_${{ matrix.config.report }} \ --make-reports=tests_${{ matrix.config.report }} \
tests/models tests/schedulers tests/others tests/models tests/schedulers tests/others
- name: Run fast PyTorch LoRA CPU tests
if: ${{ matrix.config.framework == 'lora' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx and not Dependency" \
--make-reports=tests_${{ matrix.config.report }} \
tests/lora
- name: Run fast Flax TPU tests - name: Run fast Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }} if: ${{ matrix.config.framework == 'flax' }}
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Flax" \ -s -v -k "Flax" \
--make-reports=tests_${{ matrix.config.report }} \ --make-reports=tests_${{ matrix.config.report }} \
tests tests
@@ -155,9 +113,8 @@ jobs:
- name: Run example PyTorch CPU tests - name: Run example PyTorch CPU tests
if: ${{ matrix.config.framework == 'pytorch_examples' }} if: ${{ matrix.config.framework == 'pytorch_examples' }}
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pip install peft
python -m uv pip install peft timm python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
--make-reports=tests_${{ matrix.config.report }} \ --make-reports=tests_${{ matrix.config.report }} \
examples examples
@@ -173,14 +130,13 @@ jobs:
path: reports path: reports
run_staging_tests: run_staging_tests:
needs: [check_code_quality, check_repository_consistency]
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
config: config:
- name: Hub tests for models, schedulers, and pipelines - name: Hub tests for models, schedulers, and pipelines
framework: hub_tests_pytorch framework: hub_tests_pytorch
runner: [ self-hosted, intel-cpu, 8-cpu, ci ] runner: docker-cpu
image: diffusers/diffusers-pytorch-cpu image: diffusers/diffusers-pytorch-cpu
report: torch_hub report: torch_hub
@@ -204,18 +160,16 @@ jobs:
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
- name: Environment - name: Environment
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py python utils/print_env.py
- name: Run Hub tests for models, schedulers, and pipelines on a staging env - name: Run Hub tests for models, schedulers, and pipelines on a staging env
if: ${{ matrix.config.framework == 'hub_tests_pytorch' }} if: ${{ matrix.config.framework == 'hub_tests_pytorch' }}
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
HUGGINGFACE_CO_STAGING=true python -m pytest \ HUGGINGFACE_CO_STAGING=true python -m pytest \
-m "is_staging_test" \ -m "is_staging_test" \
--make-reports=tests_${{ matrix.config.report }} \ --make-reports=tests_${{ matrix.config.report }} \

View File

@@ -4,8 +4,6 @@ on:
pull_request: pull_request:
branches: branches:
- main - main
paths:
- "src/diffusers/**.py"
push: push:
branches: branches:
- main - main
@@ -25,12 +23,10 @@ jobs:
python-version: "3.8" python-version: "3.8"
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pip install --upgrade pip
python -m pip install --upgrade pip uv pip install -e .
python -m uv pip install -e . pip install torch torchvision torchaudio
python -m uv pip install torch torchvision torchaudio pip install pytest
python -m uv pip install pytest
- name: Check for soft dependencies - name: Check for soft dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
pytest tests/others/test_dependencies.py pytest tests/others/test_dependencies.py

View File

@@ -4,10 +4,7 @@ on:
push: push:
branches: branches:
- main - main
paths:
- "src/diffusers/**.py"
- "examples/**.py"
- "tests/**.py"
env: env:
DIFFUSERS_IS_CI: yes DIFFUSERS_IS_CI: yes
@@ -21,9 +18,10 @@ env:
jobs: jobs:
setup_torch_cuda_pipeline_matrix: setup_torch_cuda_pipeline_matrix:
name: Setup Torch Pipelines CUDA Slow Tests Matrix name: Setup Torch Pipelines CUDA Slow Tests Matrix
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ] runs-on: docker-gpu
container: container:
image: diffusers/diffusers-pytorch-cpu image: diffusers/diffusers-pytorch-cpu # this is a CPU image, but we need it to fetch the matrix
options: --shm-size "16gb" --ipc host
outputs: outputs:
pipeline_test_matrix: ${{ steps.fetch_pipeline_matrix.outputs.pipeline_test_matrix }} pipeline_test_matrix: ${{ steps.fetch_pipeline_matrix.outputs.pipeline_test_matrix }}
steps: steps:
@@ -33,17 +31,21 @@ jobs:
fetch-depth: 2 fetch-depth: 2
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
python -m pip install git+https://github.com/huggingface/accelerate.git
- name: Environment - name: Environment
run: | run: |
python utils/print_env.py python utils/print_env.py
- name: Fetch Pipeline Matrix - name: Fetch Pipeline Matrix
id: fetch_pipeline_matrix id: fetch_pipeline_matrix
run: | run: |
matrix=$(python utils/fetch_torch_cuda_pipeline_test_matrix.py) matrix=$(python utils/fetch_torch_cuda_pipeline_test_matrix.py)
echo $matrix echo $matrix
echo "pipeline_test_matrix=$matrix" >> $GITHUB_OUTPUT echo "pipeline_test_matrix=$matrix" >> $GITHUB_OUTPUT
- name: Pipeline Tests Artifacts - name: Pipeline Tests Artifacts
if: ${{ always() }} if: ${{ always() }}
uses: actions/upload-artifact@v2 uses: actions/upload-artifact@v2
@@ -56,13 +58,13 @@ jobs:
needs: setup_torch_cuda_pipeline_matrix needs: setup_torch_cuda_pipeline_matrix
strategy: strategy:
fail-fast: false fail-fast: false
max-parallel: 8 max-parallel: 1
matrix: matrix:
module: ${{ fromJson(needs.setup_torch_cuda_pipeline_matrix.outputs.pipeline_test_matrix) }} module: ${{ fromJson(needs.setup_torch_cuda_pipeline_matrix.outputs.pipeline_test_matrix) }}
runs-on: [single-gpu, nvidia-gpu, t4, ci] runs-on: docker-gpu
container: container:
image: diffusers/diffusers-pytorch-cuda image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0 --privileged options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
steps: steps:
- name: Checkout diffusers - name: Checkout diffusers
uses: actions/checkout@v3 uses: actions/checkout@v3
@@ -71,23 +73,17 @@ jobs:
- name: NVIDIA-SMI - name: NVIDIA-SMI
run: | run: |
nvidia-smi nvidia-smi
- name: Tailscale
uses: huggingface/tailscale-action@v1
with:
authkey: ${{ secrets.TAILSCALE_SSH_AUTHKEY }}
slackChannel: ${{ secrets.SLACK_CIFEEDBACK_CHANNEL }}
slackToken: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git python -m pip install git+https://github.com/huggingface/accelerate.git
- name: Environment - name: Environment
run: | run: |
python utils/print_env.py python utils/print_env.py
- name: Slow PyTorch CUDA checkpoint tests on Ubuntu - name: Slow PyTorch CUDA checkpoint tests on Ubuntu
env: env:
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms # https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8 CUBLAS_WORKSPACE_CONFIG: :16:8
run: | run: |
@@ -95,12 +91,6 @@ jobs:
-s -v -k "not Flax and not Onnx" \ -s -v -k "not Flax and not Onnx" \
--make-reports=tests_pipeline_${{ matrix.module }}_cuda \ --make-reports=tests_pipeline_${{ matrix.module }}_cuda \
tests/pipelines/${{ matrix.module }} tests/pipelines/${{ matrix.module }}
- name: Tailscale Wait
if: ${{ failure() || runner.debug == '1' }}
uses: huggingface/tailscale-action@v1
with:
waitForSSH: true
authkey: ${{ secrets.TAILSCALE_SSH_AUTHKEY }}
- name: Failure short reports - name: Failure short reports
if: ${{ failure() }} if: ${{ failure() }}
run: | run: |
@@ -116,16 +106,16 @@ jobs:
torch_cuda_tests: torch_cuda_tests:
name: Torch CUDA Tests name: Torch CUDA Tests
runs-on: [single-gpu, nvidia-gpu, t4, ci] runs-on: docker-gpu
container: container:
image: diffusers/diffusers-pytorch-cuda image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0 options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
defaults: defaults:
run: run:
shell: bash shell: bash
strategy: strategy:
matrix: matrix:
module: [models, schedulers, lora, others, single_file] module: [models, schedulers, lora, others]
steps: steps:
- name: Checkout diffusers - name: Checkout diffusers
uses: actions/checkout@v3 uses: actions/checkout@v3
@@ -134,9 +124,9 @@ jobs:
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git python -m pip install git+https://github.com/huggingface/accelerate.git
- name: Environment - name: Environment
run: | run: |
@@ -144,7 +134,7 @@ jobs:
- name: Run slow PyTorch CUDA tests - name: Run slow PyTorch CUDA tests
env: env:
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms # https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8 CUBLAS_WORKSPACE_CONFIG: :16:8
run: | run: |
@@ -168,10 +158,10 @@ jobs:
peft_cuda_tests: peft_cuda_tests:
name: PEFT CUDA Tests name: PEFT CUDA Tests
runs-on: [single-gpu, nvidia-gpu, t4, ci] runs-on: docker-gpu
container: container:
image: diffusers/diffusers-pytorch-cuda image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0 options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
defaults: defaults:
run: run:
shell: bash shell: bash
@@ -183,10 +173,10 @@ jobs:
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git python -m pip install git+https://github.com/huggingface/accelerate.git
python -m pip install -U peft@git+https://github.com/huggingface/peft.git python -m pip install git+https://github.com/huggingface/peft.git
- name: Environment - name: Environment
run: | run: |
@@ -194,12 +184,12 @@ jobs:
- name: Run slow PEFT CUDA tests - name: Run slow PEFT CUDA tests
env: env:
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms # https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8 CUBLAS_WORKSPACE_CONFIG: :16:8
run: | run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \ python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx and not PEFTLoRALoading" \ -s -v -k "not Flax and not Onnx" \
--make-reports=tests_peft_cuda \ --make-reports=tests_peft_cuda \
tests/lora/ tests/lora/
@@ -221,7 +211,7 @@ jobs:
runs-on: docker-tpu runs-on: docker-tpu
container: container:
image: diffusers/diffusers-flax-tpu image: diffusers/diffusers-flax-tpu
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ --privileged options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --privileged
defaults: defaults:
run: run:
shell: bash shell: bash
@@ -233,9 +223,9 @@ jobs:
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git python -m pip install git+https://github.com/huggingface/accelerate.git
- name: Environment - name: Environment
run: | run: |
@@ -243,7 +233,7 @@ jobs:
- name: Run slow Flax TPU tests - name: Run slow Flax TPU tests
env: env:
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: | run: |
python -m pytest -n 0 \ python -m pytest -n 0 \
-s -v -k "Flax" \ -s -v -k "Flax" \
@@ -265,10 +255,10 @@ jobs:
onnx_cuda_tests: onnx_cuda_tests:
name: ONNX CUDA Tests name: ONNX CUDA Tests
runs-on: [single-gpu, nvidia-gpu, t4, ci] runs-on: docker-gpu
container: container:
image: diffusers/diffusers-onnxruntime-cuda image: diffusers/diffusers-onnxruntime-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ --gpus 0 options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
defaults: defaults:
run: run:
shell: bash shell: bash
@@ -280,9 +270,9 @@ jobs:
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git python -m pip install git+https://github.com/huggingface/accelerate.git
- name: Environment - name: Environment
run: | run: |
@@ -290,7 +280,7 @@ jobs:
- name: Run slow ONNXRuntime CUDA tests - name: Run slow ONNXRuntime CUDA tests
env: env:
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: | run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \ python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \ -s -v -k "Onnx" \
@@ -313,11 +303,11 @@ jobs:
run_torch_compile_tests: run_torch_compile_tests:
name: PyTorch Compile CUDA tests name: PyTorch Compile CUDA tests
runs-on: [single-gpu, nvidia-gpu, t4, ci] runs-on: docker-gpu
container: container:
image: diffusers/diffusers-pytorch-compile-cuda image: diffusers/diffusers-pytorch-compile-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
steps: steps:
- name: Checkout diffusers - name: Checkout diffusers
@@ -330,14 +320,13 @@ jobs:
nvidia-smi nvidia-smi
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pip install -e .[quality,test,training]
python -m uv pip install -e [quality,test,training]
- name: Environment - name: Environment
run: | run: |
python utils/print_env.py python utils/print_env.py
- name: Run example tests on GPU - name: Run example tests on GPU
env: env:
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: | run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "compile" --make-reports=tests_torch_compile_cuda tests/ python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "compile" --make-reports=tests_torch_compile_cuda tests/
- name: Failure short reports - name: Failure short reports
@@ -354,11 +343,11 @@ jobs:
run_xformers_tests: run_xformers_tests:
name: PyTorch xformers CUDA tests name: PyTorch xformers CUDA tests
runs-on: [single-gpu, nvidia-gpu, t4, ci] runs-on: docker-gpu
container: container:
image: diffusers/diffusers-pytorch-xformers-cuda image: diffusers/diffusers-pytorch-xformers-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
steps: steps:
- name: Checkout diffusers - name: Checkout diffusers
@@ -371,14 +360,13 @@ jobs:
nvidia-smi nvidia-smi
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pip install -e .[quality,test,training]
python -m uv pip install -e [quality,test,training]
- name: Environment - name: Environment
run: | run: |
python utils/print_env.py python utils/print_env.py
- name: Run example tests on GPU - name: Run example tests on GPU
env: env:
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: | run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "xformers" --make-reports=tests_torch_xformers_cuda tests/ python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "xformers" --make-reports=tests_torch_xformers_cuda tests/
- name: Failure short reports - name: Failure short reports
@@ -395,11 +383,11 @@ jobs:
run_examples_tests: run_examples_tests:
name: Examples PyTorch CUDA tests on Ubuntu name: Examples PyTorch CUDA tests on Ubuntu
runs-on: [single-gpu, nvidia-gpu, t4, ci] runs-on: docker-gpu
container: container:
image: diffusers/diffusers-pytorch-cuda image: diffusers/diffusers-pytorch-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
steps: steps:
- name: Checkout diffusers - name: Checkout diffusers
@@ -413,20 +401,16 @@ jobs:
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pip install -e .[quality,test,training]
python -m uv pip install -e [quality,test,training]
- name: Environment - name: Environment
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py python utils/print_env.py
- name: Run example tests on GPU - name: Run example tests on GPU
env: env:
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install timm
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v --make-reports=examples_torch_cuda examples/ python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v --make-reports=examples_torch_cuda examples/
- name: Failure short reports - name: Failure short reports

View File

@@ -4,10 +4,6 @@ on:
push: push:
branches: branches:
- main - main
paths:
- "src/diffusers/**.py"
- "examples/**.py"
- "tests/**.py"
concurrency: concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }} group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
@@ -29,22 +25,22 @@ jobs:
config: config:
- name: Fast PyTorch CPU tests on Ubuntu - name: Fast PyTorch CPU tests on Ubuntu
framework: pytorch framework: pytorch
runner: [ self-hosted, intel-cpu, 8-cpu, ci ] runner: docker-cpu
image: diffusers/diffusers-pytorch-cpu image: diffusers/diffusers-pytorch-cpu
report: torch_cpu report: torch_cpu
- name: Fast Flax CPU tests on Ubuntu - name: Fast Flax CPU tests on Ubuntu
framework: flax framework: flax
runner: [ self-hosted, intel-cpu, 8-cpu, ci ] runner: docker-cpu
image: diffusers/diffusers-flax-cpu image: diffusers/diffusers-flax-cpu
report: flax_cpu report: flax_cpu
- name: Fast ONNXRuntime CPU tests on Ubuntu - name: Fast ONNXRuntime CPU tests on Ubuntu
framework: onnxruntime framework: onnxruntime
runner: [ self-hosted, intel-cpu, 8-cpu, ci ] runner: docker-cpu
image: diffusers/diffusers-onnxruntime-cpu image: diffusers/diffusers-onnxruntime-cpu
report: onnx_cpu report: onnx_cpu
- name: PyTorch Example CPU tests on Ubuntu - name: PyTorch Example CPU tests on Ubuntu
framework: pytorch_examples framework: pytorch_examples
runner: [ self-hosted, intel-cpu, 8-cpu, ci ] runner: docker-cpu
image: diffusers/diffusers-pytorch-cpu image: diffusers/diffusers-pytorch-cpu
report: torch_example_cpu report: torch_example_cpu
@@ -68,19 +64,17 @@ jobs:
- name: Install dependencies - name: Install dependencies
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m uv pip install -e [quality,test] python -m pip install -e .[quality,test]
- name: Environment - name: Environment
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py python utils/print_env.py
- name: Run fast PyTorch CPU tests - name: Run fast PyTorch CPU tests
if: ${{ matrix.config.framework == 'pytorch' }} if: ${{ matrix.config.framework == 'pytorch' }}
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \ -s -v -k "not Flax and not Onnx" \
--make-reports=tests_${{ matrix.config.report }} \ --make-reports=tests_${{ matrix.config.report }} \
tests/ tests/
@@ -88,8 +82,7 @@ jobs:
- name: Run fast Flax TPU tests - name: Run fast Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }} if: ${{ matrix.config.framework == 'flax' }}
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Flax" \ -s -v -k "Flax" \
--make-reports=tests_${{ matrix.config.report }} \ --make-reports=tests_${{ matrix.config.report }} \
tests/ tests/
@@ -97,8 +90,7 @@ jobs:
- name: Run fast ONNXRuntime CPU tests - name: Run fast ONNXRuntime CPU tests
if: ${{ matrix.config.framework == 'onnxruntime' }} if: ${{ matrix.config.framework == 'onnxruntime' }}
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \ -s -v -k "Onnx" \
--make-reports=tests_${{ matrix.config.report }} \ --make-reports=tests_${{ matrix.config.report }} \
tests/ tests/
@@ -106,9 +98,8 @@ jobs:
- name: Run example PyTorch CPU tests - name: Run example PyTorch CPU tests
if: ${{ matrix.config.framework == 'pytorch_examples' }} if: ${{ matrix.config.framework == 'pytorch_examples' }}
run: | run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH" python -m pip install peft
python -m uv pip install peft timm python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
--make-reports=tests_${{ matrix.config.report }} \ --make-reports=tests_${{ matrix.config.report }} \
examples examples

View File

@@ -4,9 +4,6 @@ on:
push: push:
branches: branches:
- main - main
paths:
- "src/diffusers/**.py"
- "tests/**.py"
env: env:
DIFFUSERS_IS_CI: yes DIFFUSERS_IS_CI: yes
@@ -23,7 +20,7 @@ concurrency:
jobs: jobs:
run_fast_tests_apple_m1: run_fast_tests_apple_m1:
name: Fast PyTorch MPS tests on MacOS name: Fast PyTorch MPS tests on MacOS
runs-on: macos-13-xlarge runs-on: [ self-hosted, apple-m1 ]
steps: steps:
- name: Checkout diffusers - name: Checkout diffusers
@@ -44,11 +41,11 @@ jobs:
- name: Install dependencies - name: Install dependencies
shell: arch -arch arm64 bash {0} shell: arch -arch arm64 bash {0}
run: | run: |
${CONDA_RUN} python -m pip install --upgrade pip uv ${CONDA_RUN} python -m pip install --upgrade pip
${CONDA_RUN} python -m uv pip install -e [quality,test] ${CONDA_RUN} python -m pip install -e .[quality,test]
${CONDA_RUN} python -m uv pip install torch torchvision torchaudio ${CONDA_RUN} python -m pip install torch torchvision torchaudio
${CONDA_RUN} python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git ${CONDA_RUN} python -m pip install git+https://github.com/huggingface/accelerate.git
${CONDA_RUN} python -m uv pip install transformers --upgrade ${CONDA_RUN} python -m pip install transformers --upgrade
- name: Environment - name: Environment
shell: arch -arch arm64 bash {0} shell: arch -arch arm64 bash {0}
@@ -59,7 +56,7 @@ jobs:
shell: arch -arch arm64 bash {0} shell: arch -arch arm64 bash {0}
env: env:
HF_HOME: /System/Volumes/Data/mnt/cache HF_HOME: /System/Volumes/Data/mnt/cache
HF_TOKEN: ${{ secrets.HF_TOKEN }} HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: | run: |
${CONDA_RUN} python -m pytest -n 0 -s -v --make-reports=tests_torch_mps tests/ ${CONDA_RUN} python -m pytest -n 0 -s -v --make-reports=tests_torch_mps tests/

View File

@@ -1,81 +0,0 @@
# Adapted from https://blog.deepjyoti30.dev/pypi-release-github-action
name: PyPI release
on:
workflow_dispatch:
push:
tags:
- "*"
jobs:
find-and-checkout-latest-branch:
runs-on: ubuntu-latest
outputs:
latest_branch: ${{ steps.set_latest_branch.outputs.latest_branch }}
steps:
- name: Checkout Repo
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.8'
- name: Fetch latest branch
id: fetch_latest_branch
run: |
pip install -U requests packaging
LATEST_BRANCH=$(python utils/fetch_latest_release_branch.py)
echo "Latest branch: $LATEST_BRANCH"
echo "latest_branch=$LATEST_BRANCH" >> $GITHUB_ENV
- name: Set latest branch output
id: set_latest_branch
run: echo "::set-output name=latest_branch::${{ env.latest_branch }}"
release:
needs: find-and-checkout-latest-branch
runs-on: ubuntu-latest
steps:
- name: Checkout Repo
uses: actions/checkout@v3
with:
ref: ${{ needs.find-and-checkout-latest-branch.outputs.latest_branch }}
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -U setuptools wheel twine
pip install -U torch --index-url https://download.pytorch.org/whl/cpu
pip install -U transformers
- name: Build the dist files
run: python setup.py bdist_wheel && python setup.py sdist
- name: Publish to the test PyPI
env:
TWINE_USERNAME: ${{ secrets.TEST_PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.TEST_PYPI_PASSWORD }}
run: twine upload dist/* -r pypitest --repository-url=https://test.pypi.org/legacy/
- name: Test installing diffusers and importing
run: |
pip install diffusers && pip uninstall diffusers -y
pip install -i https://testpypi.python.org/pypi diffusers
python -c "from diffusers import __version__; print(__version__)"
python -c "from diffusers import DiffusionPipeline; pipe = DiffusionPipeline.from_pretrained('fusing/unet-ldm-dummy-update'); pipe()"
python -c "from diffusers import DiffusionPipeline; pipe = DiffusionPipeline.from_pretrained('hf-internal-testing/tiny-stable-diffusion-pipe', safety_checker=None); pipe('ah suh du')"
python -c "from diffusers import *"
- name: Publish to PyPI
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: twine upload dist/* -r pypi

View File

@@ -1,73 +0,0 @@
name: Check running SLOW tests from a PR (only GPU)
on:
workflow_dispatch:
inputs:
docker_image:
default: 'diffusers/diffusers-pytorch-cuda'
description: 'Name of the Docker image'
required: true
branch:
description: 'PR Branch to test on'
required: true
test:
description: 'Tests to run (e.g.: `tests/models`).'
required: true
env:
DIFFUSERS_IS_CI: yes
IS_GITHUB_CI: "1"
HF_HOME: /mnt/cache
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
PYTEST_TIMEOUT: 600
RUN_SLOW: yes
jobs:
run_tests:
name: "Run a test on our runner from a PR"
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: ${{ github.event.inputs.docker_image }}
options: --gpus 0 --privileged --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Validate test files input
id: validate_test_files
env:
PY_TEST: ${{ github.event.inputs.test }}
run: |
if [[ ! "$PY_TEST" =~ ^tests/ ]]; then
echo "Error: The input string must start with 'tests/'."
exit 1
fi
if [[ ! "$PY_TEST" =~ ^tests/(models|pipelines) ]]; then
echo "Error: The input string must contain either 'models' or 'pipelines' after 'tests/'."
exit 1
fi
if [[ "$PY_TEST" == *";"* ]]; then
echo "Error: The input string must not contain ';'."
exit 1
fi
echo "$PY_TEST"
- name: Checkout PR branch
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.branch }}
repository: ${{ github.event.pull_request.head.repo.full_name }}
- name: Install pytest
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install peft
- name: Run tests
env:
PY_TEST: ${{ github.event.inputs.test }}
run: |
pytest "$PY_TEST"

View File

@@ -1,46 +0,0 @@
name: SSH into runners
on:
workflow_dispatch:
inputs:
runner_type:
description: 'Type of runner to test (a10 or t4)'
required: true
docker_image:
description: 'Name of the Docker image'
required: true
env:
IS_GITHUB_CI: "1"
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
HF_HOME: /mnt/cache
DIFFUSERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes
jobs:
ssh_runner:
name: "SSH"
runs-on: [single-gpu, nvidia-gpu, "${{ github.event.inputs.runner_type }}", ci]
container:
image: ${{ github.event.inputs.docker_image }}
options: --gpus all --privileged --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Tailscale # In order to be able to SSH when a test fails
uses: huggingface/tailscale-action@v1
with:
authkey: ${{ secrets.TAILSCALE_SSH_AUTHKEY }}
slackChannel: ${{ secrets.SLACK_CIFEEDBACK_CHANNEL }}
slackToken: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
waitForSSH: true

View File

@@ -1,30 +0,0 @@
name: Update Diffusers metadata
on:
workflow_dispatch:
push:
branches:
- main
- update_diffusers_metadata*
jobs:
update_metadata:
runs-on: ubuntu-22.04
defaults:
run:
shell: bash -l {0}
steps:
- uses: actions/checkout@v3
- name: Setup environment
run: |
pip install --upgrade pip
pip install datasets pandas
pip install .[torch]
- name: Update metadata
env:
HF_TOKEN: ${{ secrets.SAYAK_HF_TOKEN }}
run: |
python utils/update_metadata.py --commit_sha ${{ github.sha }}

View File

@@ -19,16 +19,6 @@ authors:
family-names: Rasul family-names: Rasul
- given-names: Mishig - given-names: Mishig
family-names: Davaadorj family-names: Davaadorj
- given-names: Dhruv
family-names: Nair
- given-names: Sayak
family-names: Paul
- given-names: Steven
family-names: Liu
- given-names: William
family-names: Berman
- given-names: Yiyi
family-names: Xu
- given-names: Thomas - given-names: Thomas
family-names: Wolf family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers' repository-code: 'https://github.com/huggingface/diffusers'

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -355,7 +355,7 @@ You will need basic `git` proficiency to be able to contribute to
manual. Type `git --help` in a shell and enjoy. If you prefer books, [Pro manual. Type `git --help` in a shell and enjoy. If you prefer books, [Pro
Git](https://git-scm.com/book/en/v2) is a very good reference. Git](https://git-scm.com/book/en/v2) is a very good reference.
Follow these steps to start contributing ([supported Python versions](https://github.com/huggingface/diffusers/blob/42f25d601a910dceadaee6c44345896b4cfa9928/setup.py#L270)): Follow these steps to start contributing ([supported Python versions](https://github.com/huggingface/diffusers/blob/main/setup.py#L265)):
1. Fork the [repository](https://github.com/huggingface/diffusers) by 1. Fork the [repository](https://github.com/huggingface/diffusers) by
clicking on the 'Fork' button on the repository's page. This creates a copy of the code clicking on the 'Fork' button on the repository's page. This creates a copy of the code

View File

@@ -42,7 +42,6 @@ repo-consistency:
quality: quality:
ruff check $(check_dirs) setup.py ruff check $(check_dirs) setup.py
ruff format --check $(check_dirs) setup.py ruff format --check $(check_dirs) setup.py
doc-builder style src/diffusers docs/source --max_len 119 --check_only
python utils/check_doc_toc.py python utils/check_doc_toc.py
# Format source code automatically and check is there are any problems left that need manual fixing # Format source code automatically and check is there are any problems left that need manual fixing
@@ -56,7 +55,6 @@ extra_style_checks:
style: style:
ruff check $(check_dirs) setup.py --fix ruff check $(check_dirs) setup.py --fix
ruff format $(check_dirs) setup.py ruff format $(check_dirs) setup.py
doc-builder style src/diffusers docs/source --max_len 119
${MAKE} autogenerate_code ${MAKE} autogenerate_code
${MAKE} extra_style_checks ${MAKE} extra_style_checks

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -77,7 +77,7 @@ Please refer to the [How to use Stable Diffusion in Apple Silicon](https://huggi
## Quickstart ## Quickstart
Generating outputs is super easy with 🤗 Diffusers. To generate an image from text, use the `from_pretrained` method to load any pretrained diffusion model (browse the [Hub](https://huggingface.co/models?library=diffusers&sort=downloads) for 25.000+ checkpoints): Generating outputs is super easy with 🤗 Diffusers. To generate an image from text, use the `from_pretrained` method to load any pretrained diffusion model (browse the [Hub](https://huggingface.co/models?library=diffusers&sort=downloads) for 16000+ checkpoints):
```python ```python
from diffusers import DiffusionPipeline from diffusers import DiffusionPipeline
@@ -219,7 +219,7 @@ Also, say 👋 in our public Discord channel <a href="https://discord.gg/G7tWnz9
- https://github.com/deep-floyd/IF - https://github.com/deep-floyd/IF
- https://github.com/bentoml/BentoML - https://github.com/bentoml/BentoML
- https://github.com/bmaltais/kohya_ss - https://github.com/bmaltais/kohya_ss
- +11.000 other amazing GitHub repositories 💪 - +7000 other amazing GitHub repositories 💪
Thank you for using us ❤️. Thank you for using us ❤️.
@@ -238,7 +238,7 @@ We also want to thank @heejkoo for the very helpful overview of papers, code and
```bibtex ```bibtex
@misc{von-platen-etal-2022-diffusers, @misc{von-platen-etal-2022-diffusers,
author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Dhruv Nair and Sayak Paul and William Berman and Yiyi Xu and Steven Liu and Thomas Wolf}, author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf},
title = {Diffusers: State-of-the-art diffusion models}, title = {Diffusers: State-of-the-art diffusion models},
year = {2022}, year = {2022},
publisher = {GitHub}, publisher = {GitHub},

View File

@@ -141,7 +141,6 @@ class LCMLoRATextToImageBenchmark(TextToImageBenchmark):
super().__init__(args) super().__init__(args)
self.pipe.load_lora_weights(self.lora_id) self.pipe.load_lora_weights(self.lora_id)
self.pipe.fuse_lora() self.pipe.fuse_lora()
self.pipe.unload_lora_weights()
self.pipe.scheduler = LCMScheduler.from_config(self.pipe.scheduler.config) self.pipe.scheduler = LCMScheduler.from_config(self.pipe.scheduler.config)
def get_result_filepath(self, args): def get_result_filepath(self, args):
@@ -236,35 +235,6 @@ class InpaintingBenchmark(ImageToImageBenchmark):
) )
class IPAdapterTextToImageBenchmark(TextToImageBenchmark):
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_neg_embed.png"
image = load_image(url)
def __init__(self, args):
pipe = self.pipeline_class.from_pretrained(args.ckpt, torch_dtype=torch.float16).to("cuda")
pipe.load_ip_adapter(
args.ip_adapter_id[0],
subfolder="models" if "sdxl" not in args.ip_adapter_id[1] else "sdxl_models",
weight_name=args.ip_adapter_id[1],
)
if args.run_compile:
pipe.unet.to(memory_format=torch.channels_last)
print("Run torch compile")
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
pipe.set_progress_bar_config(disable=True)
self.pipe = pipe
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
ip_adapter_image=self.image,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
class ControlNetBenchmark(TextToImageBenchmark): class ControlNetBenchmark(TextToImageBenchmark):
pipeline_class = StableDiffusionControlNetPipeline pipeline_class = StableDiffusionControlNetPipeline
aux_network_class = ControlNetModel aux_network_class = ControlNetModel

View File

@@ -1,32 +0,0 @@
import argparse
import sys
sys.path.append(".")
from base_classes import IPAdapterTextToImageBenchmark # noqa: E402
IP_ADAPTER_CKPTS = {
"runwayml/stable-diffusion-v1-5": ("h94/IP-Adapter", "ip-adapter_sd15.bin"),
"stabilityai/stable-diffusion-xl-base-1.0": ("h94/IP-Adapter", "ip-adapter_sdxl.bin"),
}
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="runwayml/stable-diffusion-v1-5",
choices=list(IP_ADAPTER_CKPTS.keys()),
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
args.ip_adapter_id = IP_ADAPTER_CKPTS[args.ckpt]
benchmark_pipe = IPAdapterTextToImageBenchmark(args)
args.ckpt = f"{args.ckpt} (IP-Adapter)"
benchmark_pipe.benchmark(args)

View File

@@ -72,7 +72,7 @@ def main():
command += " --run_compile" command += " --run_compile"
run_command(command.split()) run_command(command.split())
elif file in ["benchmark_sd_inpainting.py", "benchmark_ip_adapters.py"]: elif file == "benchmark_sd_inpainting.py":
sdxl_ckpt = "stabilityai/stable-diffusion-xl-base-1.0" sdxl_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
command = f"python {file} --ckpt {sdxl_ckpt}" command = f"python {file} --ckpt {sdxl_ckpt}"
run_command(command.split()) run_command(command.split())

View File

@@ -1,51 +0,0 @@
FROM ubuntu:20.04
LABEL maintainer="Hugging Face"
LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.10 \
python3-pip \
libgl1 \
zip \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark \
--extra-index-url https://download.pytorch.org/whl/cpu && \
python3.10 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \
huggingface-hub \
Jinja2 \
librosa \
numpy \
scipy \
tensorboard \
transformers \
matplotlib \
setuptools==69.5.1
CMD ["/bin/bash"]

View File

@@ -4,36 +4,32 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \ RUN apt update && \
&& apt-get install -y software-properties-common \ apt install -y bash \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \ build-essential \
git \ git \
git-lfs \ git-lfs \
curl \ curl \
ca-certificates \ ca-certificates \
libsndfile1-dev \ libsndfile1-dev \
libgl1 \ python3.8 \
python3.10 \
python3-pip \ python3-pip \
python3.10-venv && \ python3.8-venv && \
rm -rf /var/lib/apt/lists rm -rf /var/lib/apt/lists
# make sure to use venv # make sure to use venv
RUN python3.10 -m venv /opt/venv RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH" ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py) # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
# follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container # follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m uv pip install --upgrade --no-cache-dir \ python3 -m pip install --upgrade --no-cache-dir \
clu \ clu \
"jax[cpu]>=0.2.16,!=0.3.2" \ "jax[cpu]>=0.2.16,!=0.3.2" \
"flax>=0.4.1" \ "flax>=0.4.1" \
"jaxlib>=0.1.65" && \ "jaxlib>=0.1.65" && \
python3 -m uv pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
accelerate \ accelerate \
datasets \ datasets \
hf-doc-builder \ hf-doc-builder \

View File

@@ -4,38 +4,34 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \ RUN apt update && \
&& apt-get install -y software-properties-common \ apt install -y bash \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \ build-essential \
git \ git \
git-lfs \ git-lfs \
curl \ curl \
ca-certificates \ ca-certificates \
libsndfile1-dev \ libsndfile1-dev \
libgl1 \ python3.8 \
python3.10 \
python3-pip \ python3-pip \
python3.10-venv && \ python3.8-venv && \
rm -rf /var/lib/apt/lists rm -rf /var/lib/apt/lists
# make sure to use venv # make sure to use venv
RUN python3.10 -m venv /opt/venv RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH" ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py) # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
# follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container # follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
"jax[tpu]>=0.2.16,!=0.3.2" \ "jax[tpu]>=0.2.16,!=0.3.2" \
-f https://storage.googleapis.com/jax-releases/libtpu_releases.html && \ -f https://storage.googleapis.com/jax-releases/libtpu_releases.html && \
python3 -m uv pip install --upgrade --no-cache-dir \ python3 -m pip install --upgrade --no-cache-dir \
clu \ clu \
"flax>=0.4.1" \ "flax>=0.4.1" \
"jaxlib>=0.1.65" && \ "jaxlib>=0.1.65" && \
python3 -m uv pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
accelerate \ accelerate \
datasets \ datasets \
hf-doc-builder \ hf-doc-builder \

View File

@@ -4,36 +4,32 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \ RUN apt update && \
&& apt-get install -y software-properties-common \ apt install -y bash \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \ build-essential \
git \ git \
git-lfs \ git-lfs \
curl \ curl \
ca-certificates \ ca-certificates \
libsndfile1-dev \ libsndfile1-dev \
libgl1 \ python3.8 \
python3.10 \
python3-pip \ python3-pip \
python3.10-venv && \ python3.8-venv && \
rm -rf /var/lib/apt/lists rm -rf /var/lib/apt/lists
# make sure to use venv # make sure to use venv
RUN python3.10 -m venv /opt/venv RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH" ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py) # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m uv pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
torch==2.1.2 \ torch \
torchvision==0.16.2 \ torchvision \
torchaudio==2.1.2 \ torchaudio \
onnxruntime \ onnxruntime \
--extra-index-url https://download.pytorch.org/whl/cpu && \ --extra-index-url https://download.pytorch.org/whl/cpu && \
python3 -m uv pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
accelerate \ accelerate \
datasets \ datasets \
hf-doc-builder \ hf-doc-builder \

View File

@@ -1,39 +1,35 @@
FROM nvidia/cuda:12.1.0-runtime-ubuntu20.04 FROM nvidia/cuda:11.6.2-cudnn8-devel-ubuntu20.04
LABEL maintainer="Hugging Face" LABEL maintainer="Hugging Face"
LABEL repository="diffusers" LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \ RUN apt update && \
&& apt-get install -y software-properties-common \ apt install -y bash \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \ build-essential \
git \ git \
git-lfs \ git-lfs \
curl \ curl \
ca-certificates \ ca-certificates \
libsndfile1-dev \ libsndfile1-dev \
libgl1 \ python3.8 \
python3.10 \
python3-pip \ python3-pip \
python3.10-venv && \ python3.8-venv && \
rm -rf /var/lib/apt/lists rm -rf /var/lib/apt/lists
# make sure to use venv # make sure to use venv
RUN python3.10 -m venv /opt/venv RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH" ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py) # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3.10 -m uv pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
torch \ torch \
torchvision \ torchvision \
torchaudio \ torchaudio \
"onnxruntime-gpu>=1.13.1" \ "onnxruntime-gpu>=1.13.1" \
--extra-index-url https://download.pytorch.org/whl/cu117 && \ --extra-index-url https://download.pytorch.org/whl/cu117 && \
python3.10 -m uv pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
accelerate \ accelerate \
datasets \ datasets \
hf-doc-builder \ hf-doc-builder \

View File

@@ -4,11 +4,8 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \ RUN apt update && \
&& apt-get install -y software-properties-common \ apt install -y bash \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \ build-essential \
git \ git \
git-lfs \ git-lfs \
@@ -16,23 +13,24 @@ RUN apt install -y bash \
ca-certificates \ ca-certificates \
libsndfile1-dev \ libsndfile1-dev \
libgl1 \ libgl1 \
python3.10 \ python3.9 \
python3.9-dev \
python3-pip \ python3-pip \
python3.10-venv && \ python3.9-venv && \
rm -rf /var/lib/apt/lists rm -rf /var/lib/apt/lists
# make sure to use venv # make sure to use venv
RUN python3.10 -m venv /opt/venv RUN python3.9 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH" ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py) # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \ RUN python3.9 -m pip install --no-cache-dir --upgrade pip && \
python3.10 -m uv pip install --no-cache-dir \ python3.9 -m pip install --no-cache-dir \
torch \ torch \
torchvision \ torchvision \
torchaudio \ torchaudio \
invisible_watermark && \ invisible_watermark && \
python3.10 -m pip install --no-cache-dir \ python3.9 -m pip install --no-cache-dir \
accelerate \ accelerate \
datasets \ datasets \
hf-doc-builder \ hf-doc-builder \
@@ -42,6 +40,7 @@ RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
numpy \ numpy \
scipy \ scipy \
tensorboard \ tensorboard \
transformers transformers \
omegaconf
CMD ["/bin/bash"] CMD ["/bin/bash"]

View File

@@ -4,36 +4,33 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \ RUN apt update && \
&& apt-get install -y software-properties-common \ apt install -y bash \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \ build-essential \
git \ git \
git-lfs \ git-lfs \
curl \ curl \
ca-certificates \ ca-certificates \
libsndfile1-dev \ libsndfile1-dev \
python3.10 \ python3.8 \
python3-pip \ python3-pip \
libgl1 \ libgl1 \
python3.10-venv && \ python3.8-venv && \
rm -rf /var/lib/apt/lists rm -rf /var/lib/apt/lists
# make sure to use venv # make sure to use venv
RUN python3.10 -m venv /opt/venv RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH" ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py) # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3.10 -m uv pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
torch \ torch \
torchvision \ torchvision \
torchaudio \ torchaudio \
invisible_watermark \ invisible_watermark \
--extra-index-url https://download.pytorch.org/whl/cpu && \ --extra-index-url https://download.pytorch.org/whl/cpu && \
python3.10 -m uv pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
accelerate \ accelerate \
datasets \ datasets \
hf-doc-builder \ hf-doc-builder \
@@ -43,6 +40,6 @@ RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
numpy \ numpy \
scipy \ scipy \
tensorboard \ tensorboard \
transformers matplotlib transformers
CMD ["/bin/bash"] CMD ["/bin/bash"]

View File

@@ -4,11 +4,8 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \ RUN apt update && \
&& apt-get install -y software-properties-common \ apt install -y bash \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \ build-essential \
git \ git \
git-lfs \ git-lfs \
@@ -16,23 +13,23 @@ RUN apt install -y bash \
ca-certificates \ ca-certificates \
libsndfile1-dev \ libsndfile1-dev \
libgl1 \ libgl1 \
python3.10 \ python3.8 \
python3-pip \ python3-pip \
python3.10-venv && \ python3.8-venv && \
rm -rf /var/lib/apt/lists rm -rf /var/lib/apt/lists
# make sure to use venv # make sure to use venv
RUN python3.10 -m venv /opt/venv RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH" ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py) # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3.10 -m uv pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
torch \ torch \
torchvision \ torchvision \
torchaudio \ torchaudio \
invisible_watermark && \ invisible_watermark && \
python3.10 -m pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
accelerate \ accelerate \
datasets \ datasets \
hf-doc-builder \ hf-doc-builder \
@@ -43,6 +40,7 @@ RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
scipy \ scipy \
tensorboard \ tensorboard \
transformers \ transformers \
omegaconf \
pytorch-lightning pytorch-lightning
CMD ["/bin/bash"] CMD ["/bin/bash"]

View File

@@ -4,11 +4,8 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \ RUN apt update && \
&& apt-get install -y software-properties-common \ apt install -y bash \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \ build-essential \
git \ git \
git-lfs \ git-lfs \
@@ -16,23 +13,23 @@ RUN apt install -y bash \
ca-certificates \ ca-certificates \
libsndfile1-dev \ libsndfile1-dev \
libgl1 \ libgl1 \
python3.10 \ python3.8 \
python3-pip \ python3-pip \
python3.10-venv && \ python3.8-venv && \
rm -rf /var/lib/apt/lists rm -rf /var/lib/apt/lists
# make sure to use venv # make sure to use venv
RUN python3.10 -m venv /opt/venv RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH" ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py) # pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3.10 -m pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
torch \ torch \
torchvision \ torchvision \
torchaudio \ torchaudio \
invisible_watermark && \ invisible_watermark && \
python3.10 -m uv pip install --no-cache-dir \ python3 -m pip install --no-cache-dir \
accelerate \ accelerate \
datasets \ datasets \
hf-doc-builder \ hf-doc-builder \
@@ -43,6 +40,7 @@ RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
scipy \ scipy \
tensorboard \ tensorboard \
transformers \ transformers \
omegaconf \
xformers xformers
CMD ["/bin/bash"] CMD ["/bin/bash"]

View File

@@ -1,5 +1,5 @@
<!--- <!---
Copyright 2024- The HuggingFace Team. All rights reserved. Copyright 2023- The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License. you may not use this file except in compliance with the License.
@@ -242,10 +242,10 @@ Here's an example of a tuple return, comprising several objects:
``` ```
Returns: Returns:
`tuple(torch.Tensor)` comprising various elements depending on the configuration ([`BertConfig`]) and inputs: `tuple(torch.FloatTensor)` comprising various elements depending on the configuration ([`BertConfig`]) and inputs:
- ** loss** (*optional*, returned when `masked_lm_labels` is provided) `torch.Tensor` of shape `(1,)` -- - ** loss** (*optional*, returned when `masked_lm_labels` is provided) `torch.FloatTensor` of shape `(1,)` --
Total loss is the sum of the masked language modeling loss and the next sequence prediction (classification) loss. Total loss is the sum of the masked language modeling loss and the next sequence prediction (classification) loss.
- **prediction_scores** (`torch.Tensor` of shape `(batch_size, sequence_length, config.vocab_size)`) -- - **prediction_scores** (`torch.FloatTensor` of shape `(batch_size, sequence_length, config.vocab_size)`) --
Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax). Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).
``` ```

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -18,17 +18,18 @@
- local: tutorials/basic_training - local: tutorials/basic_training
title: Train a diffusion model title: Train a diffusion model
- local: tutorials/using_peft_for_inference - local: tutorials/using_peft_for_inference
title: Load LoRAs for inference title: Inference with PEFT
- local: tutorials/fast_diffusion
title: Accelerate inference of text-to-image diffusion models
title: Tutorials title: Tutorials
- sections: - sections:
- sections:
- local: using-diffusers/loading_overview
title: Overview
- local: using-diffusers/loading - local: using-diffusers/loading
title: Load pipelines title: Load pipelines, models, and schedulers
- local: using-diffusers/schedulers
title: Load and compare different schedulers
- local: using-diffusers/custom_pipeline_overview - local: using-diffusers/custom_pipeline_overview
title: Load community pipelines and components title: Load community pipelines and components
- local: using-diffusers/schedulers
title: Load schedulers and models
- local: using-diffusers/using_safetensors - local: using-diffusers/using_safetensors
title: Load safetensors title: Load safetensors
- local: using-diffusers/other-formats - local: using-diffusers/other-formats
@@ -37,8 +38,10 @@
title: Load adapters title: Load adapters
- local: using-diffusers/push_to_hub - local: using-diffusers/push_to_hub
title: Push files to the Hub title: Push files to the Hub
title: Load pipelines and adapters title: Loading & Hub
- sections: - sections:
- local: using-diffusers/pipeline_overview
title: Overview
- local: using-diffusers/unconditional_image_generation - local: using-diffusers/unconditional_image_generation
title: Unconditional image generation title: Unconditional image generation
- local: using-diffusers/conditional_image_generation - local: using-diffusers/conditional_image_generation
@@ -47,56 +50,56 @@
title: Image-to-image title: Image-to-image
- local: using-diffusers/inpaint - local: using-diffusers/inpaint
title: Inpainting title: Inpainting
- local: using-diffusers/text-img2vid
title: Text or image-to-video
- local: using-diffusers/depth2img - local: using-diffusers/depth2img
title: Depth-to-image title: Depth-to-image
title: Generative tasks title: Tasks
- sections: - sections:
- local: using-diffusers/overview_techniques - local: using-diffusers/textual_inversion_inference
title: Overview title: Textual inversion
- local: training/distributed_inference - local: training/distributed_inference
title: Distributed inference with multiple GPUs title: Distributed inference with multiple GPUs
- local: using-diffusers/merge_loras
title: Merge LoRAs
- local: using-diffusers/callback
title: Pipeline callbacks
- local: using-diffusers/reusing_seeds - local: using-diffusers/reusing_seeds
title: Reproducible pipelines title: Improve image quality with deterministic generation
- local: using-diffusers/image_quality - local: using-diffusers/control_brightness
title: Controlling image quality title: Control image brightness
- local: using-diffusers/weighted_prompts - local: using-diffusers/weighted_prompts
title: Prompt techniques title: Prompt weighting
title: Inference techniques - local: using-diffusers/freeu
- sections: title: Improve generation quality with FreeU
title: Techniques
- sections:
- local: using-diffusers/pipeline_overview
title: Overview
- local: using-diffusers/sdxl - local: using-diffusers/sdxl
title: Stable Diffusion XL title: Stable Diffusion XL
- local: using-diffusers/sdxl_turbo - local: using-diffusers/sdxl_turbo
title: SDXL Turbo title: SDXL Turbo
- local: using-diffusers/kandinsky - local: using-diffusers/kandinsky
title: Kandinsky title: Kandinsky
- local: using-diffusers/ip_adapter
title: IP-Adapter
- local: using-diffusers/controlnet - local: using-diffusers/controlnet
title: ControlNet title: ControlNet
- local: using-diffusers/t2i_adapter
title: T2I-Adapter
- local: using-diffusers/inference_with_lcm
title: Latent Consistency Model
- local: using-diffusers/textual_inversion_inference
title: Textual inversion
- local: using-diffusers/shap-e - local: using-diffusers/shap-e
title: Shap-E title: Shap-E
- local: using-diffusers/diffedit - local: using-diffusers/diffedit
title: DiffEdit title: DiffEdit
- local: using-diffusers/inference_with_tcd_lora - local: using-diffusers/distilled_sd
title: Trajectory Consistency Distillation-LoRA title: Distilled Stable Diffusion inference
- local: using-diffusers/callback
title: Pipeline callbacks
- local: using-diffusers/reproducibility
title: Create reproducible pipelines
- local: using-diffusers/custom_pipeline_examples
title: Community pipelines
- local: using-diffusers/contribute_pipeline
title: Contribute a community pipeline
- local: using-diffusers/inference_with_lcm_lora
title: Latent Consistency Model-LoRA
- local: using-diffusers/inference_with_lcm
title: Latent Consistency Model
- local: using-diffusers/svd - local: using-diffusers/svd
title: Stable Video Diffusion title: Stable Video Diffusion
- local: using-diffusers/marigold_usage
title: Marigold Computer Vision
title: Specific pipeline examples title: Specific pipeline examples
- sections: - sections:
- local: training/overview - local: training/overview
title: Overview title: Overview
- local: training/create_dataset - local: training/create_dataset
@@ -121,7 +124,6 @@
- local: training/instructpix2pix - local: training/instructpix2pix
title: InstructPix2Pix title: InstructPix2Pix
title: Models title: Models
isExpanded: false
- sections: - sections:
- local: training/text_inversion - local: training/text_inversion
title: Textual Inversion title: Textual Inversion
@@ -136,9 +138,16 @@
- local: training/ddpo - local: training/ddpo
title: Reinforcement learning training with DDPO title: Reinforcement learning training with DDPO
title: Methods title: Methods
isExpanded: false
title: Training title: Training
- sections:
- local: using-diffusers/other-modalities
title: Other Modalities
title: Taking Diffusers Beyond Images
title: Using Diffusers
- sections: - sections:
- local: optimization/opt_overview
title: Overview
- sections:
- local: optimization/fp16 - local: optimization/fp16
title: Speed up inference title: Speed up inference
- local: optimization/memory - local: optimization/memory
@@ -149,10 +158,7 @@
title: xFormers title: xFormers
- local: optimization/tome - local: optimization/tome
title: Token merging title: Token merging
- local: optimization/deepcache title: General optimizations
title: DeepCache
- local: optimization/tgate
title: TGATE
- sections: - sections:
- local: using-diffusers/stable_diffusion_jax_how_to - local: using-diffusers/stable_diffusion_jax_how_to
title: JAX/Flax title: JAX/Flax
@@ -162,14 +168,14 @@
title: OpenVINO title: OpenVINO
- local: optimization/coreml - local: optimization/coreml
title: Core ML title: Core ML
title: Optimized model formats title: Optimized model types
- sections: - sections:
- local: optimization/mps - local: optimization/mps
title: Metal Performance Shaders (MPS) title: Metal Performance Shaders (MPS)
- local: optimization/habana - local: optimization/habana
title: Habana Gaudi title: Habana Gaudi
title: Optimized hardware title: Optimized hardware
title: Accelerate inference and reduce memory title: Optimization
- sections: - sections:
- local: conceptual/philosophy - local: conceptual/philosophy
title: Philosophy title: Philosophy
@@ -191,7 +197,6 @@
- local: api/outputs - local: api/outputs
title: Outputs title: Outputs
title: Main Classes title: Main Classes
isExpanded: false
- sections: - sections:
- local: api/loaders/ip_adapter - local: api/loaders/ip_adapter
title: IP-Adapter title: IP-Adapter
@@ -203,10 +208,7 @@
title: Textual Inversion title: Textual Inversion
- local: api/loaders/unet - local: api/loaders/unet
title: UNet title: UNet
- local: api/loaders/peft
title: PEFT
title: Loaders title: Loaders
isExpanded: false
- sections: - sections:
- local: api/models/overview - local: api/models/overview
title: Overview title: Overview
@@ -220,8 +222,6 @@
title: UNet3DConditionModel title: UNet3DConditionModel
- local: api/models/unet-motion - local: api/models/unet-motion
title: UNetMotionModel title: UNetMotionModel
- local: api/models/uvit2d
title: UViT2DModel
- local: api/models/vq - local: api/models/vq
title: VQModel title: VQModel
- local: api/models/autoencoderkl - local: api/models/autoencoderkl
@@ -234,12 +234,6 @@
title: ConsistencyDecoderVAE title: ConsistencyDecoderVAE
- local: api/models/transformer2d - local: api/models/transformer2d
title: Transformer2D title: Transformer2D
- local: api/models/pixart_transformer2d
title: PixArtTransformer2D
- local: api/models/dit_transformer2d
title: DiTTransformer2D
- local: api/models/hunyuan_transformer_2d
title: HunyuanDiT2DModel
- local: api/models/transformer_temporal - local: api/models/transformer_temporal
title: Transformer Temporal title: Transformer Temporal
- local: api/models/prior_transformer - local: api/models/prior_transformer
@@ -247,7 +241,6 @@
- local: api/models/controlnet - local: api/models/controlnet
title: ControlNet title: ControlNet
title: Models title: Models
isExpanded: false
- sections: - sections:
- local: api/pipelines/overview - local: api/pipelines/overview
title: Overview title: Overview
@@ -287,10 +280,6 @@
title: DiffEdit title: DiffEdit
- local: api/pipelines/dit - local: api/pipelines/dit
title: DiT title: DiT
- local: api/pipelines/hunyuandit
title: Hunyuan-DiT
- local: api/pipelines/i2vgenxl
title: I2VGen-XL
- local: api/pipelines/pix2pix - local: api/pipelines/pix2pix
title: InstructPix2Pix title: InstructPix2Pix
- local: api/pipelines/kandinsky - local: api/pipelines/kandinsky
@@ -303,30 +292,20 @@
title: Latent Consistency Models title: Latent Consistency Models
- local: api/pipelines/latent_diffusion - local: api/pipelines/latent_diffusion
title: Latent Diffusion title: Latent Diffusion
- local: api/pipelines/ledits_pp
title: LEDITS++
- local: api/pipelines/marigold
title: Marigold
- local: api/pipelines/panorama - local: api/pipelines/panorama
title: MultiDiffusion title: MultiDiffusion
- local: api/pipelines/musicldm - local: api/pipelines/musicldm
title: MusicLDM title: MusicLDM
- local: api/pipelines/paint_by_example - local: api/pipelines/paint_by_example
title: Paint by Example title: Paint by Example
- local: api/pipelines/pia
title: Personalized Image Animator (PIA)
- local: api/pipelines/pixart - local: api/pipelines/pixart
title: PixArt-α title: PixArt-α
- local: api/pipelines/pixart_sigma
title: PixArt-Σ
- local: api/pipelines/self_attention_guidance - local: api/pipelines/self_attention_guidance
title: Self-Attention Guidance title: Self-Attention Guidance
- local: api/pipelines/semantic_stable_diffusion - local: api/pipelines/semantic_stable_diffusion
title: Semantic Guidance title: Semantic Guidance
- local: api/pipelines/shap_e - local: api/pipelines/shap_e
title: Shap-E title: Shap-E
- local: api/pipelines/stable_cascade
title: Stable Cascade
- sections: - sections:
- local: api/pipelines/stable_diffusion/overview - local: api/pipelines/stable_diffusion/overview
title: Overview title: Overview
@@ -334,8 +313,6 @@
title: Text-to-image title: Text-to-image
- local: api/pipelines/stable_diffusion/img2img - local: api/pipelines/stable_diffusion/img2img
title: Image-to-image title: Image-to-image
- local: api/pipelines/stable_diffusion/svd
title: Image-to-video
- local: api/pipelines/stable_diffusion/inpaint - local: api/pipelines/stable_diffusion/inpaint
title: Inpainting title: Inpainting
- local: api/pipelines/stable_diffusion/depth2img - local: api/pipelines/stable_diffusion/depth2img
@@ -354,12 +331,10 @@
title: Latent upscaler title: Latent upscaler
- local: api/pipelines/stable_diffusion/upscale - local: api/pipelines/stable_diffusion/upscale
title: Super-resolution title: Super-resolution
- local: api/pipelines/stable_diffusion/k_diffusion
title: K-Diffusion
- local: api/pipelines/stable_diffusion/ldm3d_diffusion - local: api/pipelines/stable_diffusion/ldm3d_diffusion
title: LDM3D Text-to-(RGB, Depth), Text-to-(RGB-pano, Depth-pano), LDM3D Upscaler title: LDM3D Text-to-(RGB, Depth), Text-to-(RGB-pano, Depth-pano), LDM3D Upscaler
- local: api/pipelines/stable_diffusion/adapter - local: api/pipelines/stable_diffusion/adapter
title: T2I-Adapter title: Stable Diffusion T2I-Adapter
- local: api/pipelines/stable_diffusion/gligen - local: api/pipelines/stable_diffusion/gligen
title: GLIGEN (Grounded Language-to-Image Generation) title: GLIGEN (Grounded Language-to-Image Generation)
title: Stable Diffusion title: Stable Diffusion
@@ -378,7 +353,6 @@
- local: api/pipelines/wuerstchen - local: api/pipelines/wuerstchen
title: Wuerstchen title: Wuerstchen
title: Pipelines title: Pipelines
isExpanded: false
- sections: - sections:
- local: api/schedulers/overview - local: api/schedulers/overview
title: Overview title: Overview
@@ -402,10 +376,6 @@
title: DPMSolverSDEScheduler title: DPMSolverSDEScheduler
- local: api/schedulers/singlestep_dpm_solver - local: api/schedulers/singlestep_dpm_solver
title: DPMSolverSinglestepScheduler title: DPMSolverSinglestepScheduler
- local: api/schedulers/edm_multistep_dpm_solver
title: EDMDPMSolverMultistepScheduler
- local: api/schedulers/edm_euler
title: EDMEulerScheduler
- local: api/schedulers/euler_ancestral - local: api/schedulers/euler_ancestral
title: EulerAncestralDiscreteScheduler title: EulerAncestralDiscreteScheduler
- local: api/schedulers/euler - local: api/schedulers/euler
@@ -432,14 +402,11 @@
title: ScoreSdeVeScheduler title: ScoreSdeVeScheduler
- local: api/schedulers/score_sde_vp - local: api/schedulers/score_sde_vp
title: ScoreSdeVpScheduler title: ScoreSdeVpScheduler
- local: api/schedulers/tcd
title: TCDScheduler
- local: api/schedulers/unipc - local: api/schedulers/unipc
title: UniPCMultistepScheduler title: UniPCMultistepScheduler
- local: api/schedulers/vq_diffusion - local: api/schedulers/vq_diffusion
title: VQDiffusionScheduler title: VQDiffusionScheduler
title: Schedulers title: Schedulers
isExpanded: false
- sections: - sections:
- local: api/internal_classes_overview - local: api/internal_classes_overview
title: Overview title: Overview
@@ -453,8 +420,5 @@
title: Utilities title: Utilities
- local: api/image_processor - local: api/image_processor
title: VAE Image Processor title: VAE Image Processor
- local: api/video_processor
title: Video Processor
title: Internal classes title: Internal classes
isExpanded: false
title: API title: API

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -20,14 +20,14 @@ An attention processor is a class for applying different types of attention mech
## AttnProcessor2_0 ## AttnProcessor2_0
[[autodoc]] models.attention_processor.AttnProcessor2_0 [[autodoc]] models.attention_processor.AttnProcessor2_0
## AttnAddedKVProcessor ## FusedAttnProcessor2_0
[[autodoc]] models.attention_processor.AttnAddedKVProcessor [[autodoc]] models.attention_processor.FusedAttnProcessor2_0
## AttnAddedKVProcessor2_0 ## LoRAAttnProcessor
[[autodoc]] models.attention_processor.AttnAddedKVProcessor2_0 [[autodoc]] models.attention_processor.LoRAAttnProcessor
## CrossFrameAttnProcessor ## LoRAAttnProcessor2_0
[[autodoc]] pipelines.text_to_video_synthesis.pipeline_text_to_video_zero.CrossFrameAttnProcessor [[autodoc]] models.attention_processor.LoRAAttnProcessor2_0
## CustomDiffusionAttnProcessor ## CustomDiffusionAttnProcessor
[[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor [[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor
@@ -35,26 +35,26 @@ An attention processor is a class for applying different types of attention mech
## CustomDiffusionAttnProcessor2_0 ## CustomDiffusionAttnProcessor2_0
[[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor2_0 [[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor2_0
## CustomDiffusionXFormersAttnProcessor ## AttnAddedKVProcessor
[[autodoc]] models.attention_processor.CustomDiffusionXFormersAttnProcessor [[autodoc]] models.attention_processor.AttnAddedKVProcessor
## FusedAttnProcessor2_0 ## AttnAddedKVProcessor2_0
[[autodoc]] models.attention_processor.FusedAttnProcessor2_0 [[autodoc]] models.attention_processor.AttnAddedKVProcessor2_0
## LoRAAttnAddedKVProcessor ## LoRAAttnAddedKVProcessor
[[autodoc]] models.attention_processor.LoRAAttnAddedKVProcessor [[autodoc]] models.attention_processor.LoRAAttnAddedKVProcessor
## XFormersAttnProcessor
[[autodoc]] models.attention_processor.XFormersAttnProcessor
## LoRAXFormersAttnProcessor ## LoRAXFormersAttnProcessor
[[autodoc]] models.attention_processor.LoRAXFormersAttnProcessor [[autodoc]] models.attention_processor.LoRAXFormersAttnProcessor
## CustomDiffusionXFormersAttnProcessor
[[autodoc]] models.attention_processor.CustomDiffusionXFormersAttnProcessor
## SlicedAttnProcessor ## SlicedAttnProcessor
[[autodoc]] models.attention_processor.SlicedAttnProcessor [[autodoc]] models.attention_processor.SlicedAttnProcessor
## SlicedAttnAddedKVProcessor ## SlicedAttnAddedKVProcessor
[[autodoc]] models.attention_processor.SlicedAttnAddedKVProcessor [[autodoc]] models.attention_processor.SlicedAttnAddedKVProcessor
## XFormersAttnProcessor
[[autodoc]] models.attention_processor.XFormersAttnProcessor
## AttnProcessorNPU
[[autodoc]] models.attention_processor.AttnProcessorNPU

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -25,11 +25,3 @@ All pipelines with [`VaeImageProcessor`] accept PIL Image, PyTorch tensor, or Nu
The [`VaeImageProcessorLDM3D`] accepts RGB and depth inputs and returns RGB and depth outputs. The [`VaeImageProcessorLDM3D`] accepts RGB and depth inputs and returns RGB and depth outputs.
[[autodoc]] image_processor.VaeImageProcessorLDM3D [[autodoc]] image_processor.VaeImageProcessorLDM3D
## PixArtImageProcessor
[[autodoc]] image_processor.PixArtImageProcessor
## IPAdapterMaskProcessor
[[autodoc]] image_processor.IPAdapterMaskProcessor

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -12,18 +12,14 @@ specific language governing permissions and limitations under the License.
# IP-Adapter # IP-Adapter
[IP-Adapter](https://hf.co/papers/2308.06721) is a lightweight adapter that enables prompting a diffusion model with an image. This method decouples the cross-attention layers of the image and text features. The image features are generated from an image encoder. [IP-Adapter](https://hf.co/papers/2308.06721) is a lightweight adapter that enables prompting a diffusion model with an image. This method decouples the cross-attention layers of the image and text features. The image features are generated from an image encoder. Files generated from IP-Adapter are only ~100MBs.
<Tip> <Tip>
Learn how to load an IP-Adapter checkpoint and image in the IP-Adapter [loading](../../using-diffusers/loading_adapters#ip-adapter) guide, and you can see how to use it in the [usage](../../using-diffusers/ip_adapter) guide. Learn how to load an IP-Adapter checkpoint and image in the [IP-Adapter](../../using-diffusers/loading_adapters#ip-adapter) loading guide.
</Tip> </Tip>
## IPAdapterMixin ## IPAdapterMixin
[[autodoc]] loaders.ip_adapter.IPAdapterMixin [[autodoc]] loaders.ip_adapter.IPAdapterMixin
## IPAdapterMaskProcessor
[[autodoc]] image_processor.IPAdapterMaskProcessor

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,25 +0,0 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# PEFT
Diffusers supports loading adapters such as [LoRA](../../using-diffusers/loading_adapters) with the [PEFT](https://huggingface.co/docs/peft/index) library with the [`~loaders.peft.PeftAdapterMixin`] class. This allows modeling classes in Diffusers like [`UNet2DConditionModel`] to load an adapter.
<Tip>
Refer to the [Inference with PEFT](../../tutorials/using_peft_for_inference.md) tutorial for an overview of how to use PEFT in Diffusers for inference.
</Tip>
## PeftAdapterMixin
[[autodoc]] loaders.peft.PeftAdapterMixin

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -10,134 +10,13 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License. specific language governing permissions and limitations under the License.
--> -->
# Loading Pipelines and Models via `from_single_file` # Single files
The `from_single_file` method allows you to load supported pipelines using a single checkpoint file as opposed to Diffusers' multiple folders format. This is useful if you are working with Stable Diffusion Web UI's (such as A1111) that rely on a single file format to distribute all the components of a model. Diffusers supports loading pretrained pipeline (or model) weights stored in a single file, such as a `ckpt` or `safetensors` file. These single file types are typically produced from community trained models. There are three classes for loading single file weights:
The `from_single_file` method also supports loading models in their originally distributed format. This means that supported models that have been finetuned with other services can be loaded directly into Diffusers model objects and pipelines. - [`FromSingleFileMixin`] supports loading pretrained pipeline weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
- [`FromOriginalVAEMixin`] supports loading a pretrained [`AutoencoderKL`] from pretrained ControlNet weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
## Pipelines that currently support `from_single_file` loading - [`FromOriginalControlnetMixin`] supports loading pretrained ControlNet weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
- [`StableDiffusionPipeline`]
- [`StableDiffusionImg2ImgPipeline`]
- [`StableDiffusionInpaintPipeline`]
- [`StableDiffusionControlNetPipeline`]
- [`StableDiffusionControlNetImg2ImgPipeline`]
- [`StableDiffusionControlNetInpaintPipeline`]
- [`StableDiffusionUpscalePipeline`]
- [`StableDiffusionXLPipeline`]
- [`StableDiffusionXLImg2ImgPipeline`]
- [`StableDiffusionXLInpaintPipeline`]
- [`StableDiffusionXLInstructPix2PixPipeline`]
- [`StableDiffusionXLControlNetPipeline`]
- [`StableDiffusionXLKDiffusionPipeline`]
- [`LatentConsistencyModelPipeline`]
- [`LatentConsistencyModelImg2ImgPipeline`]
- [`StableDiffusionControlNetXSPipeline`]
- [`StableDiffusionXLControlNetXSPipeline`]
- [`LEditsPPPipelineStableDiffusion`]
- [`LEditsPPPipelineStableDiffusionXL`]
- [`PIAPipeline`]
## Models that currently support `from_single_file` loading
- [`UNet2DConditionModel`]
- [`StableCascadeUNet`]
- [`AutoencoderKL`]
- [`ControlNetModel`]
## Usage Examples
## Loading a Pipeline using `from_single_file`
```python
from diffusers import StableDiffusionXLPipeline
ckpt_path = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors"
pipe = StableDiffusionXLPipeline.from_single_file(ckpt_path)
```
## Setting components in a Pipeline using `from_single_file`
Set components of a pipeline by passing them directly to the `from_single_file` method. For example, here we are swapping out the pipeline's default scheduler with the `DDIMScheduler`.
```python
from diffusers import StableDiffusionXLPipeline, DDIMScheduler
ckpt_path = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors"
scheduler = DDIMScheduler()
pipe = StableDiffusionXLPipeline.from_single_file(ckpt_path, scheduler=scheduler)
```
Here we are passing in a ControlNet model to the `StableDiffusionControlNetPipeline`.
```python
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
ckpt_path = "https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.safetensors"
controlnet = ControlNetModel.from_pretrained("lllyasviel/control_v11p_sd15_canny")
pipe = StableDiffusionControlNetPipeline.from_single_file(ckpt_path, controlnet=controlnet)
```
## Loading a Model using `from_single_file`
```python
from diffusers import StableCascadeUNet
ckpt_path = "https://huggingface.co/stabilityai/stable-cascade/blob/main/stage_b_lite.safetensors"
model = StableCascadeUNet.from_single_file(ckpt_path)
```
## Using a Diffusers model repository to configure single file loading
Under the hood, `from_single_file` will try to automatically determine a model repository to use to configure the components of a pipeline. You can also explicitly set the model repository to configure the pipeline with the `config` argument.
```python
from diffusers import StableDiffusionXLPipeline
ckpt_path = "https://huggingface.co/segmind/SSD-1B/blob/main/SSD-1B.safetensors"
repo_id = "segmind/SSD-1B"
pipe = StableDiffusionXLPipeline.from_single_file(ckpt_path, config=repo_id)
```
In the example above, since we explicitly passed `repo_id="segmind/SSD-1B"` to the `config` argument, it will use this [configuration file](https://huggingface.co/segmind/SSD-1B/blob/main/unet/config.json) from the `unet` subfolder in `"segmind/SSD-1B"` to configure the `unet` component of the pipeline; Similarly, it will use the `config.json` file from `vae` subfolder to configure the `vae` model, `config.json` file from `text_encoder` folder to configure `text_encoder` and so on.
<Tip>
Most of the time you do not need to explicitly set a `config` argument. `from_single_file` will automatically map the checkpoint to the appropriate model repository. However, this option can be useful in cases where model components in the checkpoint might have been changed from what was originally distributed, or in cases where a checkpoint file might not have the necessary metadata to correctly determine the configuration to use for the pipeline.
</Tip>
## Override configuration options when using single file loading
Override the default model or pipeline configuration options by providing the relevant arguments directly to the `from_single_file` method. Any argument supported by the model or pipeline class can be configured in this way:
### Setting a pipeline configuration option
```python
from diffusers import StableDiffusionXLInstructPix2PixPipeline
ckpt_path = "https://huggingface.co/stabilityai/cosxl/blob/main/cosxl_edit.safetensors"
pipe = StableDiffusionXLInstructPix2PixPipeline.from_single_file(ckpt_path, config="diffusers/sdxl-instructpix2pix-768", is_cosxl_edit=True)
```
### Setting a model configuration option
```python
from diffusers import UNet2DConditionModel
ckpt_path = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors"
model = UNet2DConditionModel.from_single_file(ckpt_path, upcast_attention=True)
```
<Tip> <Tip>
@@ -145,116 +24,14 @@ To learn more about how to load single file weights, see the [Load different Sta
</Tip> </Tip>
## Working with local files
As of `diffusers>=0.28.0` the `from_single_file` method will attempt to configure a pipeline or model by first inferring the model type from the keys in the checkpoint file. This inferred model type is then used to determine the appropriate model repository on the Hugging Face Hub to configure the model or pipeline.
For example, any single file checkpoint based on the Stable Diffusion XL base model will use the [`stabilityai/stable-diffusion-xl-base-1.0`](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) model repository to configure the pipeline.
If you are working in an environment with restricted internet access, it is recommended that you download the config files and checkpoints for the model to your preferred directory and pass the local paths to the `pretrained_model_link_or_path` and `config` arguments of the `from_single_file` method.
```python
from huggingface_hub import hf_hub_download, snapshot_download
my_local_checkpoint_path = hf_hub_download(
repo_id="segmind/SSD-1B",
filename="SSD-1B.safetensors"
)
my_local_config_path = snapshot_download(
repo_id="segmind/SSD-1B",
allowed_patterns=["*.json", "**/*.json", "*.txt", "**/*.txt"]
)
pipe = StableDiffusionXLPipeline.from_single_file(my_local_checkpoint_path, config=my_local_config_path, local_files_only=True)
```
By default this will download the checkpoints and config files to the [Hugging Face Hub cache directory](https://huggingface.co/docs/huggingface_hub/en/guides/manage-cache). You can also specify a local directory to download the files to by passing the `local_dir` argument to the `hf_hub_download` and `snapshot_download` functions.
```python
from huggingface_hub import hf_hub_download, snapshot_download
my_local_checkpoint_path = hf_hub_download(
repo_id="segmind/SSD-1B",
filename="SSD-1B.safetensors"
local_dir="my_local_checkpoints"
)
my_local_config_path = snapshot_download(
repo_id="segmind/SSD-1B",
allowed_patterns=["*.json", "**/*.json", "*.txt", "**/*.txt"]
local_dir="my_local_config"
)
pipe = StableDiffusionXLPipeline.from_single_file(my_local_checkpoint_path, config=my_local_config_path, local_files_only=True)
```
## Working with local files on file systems that do not support symlinking
By default the `from_single_file` method relies on the `huggingface_hub` caching mechanism to fetch and store checkpoints and config files for models and pipelines. If you are working with a file system that does not support symlinking, it is recommended that you first download the checkpoint file to a local directory and disable symlinking by passing the `local_dir_use_symlink=False` argument to the `hf_hub_download` and `snapshot_download` functions.
```python
from huggingface_hub import hf_hub_download, snapshot_download
my_local_checkpoint_path = hf_hub_download(
repo_id="segmind/SSD-1B",
filename="SSD-1B.safetensors"
local_dir="my_local_checkpoints",
local_dir_use_symlinks=False
)
print("My local checkpoint: ", my_local_checkpoint_path)
my_local_config_path = snapshot_download(
repo_id="segmind/SSD-1B",
allowed_patterns=["*.json", "**/*.json", "*.txt", "**/*.txt"]
local_dir_use_symlinks=False,
)
print("My local config: ", my_local_config_path)
```
Then pass the local paths to the `pretrained_model_link_or_path` and `config` arguments of the `from_single_file` method.
```python
pipe = StableDiffusionXLPipeline.from_single_file(my_local_checkpoint_path, config=my_local_config_path, local_files_only=True)
```
<Tip>
As of `huggingface_hub>=0.23.0` the `local_dir_use_symlinks` argument isn't necessary for the `hf_hub_download` and `snapshot_download` functions.
</Tip>
## Using the original configuration file of a model
If you would like to configure the model components in a pipeline using the orignal YAML configuration file, you can pass a local path or url to the original configuration file via the `original_config` argument.
```python
from diffusers import StableDiffusionXLPipeline
ckpt_path = "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors"
repo_id = "stabilityai/stable-diffusion-xl-base-1.0"
original_config = "https://raw.githubusercontent.com/Stability-AI/generative-models/main/configs/inference/sd_xl_base.yaml"
pipe = StableDiffusionXLPipeline.from_single_file(ckpt_path, original_config=original_config)
```
<Tip>
When using `original_config` with `local_files_only=True`, Diffusers will attempt to infer the components of the pipeline based on the type signatures of pipeline class, rather than attempting to fetch the configuration files from a model repository on the Hugging Face Hub. This is to prevent backward breaking changes in existing code that might not be able to connect to the internet to fetch the necessary configuration files.
This is not as reliable as providing a path to a local model repository using the `config` argument and might lead to errors when configuring the pipeline. To avoid this, please run the pipeline with `local_files_only=False` once to download the appropriate pipeline configuration files to the local cache.
</Tip>
## FromSingleFileMixin ## FromSingleFileMixin
[[autodoc]] loaders.single_file.FromSingleFileMixin [[autodoc]] loaders.single_file.FromSingleFileMixin
## FromOriginalModelMixin ## FromOriginalVAEMixin
[[autodoc]] loaders.single_file_model.FromOriginalModelMixin [[autodoc]] loaders.single_file.FromOriginalVAEMixin
## FromOriginalControlnetMixin
[[autodoc]] loaders.single_file.FromOriginalControlnetMixin

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -33,9 +33,6 @@ model = AutoencoderKL.from_single_file(url)
## AutoencoderKL ## AutoencoderKL
[[autodoc]] AutoencoderKL [[autodoc]] AutoencoderKL
- decode
- encode
- all
## AutoencoderKLOutput ## AutoencoderKLOutput

View File

@@ -1,15 +1,3 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Consistency Decoder # Consistency Decoder
Consistency decoder can be used to decode the latents from the denoising UNet in the [`StableDiffusionPipeline`]. This decoder was introduced in the [DALL-E 3 technical report](https://openai.com/dall-e-3). Consistency decoder can be used to decode the latents from the denoising UNet in the [`StableDiffusionPipeline`]. This decoder was introduced in the [DALL-E 3 technical report](https://openai.com/dall-e-3).

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,19 +0,0 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# DiTTransformer2D
A Transformer model for image-like data from [DiT](https://huggingface.co/papers/2212.09748).
## DiTTransformer2DModel
[[autodoc]] DiTTransformer2DModel

View File

@@ -1,20 +0,0 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# HunyuanDiT2DModel
A Diffusion Transformer model for 2D data from [Hunyuan-DiT](https://github.com/Tencent/HunyuanDiT).
## HunyuanDiT2DModel
[[autodoc]] HunyuanDiT2DModel

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,19 +0,0 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# PixArtTransformer2D
A Transformer model for image-like data from [PixArt-Alpha](https://huggingface.co/papers/2310.00426) and [PixArt-Sigma](https://huggingface.co/papers/2403.04692).
## PixArtTransformer2DModel
[[autodoc]] PixArtTransformer2DModel

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -24,4 +24,4 @@ The abstract from the paper is:
## PriorTransformerOutput ## PriorTransformerOutput
[[autodoc]] models.transformers.prior_transformer.PriorTransformerOutput [[autodoc]] models.prior_transformer.PriorTransformerOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -38,4 +38,4 @@ It is assumed one of the input classes is the masked latent pixel. The predicted
## Transformer2DModelOutput ## Transformer2DModelOutput
[[autodoc]] models.transformers.transformer_2d.Transformer2DModelOutput [[autodoc]] models.transformer_2d.Transformer2DModelOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -16,8 +16,8 @@ A Transformer model for video-like data.
## TransformerTemporalModel ## TransformerTemporalModel
[[autodoc]] models.transformers.transformer_temporal.TransformerTemporalModel [[autodoc]] models.transformer_temporal.TransformerTemporalModel
## TransformerTemporalModelOutput ## TransformerTemporalModelOutput
[[autodoc]] models.transformers.transformer_temporal.TransformerTemporalModelOutput [[autodoc]] models.transformer_temporal.TransformerTemporalModelOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -22,4 +22,4 @@ The abstract from the paper is:
[[autodoc]] UNetMotionModel [[autodoc]] UNetMotionModel
## UNet3DConditionOutput ## UNet3DConditionOutput
[[autodoc]] models.unets.unet_3d_condition.UNet3DConditionOutput [[autodoc]] models.unet_3d_condition.UNet3DConditionOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -22,4 +22,4 @@ The abstract from the paper is:
[[autodoc]] UNet1DModel [[autodoc]] UNet1DModel
## UNet1DOutput ## UNet1DOutput
[[autodoc]] models.unets.unet_1d.UNet1DOutput [[autodoc]] models.unet_1d.UNet1DOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -22,10 +22,10 @@ The abstract from the paper is:
[[autodoc]] UNet2DConditionModel [[autodoc]] UNet2DConditionModel
## UNet2DConditionOutput ## UNet2DConditionOutput
[[autodoc]] models.unets.unet_2d_condition.UNet2DConditionOutput [[autodoc]] models.unet_2d_condition.UNet2DConditionOutput
## FlaxUNet2DConditionModel ## FlaxUNet2DConditionModel
[[autodoc]] models.unets.unet_2d_condition_flax.FlaxUNet2DConditionModel [[autodoc]] models.unet_2d_condition_flax.FlaxUNet2DConditionModel
## FlaxUNet2DConditionOutput ## FlaxUNet2DConditionOutput
[[autodoc]] models.unets.unet_2d_condition_flax.FlaxUNet2DConditionOutput [[autodoc]] models.unet_2d_condition_flax.FlaxUNet2DConditionOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -22,4 +22,4 @@ The abstract from the paper is:
[[autodoc]] UNet2DModel [[autodoc]] UNet2DModel
## UNet2DOutput ## UNet2DOutput
[[autodoc]] models.unets.unet_2d.UNet2DOutput [[autodoc]] models.unet_2d.UNet2DOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -22,4 +22,4 @@ The abstract from the paper is:
[[autodoc]] UNet3DConditionModel [[autodoc]] UNet3DConditionModel
## UNet3DConditionOutput ## UNet3DConditionOutput
[[autodoc]] models.unets.unet_3d_condition.UNet3DConditionOutput [[autodoc]] models.unet_3d_condition.UNet3DConditionOutput

View File

@@ -1,39 +0,0 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# UVit2DModel
The [U-ViT](https://hf.co/papers/2301.11093) model is a vision transformer (ViT) based UNet. This model incorporates elements from ViT (considers all inputs such as time, conditions and noisy image patches as tokens) and a UNet (long skip connections between the shallow and deep layers). The skip connection is important for predicting pixel-level features. An additional 3x3 convolutional block is applied prior to the final output to improve image quality.
The abstract from the paper is:
*Currently, applying diffusion models in pixel space of high resolution images is difficult. Instead, existing approaches focus on diffusion in lower dimensional spaces (latent diffusion), or have multiple super-resolution levels of generation referred to as cascades. The downside is that these approaches add additional complexity to the diffusion framework. This paper aims to improve denoising diffusion for high resolution images while keeping the model as simple as possible. The paper is centered around the research question: How can one train a standard denoising diffusion models on high resolution images, and still obtain performance comparable to these alternate approaches? The four main findings are: 1) the noise schedule should be adjusted for high resolution images, 2) It is sufficient to scale only a particular part of the architecture, 3) dropout should be added at specific locations in the architecture, and 4) downsampling is an effective strategy to avoid high resolution feature maps. Combining these simple yet effective techniques, we achieve state-of-the-art on image generation among diffusion models without sampling modifiers on ImageNet.*
## UVit2DModel
[[autodoc]] UVit2DModel
## UVit2DConvEmbed
[[autodoc]] models.unets.uvit_2d.UVit2DConvEmbed
## UVitBlock
[[autodoc]] models.unets.uvit_2d.UVitBlock
## ConvNextBlock
[[autodoc]] models.unets.uvit_2d.ConvNextBlock
## ConvMlmLayer
[[autodoc]] models.unets.uvit_2d.ConvMlmLayer

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -12,20 +12,14 @@ specific language governing permissions and limitations under the License.
# aMUSEd # aMUSEd
aMUSEd was introduced in [aMUSEd: An Open MUSE Reproduction](https://huggingface.co/papers/2401.01808) by Suraj Patil, William Berman, Robin Rombach, and Patrick von Platen. Amused is a lightweight text to image model based off of the [muse](https://arxiv.org/pdf/2301.00704.pdf) architecture. Amused is particularly useful in applications that require a lightweight and fast model such as generating many images quickly at once.
Amused is a lightweight text to image model based off of the [MUSE](https://arxiv.org/abs/2301.00704) architecture. Amused is particularly useful in applications that require a lightweight and fast model such as generating many images quickly at once.
Amused is a vqvae token based transformer that can generate an image in fewer forward passes than many diffusion models. In contrast with muse, it uses the smaller text encoder CLIP-L/14 instead of t5-xxl. Due to its small parameter count and few forward pass generation process, amused can generate many images quickly. This benefit is seen particularly at larger batch sizes. Amused is a vqvae token based transformer that can generate an image in fewer forward passes than many diffusion models. In contrast with muse, it uses the smaller text encoder CLIP-L/14 instead of t5-xxl. Due to its small parameter count and few forward pass generation process, amused can generate many images quickly. This benefit is seen particularly at larger batch sizes.
The abstract from the paper is:
*We present aMUSEd, an open-source, lightweight masked image model (MIM) for text-to-image generation based on MUSE. With 10 percent of MUSE's parameters, aMUSEd is focused on fast image generation. We believe MIM is under-explored compared to latent diffusion, the prevailing approach for text-to-image generation. Compared to latent diffusion, MIM requires fewer inference steps and is more interpretable. Additionally, MIM can be fine-tuned to learn additional styles with only a single image. We hope to encourage further exploration of MIM by demonstrating its effectiveness on large-scale text-to-image generation and releasing reproducible training code. We also release checkpoints for two models which directly produce images at 256x256 and 512x512 resolutions.*
| Model | Params | | Model | Params |
|-------|--------| |-------|--------|
| [amused-256](https://huggingface.co/amused/amused-256) | 603M | | [amused-256](https://huggingface.co/huggingface/amused-256) | 603M |
| [amused-512](https://huggingface.co/amused/amused-512) | 608M | | [amused-512](https://huggingface.co/huggingface/amused-512) | 608M |
## AmusedPipeline ## AmusedPipeline
@@ -34,15 +28,3 @@ The abstract from the paper is:
- all - all
- enable_xformers_memory_efficient_attention - enable_xformers_memory_efficient_attention
- disable_xformers_memory_efficient_attention - disable_xformers_memory_efficient_attention
[[autodoc]] AmusedImg2ImgPipeline
- __call__
- all
- enable_xformers_memory_efficient_attention
- disable_xformers_memory_efficient_attention
[[autodoc]] AmusedInpaintPipeline
- __call__
- all
- enable_xformers_memory_efficient_attention
- disable_xformers_memory_efficient_attention

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -25,7 +25,6 @@ The abstract of the paper is the following:
| Pipeline | Tasks | Demo | Pipeline | Tasks | Demo
|---|---|:---:| |---|---|:---:|
| [AnimateDiffPipeline](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/animatediff/pipeline_animatediff.py) | *Text-to-Video Generation with AnimateDiff* | | [AnimateDiffPipeline](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/animatediff/pipeline_animatediff.py) | *Text-to-Video Generation with AnimateDiff* |
| [AnimateDiffVideoToVideoPipeline](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/animatediff/pipeline_animatediff_video2video.py) | *Video-to-Video Generation with AnimateDiff* |
## Available checkpoints ## Available checkpoints
@@ -33,29 +32,22 @@ Motion Adapter checkpoints can be found under [guoyww](https://huggingface.co/gu
## Usage example ## Usage example
### AnimateDiffPipeline
AnimateDiff works with a MotionAdapter checkpoint and a Stable Diffusion model checkpoint. The MotionAdapter is a collection of Motion Modules that are responsible for adding coherent motion across image frames. These modules are applied after the Resnet and Attention blocks in Stable Diffusion UNet. AnimateDiff works with a MotionAdapter checkpoint and a Stable Diffusion model checkpoint. The MotionAdapter is a collection of Motion Modules that are responsible for adding coherent motion across image frames. These modules are applied after the Resnet and Attention blocks in Stable Diffusion UNet.
The following example demonstrates how to use a *MotionAdapter* checkpoint with Diffusers for inference based on StableDiffusion-1.4/1.5. The following example demonstrates how to use a *MotionAdapter* checkpoint with Diffusers for inference based on StableDiffusion-1.4/1.5.
```python ```python
import torch import torch
from diffusers import AnimateDiffPipeline, DDIMScheduler, MotionAdapter from diffusers import MotionAdapter, AnimateDiffPipeline, DDIMScheduler
from diffusers.utils import export_to_gif from diffusers.utils import export_to_gif
# Load the motion adapter # Load the motion adapter
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2", torch_dtype=torch.float16) adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2")
# load SD 1.5 based finetuned model # load SD 1.5 based finetuned model
model_id = "SG161222/Realistic_Vision_V5.1_noVAE" model_id = "SG161222/Realistic_Vision_V5.1_noVAE"
pipe = AnimateDiffPipeline.from_pretrained(model_id, motion_adapter=adapter, torch_dtype=torch.float16) pipe = AnimateDiffPipeline.from_pretrained(model_id, motion_adapter=adapter)
scheduler = DDIMScheduler.from_pretrained( scheduler = DDIMScheduler.from_pretrained(
model_id, model_id, subfolder="scheduler", clip_sample=False, timestep_spacing="linspace", steps_offset=1
subfolder="scheduler",
clip_sample=False,
timestep_spacing="linspace",
beta_schedule="linear",
steps_offset=1,
) )
pipe.scheduler = scheduler pipe.scheduler = scheduler
@@ -78,7 +70,6 @@ output = pipe(
) )
frames = output.frames[0] frames = output.frames[0]
export_to_gif(frames, "animation.gif") export_to_gif(frames, "animation.gif")
``` ```
Here are some sample outputs: Here are some sample outputs:
@@ -97,190 +88,28 @@ Here are some sample outputs:
<Tip> <Tip>
AnimateDiff tends to work better with finetuned Stable Diffusion models. If you plan on using a scheduler that can clip samples, make sure to disable it by setting `clip_sample=False` in the scheduler as this can also have an adverse effect on generated samples. Additionally, the AnimateDiff checkpoints can be sensitive to the beta schedule of the scheduler. We recommend setting this to `linear`. AnimateDiff tends to work better with finetuned Stable Diffusion models. If you plan on using a scheduler that can clip samples, make sure to disable it by setting `clip_sample=False` in the scheduler as this can also have an adverse effect on generated samples.
</Tip> </Tip>
### AnimateDiffSDXLPipeline
AnimateDiff can also be used with SDXL models. This is currently an experimental feature as only a beta release of the motion adapter checkpoint is available.
```python
import torch
from diffusers.models import MotionAdapter
from diffusers import AnimateDiffSDXLPipeline, DDIMScheduler
from diffusers.utils import export_to_gif
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-sdxl-beta", torch_dtype=torch.float16)
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
scheduler = DDIMScheduler.from_pretrained(
model_id,
subfolder="scheduler",
clip_sample=False,
timestep_spacing="linspace",
beta_schedule="linear",
steps_offset=1,
)
pipe = AnimateDiffSDXLPipeline.from_pretrained(
model_id,
motion_adapter=adapter,
scheduler=scheduler,
torch_dtype=torch.float16,
variant="fp16",
).to("cuda")
# enable memory savings
pipe.enable_vae_slicing()
pipe.enable_vae_tiling()
output = pipe(
prompt="a panda surfing in the ocean, realistic, high quality",
negative_prompt="low quality, worst quality",
num_inference_steps=20,
guidance_scale=8,
width=1024,
height=1024,
num_frames=16,
)
frames = output.frames[0]
export_to_gif(frames, "animation.gif")
```
### AnimateDiffVideoToVideoPipeline
AnimateDiff can also be used to generate visually similar videos or enable style/character/background or other edits starting from an initial video, allowing you to seamlessly explore creative possibilities.
```python
import imageio
import requests
import torch
from diffusers import AnimateDiffVideoToVideoPipeline, DDIMScheduler, MotionAdapter
from diffusers.utils import export_to_gif
from io import BytesIO
from PIL import Image
# Load the motion adapter
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2", torch_dtype=torch.float16)
# load SD 1.5 based finetuned model
model_id = "SG161222/Realistic_Vision_V5.1_noVAE"
pipe = AnimateDiffVideoToVideoPipeline.from_pretrained(model_id, motion_adapter=adapter, torch_dtype=torch.float16).to("cuda")
scheduler = DDIMScheduler.from_pretrained(
model_id,
subfolder="scheduler",
clip_sample=False,
timestep_spacing="linspace",
beta_schedule="linear",
steps_offset=1,
)
pipe.scheduler = scheduler
# enable memory savings
pipe.enable_vae_slicing()
pipe.enable_model_cpu_offload()
# helper function to load videos
def load_video(file_path: str):
images = []
if file_path.startswith(('http://', 'https://')):
# If the file_path is a URL
response = requests.get(file_path)
response.raise_for_status()
content = BytesIO(response.content)
vid = imageio.get_reader(content)
else:
# Assuming it's a local file path
vid = imageio.get_reader(file_path)
for frame in vid:
pil_image = Image.fromarray(frame)
images.append(pil_image)
return images
video = load_video("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-vid2vid-input-1.gif")
output = pipe(
video = video,
prompt="panda playing a guitar, on a boat, in the ocean, high quality",
negative_prompt="bad quality, worse quality",
guidance_scale=7.5,
num_inference_steps=25,
strength=0.5,
generator=torch.Generator("cpu").manual_seed(42),
)
frames = output.frames[0]
export_to_gif(frames, "animation.gif")
```
Here are some sample outputs:
<table>
<tr>
<th align=center>Source Video</th>
<th align=center>Output Video</th>
</tr>
<tr>
<td align=center>
raccoon playing a guitar
<br />
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-vid2vid-input-1.gif"
alt="racoon playing a guitar"
style="width: 300px;" />
</td>
<td align=center>
panda playing a guitar
<br/>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-vid2vid-output-1.gif"
alt="panda playing a guitar"
style="width: 300px;" />
</td>
</tr>
<tr>
<td align=center>
closeup of margot robbie, fireworks in the background, high quality
<br />
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-vid2vid-input-2.gif"
alt="closeup of margot robbie, fireworks in the background, high quality"
style="width: 300px;" />
</td>
<td align=center>
closeup of tony stark, robert downey jr, fireworks
<br/>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-vid2vid-output-2.gif"
alt="closeup of tony stark, robert downey jr, fireworks"
style="width: 300px;" />
</td>
</tr>
</table>
## Using Motion LoRAs ## Using Motion LoRAs
Motion LoRAs are a collection of LoRAs that work with the `guoyww/animatediff-motion-adapter-v1-5-2` checkpoint. These LoRAs are responsible for adding specific types of motion to the animations. Motion LoRAs are a collection of LoRAs that work with the `guoyww/animatediff-motion-adapter-v1-5-2` checkpoint. These LoRAs are responsible for adding specific types of motion to the animations.
```python ```python
import torch import torch
from diffusers import AnimateDiffPipeline, DDIMScheduler, MotionAdapter from diffusers import MotionAdapter, AnimateDiffPipeline, DDIMScheduler
from diffusers.utils import export_to_gif from diffusers.utils import export_to_gif
# Load the motion adapter # Load the motion adapter
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2", torch_dtype=torch.float16) adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2")
# load SD 1.5 based finetuned model # load SD 1.5 based finetuned model
model_id = "SG161222/Realistic_Vision_V5.1_noVAE" model_id = "SG161222/Realistic_Vision_V5.1_noVAE"
pipe = AnimateDiffPipeline.from_pretrained(model_id, motion_adapter=adapter, torch_dtype=torch.float16) pipe = AnimateDiffPipeline.from_pretrained(model_id, motion_adapter=adapter)
pipe.load_lora_weights( pipe.load_lora_weights("guoyww/animatediff-motion-lora-zoom-out", adapter_name="zoom-out")
"guoyww/animatediff-motion-lora-zoom-out", adapter_name="zoom-out"
)
scheduler = DDIMScheduler.from_pretrained( scheduler = DDIMScheduler.from_pretrained(
model_id, model_id, subfolder="scheduler", clip_sample=False, timestep_spacing="linspace", steps_offset=1
subfolder="scheduler",
clip_sample=False,
beta_schedule="linear",
timestep_spacing="linspace",
steps_offset=1,
) )
pipe.scheduler = scheduler pipe.scheduler = scheduler
@@ -303,7 +132,6 @@ output = pipe(
) )
frames = output.frames[0] frames = output.frames[0]
export_to_gif(frames, "animation.gif") export_to_gif(frames, "animation.gif")
``` ```
<table> <table>
@@ -332,30 +160,21 @@ Then you can use the following code to combine Motion LoRAs.
```python ```python
import torch import torch
from diffusers import AnimateDiffPipeline, DDIMScheduler, MotionAdapter from diffusers import MotionAdapter, AnimateDiffPipeline, DDIMScheduler
from diffusers.utils import export_to_gif from diffusers.utils import export_to_gif
# Load the motion adapter # Load the motion adapter
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2", torch_dtype=torch.float16) adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2")
# load SD 1.5 based finetuned model # load SD 1.5 based finetuned model
model_id = "SG161222/Realistic_Vision_V5.1_noVAE" model_id = "SG161222/Realistic_Vision_V5.1_noVAE"
pipe = AnimateDiffPipeline.from_pretrained(model_id, motion_adapter=adapter, torch_dtype=torch.float16) pipe = AnimateDiffPipeline.from_pretrained(model_id, motion_adapter=adapter)
pipe.load_lora_weights( pipe.load_lora_weights("diffusers/animatediff-motion-lora-zoom-out", adapter_name="zoom-out")
"diffusers/animatediff-motion-lora-zoom-out", adapter_name="zoom-out", pipe.load_lora_weights("diffusers/animatediff-motion-lora-pan-left", adapter_name="pan-left")
)
pipe.load_lora_weights(
"diffusers/animatediff-motion-lora-pan-left", adapter_name="pan-left",
)
pipe.set_adapters(["zoom-out", "pan-left"], adapter_weights=[1.0, 1.0]) pipe.set_adapters(["zoom-out", "pan-left"], adapter_weights=[1.0, 1.0])
scheduler = DDIMScheduler.from_pretrained( scheduler = DDIMScheduler.from_pretrained(
model_id, model_id, subfolder="scheduler", clip_sample=False, timestep_spacing="linspace", steps_offset=1
subfolder="scheduler",
clip_sample=False,
timestep_spacing="linspace",
beta_schedule="linear",
steps_offset=1,
) )
pipe.scheduler = scheduler pipe.scheduler = scheduler
@@ -378,7 +197,6 @@ output = pipe(
) )
frames = output.frames[0] frames = output.frames[0]
export_to_gif(frames, "animation.gif") export_to_gif(frames, "animation.gif")
``` ```
<table> <table>
@@ -393,193 +211,23 @@ export_to_gif(frames, "animation.gif")
</tr> </tr>
</table> </table>
## Using FreeInit
[FreeInit: Bridging Initialization Gap in Video Diffusion Models](https://arxiv.org/abs/2312.07537) by Tianxing Wu, Chenyang Si, Yuming Jiang, Ziqi Huang, Ziwei Liu.
FreeInit is an effective method that improves temporal consistency and overall quality of videos generated using video-diffusion-models without any addition training. It can be applied to AnimateDiff, ModelScope, VideoCrafter and various other video generation models seamlessly at inference time, and works by iteratively refining the latent-initialization noise. More details can be found it the paper.
The following example demonstrates the usage of FreeInit.
```python
import torch
from diffusers import MotionAdapter, AnimateDiffPipeline, DDIMScheduler
from diffusers.utils import export_to_gif
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2")
model_id = "SG161222/Realistic_Vision_V5.1_noVAE"
pipe = AnimateDiffPipeline.from_pretrained(model_id, motion_adapter=adapter, torch_dtype=torch.float16).to("cuda")
pipe.scheduler = DDIMScheduler.from_pretrained(
model_id,
subfolder="scheduler",
beta_schedule="linear",
clip_sample=False,
timestep_spacing="linspace",
steps_offset=1
)
# enable memory savings
pipe.enable_vae_slicing()
pipe.enable_vae_tiling()
# enable FreeInit
# Refer to the enable_free_init documentation for a full list of configurable parameters
pipe.enable_free_init(method="butterworth", use_fast_sampling=True)
# run inference
output = pipe(
prompt="a panda playing a guitar, on a boat, in the ocean, high quality",
negative_prompt="bad quality, worse quality",
num_frames=16,
guidance_scale=7.5,
num_inference_steps=20,
generator=torch.Generator("cpu").manual_seed(666),
)
# disable FreeInit
pipe.disable_free_init()
frames = output.frames[0]
export_to_gif(frames, "animation.gif")
```
<Tip warning={true}>
FreeInit is not really free - the improved quality comes at the cost of extra computation. It requires sampling a few extra times depending on the `num_iters` parameter that is set when enabling it. Setting the `use_fast_sampling` parameter to `True` can improve the overall performance (at the cost of lower quality compared to when `use_fast_sampling=False` but still better results than vanilla video generation models).
</Tip>
<Tip> <Tip>
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines. Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
</Tip> </Tip>
<table>
<tr>
<th align=center>Without FreeInit enabled</th>
<th align=center>With FreeInit enabled</th>
</tr>
<tr>
<td align=center>
panda playing a guitar
<br />
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-no-freeinit.gif"
alt="panda playing a guitar"
style="width: 300px;" />
</td>
<td align=center>
panda playing a guitar
<br/>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-freeinit.gif"
alt="panda playing a guitar"
style="width: 300px;" />
</td>
</tr>
</table>
## Using AnimateLCM
[AnimateLCM](https://animatelcm.github.io/) is a motion module checkpoint and an [LCM LoRA](https://huggingface.co/docs/diffusers/using-diffusers/inference_with_lcm_lora) that have been created using a consistency learning strategy that decouples the distillation of the image generation priors and the motion generation priors.
```python
import torch
from diffusers import AnimateDiffPipeline, LCMScheduler, MotionAdapter
from diffusers.utils import export_to_gif
adapter = MotionAdapter.from_pretrained("wangfuyun/AnimateLCM")
pipe = AnimateDiffPipeline.from_pretrained("emilianJR/epiCRealism", motion_adapter=adapter)
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config, beta_schedule="linear")
pipe.load_lora_weights("wangfuyun/AnimateLCM", weight_name="sd15_lora_beta.safetensors", adapter_name="lcm-lora")
pipe.enable_vae_slicing()
pipe.enable_model_cpu_offload()
output = pipe(
prompt="A space rocket with trails of smoke behind it launching into space from the desert, 4k, high resolution",
negative_prompt="bad quality, worse quality, low resolution",
num_frames=16,
guidance_scale=1.5,
num_inference_steps=6,
generator=torch.Generator("cpu").manual_seed(0),
)
frames = output.frames[0]
export_to_gif(frames, "animatelcm.gif")
```
<table>
<tr>
<td><center>
A space rocket, 4K.
<br>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatelcm-output.gif"
alt="A space rocket, 4K"
style="width: 300px;" />
</center></td>
</tr>
</table>
AnimateLCM is also compatible with existing [Motion LoRAs](https://huggingface.co/collections/dn6/animatediff-motion-loras-654cb8ad732b9e3cf4d3c17e).
```python
import torch
from diffusers import AnimateDiffPipeline, LCMScheduler, MotionAdapter
from diffusers.utils import export_to_gif
adapter = MotionAdapter.from_pretrained("wangfuyun/AnimateLCM")
pipe = AnimateDiffPipeline.from_pretrained("emilianJR/epiCRealism", motion_adapter=adapter)
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config, beta_schedule="linear")
pipe.load_lora_weights("wangfuyun/AnimateLCM", weight_name="sd15_lora_beta.safetensors", adapter_name="lcm-lora")
pipe.load_lora_weights("guoyww/animatediff-motion-lora-tilt-up", adapter_name="tilt-up")
pipe.set_adapters(["lcm-lora", "tilt-up"], [1.0, 0.8])
pipe.enable_vae_slicing()
pipe.enable_model_cpu_offload()
output = pipe(
prompt="A space rocket with trails of smoke behind it launching into space from the desert, 4k, high resolution",
negative_prompt="bad quality, worse quality, low resolution",
num_frames=16,
guidance_scale=1.5,
num_inference_steps=6,
generator=torch.Generator("cpu").manual_seed(0),
)
frames = output.frames[0]
export_to_gif(frames, "animatelcm-motion-lora.gif")
```
<table>
<tr>
<td><center>
A space rocket, 4K.
<br>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatelcm-motion-lora.gif"
alt="A space rocket, 4K"
style="width: 300px;" />
</center></td>
</tr>
</table>
## AnimateDiffPipeline ## AnimateDiffPipeline
[[autodoc]] AnimateDiffPipeline [[autodoc]] AnimateDiffPipeline
- all - all
- __call__ - __call__
- enable_freeu
## AnimateDiffSDXLPipeline - disable_freeu
- enable_vae_slicing
[[autodoc]] AnimateDiffSDXLPipeline - disable_vae_slicing
- all - enable_vae_tiling
- __call__ - disable_vae_tiling
## AnimateDiffVideoToVideoPipeline
[[autodoc]] AnimateDiffVideoToVideoPipeline
- all
- __call__
## AnimateDiffPipelineOutput ## AnimateDiffPipelineOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -20,8 +20,7 @@ The abstract of the paper is the following:
*Although audio generation shares commonalities across different types of audio, such as speech, music, and sound effects, designing models for each type requires careful consideration of specific objectives and biases that can significantly differ from those of other types. To bring us closer to a unified perspective of audio generation, this paper proposes a framework that utilizes the same learning method for speech, music, and sound effect generation. Our framework introduces a general representation of audio, called "language of audio" (LOA). Any audio can be translated into LOA based on AudioMAE, a self-supervised pre-trained representation learning model. In the generation process, we translate any modalities into LOA by using a GPT-2 model, and we perform self-supervised audio generation learning with a latent diffusion model conditioned on LOA. The proposed framework naturally brings advantages such as in-context learning abilities and reusable self-supervised pretrained AudioMAE and latent diffusion models. Experiments on the major benchmarks of text-to-audio, text-to-music, and text-to-speech demonstrate state-of-the-art or competitive performance against previous approaches. Our code, pretrained model, and demo are available at [this https URL](https://audioldm.github.io/audioldm2).* *Although audio generation shares commonalities across different types of audio, such as speech, music, and sound effects, designing models for each type requires careful consideration of specific objectives and biases that can significantly differ from those of other types. To bring us closer to a unified perspective of audio generation, this paper proposes a framework that utilizes the same learning method for speech, music, and sound effect generation. Our framework introduces a general representation of audio, called "language of audio" (LOA). Any audio can be translated into LOA based on AudioMAE, a self-supervised pre-trained representation learning model. In the generation process, we translate any modalities into LOA by using a GPT-2 model, and we perform self-supervised audio generation learning with a latent diffusion model conditioned on LOA. The proposed framework naturally brings advantages such as in-context learning abilities and reusable self-supervised pretrained AudioMAE and latent diffusion models. Experiments on the major benchmarks of text-to-audio, text-to-music, and text-to-speech demonstrate state-of-the-art or competitive performance against previous approaches. Our code, pretrained model, and demo are available at [this https URL](https://audioldm.github.io/audioldm2).*
This pipeline was contributed by [sanchit-gandhi](https://huggingface.co/sanchit-gandhi) and [Nguyễn Công Tú Anh](https://github.com/tuanh123789). The original codebase can be This pipeline was contributed by [sanchit-gandhi](https://huggingface.co/sanchit-gandhi). The original codebase can be found at [haoheliu/audioldm2](https://github.com/haoheliu/audioldm2).
found at [haoheliu/audioldm2](https://github.com/haoheliu/audioldm2).
## Tips ## Tips
@@ -37,8 +36,6 @@ See table below for details on the three checkpoints:
| [audioldm2](https://huggingface.co/cvssp/audioldm2) | Text-to-audio | 350M | 1.1B | 1150k | | [audioldm2](https://huggingface.co/cvssp/audioldm2) | Text-to-audio | 350M | 1.1B | 1150k |
| [audioldm2-large](https://huggingface.co/cvssp/audioldm2-large) | Text-to-audio | 750M | 1.5B | 1150k | | [audioldm2-large](https://huggingface.co/cvssp/audioldm2-large) | Text-to-audio | 750M | 1.5B | 1150k |
| [audioldm2-music](https://huggingface.co/cvssp/audioldm2-music) | Text-to-music | 350M | 1.1B | 665k | | [audioldm2-music](https://huggingface.co/cvssp/audioldm2-music) | Text-to-music | 350M | 1.1B | 665k |
| [audioldm2-gigaspeech](https://huggingface.co/anhnct/audioldm2_gigaspeech) | Text-to-speech | 350M | 1.1B |10k |
| [audioldm2-ljspeech](https://huggingface.co/anhnct/audioldm2_ljspeech) | Text-to-speech | 350M | 1.1B | |
### Constructing a prompt ### Constructing a prompt
@@ -56,7 +53,7 @@ See table below for details on the three checkpoints:
* The quality of the generated waveforms can vary significantly based on the seed. Try generating with different seeds until you find a satisfactory generation. * The quality of the generated waveforms can vary significantly based on the seed. Try generating with different seeds until you find a satisfactory generation.
* Multiple waveforms can be generated in one go: set `num_waveforms_per_prompt` to a value greater than 1. Automatic scoring will be performed between the generated waveforms and prompt text, and the audios ranked from best to worst accordingly. * Multiple waveforms can be generated in one go: set `num_waveforms_per_prompt` to a value greater than 1. Automatic scoring will be performed between the generated waveforms and prompt text, and the audios ranked from best to worst accordingly.
The following example demonstrates how to construct good music and speech generation using the aforementioned tips: [example](https://huggingface.co/docs/diffusers/main/en/api/pipelines/audioldm2#diffusers.AudioLDM2Pipeline.__call__.example). The following example demonstrates how to construct good music generation using the aforementioned tips: [example](https://huggingface.co/docs/diffusers/main/en/api/pipelines/audioldm2#diffusers.AudioLDM2Pipeline.__call__.example).
<Tip> <Tip>

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
@@ -12,10 +12,42 @@ specific language governing permissions and limitations under the License.
# AutoPipeline # AutoPipeline
The `AutoPipeline` is designed to make it easy to load a checkpoint for a task without needing to know the specific pipeline class. Based on the task, the `AutoPipeline` automatically retrieves the correct pipeline class from the checkpoint `model_index.json` file. `AutoPipeline` is designed to:
1. make it easy for you to load a checkpoint for a task without knowing the specific pipeline class to use
2. use multiple pipelines in your workflow
Based on the task, the `AutoPipeline` class automatically retrieves the relevant pipeline given the name or path to the pretrained weights with the `from_pretrained()` method.
To seamlessly switch between tasks with the same checkpoint without reallocating additional memory, use the `from_pipe()` method to transfer the components from the original pipeline to the new one.
```py
from diffusers import AutoPipelineForText2Image
import torch
pipeline = AutoPipelineForText2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True
).to("cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipeline(prompt, num_inference_steps=25).images[0]
```
<Tip>
Check out the [AutoPipeline](../../tutorials/autopipeline) tutorial to learn how to use this API!
</Tip>
`AutoPipeline` supports text-to-image, image-to-image, and inpainting for the following diffusion models:
- [Stable Diffusion](./stable_diffusion/overview)
- [ControlNet](./controlnet)
- [Stable Diffusion XL (SDXL)](./stable_diffusion/stable_diffusion_xl)
- [DeepFloyd IF](./deepfloyd_if)
- [Kandinsky 2.1](./kandinsky)
- [Kandinsky 2.2](./kandinsky_v22)
> [!TIP]
> Check out the [AutoPipeline](../../tutorials/autopipeline) tutorial to learn how to use this API!
## AutoPipelineForText2Image ## AutoPipelineForText2Image

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

View File

@@ -1,37 +0,0 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Hunyuan-DiT
![chinese elements understanding](https://github.com/gnobitab/diffusers-hunyuan/assets/1157982/39b99036-c3cb-4f16-bb1a-40ec25eda573)
[Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding](https://arxiv.org/abs/2405.08748)] from Tencent Hunyuan.
The abstract from the paper is:
*We present Hunyuan-DiT, a text-to-image diffusion transformer with fine-grained understanding of both English and Chinese. To construct Hunyuan-DiT, we carefully design the transformer structure, text encoder, and positional encoding. We also build from scratch a whole data pipeline to update and evaluate data for iterative model optimization. For fine-grained language understanding, we train a Multimodal Large Language Model to refine the captions of the images. Finally, Hunyuan-DiT can perform multi-turn multimodal dialogue with users, generating and refining images according to the context. Through our holistic human evaluation protocol with more than 50 professional human evaluators, Hunyuan-DiT sets a new state-of-the-art in Chinese-to-image generation compared with other open-source models.*
You can find the original codebase at [Tencent/HunyuanDiT](https://github.com/Tencent/HunyuanDiT) and all the available checkpoints at [Tencent-Hunyuan](https://huggingface.co/Tencent-Hunyuan/HunyuanDiT).
**Highlights**: HunyuanDiT supports Chinese/English-to-image, multi-resolution generation.
HunyuanDiT has the following components:
* It uses a diffusion transformer as the backbone
* It combines two text encoders, a bilingual CLIP and a multilingual T5 encoder
## HunyuanDiTPipeline
[[autodoc]] HunyuanDiTPipeline
- all
- __call__

View File

@@ -1,58 +0,0 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# I2VGen-XL
[I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models](https://hf.co/papers/2311.04145.pdf) by Shiwei Zhang, Jiayu Wang, Yingya Zhang, Kang Zhao, Hangjie Yuan, Zhiwu Qin, Xiang Wang, Deli Zhao, and Jingren Zhou.
The abstract from the paper is:
*Video synthesis has recently made remarkable strides benefiting from the rapid development of diffusion models. However, it still encounters challenges in terms of semantic accuracy, clarity and spatio-temporal continuity. They primarily arise from the scarcity of well-aligned text-video data and the complex inherent structure of videos, making it difficult for the model to simultaneously ensure semantic and qualitative excellence. In this report, we propose a cascaded I2VGen-XL approach that enhances model performance by decoupling these two factors and ensures the alignment of the input data by utilizing static images as a form of crucial guidance. I2VGen-XL consists of two stages: i) the base stage guarantees coherent semantics and preserves content from input images by using two hierarchical encoders, and ii) the refinement stage enhances the video's details by incorporating an additional brief text and improves the resolution to 1280×720. To improve the diversity, we collect around 35 million single-shot text-video pairs and 6 billion text-image pairs to optimize the model. By this means, I2VGen-XL can simultaneously enhance the semantic accuracy, continuity of details and clarity of generated videos. Through extensive experiments, we have investigated the underlying principles of I2VGen-XL and compared it with current top methods, which can demonstrate its effectiveness on diverse data. The source code and models will be publicly available at [this https URL](https://i2vgen-xl.github.io/).*
The original codebase can be found [here](https://github.com/ali-vilab/i2vgen-xl/). The model checkpoints can be found [here](https://huggingface.co/ali-vilab/).
<Tip>
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines. Also, to know more about reducing the memory usage of this pipeline, refer to the ["Reduce memory usage"] section [here](../../using-diffusers/svd#reduce-memory-usage).
</Tip>
Sample output with I2VGenXL:
<table>
<tr>
<td><center>
library.
<br>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/i2vgen-xl-example.gif"
alt="library"
style="width: 300px;" />
</center></td>
</tr>
</table>
## Notes
* I2VGenXL always uses a `clip_skip` value of 1. This means it leverages the penultimate layer representations from the text encoder of CLIP.
* It can generate videos of quality that is often on par with [Stable Video Diffusion](../../using-diffusers/svd) (SVD).
* Unlike SVD, it additionally accepts text prompts as inputs.
* It can generate higher resolution videos.
* When using the [`DDIMScheduler`] (which is default for this pipeline), less than 50 steps for inference leads to bad results.
* This implementation is 1-stage variant of I2VGenXL. The main figure in the [I2VGen-XL](https://arxiv.org/abs/2311.04145) paper shows a 2-stage variant, however, 1-stage variant works well. See [this discussion](https://github.com/huggingface/diffusers/discussions/7952) for more details.
## I2VGenXLPipeline
[[autodoc]] I2VGenXLPipeline
- all
- __call__
## I2VGenXLPipelineOutput
[[autodoc]] pipelines.i2vgen_xl.pipeline_i2vgen_xl.I2VGenXLPipelineOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0 http://www.apache.org/licenses/LICENSE-2.0

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0 http://www.apache.org/licenses/LICENSE-2.0

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0 http://www.apache.org/licenses/LICENSE-2.0

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved. <!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at the License. You may obtain a copy of the License at

Some files were not shown because too many files have changed in this diff Show More