diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-02-17 00:06:20 +08:00

Author	SHA1	Message	Date
Sayak Paul	8ade2c94d0	[Hunyuan DiT] feat: enable fusing qkv projections when doing attention (#8396 ) * feat: introduce qkv fusion for Hunyuan * fix copies	2024-12-23 13:02:12 +05:30
leaps	942bc38be9	Update code example in pipeline_stable_unclip_img2img.py EXAMPLE_DOC_STRING (#8401 ) Update code example in pipeline_stable_unclip_img2img.py Previous code caused an error when run	2024-12-23 13:02:12 +05:30
Sayak Paul	4f77f52853	[Transformer2DModel] Handle `norm_type` safely while remapping (#8370 ) * handle norm_type of transformer2d_model safely. * log an info when old model class is being returned. * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * remove extra stuff --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2024-12-23 13:02:12 +05:30
Sayak Paul	9b1c118692	[HunyuanDiT] minor docs changes in hunyuandit (#8395 ) minor docs changes in hunyuandit	2024-12-23 13:02:12 +05:30
townwish4git	c0a81199d3	Fix AsymmetricAutoencoderKL forward (#8378 )	2024-12-23 13:02:12 +05:30
Marçal Comajoan Cara	26e7b4bd9a	Update transformer2d.md title (#8375 ) * Update transformer2d.md title For the other classes (e.g., UNet2DModel) the title of the documentation coincides with the name of the class, but that was not the case for Transformer2DModel. * Update model docs titles for consistency with class names	2024-12-23 13:02:12 +05:30
Dhruv Nair	92b915aca7	Update slow test actions (#8381 ) * update * update * update * update	2024-12-23 13:02:12 +05:30
XCL	d6c7e17867	Tencent Hunyuan Team - Updated Doc for HunyuanDiT (#8383 ) * add hunyuandit doc * update hunyuandit doc * update hunyuandit 2d model * update toctree.yml for hunyuandit	2024-12-23 13:02:12 +05:30
XCL	43510582cd	Tencent Hunyuan Team: add HunyuanDiT related updates (#8240 ) * Hunyuan Team: add HunyuanDiT related updates --------- Co-authored-by: XCLiu <liuxc1996@gmail.com> Co-authored-by: yiyixuxu <yixu310@gmail.com>	2024-12-23 13:02:12 +05:30
39th president of the United States, probably	7f58fad44d	Fix DREAM training (#8302 ) Co-authored-by: Jimmy <39@🇺🇸.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com>	2024-12-23 13:02:12 +05:30
Anton Obukhov	a495ed3e8b	Fix marigold documentation (#8372 ) * rename prs-eth/marigold-lcm-v1-0 into prs-eth/marigold-depth-lcm-v1-0 * update image paths in https://huggingface.co/datasets/huggingface/documentation-images to use main branch * fix relative paths to other diffusers pages * Update docs/source/en/using-diffusers/marigold_usage.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-12-23 13:02:12 +05:30
Sayak Paul	b9e55fcbf7	[Core] Introduce class variants for `Transformer2DModel` (#7647 ) * init for patches * finish patched model. * continuous transformer * vectorized transformer2d. * style. * inits. * fix-copies. * introduce DiTTransformer2DModel. * fixes * use REMAPPING as suggested by @DN6 * better logging. * add pixart transformer model. * inits. * caption_channels. * attention masking. * fix use_additional_conditions. * remove print. * debug * flatten * fix: assertion for sigma * handle remapping for modeling_utils * add tests for dit transformer2d * quality * placeholder for pixart tests * pixart tests * add _no_split_modules * add docs. * check * check * check * check * fix tests * fix tests * move Transformer output to modeling_output * move errors better and bring back use_additional_conditions attribute. * add unnecessary things from DiT. * clean up pixart * fix remapping * fix device_map things in pixart2d. * replace Transformer2DModel with appropriate classes in dit, pixart tests * empty * legacy mixin classes./ * use a remapping dict for fetching class names. * change to specifc model types in the pipeline implementations. * move _fetch_remapped_cls_from_config to modeling_loading_utils.py * fix dependency problems. * add deprecation note.	2024-12-23 13:02:12 +05:30
Dhruv Nair	f6553106f7	Change checkpoint key used to identify CLIP models in single file checkpoints (#8319 ) update	2024-12-23 13:02:12 +05:30
Jonah	6f0153747c	Fix depth pipeline "input/weight type should be the same" error at fp16 (#8321 ) Fix "input/weight type should be the same" Co-authored-by: YiYi Xu <yixu310@gmail.com>	2024-12-23 13:02:12 +05:30
satani99	760e5993de	Modularize train_text_to_image_lora_sdxl inferencing during and after training in example (#8335 ) * Modularized the train_lora_sdxl file * Modularized the train_lora_sdxl file * Modularized the train_lora_sdxl file --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:12 +05:30
Genius Patrick	a1cfb0accc	fix(training): lr scheduler doesn't work properly in distributed scenarios (#8312 )	2024-12-23 13:02:12 +05:30
Dhruv Nair	4be03cfa82	Fix StableDiffusionPipeline when `text_encoder=None` (#8297 ) * update * update --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:12 +05:30
Tolga Cangöz	b2abd9e371	Fix Copying Mechanism typo/bug (#8232 ) * Fix copying mechanism typos * fix copying mecha * Revert, since they are in TODO * Fix copying mechanism	2024-12-23 13:02:12 +05:30
Steven Liu	c62a927ba1	[docs] Files and formats (#7874 ) * files and formats * fix callout * feedback * code sample * feedback	2024-12-23 13:02:12 +05:30
Steven Liu	8ccc3f225c	[docs] DeepFloyd training (#8224 ) deepfloyd training Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:12 +05:30
Tolga Cangöz	64e3d998b3	Simplify `platform_info` assignment in `diffusers-cli env` (#8298 ) chore: Simplify `platform_info` assignment	2024-12-23 13:02:12 +05:30
satani99	630a71361f	Modularize train_text_to_image_lora SD inferencing during and after training in example (#8283 ) * Modularized the train_lora file * Modularized the train_lora file * Modularized the train_lora file * Modularized the train_lora file * Modularized the train_lora file --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:12 +05:30
Sayak Paul	e135b4dadf	post release v0.28.0 (#8286 ) * post release v0.28.0 * style	2024-12-23 13:02:12 +05:30
Sayak Paul	93c3abef71	[Core] Refactor `IPAdapterPlusImageProjection` a bit (#7994 ) * use IPAdapterPlusImageProjectionBlock in IPAdapterPlusImageProjection * reposition IPAdapterPlusImageProjection * refactor complete? * fix heads param retrieval. * update test dict creation method.	2024-12-23 13:02:12 +05:30
Sayak Paul	3a932ec0bd	move `vqmodel` to `models.autoencoders`. (#8292 ) move vqmodel to models.autoencoders.	2024-12-23 13:02:12 +05:30
Sayak Paul	84f7039fb3	[Post release 0.28.0] remove deprecated blocks. (#8291 ) * remove deprecated blocks. * update the location paths.	2024-12-23 13:02:12 +05:30
Vladimir Mandic	57843c9fac	fix pixart-sigma negative prompt handling (#8299 ) * fix negative prompt * fix --------- Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: YiYi Xu <yixu310@gmail.com>	2024-12-23 13:02:12 +05:30
Steven Liu	e73f80f05a	[docs] Outpaint (#7964 ) * first draft * edits	2024-12-23 13:02:12 +05:30
Steven Liu	6e0c2947e7	[docs] Scheduler features (#7990 ) * noise schedule * sigmas and zero snr * feedback * feedback	2024-12-23 13:02:12 +05:30
Álvaro Somoza	f59e4b1df9	Fix object has no attribute 'flush' when using without a console (#8271 ) fix	2024-12-23 13:02:12 +05:30
Sajad Norouzi	b52b0d247e	Add Kohya fix to SD pipeline for high resolution generation (#7633 ) add kohya high resolution fix.	2024-12-23 13:02:12 +05:30
Sayak Paul	c512cc9845	change to yiyi's address. (#7981 ) * change to yiyi's address. * update to diffusers@huggingface.co	2024-12-23 13:02:12 +05:30
Sayak Paul	a33f850d5c	[LoRA] attempt at fixing onetrainer lora. (#8242 ) * attempt at fixing onetrainer lora. * fix	2024-12-23 13:02:12 +05:30
Jiwook Han	36ce49851d	Fix typo in `philosophy.md` (#8303 ) fix typo in philosophy.md	2024-12-23 13:02:12 +05:30
Álvaro Somoza	eedcdafe25	[docs] Add controlnet example to marigold (#8289 ) * initial doc * fix wrong LCM sentence * implement binary colormap without requiring matplotlib update section about Marigold for ControlNet update formatting of marigold_usage.md * fix indentation --------- Co-authored-by: anton <anton.obukhov@gmail.com>	2024-12-23 13:02:12 +05:30
Sayak Paul	879ad6d38c	install wget. (#8285 )	2024-12-23 13:02:12 +05:30
Anton Obukhov	0be111f3d0	[Pipeline] Marigold depth and normals estimation (#7847 ) * implement marigold depth and normals pipelines in diffusers core * remove bibtex * remove deprecations * remove save_memory argument * remove validate_vae * remove config output * remove batch_size autodetection * remove presets logic move default denoising_steps and processing_resolution into the model config make default ensemble_size 1 * remove no_grad * add fp16 to the example usage * implement is_matplotlib_available use is_matplotlib_available, is_scipy_available for conditional imports in the marigold depth pipeline * move colormap, visualize_depth, and visualize_normals into export_utils.py * make the denoising loop more lucid fix the outputs to always be 4d tensors or lists of pil images support a 4d input_image case attempt to support model_cpu_offload_seq move check_inputs into a separate function change default batch_size to 1, remove any logic to make it bigger implicitly * style * rename denoising_steps into num_inference_steps * rename input_image into image * rename input_latent into latents * remove decode_image change decode_prediction to use the AutoencoderKL.decode method * move clean_latent outside of progress_bar * refactor marigold-reusable image processing bits into MarigoldImageProcessor class * clean up the usage example docstring * make ensemble functions members of the pipelines * add early checks in check_inputs rename E into ensemble_size in depth ensembling * fix vae_scale_factor computation * better compatibility with torch.compile better variable naming * move export_depth_to_png to export_utils * remove encode_prediction * improve visualize_depth and visualize_normals to accept multi-dimensional data and lists remove visualization functions from the pipelines move exporting depth as 16-bit PNGs functionality from the depth pipeline update example docstrings * do not shortcut vae.config variables * change all asserts to raise ValueError * rename output_prediction_type to output_type * better variable names clean up variable deletion code * better variable names * pass desc and leave kwargs into the diffusers progress_bar implement nested progress bar for images and steps loops * implement scale_invariant and shift_invariant flags in the ensemble_depth function add scale_invariant and shift_invariant flags readout from the model config further refactor ensemble_depth support ensembling without alignment add ensemble_depth docstring * fix generator device placement checks * move encode_empty_text body into the pipeline call * minor empty text encoding simplifications * adjust pipelines' class docstrings to explain the added construction arguments * improve the scipy failure condition add comments improve docstrings change the default use_full_z_range to True * make input image values range check configurable in the preprocessor refactor load_image_canonical in preprocessor to reject unknown types and return the image in the expected 4D format of tensor and on right device support a list of everything as inputs to the pipeline, change type to PipelineImageInput implement a check that all input list elements have the same dimensions improve docstrings of pipeline outputs remove check_input pipeline argument * remove forgotten print * add prediction_type model config * add uncertainty visualization into export utils fix NaN values in normals uncertainties * change default of output_uncertainty to False better handle the case of an attempt to export or visualize none * fix `output_uncertainty=False` * remove kwargs fix check_inputs according to the new inputs of the pipeline * rename prepare_latent into prepare_latents as in other pipelines annotate prepare_latents in normals pipeline with "Copied from" annotate encode_image in normals pipeline with "Copied from" * move nested-capable `progress_bar` method into the pipelines revert the original `progress_bar` method in pipeline_utils * minor message improvement * fix cpu offloading * move colormap, visualize_depth, export_depth_to_16bit_png, visualize_normals, visualize_uncertainty to marigold_image_processing.py update example docstrings * fix missing comma * change torch.FloatTensor to torch.Tensor * fix importing of MarigoldImageProcessor * fix vae offloading fix batched image encoding remove separate encode_image function and use vae.encode instead * implement marigold's intial tests relax generator checks in line with other pipelines implement return_dict __call__ argument in line with other pipelines * fix num_images computation * remove MarigoldImageProcessor and outputs from import structure update tests * update docstrings * update init * update * style * fix * fix * up * up * up * add simple test * up * update expected np input/output to be channel last * move expand_tensor_or_array into the MarigoldImageProcessor * rewrite tests to follow conventions - hardcoded slices instead of image artifacts write more smoke tests * add basic docs. * add anton's contribution statement * remove todos. * fix assertion values for marigold depth slow tests * fix assertion values for depth normals. * remove print * support AutoencoderTiny in the pipelines * update documentation page add Available Pipelines section add Available Checkpoints section add warning about num_inference_steps * fix missing import in docstring fix wrong value in visualize_depth docstring * [doc] add marigold to pipelines overview * [doc] add section "usage examples" * fix an issue with latents check in the pipelines * add "Frame-by-frame Video Processing with Consistency" section * grammarly * replace tables with images with css-styled images (blindly) * style * print * fix the assertions. * take from the github runner. * take the slices from action artifacts * style. * update with the slices from the runner. * remove unnecessary code blocks. * Revert "[doc] add marigold to pipelines overview" This reverts commit a505165150afd8dab23c474d1a054ea505a56a5f. * remove invitation for new modalities * split out marigold usage examples * doc cleanup --------- Co-authored-by: yiyixuxu <yixu310@gmail.com> Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: sayakpaul <spsayakpaul@gmail.com>	2024-12-23 13:02:12 +05:30
Dhruv Nair	bc984f82ee	Add zip package to doc builder image (#8284 ) update	2024-12-23 13:02:12 +05:30
Sayak Paul	1be1282cbc	[Workflows] add a more secure way to run tests from a PR. (#7969 ) * add a more secure way to run tests from a PR. * make pytest more secure. * address dhruv's comments. * improve validation check. * Update .github/workflows/run_tests_from_a_pr.yml Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2024-12-23 13:02:12 +05:30
Dhaivat Bhatt	d1ddc5eb78	Add details about 1-stage implementation in I2VGen-XL docs (#8282 ) * Add details about 1-stage implementation * Add details about 1-stage implementation	2024-12-23 13:02:12 +05:30
Tolga Cangöz	d8d7a0e307	Fix CPU Offloading Usage & Typos (#8230 ) * Fix typos * Fix `pipe.enable_model_cpu_offload()` usage * Fix cpu offloading * Update numbers	2024-12-23 13:02:12 +05:30
Tolga Cangöz	c69aff7798	Fix a grammatical error in the `raise` messages (#8272 ) Fix grammatical error	2024-12-23 13:02:12 +05:30
Yue Wu	6c302b4b09	sampling bug fix in diffusers tutorial "basic_training.md" (#8223 ) sampling bug fix in basic_training.md In the diffusers basic training tutorial, setting the manual seed argument (generator=torch.manual_seed(config.seed)) in the pipeline call inside evaluate() function rewinds the dataloader shuffling, leading to overfitting due to the model seeing same sequence of training examples after every evaluation call. Using generator=torch.Generator(device='cpu').manual_seed(config.seed) avoids this.	2024-12-23 13:02:12 +05:30
Dhruv Nair	915c094068	Clean up `from_single_file` docs (#8268 ) * update * update	2024-12-23 13:02:12 +05:30
Lucain	ebb3b44eec	Respect `resume_download` deprecation V2 (#8267 ) * Fix resume_downoad FutureWarning * only resume download	2024-12-23 13:02:12 +05:30
Sayak Paul	7425b46859	[Chore] run the documentation workflow in a custom container. (#8266 ) run the documentation workflow in a custom container.	2024-12-23 13:02:12 +05:30
Yifan Zhou	ae1fb33d4c	[Community Pipeline] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation (#8239 ) * code and doc * update paper link * remove redundant codes * add example video --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:12 +05:30
Dhruv Nair	d57bfe7405	Use `freedesktop_os_release()` in diffusers cli for Python >=3.10 (#8235 ) * update * update	2024-12-23 13:02:12 +05:30
Dhruv Nair	84b6effa66	Create custom container for doc builder (#8263 ) * update * update	2024-12-23 13:02:12 +05:30
Dhruv Nair	6534924ae1	Fix resize issue in SVD pipeline with VideoProcessor (#8229 ) update Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:12 +05:30

... 14 15 16 17 18 ...

4913 Commits