1

65

submitted 2 years ago by [email protected] to c/[email protected]

14 comments fedilink

This is a copy of /r/stablediffusion wiki to help people who need access to that information

Howdy and welcome to r/stablediffusion! I'm u/Sandcheeze and I have collected these resources and links to help enjoy Stable Diffusion whether you are here for the first time or looking to add more customization to your image generations.

If you'd like to show support, feel free to send us kind words or check out our Discord. Donations are appreciated, but not necessary as you being a great part of the community is all we ask for.

Note: The community resources provided here are not endorsed, vetted, nor provided by Stability AI.

#Stable Diffusion

Local Installation

Active Community Repos/Forks to install on your PC and keep it local.

Online Websites

Websites with usable Stable Diffusion right in your browser. No need to install anything.

Mobile Apps

Stable Diffusion on your mobile device.

Tutorials

Learn how to improve your skills in using Stable Diffusion even if a beginner or expert.

Dream Booth

How-to train a custom model and resources on doing so.

Models

Specially trained towards certain subjects and/or styles.

Embeddings

Tokens trained on specific subjects and/or styles.

Bots

Either bots you can self-host, or bots you can use directly on various websites and services such as Discord, Reddit etc

3rd Party Plugins

SD plugins for programs such as Discord, Photoshop, Krita, Blender, Gimp, etc.

Other useful tools

Diffusion Toolkit - Image viewer/organizer that scans your images for PNGInfo generated.
Pixiz Morphing - Easily transition between 2 photos.
Bulk Image Resizing Made Easy 2.0

#Community

Games

PictionAIry : (Video|2-6 Players) - The image guessing game where AI does the drawing!

Podcasts

This is Not An AI Art Podcast - Doug Smith talks about Ai Art and provides the prompts/workflow on his site.

Databases or Lists

AiArtApps
Stable Diffusion Akashic Records
Questianon's SD Updates 1
Questianon's SD Updates 2
SW-Yw's Stable Diffusion Repo List
Plonk's SD Model List (NSFW)
Nightkall's Useful Lists
Civitai - Website with a list of custom models.

Still updating this with more links as I collect them all here.

FAQ

How do I use Stable Diffusion?

Check out our guides section above!

Will it run on my machine?

Stable Diffusion requires a 4GB+ VRAM GPU to run locally. However, much beefier graphics cards (10, 20, 30 Series Nvidia Cards) will be necessary to generate high resolution or high step images. However, anyone can run it online through DreamStudio or hosting it on their own GPU compute cloud server.
Only Nvidia cards are officially supported.
AMD support is available here unofficially.
Apple M1 Chip support is available here unofficially.
Intel based Macs currently do not work with Stable Diffusion.

How do I get a website or resource added here?

*If you have a suggestion for a website or a project to add to our list, or if you would like to contribute to the wiki, please don't hesitate to reach out to us via modmail or message me.

2

4

Rooted by Subash Breakdown - Unreal Engine 5 (files.catbox.moe)

submitted 1 day ago* (last edited 1 day ago) by [email protected] to c/[email protected]

1 comments fedilink

https://youtu.be/Xa9Sg-j62xY

Artstation: https://www.artstation.com/artwork/eR41JJ

3

5

bghira/SimpleTuner v1.3.0 released with LTX Video T2V/I2V finetuning support (github.com)

submitted 2 days ago by [email protected] to c/[email protected]

0 comments fedilink

Quickstart: https://github.com/bghira/SimpleTuner/blob/main/documentation/quickstart/LTXVIDEO.md

Release notes: https://github.com/bghira/SimpleTuner/releases/tag/v1.3.0

4

8

Introducing Stable Virtual Camera: Multi-View Video Generation with 3D Camera Control — Stability AI (stability.ai)

submitted 4 days ago by [email protected] to c/[email protected]

0 comments fedilink

5

1

Illustrious XL | Support Open AI Innovation (www.illustrious-xl.ai)

submitted 5 days ago by [email protected] to c/[email protected]

0 comments fedilink

6

ConsisLoRA: Enhancing Content and Style Consistency for LoRA-based Style Transfer (imgur.com)

submitted 1 week ago by [email protected] to c/[email protected]

0 comments fedilink

Abstract

Style transfer involves transferring the style from a reference image to the content of a target image. Recent advancements in LoRA-based (Low-Rank Adaptation) methods have shown promise in effectively capturing the style of a single image. However, these approaches still face significant challenges such as content inconsistency, style misalignment, and content leakage. In this paper, we comprehensively analyze the limitations of the standard diffusion parameterization, which learns to predict noise, in the context of style transfer. To address these issues, we introduce ConsisLoRA, a LoRA-based method that enhances both content and style consistency by optimizing the LoRA weights to predict the original image rather than noise. We also propose a two-step training strategy that decouples the learning of content and style from the reference image. To effectively capture both the global structure and local details of the content image, we introduce a stepwise loss transition strategy. Additionally, we present an inference guidance method that enables continuous control over content and style strengths during inference. Through both qualitative and quantitative evaluations, our method demonstrates significant improvements in content and style consistency while effectively reducing content leakage.

Paper: https://arxiv.org/abs/2503.10614

Code: https://github.com/000linlin/ConsisLoRA (coming soon)

Project Page: https://consislora.github.io/

7

10

WAN 2.1 I2V 720P – 54% Faster Video Generation with SageAttention + TeaCache! (civitai.com)

submitted 2 weeks ago by [email protected] to c/[email protected]

0 comments fedilink

8

5

tencent/HunyuanVideo-I2V (huggingface.co)

submitted 2 weeks ago by [email protected] to c/[email protected]

3 comments fedilink

FP8 Model: https://huggingface.co/Kijai/HunyuanVideo_comfy/tree/main

ComfyUI nodes (updated wrapper): https://github.com/kijai/ComfyUI-HunyuanVideoWrapper

Example ComfyUI workflow: https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/blob/main/example_workflows/hyvideo_i2v_example_01.json

Discord Server: https://discord.com/invite/7tsKMCbNFC

9

6

tensorart/stable-diffusion-3.5-large-TurboX (huggingface.co)

submitted 2 weeks ago* (last edited 2 weeks ago) by [email protected] to c/[email protected]

0 comments fedilink

Model Description

TensorArt-TurboX-SD3.5Large is a highly optimized variant of Stable Diffusion 3.5 Large, achieving 6x faster generation speed with minimal quality loss. It surpasses the official Stable Diffusion 3.5 Large Turbo in image detail, diversity, richness, and realism.

10

6

nygaard91/Pixel-Perfect-AI-Art-Converter: AI Art Converter is designed to help you convert any image into pixel art with greater control and accuracy. (youtu.be)

submitted 2 weeks ago* (last edited 2 weeks ago) by [email protected] to c/[email protected]

0 comments fedilink

Repo: https://github.com/nygaard91/Pixel-Perfect-AI-Art-Converter

AI-generated pixel art aren’t truly pixel perfect, too oversized for game assets, or otherwise unsuitable for professional use—this tool provides a practical, manual approach that can better meet your needs.

11

5

CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion (github.com)

submitted 2 weeks ago by [email protected] to c/[email protected]

0 comments fedilink

Abstract

Recent advancements in text-to-image generative systems have been largely driven by diffusion models. However, single-stage text-to-image diffusion models still face challenges, in terms of computational efficiency and the refinement of image details. To tackle the issue, we propose CogView3, an innovative cascaded framework that enhances the performance of text-to-image diffusion. CogView3 is the first model implementing relay diffusion in the realm of text-to-image generation, executing the task by first creating low-resolution images and subsequently applying relay-based super-resolution. This methodology not only results in competitive text-to-image outputs but also greatly reduces both training and inference costs. Our experimental results demonstrate that CogView3 outperforms SDXL, the current state-of-the-art open-source text-to-image diffusion model, by 77.0% in human evaluations, all while requiring only about 1/2 of the inference time. The distilled variant of CogView3 achieves comparable performance while only utilizing 1/10 of the inference time by SDXL.

Paper: https://arxiv.org/abs/2403.05121

Code: https://github.com/THUDM/CogView4

Demo: https://huggingface.co/spaces/THUDM-HF-SPACE/CogView4

Model Weights: https://huggingface.co/THUDM/CogView4-6B/tree/main

12

15

Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion (imgur.com)

submitted 2 weeks ago by [email protected] to c/[email protected]

2 comments fedilink

Abstract

In 3D modeling, designers often use an existing 3D model as a reference to create new ones. This practice has inspired the development of Phidias, a novel generative model that uses diffusion for reference-augmented 3D generation. Given an image, our method leverages a retrieved or user-provided 3D reference model to guide the generation process, thereby enhancing the generation quality, generalization ability, and controllability. Our model integrates three key components: 1) meta-ControlNet that dynamically modulates the conditioning strength, 2) dynamic reference routing that mitigates misalignment between the input image and 3D reference, and 3) self-reference augmentations that enable self-supervised training with a progressive curriculum. Collectively, these designs result in a clear improvement over existing methods. Phidias establishes a unified framework for 3D generation using text, image, and 3D conditions with versatile applications.

Paper: https://arxiv.org/abs/2409.11406

Code: https://github.com/3DTopia/Phidias-Diffusion

Models: https://huggingface.co/ZhenweiWang/Phidias-Diffusion/tree/main

Project Page: https://rag-3d.github.io/

13

5

DEVAIEXP/mod-control-tile-upscaler-sdxl: MoD Control Tile Upscaler for SDXL Pipeline (github.com)

submitted 2 weeks ago by [email protected] to c/[email protected]

0 comments fedilink

14

3

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation (art-msra.github.io)

submitted 2 weeks ago* (last edited 2 weeks ago) by [email protected] to c/[email protected]

0 comments fedilink

Abstract

Multi-layer image generation is a fundamental task that enables users to isolate, select, and edit specific image layers, thereby revolutionizing interactions with generative models. In this paper, we introduce the Anonymous Region Transformer (ART), which facilitates the direct generation of variable multi-layer transparent images based on a global text prompt and an anonymous region layout. Inspired by Schema theory suggests that knowledge is organized in frameworks (schemas) that enable people to interpret and learn from new information by linking it to prior knowledge.}, this anonymous region layout allows the generative model to autonomously determine which set of visual tokens should align with which text tokens, which is in contrast to the previously dominant semantic layout for the image generation task. In addition, the layer-wise region crop mechanism, which only selects the visual tokens belonging to each anonymous region, significantly reduces attention computation costs and enables the efficient generation of images with numerous distinct layers (e.g., 50+). When compared to the full attention approach, our method is over 12 times faster and exhibits fewer layer conflicts. Furthermore, we propose a high-quality multi-layer transparent image autoencoder that supports the direct encoding and decoding of the transparency of variable multi-layer images in a joint manner. By enabling precise control and scalable layer generation, ART establishes a new paradigm for interactive content creation.

Paper: https://arxiv.org/abs/2502.18364

Code: https://github.com/microsoft/art-msra

Demo: http://20.65.136.27:8060/

Project Page: https://art-msra.github.io/

15

9

Wan-Video/Wan2.1: Wan: Open and Advanced Large-Scale Video Generative Models (github.com)

submitted 3 weeks ago* (last edited 3 weeks ago) by [email protected] to c/[email protected]

2 comments fedilink

16

7

SVDQuant Meets NVFP4: 4× Smaller and 3× Faster FLUX with 16-bit Quality on NVIDIA Blackwell GPUs (hanlab.mit.edu)

submitted 3 weeks ago* (last edited 3 weeks ago) by [email protected] to c/[email protected]

0 comments fedilink

17

11

Type (lemmy.fish)

submitted 1 month ago by [email protected] to c/[email protected]

2 comments fedilink

18

16

Region-Adaptive Sampling for Diffusion Transformers (microsoft.github.io)

submitted 1 month ago* (last edited 1 month ago) by [email protected] to c/[email protected]

1 comments fedilink

Abstract

Diffusion models (DMs) have become the leading choice for generative tasks across diverse domains. However, their reliance on multiple sequential forward passes significantly limits real-time performance. Previous acceleration methods have primarily focused on reducing the number of sampling steps or reusing intermediate results, failing to leverage variations across spatial regions within the image due to the constraints of convolutional U-Net structures. By harnessing the flexibility of Diffusion Transformers (DiTs) in handling variable number of tokens, we introduce RAS, a novel, training-free sampling strategy that dynamically assigns different sampling ratios to regions within an image based on the focus of the DiT model. Our key observation is that during each sampling step, the model concentrates on semantically meaningful regions, and these areas of focus exhibit strong continuity across consecutive steps. Leveraging this insight, RAS updates only the regions currently in focus, while other regions are updated using cached noise from the previous step. The model's focus is determined based on the output from the preceding step, capitalizing on the temporal consistency we observed. We evaluate RAS on Stable Diffusion 3 and Lumina-Next-T2I, achieving speedups up to 2.36x and 2.51x, respectively, with minimal degradation in generation quality. Additionally, a user study reveals that RAS delivers comparable qualities under human evaluation while achieving a 1.6x speedup. Our approach makes a significant step towards more efficient diffusion transformers, enhancing their potential for real-time applications.

Paper: https://arxiv.org/abs/2502.10389

Paper Summary: https://www.aimodels.fyi/papers/arxiv/region-adaptive-sampling-diffusion-transformers

Code: https://github.com/microsoft/RAS

Project Page: https://microsoft.github.io/RAS/

19

10

Uncensoring Flux.1 Dev: Abliteration (medium.com)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

20

14

CivitAI has announced that paid users can now limit usage of models to on-site generation only. (civitai.com)

submitted 1 month ago by [email protected] to c/[email protected]

4 comments fedilink

"This feature is launching first for Gold-tier Civitai Subscribers as we refine and improve its functionality" - so they're most likely planning to expand it to all users at some point.

Pretty disappointed with the direction CiviatAI is going.

21

4

tin2tin/Pallaidium: OmniGen implemented in PALLAIDIUM add-on for Blender (github.com)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

22

15

This Company Got a Copyright for an Image Made Entirely With AI. Here's How (www.cnet.com)

submitted 1 month ago by [email protected] to c/[email protected]

6 comments fedilink

23

7

Understanding Flux LoRA Training Parameters (civitai.com)

submitted 1 month ago by [email protected] to c/[email protected]

1 comments fedilink

24

8

MackinationsAi/Window_Trellis: Windows - One click Install (github.com)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

25

6

yuvraj108c/ComfyUI-Video-Depth-Anything (github.com)

submitted 1 month ago by [email protected] to c/[email protected]

0 comments fedilink

Stable Diffusion

Also see

Other communities

Local Installation

Online Websites

Mobile Apps

Tutorials

Dream Booth

Models

Embeddings

Bots

3rd Party Plugins

Other useful tools

Games

Podcasts

Databases or Lists

FAQ

How do I use Stable Diffusion?

Will it run on my machine?

How do I get a website or resource added here?

Abstract

Model Description

Abstract

Abstract

Abstract

Abstract