Running Stable Diffusion locally means free, private, unlimited image generation — if you have the right hardware. This guide covers setup on both Mac (Apple Silicon) and Windows (NVIDIA GPU).


Hardware Requirements

Minimum (functional):

  • NVIDIA GPU with 6GB VRAM (GTX 1060 or better)
  • OR Apple Silicon Mac (M1 or better, 16GB RAM recommended)
  • 20GB free storage for models

Recommended:

  • NVIDIA RTX 3080/4070 or better (10-24GB VRAM)
  • OR Apple Silicon Mac with 32GB+ RAM
  • 50-100GB storage for multiple models and LoRAs

CPU-only: Works but generation takes minutes per image instead of seconds.


Choose Your Interface

AUTOMATIC1111 (A1111): Most features, largest community, steeper learning curve. Best for power users.

ComfyUI: Node-based workflow builder. Harder to learn, more powerful for complex pipelines. Best for automation.

Forge: Fork of A1111 with better performance on newer models. Recommended for 2026.

This guide covers both A1111/Forge and a quick ComfyUI start.


Windows Setup: AUTOMATIC1111 / Forge

Prerequisites

Install these in order:

  1. Python 3.10.x — NOT 3.11+, NOT the latest
  2. Git
  3. CUDA drivers (from NVIDIA website, if using NVIDIA GPU)

Installation

# Clone the repository
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui

# On first run, it installs everything automatically
./webui-user.bat

First run takes 15-30 minutes — it downloads dependencies automatically.

git clone https://github.com/lllyasviel/stable-diffusion-webui-forge
cd stable-diffusion-webui-forge
./webui.bat

Forge runs 20-40% faster than A1111 on the same hardware.


Mac Setup (Apple Silicon)

Prerequisites

# Install Homebrew if not installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install dependencies
brew install cmake protobuf rust [email protected] git wget

Installation

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
./webui.sh

Mac uses MPS (Metal Performance Shaders) instead of CUDA. Performance is good on M2/M3 chips.


Downloading Models

Models go in: stable-diffusion-webui/models/Stable-diffusion/

Realistic Imagery:

  • Stable Diffusion XL Base 1.0 (from HuggingFace: stabilityai/stable-diffusion-xl-base-1.0)
  • Realistic Vision v6 (civitai.com — excellent for portraits)

Artistic:

  • Dreamshaper XL
  • SDXL-Lightning (4-step, very fast)

Download via command line:

# From HuggingFace
cd stable-diffusion-webui/models/Stable-diffusion/
wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors

Or download .safetensors files from Civitai.com and move them to the models directory.


First Generation

  1. Open http://localhost:7860 in your browser (A1111 starts a web server)
  2. Select a model from the Checkpoint dropdown
  3. Type a positive prompt and optionally a negative prompt
  4. Click Generate

Basic Prompt Structure

[Subject], [setting/context], [style/medium], [lighting], [quality modifiers]

Example:

Positive: A woman reading a book in a cozy library, afternoon sunlight through tall windows, oil painting, warm colors, highly detailed, masterpiece

Negative: ugly, bad anatomy, deformed, low quality, blurry, watermark, text, nsfw

Key Settings Explained

Sampling Method: DPM++ 2M Karras (balanced), Euler a (creative), DDIM (fast)

Sampling Steps:

  • Fast: 15-20 steps (lower quality)
  • Standard: 25-30 steps (good balance)
  • Detailed: 40-50 steps (diminishing returns)

CFG Scale: How closely to follow the prompt

  • Low (3-5): More creative, less literal
  • Medium (7): Default, good balance
  • High (10-15): Strict prompt following, can be harsh

Size:

  • SDXL native: 1024×1024
  • Portrait: 832×1216 or 768×1024
  • Landscape: 1216×832

ComfyUI Setup

ComfyUI is more complex but enables powerful automation:

git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
pip install -r requirements.txt

# Run
python main.py

Access at http://localhost:8188

ComfyUI uses a node-based workflow. Each node represents an operation; you connect them to build generation pipelines. Start with the default workflow and expand from there.


LoRA Models

LoRA (Low-Rank Adaptation) models fine-tune specific styles or characters.

Install: Drop .safetensors files into models/Lora/

Use in prompt:

<lora:model_name:0.7>

The number (0.7) is the weight (0.1-1.0). Higher = stronger influence.

Find LoRAs: Civitai.com has thousands of community-trained LoRAs for styles, characters, and concepts.


Optimization Tips

For NVIDIA GPUs:

  • Add --xformers to webui-user.bat for faster generation
  • Use --medvram with 6-8GB VRAM GPUs
  • Use --lowvram with 4GB VRAM (slower but functional)

For Apple Silicon:

  • Use --no-half if you see color artifacts
  • Generation speed is best on M2 Pro/Max and later chips

Model format: Prefer .safetensors over .ckpt files — faster to load and safer.


Common Issues

CUDA out of memory: Reduce image resolution or add --medvram flag

Black/green images on Mac: Add --no-half flag

Slow on CPU: Normal — CPU generation is 50-100x slower than GPU. Consider cloud options.

Model not showing: Check the file is in the correct folder and refresh the checkpoint list


Privacy

Everything runs locally. No images or prompts are sent anywhere. This makes local Stable Diffusion ideal for:

  • NSFW content (follows local laws)
  • Unreleased product images
  • Client work with confidentiality requirements
  • Any use case where you don’t want a third party seeing your generations