In the world of AI-generated art, input images are more than just starting points—they’re powerful tools for guiding the creative process. With ComfyUI, input images can be combined with features like ControlNet, depth maps, or edge detection to create outputs that are highly detailed, structurally consistent, and uniquely yours.

In this guide, we’ll explore the different ways you can use input images in ComfyUI, including methods, combinations, and exciting use cases to unlock your creative potential.

Why Use Input Images in ComfyUI?

Input images serve as visual anchors, helping to:

  • Guide Composition: Control the layout, perspective, or structure of your output.
  • Incorporate Style: Extract and reuse colors, textures, or artistic aesthetics.
  • Add Consistency: Maintain a unified look across a series of images.

Whether you’re enhancing a photograph, reinterpreting a sketch, or creating variations of an existing design, input images give you unparalleled control.

Key Methods for Using Input Images in ComfyUI

1. Image-to-Image (I2I)

This is the simplest way to use an input image. It serves as a base that Stable Diffusion modifies based on your text prompt.

  • How It Works:
    • Load an image using a Load Image node.
    • Connect the image to a KSampler node and adjust parameters like strength and noise.
    • Add a CLIP Text Encode node to influence the final result with a prompt.
  • Use Cases:
    • Transforming a photo into an artwork (e.g., oil painting, anime style).
    • Adding surreal or stylistic elements to an image.
    • Enhancing resolution or details in an image while preserving its essence.

2. ControlNet

ControlNet allows you to use structural data (like depth maps or edge detection) from an input image to guide the output. It’s one of the most powerful tools for image control.

  • Common ControlNet Models:
    • Depth Maps: Use grayscale depth data to maintain perspective and spatial arrangement.
    • Canny Edge Detection: Extract outlines from an image to preserve structure.
    • Pose Estimation: Guide the composition with human poses detected in an input image.
  • How It Works:
    • Use a ControlNet node to load structural data from your input image.
    • Combine this with a CLIP Text Encode node for additional creative guidance.
    • Adjust ControlNet parameters like weight and strength to balance structure with style.
  • Use Cases:
    • Reimagining line art with textures and colors.
    • Keeping architectural designs precise while applying creative styles.
    • Transforming posed photos into stylized artworks.

3. Depth-to-Image

Using depth maps extracted from an input image allows for detailed spatial guidance.

  • How It Works:
    • Generate a depth map using a Depth Map Loader node or extract it using a tool like MiDaS.
    • Feed the depth map into a ControlNet Depth node.
    • Combine with a prompt to define the style or additional elements.
  • Use Cases:
    • Creating realistic landscapes with consistent perspective.
    • Adding depth and dimension to flat compositions.
    • Maintaining proportional accuracy in architectural or environmental designs.

4. Sketch or Edge Maps

Sketches and edge maps let you control the structure of the output without needing a highly detailed base image.

  • How It Works:
    • Generate an edge map with canny edge detection or provide a hand-drawn sketch.
    • Feed the map into a ControlNet Edge node.
    • Use prompts to guide textures, colors, and additional details.
  • Use Cases:
    • Turning rough sketches into fully realized artworks.
    • Reimagining outlines from existing images with creative styles.
    • Generating textures for 3D models or game assets.

5. Latent Image Processing

Latent image processing manipulates the “hidden” data that Stable Diffusion uses to generate images.

  • How It Works:
    • Use a VAE Encode node to convert an input image into a latent representation.
    • Modify the latent space using additional nodes like noise or masking.
    • Decode the latent image back into pixel space with a VAE Decode node.
  • Use Cases:
    • Subtly altering images while preserving their core features.
    • Combining latent data from multiple images to create hybrids.
    • Experimenting with abstract or surreal results.

Combining Methods for Advanced Control

The real magic happens when you combine these methods. Here are a few ideas to get started:

Line Art + Depth Map

  • Workflow:
    • Use a line drawing as an edge map in a ControlNet Edge node.
    • Add a depth map to a ControlNet Depth node.
    • Combine with a prompt like “a fantasy castle, watercolor style.”
  • Result: A richly detailed scene that respects both the structure and depth of your input.

Image-to-Image with LoRAs

  • Workflow:
    • Load an input image into a Load Image node.
    • Apply artistic LoRAs (e.g., watercolor, cyberpunk) in a LoRA Loader node.
    • Use a prompt to refine the style or add new elements.
  • Result: A stylized reimagining of your input image.

Pose + Texture Guidance

  • Workflow:
    • Extract a pose from a photo using a pose estimation tool.
    • Use the pose map with ControlNet to guide the composition.
    • Apply a texture or style LoRA for added flair.
  • Result: A dynamic, stylized character or scene that retains the pose’s integrity.

Tips for Using Input Images in ComfyUI

  • Start Simple Begin with a single input method (e.g., image-to-image) before experimenting with combinations.
  • Use Quality Input Images High-quality images produce better results, especially when extracting structural data like depth maps or edge detection.
  • Adjust Parameters Dynamically Use Primitive Nodes to tweak weights or settings for ControlNet, LoRAs, or prompts dynamically.
  • Experiment with Strength Settings In ControlNet or image-to-image workflows, strength settings control how much the input image influences the final result. Adjust to find the perfect balance.
  • Save and Compare Use an Image Grid node to visualize variations side by side for better comparison.

Creative Use Cases for Input Images

  • Photo Enhancement Transform a photograph into a stylized artwork while retaining key details like composition and color balance.
  • Concept Art Use sketches and depth maps to create polished concept art for games, movies, or other creative projects.
  • Character Design Combine pose estimation with LoRAs to generate consistent, dynamic characters.
  • Architectural Visualization Use depth maps or edge detection to create accurate yet artistic renderings of buildings and environments.
  • Abstract Explorations Experiment with latent image processing to create surreal or abstract interpretations of your input image.

Final Thoughts: Let Your Input Images Guide the Way

Input images are the backbone of many advanced workflows in ComfyUI. Whether you’re transforming sketches into polished pieces, reimagining photos with artistic styles, or guiding the AI with depth and edge maps, the possibilities are endless. By combining methods and experimenting with tools like ControlNet, you can unlock a new level of precision and creativity in your AI-generated art.

So, grab your favorite input image, load up ComfyUI, and start exploring the boundless potential of guided image generation.