Building Clipper: An AI Image Generator You Control

May 7, 2025

“If you’ve ever pasted 50 prompts into an image generator one-by-one, this is for you. I hit my limit and built Clipper to solve it.”

📖 Summary

In the previous blog post I wrote a research paper: Cross-Modal Cognitive Mapping. This paper is about turning your conversations into images to gradually map your thought patterns. The implementation of this paper is an application called Prism.

A component of this app is image generation from prompts or your conversations. All of the Foundation models support this but it’s a pretty janky process where you have to generate the prompt paste it into a text box and download the image. I just went through a week of doing this while building a prompt toolkit. While I was doing this I kept wishing I built the app which I’m going to share with you now.

⚙️ Clipper

Clipper performs a really simple task:

Read the documentation here: Clipper Docs

 ┌──────────────────────────────────────┐
 │  📥 Prompt or Prompt File (Text)     │
 └──────────────────────────────────────┘
                  │
                  ▼
 ┌──────────────────────────────────────┐
 │  🎨 Image Generation (WebUI Forge)   │
 └──────────────────────────────────────┘
                  │
                  ▼
 ┌──────────────────────────────────────┐
 │  📝 Log Output (prompt_log.jsonl)    │
 └──────────────────────────────────────┘

As you go along you may find you need something like this. For me, I found lots of instances where I wanted to generate a large amount of images. You get really good quality from open AI but there is a limit to how many you can generate and it’s a slow process pasting into a text box waiting on a result.

Clipper depends on stable-diffusion-webui-forge .

🔧 Installing Web UI Forge

To use Clipper you’ll need a working installation of Stable Diffusion WebUI Forge a highly optimized fork of the popular AUTOMATIC1111 WebUI, focused on speed and performance.

Go to https://github.com/lllyasviel/stable-diffusion-webui-forge.git
There is a releases section on middle left of the main the github page. Download the latest version.
It is a large zip file.
Extract it run update.bat and then run run.bat.OK look at this this is a very complex
You will need to install a model. So what you do is go here civitai they have lots of models. Be careful there is some nsfw content here.
Once you have downloaded the model put it in this directory: C:\webui_forge\webui\models\Stable-diffusion
You will need to enable the api to do this you need to change the RUN_ARGS of the applciation.

# C:\webui_forge\webui\webui-user.bat

@echo off

set PYTHON=
set GIT=
set VENV_DIR=
## change this line to enable the api
set COMMANDLINE_ARGS="--api"  

call webui.bat

Once launched, the Web UI will be accessible at http://127.0.0.1:7860 by default. This becomes the foundation that Clipper communicates with to generate images.

There is a lot more details on the github repo: stable-diffusion-webui-forge.

🎯 When Would You Use Clipper?

🖼 Generating chapter or article illustrations in bulk 🧪 Creating concept art from text prompts 💻 Mocking up UIs for app design or product dev 🔄 Building a dataset of prompt/image pairs 🧠 Visualizing thought structures from conversation history (as used in Prism)

📜 The core code

Clipper is a lightweight Python interface designed to automate prompt-based image generation using Web UI Forge. It wraps around the image generation API and logs the results so you can use that in your pipeline processes.

class Clipper:
    def __init__(self, config: ClipperConfig):
        self.config = config
        self._setup_logging()

    def _setup_logging(self):
        logging.basicConfig(
            level=logging.INFO,
            format="%(asctime)s [%(levelname)s] %(message)s",
            handlers=[
                logging.FileHandler(self.config.log_file, encoding='utf-8'),
                logging.StreamHandler()
            ]
        )

    def _sanitize_filename(self, prompt: str) -> str:
        slug = re.sub(r'[^\w\s-]', '', prompt).strip().lower()
        return re.sub(r'[-\s]+', '-', slug)[:50]

    def _log_prompt_metadata(self, prompt: str, filename: str, timestamp: str):
        log_entry = {
            "prompt": prompt,
            "filename": filename,
            "timestamp": timestamp,
            "width": self.config.width,
            "height": self.config.height,
            "cfg_scale": self.config.cfg_scale,
            "steps": self.config.steps
        }
        with open(self.config.prompt_log, "a", encoding="utf-8") as f:
            f.write(json.dumps(log_entry) + "\n")

    def generate_image(self, prompt: str):
        timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
        base_name = self._sanitize_filename(prompt)
        filename = f"{base_name}_{timestamp}.png"
        output_path = os.path.join(self.config.output_dir, filename)

        payload = {
            "prompt": prompt,
            "steps": self.config.steps,
            "width": self.config.width,
            "height": self.config.height,
            "cfg_scale": self.config.cfg_scale,
            # "enable_hr": True,
            # "hr_scale": 2,
            # "hr_upscaler": "RealESRGAN_x4plus",
            # "denoising_strength": 0.4
        }

        try:
            response = requests.post(self.config.api_url, json=payload)
            response.raise_for_status()
            r = response.json()

            if "images" in r:
                image_data = r["images"][0]
                image_bytes = base64.b64decode(image_data.split(",", 1)[-1])

                with open(output_path, "wb") as f:
                    f.write(image_bytes)

                logging.info(f"✅ Saved: {filename}")
                self._log_prompt_metadata(prompt, filename, timestamp)
            else:
                logging.error(f"❌ No image returned for: {prompt}")

        except Exception as e:
            logging.exception(f"❌ Error generating image for: {prompt} {e}")

    def run_batch(self, prompts: List[str]):
        logging.info(f"📦 Starting batch: {len(prompts)} prompts")
        for i, prompt in enumerate(prompts):
            logging.info(f"[{i+1}/{len(prompts)}] {prompt[:60]}")
            self.generate_image(prompt)
            time.sleep(self.config.delay)
        logging.info("🎉 Batch complete")

📜 Code Overview

The application sends a text prompt to a webui-forge via API, retrieves the generated image, and saves it to a configurable output directory. The filename is constructed from the prompt and a timestamp to ensure uniqueness. Alongside the image, metadata such as the original prompt, dimensions, CFG scale, and generation parameters are logged in an adjacent .jsonl file. This allows for easy integration into downstream pipelines or later reference.

✂️ Using Clipper

🖼 Generate from a Single Prompt

pip install clipper-ai

clipper --prompt "A futuristic robot walking in a neon city"

This will:

Call the configured web backend (http://127.0.0.1:7860)
Save the resulting image in generated_images/
Log the prompt in prompt_log.jsonl

In the console you see.

2025-05-08 09:57:51,653 [INFO] 📦 Starting batch: 1 prompts
2025-05-08 09:57:51,653 [INFO] [1/1] A futuristic robot walking in a neon city
2025-05-08 09:58:02,964 [INFO] ✅ Saved: a-futuristic-robot-walking-in-a-neon-city_2025-05-08_09-57-51.png
2025-05-08 09:58:03,966 [INFO] 🎉 Batch complete

This is what you will find in the jsonl log prompt_log.jsonl

{
    "prompt": "A futuristic robot walking in a neon city", 
    "filename": "a-futuristic-robot-walking-in-a-neon-city_2025-05-08_09-57-51.png", 
    "timestamp": "2025-05-08_09-57-51", 
    "width": 512, 
    "height": 512, 
    "cfg_scale": 7, 
    "steps": 30
}

There is also a log file for debugging.

Robot

📄 Generate from a Prompt File

clipper --prompts prompts.txt

Each line in prompts.txt should be a separate text prompt. The tool will:

Loop through the list
Generate one image per line
Sleep between prompts (default: 2 seconds)

⚙️ Using a Custom Config

I know it’s been too long at this

clipper --prompts prompts.txt --config custom_config.json

The config file lets you control:

Output resolution
Inference steps
CFG scale
Delay between prompts

See the Configuration page for details.

🐍 Using Clipper from Python

You can use Clipper as a library inside any Python script:

from clipper.core import Clipper
from clipper.config import load_config

# Load default config
config = load_config()

# Create the Clipper engine
engine = Clipper(config=config)

# Run a single prompt
engine.generate_image("A serene sunset over a futuristic city")

# Run a batch
prompts = ["A glowing forest at night", "A dragon flying over the mountains"]
engine.run_batch(prompts)

Clipper Flow

⚡ Advanced Usage

🔁 Setting a Random Seed

You can modify the config to set a deterministic seed per run:

{
  "seed": 42,
  "steps": 30,
  "width": 512,
  "height": 512
}

Setting a seed ensures repeatable results across runs.

🌐 Changing the Backend URL

By default, Clipper sends prompts to http://127.0.0.1:7860. To override this, pass a custom backend_url in the config:

{
  "backend_url": "http://localhost:7860",
  "steps": 20
}

You can use this to point to:

A remote server
A Docker container
A modified SD backend

📁 Custom Output Directory

To change where images are saved:

{
  "output_dir": "my_results/"
}

This is useful if you want to separate runs by topic or project.

📂 Output Files

File	Description
`generated_images/`	All output images go here (unless overridden)
`prompt_log.jsonl`	One JSON line per prompt/image
`clipper_config.json`	Optional config file with generation settings

🧠 Requirements

A working instance of webui_forge running locally at http://127.0.0.1:7860
Python 3.10+
Installed package via pip install clipper-ai

🧩 What Clipper Doesn’t Do (Yet)

Clipper hands off image generation to webui_forge. You can’t yet:

Control model, sampler, or fine-grained config
Switch output styles on the fly
Tag prompts by purpose (icon, scene, etc.)

← Back to Blog