Semantic Image Disassembler (SID) - CivArchive (CivitAI Archive)

Update:
Replaced v. 261 with v. 2627.
Added multi thread support and some bug fixed.

SID (Semantic Image Disassembler) is a VLM-based tool for prompt extraction, semantic style transfer, and image re-composition (de-summarization).
It works with LM Studio (via local API) using Qwen3-VL-8B-Instruct or any similar vision-capable VLM (tested with Qwen3-VL and Gemma 3).

SID separates inputs into Content (structure, subject, composition) and Style (lighting, materials, visual physics) using a structured JSON analysis stage. Different modes operate on this analysis without re-interpreting the input.

Inputs

SID has two inputs: Style and Content.
Both inputs support images and text.

Multiple images are supported for batch processing.
Only one text file per input is supported (multiple TXT files are not supported).
Text files are treated like wildcards: 1 line / 1 paragraph = 1 prompt.
File type does not affect logic — only which input slot is populated.

Modes

Only "Styles" input used:
- Style DNA Extraction – extracts reusable visual physics (lighting, materials, energy behavior).
- Full Prompt Extraction – reconstructs a complete, generation-ready prompt describing how the image is rendered.
Only "Content" input used:
- De-summarization – the input is treated as a TL;DR / summary of a full scene.
  SID reasons about missing structure, environment, materials, and context to deduce a detailed “full picture” description.
Styles + Content, both inputs used:
- Semantic Style Transfer – preserves subject, pose, and composition from Content and renders it using only the visual physics of Style.

Smart pairing

When multiple images are provided, SID automatically selects an appropriate pairing strategy:

one content → multiple style variations
multiple contents → one unified style
one-to-one batch pairing

SID shows intermediate stages during execution, automatically logs all results.
SID can be useful for creating LoRA datasets, by extracting a consistent style from as little as one reference image and applying it across multiple contents.

Requirements

Python
LM Studio https://lmstudio.ai/
Gradio

How to run

Install LM Studio
Download (I recommend downloading model using LM Studio internal search) and load a vision-capable VLM (e.g. Qwen3-VL-8B-Instruct)
Start the LM Studio Local Server (In Developer tab, port 1234)
Launch SID

Inputs

Modes

Smart pairing

Requirements

How to run

Description

FAQ

Details

Files

semanticImage_sidV261.zip

Mirrors

semanticImage_sidV26.zip

Mirrors

semanticImage_sidV2624.zip

Mirrors

semanticImage_sidV2627.zip

Mirrors

Inputs

Modes

Smart pairing

Requirements

How to run

Description

FAQ

What is Semantic Image Disassembler (SID)?

What files are available and where can I download them?

Details

Files

semanticImage_sidV261.zip

Mirrors

semanticImage_sidV26.zip

Mirrors

semanticImage_sidV2624.zip

Mirrors

semanticImage_sidV2627.zip

Mirrors