this is the only use case of klein 4b base that i found, other lora like text generation always got the model collapsed and produce body horror instead
and even in this case the result isn't really that good, you got a style similar to the artist but still have to do lots of manual segmentation and color curve / hue / sat fixing if you want to get the exact color, the output image is low quality (only 1M pixels), and sometimes pixel drift
also the lora rank was 64, so for each artist requires a lora of ~200MB, which isn't very lean, i tried rank 32 lora but seems like that wasn't enough parameters to pick up complex coloring style
what can this model used for:
coarse coloring: sketch some color then have the model fill in the rest
video2video: modify some frame and use ebsynth, or run it frame by frame with worse temporal consistency but better details (not all video is applicable though, and it only change the coloring, not the line art itself, refer to example)
style switch, manga coloring, ..., however will still require manual editing to have high quality image
how the data was generated:
gather images, even low as 10 is sufficient, make sure they have the same coloring style
generate line art + coarse coloring, there are already ML model out there
generate line art on base color, line art on synthesis stroke, etc... ask any AI model to vibe code those
augment with random flip / crop / zoom / etc... so the model don't lose the editing capability (this is important if you have only few images in the dataset)