Paint By Numbers - Victory At Last?

Posted on December 12th, 2025

After a 10-month hiatus (inspired by Togashi-sama), we're finally back for the final part of the most anticipated series of 2025!

The break wasn't planned - I simply got tired of wrestling with this problem and shifted my attention to other projects. But after some well-deserved rest days from work, the energy to tackle this challenge came rushing back. Sometimes, stepping away is exactly what you need to return with fresh eyes and renewed enthusiasm.

Part 3 ended on a high note - we successfully created a paint-by-numbers effect that actually worked. However, I wasn't fully satisfied with the results. The final result was very pixelated and some of the color choices felt... off. So I dove back in to polish things up.

Spoiler alert: it was worth it.

If you missed the previous posts:

Let's begin with the final result, shall we?

Original Image

Paint by Numbers Preview

Final Canvas Result

Canvas Preview

Let's dive deeper into what changed and how we achieved this.

Clustering And Color Spaces

The first major change is actually a simple one - getting better and more consistent colors from the K-Means clustering. The previous version used the color representation in the RGB space to cluster similar colors - this had an effect of outputting weird colors sometimes, and the colors output by the algorithm would change a lot each time we ran the clustering. This is expected, of course, because K-Means is non-deterministic, but the results were all over the place.

The reason for this is that at the end of the day K-Means uses Euclidean distance to calculate how similar a color is to another:

$$ \text{Distance} = \sqrt{(x_{2} - x_{1})^{2} + (y_{2} - y_{1})^{2} + (z_{2} - z_{1})^{2}} $$

To illustrate this, let's use 3 colors in RGB as example: dark red (100, 0, 0), dark yellow (100, 100, 0) and dark gray (100, 100, 100).

In RGB, the distance from dark red to dark yellow is the same as the distance from dark yellow to dark gray - in a hypothetical scenario, if K-Means has to choose an approximation for dark yellow between these two, it could be as likely to choose dark red as dark gray - perceptually for humans (at least for me) they are very different from each other.

In contrast, CIELab tries to represent colors in a way that is more tuned to human perception. There is an L channel (lightness) which tells how bright or dark a color is - ranging from pure black (0) to pure white (100). The a channel represents a red-green axis (negative values are green, positive values are red), and the b channel represents a blue-yellow axis (negative values are blue, positive values are yellow).

In the example above, when we convert to CIELab, dark yellow would be closer to gray than it is to dark red because they share similar lightness values and yellow-to-gray represents a smaller shift in the color axes than yellow-to-red (which is a complete hue change). This means K-Means will group perceptually similar colors together, rather than just mathematically close RGB values.

The practical result? Converting from RGB to CIELab before clustering significantly improves the quality of the final clustered colors and outputs more consistent results. The colors look more natural, and running the algorithm multiple times produces much more stable palettes - no more wild swings between different color schemes with each execution.

Small Region Removal

Our previous method, while functional, had some drawbacks. First, its reliance on Python loops - although we were using ThreadPoolExecutor to parallelize operations, it is still somewhat slow. And secondly, the small regions color was being determined by "vote" (most common neighboring color), which works fine for small regions with just one neighboring color, but tends to create weird patches for small regions bordering multiple colors - a better strategy would be, if possible, to fill each pixel with the nearest color.

In Part 2, we worked with Morphological Transformations - one of them - dilation - had the effect of removing holes. Dilation operations have an interesting property - they "preserve color". Since dilation is not a convolution that uses averages that could change the color space, the final dilated image will have the same colors as the initial image. This is because when dilating, if you have a pixel $P$ with $N$ neighbors, the new value $P^{\prime}$ is:

$$ P^{\prime} = \underset{(i, j) \in N}{\text{max}} P_{i,j} $$

It is only finding a local maximum in a neighborhood - what we can do then is:

Transform the multi-channel image (in RGB, or CIELab) into a matrix where each color is a label - for example, (255, 0, 0) is mapped to 1 - this creates a single channel image with color labels where we can apply dilation;
We still use the same approach of using connectedComponentsWithStats to find connected regions and their areas - we filter areas smaller than the removal threshold and use them to create a mask - these regions will be treated as holes;
Then, we iteratively apply dilation until there are no holes left;
Finally, we convert the label representation back to the color representation using the palette.

Here is a small example of what that iterative process looks like:

Dilation Iteration Example

Of course, this wasn't as straightforward as it sounds. My first attempts with dilation were too aggressive, merging regions that should have stayed separate. After some experimentation, I found that controlling the number of iterations and using a conservative kernel size produced the best results - enough to fill small holes without destroying the image's structure.

Thin Region Removal

This was probably the ugliest part of the previous algorithm - doing horizontal/vertical scans in the image to remove thin patches had an undesired side effect of "squaring" the final image, since the passes were rectangular - let alone that it was a very heavy operation that needed to run iteratively.

Here, we'll again make use of Morphological Transformations - we'll use the morphological opening, which is an erosion followed by a dilation - this has the effect of removing small objects from the foreground, which are smaller than the kernel size - that means if there is are regions connected by thin strips, they will be removed - we'll use the same dilation approach from before to fill that region with the appropriate color. Here is an example of how this operation looks like:

Morphological Opening Example

This also has the side effect of sometimes "rounding borders", which is a much better and actually desirable side effect - the final image won't look pixelated and will have more organic shapes.

Smoothing Borders

In order to be 100% sure that the final result will have organic borders and not pixelated edges, there is a final step we can run (pixelated edges really bother me, right?) - we already know how blurring looks like - we basically average pixel colors in a region outside a point - what if we did the same, but instead of using averages, we use the "median" - this preserves colors and has the effect of smoothing hard edges. Then, we can just run the small area removal again to make sure we don't introduce any artifacts in the final result.

This is an example of how a Median Blur acts on an image:

Median Blur Example

I'll admit, I was skeptical about this step at first. "Another blur? Really?" But the median blur turned out to be the perfect finishing touch - it smoothed those jagged edges without turning the image into mush. Sometimes the simplest solutions are the best ones.

Generating Edges

Previously, we were using the findContours method from OpenCV for each color channel in the final result to draw the boundary regions - it turns out we can simplify this by using AGAIN Morphological Transformations - we'll use something called the Morphological Gradient - basically, you calculate the difference between a dilation and an erosion - this will correspond to the image borders - this method is way lighter than finding the exact pixel contours for each region.

This is a visualization to help understand this process:

Morphological Gradient Example

Final Results

The final part of the script still looks the same - we use Pole of Inaccessibility to position the numbers inside the regions to be painted - the number size scales with region size, so smaller regions get a smaller label. The border of the image was expanded in order to avoid number clipping.

Let's compare the old approach from Part 3 with the new refined version:

Part 3 Result (Pixelated)

Final Result (Smooth)

The difference is night and day. The new version has organic, flowing borders instead of harsh pixelated edges. The colors are more consistent and natural-looking. The regions are well-defined without being artificial. This is what I was aiming for all along.

Closing Thoughts

Remember how this all started? I almost fell for a paint-by-numbers scam website back in Part 1. The plan was simple: avoid getting scammed and create something I could actually use to turn my own photos into paint-by-numbers projects.

Mission accomplished.

Looking back at this journey, I learned that sometimes the "clever" solution isn't the right one. The brush-stroke simulation from Part 1 seemed ingenious at the time but was painfully slow and produced mediocre results. The horizontal/vertical scans from Part 3 were a brute-force hack that created square artifacts. Meanwhile, morphological transformations - a fundamental technique I initially overlooked - turned out to be the elegant solution hiding in plain sight.

The biggest lesson? Take breaks. Seriously. I spent weeks banging my head against these problems, then walked away for 10 months. When I came back, the solutions seemed obvious. CIELab color space? Of course. Morphological transformations for everything? Why didn't I think of that before? Sometimes your brain needs time to process problems in the background.

Would I use this myself? Absolutely. In fact, I already have a few photos lined up for printing. Will I turn this into a business and compete with those scam websites? Maybe. The idea of providing a legitimate alternative to those scam sites is tempting - helping people create personalized paint-by-numbers from their own photos without worrying about fraud. We'll see where this goes.

The Code

If you want to try this yourself, the complete code is available on GitHub:

Paint-by-Numbers Generator Repository

The repository includes: - Complete Python implementation with all optimizations - Example images and results - Documentation on tweaking parameters for different image types

Feel free to experiment with it, improve it, or use it for your own projects. If you create something cool, I'd love to see it!

Victory, Indeed

So, is this victory "at last"? The question mark in the title suggests I'm still not entirely sure. But you know what? I'm calling it. After four parts, 10 months, countless failed attempts, and more morphological transformations than any person should reasonably encounter, I'm satisfied with this result.

It's not perfect - no image processing algorithm ever is. But it's good enough, and sometimes good enough is exactly what you need.

Thank you for following along on this journey. Now, if you'll excuse me, I have some painting to do.

AxeCrafted Blog