AxeCrafted Blog

Wandering thoughts from a wandering mind.

EN | PT

An Attempt at Paint-by-Numbers - Part 1

Posted on November 03, 2024

At some point when you were fiddling around on Instagram (or maybe some other platform - not judging), I bet you were served a "Paint-by-Numbers" ad. Those are painting kits where you are given a black-and-white contour image with labeled numbers. You have to paint each number with a color, and at the end you have a fully colored painting that you did yourself (well, mostly).

Media targeting is pretty good nowadays, and of course I'm the kind of person to fall for it. On the other hand, for my own good, I also like to let things "cook" on a tab for a couple of days or weeks before making a purchase decision, so I don't buy anything emotionally. This saved me this time because, after two weeks, the website I opened (which seemed pretty legit) simply didn't exist anymore. Another one of those scams where you buy something and probably don't receive anything (or a brick).

Then I began asking myself: What if I could do it myself? I certainly didn't like most of the paintings I saw on those websites - maybe I could use my own pictures - and if I had something that worked, I could maybe even sell it and make it available for people who also want to create their own paint-by-numbers, with the additional benefit of not needing to spend a tedious amount of time making sure the websites are legit.

The plan was pretty simple in my head.

The Plan

  1. Find a good image;
  2. Limit the color palette to a few colors;
  3. "Posterize" the image to create distinct color regions;
  4. Label each region and contour it;
  5. Profit.

As with every project where you don’t know much, you grossly underestimate the complexity — after all, we’re only human, and we like to believe we’re a little smarter than we are. At this point, I was at the top of the confidence curve:

Dunning-Kruger Effect

Enhancing Saturation Using HSV

With that said, let’s start. The idea is to use Python and libraries like OpenCV to process photos I took. The end goal is to transform this photo into something with a limited color space (a.k.a. only a couple colors) and clearly defined regions that I can paint afterward. For the image, I chose one I took in Hallstatt, Austria, simply because I liked it and thought it would work well:

Photo of Hallstatt Houses

My first attempts on this were not even using code but in Photoshop, experimenting with filters and colors. Unfortunately, I didn't find a silver bullet to solve all my problems, but I could learn some things. The first thing you’ll probably notice is that this image is kind of flat — it lacks that "oomph". It's too dull, too unsaturated. Saturation is the key word here. Since one of the steps here is "reducing the color space", when you have an image that is too unsaturated, all final colors will just look grayish at the end, and you end up converting the image to a grayscale one.

The solution here is to increase the saturation first. The simplest way to increase saturation in an image is to convert it from RGB to HSV. There’s a bunch of math involved in that, but for now, let's just be happy that OpenCV does it all under the hood for us.

HSV stands for Hue, Saturation, and Value, which is a color model that represents colors in terms of their shade (hue), vibrancy (saturation), and brightness (value). Instead of representing a picture based on Red, Green, and Blue values for each channel, HSV uses a direct Saturation channel. So, if we scale this channel by some factor and make sure it remains between 0 and 255 (8 bit), we can increase the saturation and give the image brighter colors:

def saturateImage(image, SCALE_FACTOR=2.5):
    # Convert image from RGB to HSV color space
    hsvImage = cv2.cvtColor(image, cv2.COLOR_RGB2HSV).astype(np.float32)

    # Scale the Saturation channel
    hsvImage[:, :, 1] *= SCALE_FACTOR
    # Clip saturation values to be within the allowable range
    hsvImage[:, :, 1] = np.clip(hsvImage[:, :, 1], 0, 255)

    # Convert back from HSV to RGB
    saturatedImage = cv2.cvtColor(hsvImage.astype(np.uint8), cv2.COLOR_HSV2RGB)
    return saturatedImage

Photo After Applying Saturation Filter

Much better. This image has a lot of unique colors, and since I’m planning to paint it, it would be good to reduce it to a manageable number — like a nice power of 2, probably 16 since 8 is too few and 32 is too many. We could try sampling the image to get some pixels or manually defining a color palette, but manual approaches seem boring and are not scalable. The fastest way would be to extract the color palette from the image automatically.

Extracting a Color Palette with K-Means Clustering

The idea now is to identify which colors best represent the image, so when we decrease the color space, we still preserve the overall appearance. This sounds a lot like clustering, so we apply a K-Means algorithm to find 16 colors that best fit the image.

K-Means clustering is an algorithm that partitions data into K distinct clusters based on feature similarity. In our case, it groups similar colors together, allowing us to identify the most representative colors in the image. To make things run faster, we can scale the image down since this will mostly preserve colors, and we don't need the exact shapes for the K-Means algorithm.

def getColorPalette(image, RESIZE_FACTOR=5, N_CLUSTERS=16):
    # Resize image to speed up K-Means processing
    pixels = cv2.resize(
        image,
        (int(width / RESIZE_FACTOR), int(height / RESIZE_FACTOR))
    ).reshape(-1, 3)

    # Apply K-Means to find cluster centers representing the main colors
    kmeans = KMeans(n_clusters=N_CLUSTERS)
    kmeans.fit(pixels)
    colors = kmeans.cluster_centers_.astype(int)
    return colors

And this is what we get:

Color Palette Extracted With K-Means

This won't always result in the exact same colors, but it’s close enough that if we reconstruct the image using this palette, we won’t notice much of a difference. Now, the idea here is to reduce the color space of the image to match the palette. For each pixel, we find the closest color from the palette. Think of distance here as how similar one color is to another.

If you really want to think of the math, just think of colors as vectors in a 3D space where each axis represents a color (for example, RGB - Red, Green, Blue). Any color, like this one can be described as the length of the vectors in the three axis - this one is (70, 130, 180). To calculate the distance between this color, and let's say, this other one (180, 70, 94), we just calculate the Euclidean distance between them with $d = \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2 + (z_2 - z_1)^2}$, which is 154.27 in this case.

Reducing Image Colors for Paint-by-Numbers

def recolorImageWithPalette(image, palette):
    # Flatten the image array to a list of pixels
    pixels = image.reshape(-1, 3)

    # Compute the distance between each pixel and the palette colors
    distances = cdist(pixels, palette, 'euclidean')

    # Find the index of the closest palette color for each pixel
    closestPaletteColors = palette[np.argmin(distances, axis=1)]

    # Reshape the array back to the image dimensions
    recoloredImage = closestPaletteColors.reshape(image.shape)
    return recoloredImage

And isn’t it amazing how only 16 colors can recreate the image so well? Look at it:

Image With 16 Color Palette Applied

A nice side effect of this process is that we already have well-defined "regions" in some areas, like the sky. However, there's still some erratic color changes on the left side of the image, particularly on the stone wall, and we'll need to address that next.

Smoothing Color Regions Using Brush Simulation

Things were running smoothly up to this point — I was riding the confidence wave at the top of the Dunning-Kruger curve. But then, reality hit: how do I turn it into a "brush-painted picture"? The main idea in my head was: I needed to "brush" through the picture by selecting a large enough area, averaging the colors within that area, and filling it with the palette color closest to that average. This should, in theory, eliminate single pixels of differing colors and create an effect akin to a brush-stroke painting.

However, things aren't always as simple as they seem. Starting with a square region made the final image look very pixelated — if I was aiming for pixel art, that would be perfect, but my goal was something else entirely. So, I decided to use circular regions, which are a bit more complex but achieve a more natural look.

Even with circular regions, passing through the image with a set radius resulted in a blurred effect. To achieve the effect I wanted, I needed to go pixel by pixel, which was far more computationally demanding — and I’m not the most patient person. After a lot of testing, I found that balancing the radius of the brush relative to the image size and its details could give me something that really started to feel like a true paint-by-numbers image.

Image With Simulated Brush Stroke Effect

As you can see, the effect is starting to work. By playing with the brush size, the final image began to look smoother, as if painted. But of course, this process was just the beginning of the real challenges.

def findClosestPaletteColor(color, palette):
    # Calculate the Euclidean distance between the given color and all colors in the palette
    distances = cdist([color], palette, 'euclidean')
    return palette[np.argmin(distances)]

def createCircularMask(radius):
    # Create a mask with a circular shape
    mask = np.zeros((2 * radius + 1, 2 * radius + 1), dtype=bool)
    center = radius
    for y in range(2 * radius + 1):
        for x in range(2 * radius + 1):
            if (x - center) ** 2 + (y - center) ** 2 <= radius ** 2:
                mask[y, x] = True
    return mask

def recolorImageWithRadius(image, palette, RADIUS=5):
    # Get image dimensions
    h, w, _ = image.shape
    # Copy the original image to avoid modifying it directly
    smoothedImage = np.copy(image)
    # Create a circular mask for applying the brush effect
    circularMask = createCircularMask(RADIUS)

    # Iterate over each pixel in the image
    for y in range(0, h, 1):
        for x in range(0, w, 1):
            # Define the region around the current pixel, with boundary checks
            y1, y2 = max(0, y - RADIUS), min(h, y + RADIUS)
            x1, x2 = max(0, x - RADIUS), min(w, x + RADIUS)

            # Extract the current region and apply the circular mask
            region = image[y1:y2, x1:x2]
            mask = circularMask[:y2 - y1, :x2 - x1]
            circularRegionPixels = region[mask]

            # Calculate the average color of the selected region
            meanColor = np.mean(circularRegionPixels.reshape(-1, 3), axis=0)
            # Find the closest color from the palette
            closestColor = findClosestPaletteColor(meanColor, palette)
            # Apply the closest color to the corresponding region
            smoothedImage[y1:y2, x1:x2][mask] = closestColor

    return smoothedImage

(Trying to) Find and Label Color Regions

At this point, I had a recolored and smoothed image. But now came the next major step: How do I take this image, identify the colored regions, and label each one with its corresponding palette color number? My confidence started to drop, and my journey down the infamous Dunning-Kruger slope was well underway.

My first idea was to use OpenCV’s Connected Components, which essentially finds groups of connected pixels of the same value. You can choose between diagonal connectivity (8-connected) or strictly horizontal and vertical (4-connected). The plan was to iterate over each of the 16 colors, mask the image to only that color, and then find every connected region to identify all the areas of the same color.

Naturally, I ran this and found that my final image had over 9000 regions. What?! Yes, there were still a LOT of small, nearly single-pixel regions even in the smoothed image, even though it looked decent at a glance. This was definitely not the outcome I was expecting.

Take a look at this:

Image With Spectral Color Palette Showing Deffects

Can you see the problem? First of all, there are a lot of really really small regions - going through each connected region, it is possible to see that some of them are one or two pixels - this creates a lot of problems in finding boundaries and labeling regions. The second problem is the borders - it's possible to see that almost every region has a thin border of a different color surrounding it - the size of these regions (in pixel count and also in width and height) is "big", making a simple approach such as "find every region smaller than X and fill if with the dominant adjacent color" infeasible, because those big regions won't be found.

That will be a problem for the next part of this series, though. See you there.

Photo of Leonardo Machado

Leonardo Machado

Some guy from Brazil. Loves his wife, cats, coffee, and data. Often found trying to make sense of numbers or cooking something questionable.