Color extraction and image conversion for country flags
Published: 2023-10-27

I’ve written about image formats and image compression in my wiki, but today just wanted to summarize few notes on conversion between images and the interesting task of color extraction. This will not be an explainer but, probably a good starting point, referencing to implementations and “semi”-literatures on this topic.

So I was working on a small tool which is supposed to extract colors from country flags (I’ll link to it here, when it’s done). This required me two sets of info from each image(flag).

  • Colors that are being used in the image in hex
  • Percentage of each color in the image

Extract color

Static dataset

Now we won’t even need to do this if there existed a dataset for this already, in-fact there exists one. But the issue with that is, what if we need to add new countries, change something etc. So I was somewhat reluctant towards using a static dataset for this usecase. I also couldn’t find a dataset with percentage of colors in it.

Extraction from raster

Implementations

Extraction from SVG

This is sort of a hack but there are many ways to do it, it’s simpler than extracting colors from raster images but it requires the image to be in vector format. Converting raster images to vectors is lossy and non-trivial.

Ways to extract colors from SVG

  • Can be as simple as matching for /#[0-9a-f]{3,6}/gi in an svg file
  • Listing the values of all “fill” and “stroke” attributes and corresponding CSS properties from the SVG code of each flag file, and removing duplicates. (picked from the dataset description above)
  • If you know any more, let me know!

Implementations

Converting raster to vector

  • Converting from raster to vector is not simple. Most essentially, you’ll need to trace the path for the raster and that’s manual work. Some online tools that claim to do raster to svg conversion sometimes would just put the xml(base64(raster)) call it a svg, that’s probably not the svg you want.
  • Luckily for us, there exists tools that do best effort automatic tracing for us. But this is not always perfect and each of these techniques have their own tradeoffs.
  • Of these tools, few popular ones are potrace(2 colors), autotrace(multi-color) and vtracer. Tools like imagemagik and inkscape use these underlying tools to carry out these conversions. Of all these, I got the best results from vtracer, they also have nice documentation on how it works.

References

Conclusion

So the question remains, “what to use for our usecase of extracting colors for flags?”

Well, the raw flag images that I am downloading from Wikipedia are anyway in svg, plus in any case I am not able to find svg images for the flag of certain country, we have excellent tools like vtracer to help us with converting from raster to svg, so it makes solid sense to use svg and then get the colors out of it.

But we have a small issue, we also wanted the percentage of each color in the image. With raster image because we’ll go px by px and we can get the proportion by division. But in case of svg, we’d need to calculate the area under visible element(let me know if you have any other idea in mind!).

SVG has other problems,

  • What if there’s no fill attribute? what’s gets filled then is contextual and no real way to know.
  • We’d need take care of type of element, fill/non-fill, colors added by css if any etc.

So when trying to extract color + portion of color, extracting from raster image seems to be the clear winner here.