Color is a product of how our brains interpret the light that enters our eyes, so in short, it and sight are not separate senses.
When working with computer vision tasks, there is a powerful toolset that can help you unlock the visual world. It provides a comprehensive library of functions for image and video processing, object detection, and more. With this toolkit, you can analyze images and videos like a human expert, extracting valuable insights and information.
In computer programs, images are typically represented as multi-dimensional matrices, where each element of the matrix corresponds to a pixel in the image. Common representations of said images include:
Grayscale Images: represented as 2D matrices, where each element represents the intensity of the pixel (e.g., 0-255 for 8-bit grayscale).
Color Images: represented as 3D matrices, where the third dimension represents the color channels (e.g., RGB, HSV). Each element in the matrix stores the intensity of the corresponding pixel in each color channel.
The OpenCV library in Python can be accessed through the cv2 module, which is an alias of OpenCV version 2.X and later versions. For this exercise, we will use its common functions to build a window-based image displayer with a grayscale feature.
The imread function reads an image file from a path and returns it as a NumPy array, The returned 3D array has the following structure:
1st Dimension: represents the height of the image (number of rows).
2nd Dimension: represents the width of the image (number of columns).
3rd Dimension: represents the color channels (e.g., BGR for color images, grayscale for single-channel images).
Up next is the imshow function. It displays an image in a window, taking the arguments window name and the image data as a NumPy array. While running through a while True loop, the window will remain open until closed, but closing such windows without an exit script will result in a kernel crash.
This is where waitKey and destroyAllWindows come in.
What waitKey does is waits for a key press event and returns the ASCII value of the pressed key. The function takes an optional argument, which is the time in milliseconds to wait for a key press event. If the argument is:
0: the function waits indefinitely until a key is pressed.
+ value: the function waits for the specified time and returns the ASCII value of the pressed key, or -1 if no key is pressed.
- value: the function waits indefinitely until a key is pressed but does not return the ASCII value of the pressed key.
Finally, destroyAllWindows simply closes all windows created by OpenCV, also releasing system resources associated to windows. In the example below, the function is called if the key ASCII value 27, representing the Esc key, is returned while on the window displaying the image.
See for yourself.
In the early days of digital imaging, grayscale images were the norm. These images were represented using 8-bit values – ranging from 0 to 255 – to represent the intensity of each pixel, which limited color representation to shades of gray.
With advancements in technology, color displays became more common. This led to the development of color spaces to represent a wider range of colors. The most widely used color space in image processing is red, green, and blue (RGB). RGB colors can be visualized as a 3D cube, where each point in the cube represents a unique color. The x, y, and z axes correspond to the red, green, and blue components, respectively.
In regard to the topic of machine learning, color spaces are different ways to represent colors mathematically. There are two primary approaches to defining color spaces:
Dimensionality: some color spaces use more than 3 dimensions to represent color, allowing for a wider range of hues and tones.
Numerical Meaning: the meaning of the numerical values in a color space can vary. For example, in RGB, the values represent the intensities of red, green, and blue. In other color spaces, the values might represent different attributes like hue, saturation, and value.
Another color space, LAB, is designed to be perceptually uniform, meaning that equal changes in color values correspond to equal perceived changes in color. It is composed of:
L (lightness): represents the overall luminance or brightness of the color, ranging from 0 (black) to 100 (white).
A (a axis): represents the red-green color axis, ranging from -128 to 127. A positive value indicates a more reddish color, while a negative value indicates a more greenish color.
B (b axis): represents the yellow-blue color axis, ranging from -128 to 127. A positive value indicates a more yellowish color, while a negative value indicates a more bluish color.
While RGB is a widely used color space, LAB is often preferred for tasks that require accurate color representation and manipulation, such as color grading and printing. LAB's perceptual uniformity and wider color gamut make it a more suitable choice for these applications.
While understanding the mathematical transformations between color spaces can be helpful, the key is to grasp the conceptual meaning of each color space. This knowledge allows you to work effectively with color information in various applications.
Moving on, color expressions are mathematical representations of colors used in computer graphics, image processing, and other fields. They represent a specific point within a color space, providing a way to define and manipulate colors precisely. In short, a color space is the overall framework, while a color expression is a specific point within that framework.
Three of such color expressions are HSB, HSV, and HSL.
HSB and HSV both use the same hue (dominant color) and saturation (color purity or intensity) component, but are different in that:
Brightness (HSB): represents the overall intensity of the color, ranging from 0 (black) to 100% (the purest color).
Lightness (HSL): also represents the overall intensity, but it is defined differently. In HSL, lightness ranges from 0% (black) to 100% (white), regardless of the hue and saturation.
As for HSL and HSV, while they both also use the same hue and saturation components, the latter uses value, which also represents the overall brightness or intensity of the color. However, in HSV, pure colors always have a value of 100%. This means that the brightness of a color in HSV is directly related to its saturation.
HSL and HSV color models are widely used in various applications, such as:
Pseudo-Coloring
Medical Imaging: HSL and HSV are used to create pseudo-color images for medical visualization. By assigning specific colors to different intensity ranges, these models can help to highlight important features or anomalies in medical images.
Meteorological Maps: HSL and HSV are used to colorize meteorological maps, making it easier to visualize temperature, precipitation, wind patterns, and other weather data.
Image Segmentation
Image Matting: HSL and HSV can be used to isolate objects or regions within an image. By analyzing the hue, saturation, and lightness components, it's possible to extract specific areas of interest.
Information Extraction: HSL and HSV can be used to extract important information from images, such as identifying objects, recognizing patterns, or analyzing color distributions.
This exercise explores how images rendered in BGR look after they are converted to other color spaces on OpenCV.
On a surface level, imread appears to create windowed images with different color spaces, but it merely converts the existing image data. Today's new function cvtColor takes a step further and truly creates a new one with the converted color space.
To present the massive Python windows more tidily on the screen, we will also implement an additional function called namedWindow, which will make the default static pages resizable.
Discern the differences in color spaces for yourself.