Assumption is one of the great blockades to understanding.
OpenCV, a powerful computer vision library, offers a vast array of tools to enhance and manipulate images. By applying various transformations to images, OpenCV can help us create new, synthetic samples near-identical to the original data, which can be used to improve the generalization performance of our machine learning models, especially when dealing with limited datasets.
While color spaces provide different ways to represent colors, they can also be used for data augmentation purposes. By converting images between color spaces, we can modify their color properties such as brightness, saturation, or hue.
For example, to adjust the brightness of an image, you can:
Convert Image to HSV or HSL Color Space: these color spaces explicitly represent brightness or lightness as a separate component.
Adjust Brightness Value: increase or decrease the brightness component to make the image brighter or darker.
Convert Image Back to RGB: convert the modified image back to RGB for display or further processing.
When working with OpenCV, it is essential to be aware of the specific color space you are using and its characteristics:
Value Range: OpenCV typically represents color values as 8-bit unsigned integers (uint8), which have a range of 0 to 255. If your calculations result in values outside this range, you'll need to normalize or clip them accordingly.
Color Space Conversion: when converting between color spaces, ensure that you handle the value ranges correctly. Some color spaces like HSV might use a different range for certain components (e.g., hue might be expressed in degrees).
Data Types: be mindful of the data types used for color values. OpenCV might require specific data types for certain operations.
For example, to adjust the hue of an HSV image in OpenCV, you can:
Convert the image to HSV color space.
Access the hue channel.
Modify the hue values as needed (e.g., shift them by a certain amount).
Ensure that the modified hue values remain within the valid range. 0 to 179 is OpenCV's default HSV implementation.
Convert the image back to RGB for display or further processing.
In actual photography, environmental factors often cause overexposure or underexposure in photos, resulting in loss of detail and lower accuracy in AIs fed with said photos. Histogram equalization is a technique used in image processing to enhance the contrast of an image by stretching the histogram to cover the full range of possible pixel values, helping to reveal hidden details otherwise invisible in the original image.
As it is written in the technique's name, a histogram shows how many pixels have each intensity value, providing a visual understanding of the image's contrast and dynamic range. By analyzing the histogram, we can identify areas of the image that are too dark or too bright.
Histogram equalization aims to redistribute these pixel intensities to create a more uniform histogram, which typically results in a more balanced and visually appealing image.
Traditional histogram equalization, while effective in many use cases, has some limitations:
Global Contrast Enhancement: the traditional technique applies the same enhancement to the entire image, regardless of local variations in contrast or detail. This can lead to over-enhancement in some regions and under-enhancement in others.
Over-Enhancement: for images with extreme contrast, traditional histogram equalization can over-enhance high-intensity regions, resulting in noise or artifacts.
Loss of Details: in some cases, the equalization process can lead to the loss of fine details or subtle features in the image.
Sensitivity to Noise: the technique can be sensitive to noise in the image, as the noise can affect the shape of the histogram and the resulting equalization transformation.
To cover the limitations of traditional histogram equalization stated above, we can use Contrast Limited Adaptive Histogram Equalization (CLAHE). Here is how this variant of histogram equalization works and progresses:
The image is divided into smaller blocks or tiles.
A histogram is calculated for each block, capturing the intensity distribution within that region.
To prevent over-enhancement, the histogram is clipped or limited to a maximum value. This limits the amount of contrast that can be applied to each block.
Each block's histogram is equalized using the same process as traditional histogram equalization.
The borders between blocks are interpolated to ensure a smooth transition and avoid artifacts.
Binarization is a fundamental image processing technique that involves converting a grayscale image into a binary image, consisting of only black and white pixels. This process is essential for many applications, such as object detection, image segmentation, and optical character recognition.
The key step in binarization is thresholding, which is indicated by its other name binary thresholding. A threshold value is selected, and each pixel in the grayscale image is compared to this value. If the pixel's intensity is greater than or equal to the threshold, it is set to white (1); otherwise, it is set to black (0).
Several types of binarization include:
Global Thresholding: a fixed threshold value of 127 is chosen, and all pixels above this threshold are set to white, while those below are set to black. This method is simple but may not be suitable for images with varying lighting conditions or complex backgrounds.
Adaptive Thresholding: the threshold value is calculated dynamically based on local image characteristics, such as the average intensity or variance within a neighborhood. This method can be more robust to variations in lighting and background.
Adaptive Mean Thresholding: the threshold value is calculated dynamically based on the mean intensity of a local neighborhood. This method is more robust to variations in lighting and background, as it adapts the threshold to the local characteristics of the image. Produces a better result than global thresholding, with fewer errors in the binary image.
Adaptive Gaussian Thresholding: similar to adaptive mean thresholding, but uses a weighted average of the neighborhood intensities, with more weight given to pixels closer to the center. This method is even more robust to noise and variations in lighting, as it takes into account the spatial distribution of pixel intensities. Produces the best result of the three methods, with the most accurate binary image.
This exercise will have us work on color channels, perform histogram equalization, and run two types of binarization on a grayscale image.
To access a specific image channel in HSV with imshow, you need to type in their index number starting from 0. To spell it out to you, hue is 0, saturation is 1, and value is 2.
By combining the hue, saturation, and value for each pixel, a clear and vibrant image is created.
OpenCV's ravel function is used to flatten a multi-dimensional array into a one-dimensional array. If it is not applied in a hist (OpenCV histogram) function, you will likely get an error.
In a standard OpenCV image histogram, the x-axis represents pixel intensity values ranging from 0 to 255, while the y-axis represents the frequency or number of pixels that have a particular intensity value.
To calculate the histogram of an image, the function calcHist, OpenCV's hist counterpart, is used. Compared to Matplotlib's hist, calcHist differentiates in:
Library: hist is from Matplotlib, while calcHist is from OpenCV.
Functionality: calcHist offers more advanced features for image processing, such as handling multiple images, specifying channels, and applying masks.
Output: hist typically returns the histogram as a NumPy array, while calcHist can return the histogram as a NumPy array or an OpenCV Mat object.
The photo below looks captures shadowy details clearer than the grayscale photo from before, as you can see in the shadow on the woman's back. In the histogram, this is reflected by a more equally distributed intensity value across the axes.
In this exercise, a new type of thresholding is introduced. Otsu thresholding, named by its developer Nobuyuki Otsu in 1979, is an automatic thresholding method that determines the optimal threshold value of an image by maximizing the variance between foreground and background, two classes of pixels.
Using Otsu in conjunction with global thresholding can automate the process of selecting the best threshold value for your image, leading to more accurate and consistent results.
While human vision can perceive grayscale images more naturally, binarization can provide valuable information that is not immediately apparent to the human eye.