What would I give for a better brain…
Data augmentation is pivotal in enhancing machine learning models' performance and generalization. By introducing controlled overlaps between augmented images, we can simulate real-world scenarios in manually tidied training data, improving models' robustness.
OpenCV offers a versatile set of functions for drawing and painting on images. These tools can be used for various purposes, including image annotation and data visualization.
Starting off, cv2.rectangle() draws a rectangular frame onto an image. It requires the following parameters:
img: the image on which to draw the rectangle.
pt1: the top-left corner of the rectangle.
pt2: the bottom-right corner of the rectangle.
color: the color of the rectangle, represented as a tuple of values in the order of B, G, and R.
thickness: the thickness of the rectangle's outline in pixels. If it is negative, the rectangle will be filled with the specified color.
lineType: the type of line used to draw the rectangle.
The next function line() draws a line between two corner coordinates. It shares the same parameters with rectangle().
Afterwards is the putText() function, which draws inserted texts over an image. You can insert values for font style (via link), font size, font color, and font weight into it.
To further elaborate on affine transformations, they are a class of geometric transformations that preserve parallelism and ratios of distances between points. This means that parallel lines remain parallel after an affine transformation, and the midpoint between two points will still be the midpoint after the transformation.
To represent affine transformations in a unified framework, we often use homogeneous coordinates. By adding an extra dimension to the coordinates of points, we can represent both affine and projective transformations using matrix multiplication.
In the context of affine transformations, the third dimension of homogeneous coordinates is typically set to 1. This ensures that the transformation preserves the affine properties of the image.
This exercise explores how OpenCV's rotation and translation matrices operate, as well as how functions can be run sequentially on the same item.
Beginning with the rotation matrix, it is run through the getRotationMatrix2D function, requiring parameters such as a center to rotate from, rotation angle, and scale. When printed as done in the example below, each pair of values:
[0.35355339, 0.35355339]: represents the rotation and scaling of the x-axis.
[-0.35355339, 0.35355339]: represents the rotation and scaling of the y-axis.
[74.98066402, 256.0]: represents the translation of the x and y coordinates to move the rotation center to the center of the image.
Translation in geometry is essentially the same as moving an object. In mathematical terms, translation is represented by adding a vector to the coordinates of each point in the object. This vector specifies the direction and distance of the shift.
By storing the outputs of warpAffine() in variables, they can be further transformed when fetched and run through new OpenCV functions.
As stated before, warpAffine() creates a new image based on the one inserted into it, leaving the original unaltered.