Geometric Transformations
In this tutorial, we will explore geometric transformations in depth, understand the underlying theory, and learn how to perform them using OpenCV. So, let’s dive in!
Updated March 25, 2023
Welcome to this exciting tutorial on geometric transformations in OpenCV! As an aspiring computer vision expert, you’ll often need to perform transformations on images to manipulate their shape, size, and orientation. In this tutorial, we will explore geometric transformations in depth, understand the underlying theory, and learn how to perform them using OpenCV. So, let’s dive in!
What are Geometric Transformations?
Geometric transformations are operations that modify the geometric properties of images, such as size, orientation, and perspective. Some common geometric transformations include scaling, rotation, translation, and affine transformations.
Check out the geometric transformation Wikipedia page for a more detailed explanation of the concept.
Why are Geometric Transformations Important?
Geometric transformations are essential for many computer vision tasks, such as image stitching, object tracking, and image registration. By transforming images, you can align them, change their perspective, or resize them to fit specific requirements, which helps improve the performance of computer vision algorithms.
The Theory Behind Geometric Transformations
Geometric transformations can be represented as mathematical operations applied to the pixel coordinates of an image. The transformed pixel coordinates can be calculated using transformation matrices, which define how the input image is transformed into the output image.
In OpenCV, geometric transformations can be performed using functions like resize()
, warpAffine()
, and warpPerspective()
.
Geometric Transformations with OpenCV: A Step-by-Step Guide
Now that we have a basic understanding of geometric transformations, let’s see how to perform them using OpenCV. We’ll be using Python for our examples, but you can also use the OpenCV C++ API.
Step 1: Install OpenCV
First, let’s install OpenCV. You can do this by running the following command:
pip install opencv-python opencv-python-headless
Step 2: Load and Display an Image
Let’s start by loading and displaying an image using OpenCV:
import cv2
img = cv2.imread('path/to/image.jpg')
cv2.imshow('Original Image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Step 3: Scaling (Resizing) an Image
Scaling is the process of resizing an image. You can scale an image using OpenCV’s resize() function:
def resize_image(img, scale_percent):
width = int(img.shape[1] * scale_percent / 100)
height = int(img.shape[0] * scale_percent / 100)
dim = (width, height)
resized = cv2.resize(img, dim, interpolation=cv2.INTER_LINEAR)
return resized
scaled_image = resize_image(img, 50)
cv2.imshow('Scaled Image', scaled_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
In this example, we resize the image to 50% of its original size using linear interpolation.
Step 4: Rotating an Image
Rotating an image involves changing its orientation by a specified angle. You can rotate an image using OpenCV’s getRotationMatrix2D()
and warpAffine()
functions:
def rotate_image(img, angle, center=None, scale=1.0):
(h, w) = img.shape[:2]
if center is None:
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, scale)
rotated = cv2
warpAffine(img, M, (w, h))
return rotated
rotated_image = rotate_image(img, 45)
cv2.imshow('Rotated Image', rotated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
In this example, we rotate the image by 45 degrees counterclockwise around its center and maintain the original scale.
Step 5: Translating an Image
Translating an image involves shifting its position horizontally and vertically. You can translate an image using OpenCV’s warpAffine()
function:
def translate_image(img, x_shift, y_shift):
M = np.float32([[1, 0, x_shift], [0, 1, y_shift]])
translated = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]))
return translated
translated_image = translate_image(img, 50, 50)
cv2.imshow('Translated Image', translated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
In this example, we translate the image 50 pixels to the right and 50 pixels down.
Step 6: Applying Affine Transformations
Affine transformations are a combination of scaling, rotation, and translation that maintain parallelism between lines. You can apply an affine transformation to an image using OpenCV’s getAffineTransform()
and warpAffine()
functions:
def apply_affine_transform(img, src_points, dst_points):
M = cv2.getAffineTransform(src_points, dst_points)
transformed = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]))
return transformed
src_pts = np.float32([[50, 50], [200, 50], [50, 200]])
dst_pts = np.float32([[10, 100], [200, 50], [100, 250]])
affine_transformed_image = apply_affine_transform(img, src_pts, dst_pts)
cv2.imshow('Affine Transformed Image', affine_transformed_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
In this example, we transform the image by specifying three source points and their corresponding destination points.
Step 7: Applying Perspective Transformations
Perspective transformations change the perspective of an image, such as simulating a 3D effect or correcting distortion. You can apply a perspective transformation using OpenCV’s getPerspectiveTransform()
and warpPerspective()
functions:
def apply_perspective_transform(img, src_points, dst_points):
M = cv2.getPerspectiveTransform(src_points, dst_points)
transformed = cv2.warpPerspective(img, M, (img.shape[1], img.shape[0]))
return transformed
src_pts = np.float32([[56, 65], [368, 52], [28, 387], [389, 390]])
dst_pts = np.float32([[0, 0], [300, 0], [0, 300], [300, 300]])
perspective_transformed_image = apply_perspective_transform(img, src_pts, dst_pts)
cv2.imshow('Perspective Transformed Image', perspective_transformed_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
In this example, we transform the image by specifying four source points and their corresponding destination points.
And that’s it! You now know how to perform various geometric transformations in OpenCV. These techniques can significantly enhance the quality and usefulness of your computer vision applications, unlocking new possibilities in your projects.
Remember to keep experimenting, learning, and having fun with computer vision. Happy coding!