Geometric Transformations

In this tutorial, we will explore geometric transformations in depth, understand the underlying theory, and learn how to perform them using OpenCV. So, let’s dive in!

Updated March 25, 2023


Hey! If you love Computer Vision and AI, let's connect on Twitter or LinkedIn. I talk about this stuff all the time!

Welcome to this exciting tutorial on geometric transformations in OpenCV! As an aspiring computer vision expert, you’ll often need to perform transformations on images to manipulate their shape, size, and orientation. In this tutorial, we will explore geometric transformations in depth, understand the underlying theory, and learn how to perform them using OpenCV. So, let’s dive in!

What are Geometric Transformations?

Geometric transformations are operations that modify the geometric properties of images, such as size, orientation, and perspective. Some common geometric transformations include scaling, rotation, translation, and affine transformations.

Check out the geometric transformation Wikipedia page for a more detailed explanation of the concept.

Why are Geometric Transformations Important?

Geometric transformations are essential for many computer vision tasks, such as image stitching, object tracking, and image registration. By transforming images, you can align them, change their perspective, or resize them to fit specific requirements, which helps improve the performance of computer vision algorithms.

The Theory Behind Geometric Transformations

Geometric transformations can be represented as mathematical operations applied to the pixel coordinates of an image. The transformed pixel coordinates can be calculated using transformation matrices, which define how the input image is transformed into the output image.

In OpenCV, geometric transformations can be performed using functions like resize(), warpAffine(), and warpPerspective().

Geometric Transformations with OpenCV: A Step-by-Step Guide

Now that we have a basic understanding of geometric transformations, let’s see how to perform them using OpenCV. We’ll be using Python for our examples, but you can also use the OpenCV C++ API.

Step 1: Install OpenCV

First, let’s install OpenCV. You can do this by running the following command:

pip install opencv-python opencv-python-headless

Step 2: Load and Display an Image

Let’s start by loading and displaying an image using OpenCV:

import cv2

img = cv2.imread('path/to/image.jpg')
cv2.imshow('Original Image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Step 3: Scaling (Resizing) an Image

Scaling is the process of resizing an image. You can scale an image using OpenCV’s resize() function:

def resize_image(img, scale_percent):
    width = int(img.shape[1] * scale_percent / 100)
    height = int(img.shape[0] * scale_percent / 100)
    dim = (width, height)

    resized = cv2.resize(img, dim, interpolation=cv2.INTER_LINEAR)
    return resized

scaled_image = resize_image(img, 50)
cv2.imshow('Scaled Image', scaled_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this example, we resize the image to 50% of its original size using linear interpolation.

Step 4: Rotating an Image

Rotating an image involves changing its orientation by a specified angle. You can rotate an image using OpenCV’s getRotationMatrix2D() and warpAffine() functions:

def rotate_image(img, angle, center=None, scale=1.0):
    (h, w) = img.shape[:2]

    if center is None:
        center = (w // 2, h // 2)

    M = cv2.getRotationMatrix2D(center, angle, scale)
    rotated = cv2

    warpAffine(img, M, (w, h))
    return rotated

    rotated_image = rotate_image(img, 45)
    cv2.imshow('Rotated Image', rotated_image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

In this example, we rotate the image by 45 degrees counterclockwise around its center and maintain the original scale.

Step 5: Translating an Image

Translating an image involves shifting its position horizontally and vertically. You can translate an image using OpenCV’s warpAffine() function:

def translate_image(img, x_shift, y_shift):
    M = np.float32([[1, 0, x_shift], [0, 1, y_shift]])
    translated = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]))
    return translated

translated_image = translate_image(img, 50, 50)
cv2.imshow('Translated Image', translated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this example, we translate the image 50 pixels to the right and 50 pixels down.

Step 6: Applying Affine Transformations

Affine transformations are a combination of scaling, rotation, and translation that maintain parallelism between lines. You can apply an affine transformation to an image using OpenCV’s getAffineTransform() and warpAffine() functions:

def apply_affine_transform(img, src_points, dst_points):
    M = cv2.getAffineTransform(src_points, dst_points)
    transformed = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]))
    return transformed

src_pts = np.float32([[50, 50], [200, 50], [50, 200]])
dst_pts = np.float32([[10, 100], [200, 50], [100, 250]])

affine_transformed_image = apply_affine_transform(img, src_pts, dst_pts)
cv2.imshow('Affine Transformed Image', affine_transformed_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this example, we transform the image by specifying three source points and their corresponding destination points.

Step 7: Applying Perspective Transformations

Perspective transformations change the perspective of an image, such as simulating a 3D effect or correcting distortion. You can apply a perspective transformation using OpenCV’s getPerspectiveTransform() and warpPerspective() functions:

def apply_perspective_transform(img, src_points, dst_points):
    M = cv2.getPerspectiveTransform(src_points, dst_points)
    transformed = cv2.warpPerspective(img, M, (img.shape[1], img.shape[0]))
    return transformed

src_pts = np.float32([[56, 65], [368, 52], [28, 387], [389, 390]])
dst_pts = np.float32([[0, 0], [300, 0], [0, 300], [300, 300]])

perspective_transformed_image = apply_perspective_transform(img, src_pts, dst_pts)
cv2.imshow('Perspective Transformed Image', perspective_transformed_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

In this example, we transform the image by specifying four source points and their corresponding destination points.

And that’s it! You now know how to perform various geometric transformations in OpenCV. These techniques can significantly enhance the quality and usefulness of your computer vision applications, unlocking new possibilities in your projects.

Remember to keep experimenting, learning, and having fun with computer vision. Happy coding!