OpenCV in Python: Part 1 — Working with Images and Videos

A practical and hands-on guide towards learning OpenCV

Published in

Python in Plain English

6 min readMay 18, 2021

Welcome!

This is the very first article of the OpenCV in Python series, which, as you might have guessed already, will be about how to learn and be comfortable working with OpenCV.

I am well aware that everyone has their own style of learning, but I would highly recommend you to follow along and code with me as we proceed through the lessons. All the code and data files used will be made available at the end of each article. Feel free to drop in queries, if you have any.

Now without any further ado, let’s get started.

1. Getting Started with OpenCV

Since we are starting from the very beginning, in this section, we’ll learn the basics of handling images. Now, in order to work with OpenCV, we need to install it first. To do so, enter this command in the command line terminal (assuming you are working on Windows).

pip install opencv-python

With the installation done, let’s start coding.

First, we need to import the necessary libraries.

The cv2 package is OpenCV and will be used for image and video analysis. The argparse package will be useful when dealing with arguments passed through the command line terminal. If you’re not comfortable working with command-line arguments, you can go through this tutorial which explains it very well.

We’ll be working on this image.

Place this image (or any other image of your choice) inside your main directory, where your Python file resides.

We’ll be feeding this image through the terminal as an argument, so inside our code file, we need to create an argument parser.

After creating an instance of argparse, we are adding an argument, which essentially tells the machine that an image argument is being passed in and it needs to be parsed. The parsed argument is then passed on to the vars() function which returns a __dict__ attribute of the specified object.

Using the imread method of cv2, we’ll input the image inside the image variable.

The first argument passed in the function is args[“image”], the parsed argument containing the path of the image. Since we stored the image inside the same folder, we’ll simply mention the name of the image.

The second argument will convert the image into a grayscale version of itself. This is a step which will be frequently done when dealing with images and videos because processing colored images is much harder and time-consuming and more often than not, grayscale images are entirely sufficient for many tasks. Hence, there is no need to use more complicated and harder-to-process color images.

To display the image, use the imshow method. The first argument will be the title of the image being displayed and the second will be the image variable itself.

cv2.waitKey(0) will wait for the user to enter any key, upon which, the image window will be closed.

Besides displaying the image itself, this image variable can be used to display some basic information too, such as the height and width of the image.

OpenCV also allows you to save this file and write it to your disk. Most interestingly, OpenCV also handles the type conversion of the file behind the scenes, i.e., you can save a .jpeg file as a .png file.

The imwrite method takes the path for the output image to be stored as the first argument and the image variable as the second. If you want to change the file type, simply alter it inside the pathname and OpenCV will take care of the rest.

Here is the complete code —

The command to execute this file —

python app.py --image photo_one.jpeg

Note: I named my Python file as app.py and image file as photo_one.jpeg, make sure to replace it with your own filenames.

After you execute this file, this grayscale version of the image will pop up —

Inside your main directory, a photos folder must have been created, inside which, you will see the same image stored as newimage.png.

2. Working with Videos in OpenCV

Now that we’re done with images, it is time to understand the basics of interacting with a web-cam feed or a video file using OpenCV. Working with video files is not much different from working with images, for videos consist of frames, which are, images.

First, we import the libraries.

Now, we will create an object of VideoCapture to, like the name suggests, capture video.

The number that we are passing in signifies the source. 0 corresponds to the first webcam in your system. 1 corresponds to the second web-cam and so on.

Instead of capturing a live feed, if you want to load a video file, simply pass in the path to the video file instead of a number.

Then, we will create a while loop, to capture frame-by-frame from the source.

The cap.read() returns a Boolean value (True/False) and the frame. If the frame is read correctly, it will be True.

cv2.imshow() is used to display the video. The title of the video will be the first argument.

Now since this while loop is an infinite loop, we need something to break it.

This code snippet might seem confusing at first but once you understand the details, it is quite easy.

The waitKey(0) function returns -1 when no input is made whatsoever. As soon the event occurs i.e. a button is pressed it returns a 32-bit integer.
The 0xFF in this scenario is representing binary 11111111 a 8 bit binary, since we only require 8 bits to represent a character we AND waitKey(0) to 0xFF. As a result, an integer is obtained below 255.
ord(char) returns the ASCII value of the character which would be again maximum 255.
Hence by comparing the integer to the ord(char) value, we can check for a key pressed event and break the loop.¹

After we’re finished using the camera, we need to release it. If we don’t, the next time we use it, it’ll throw an error.

It is time to run this file. Just in case you haven’t typed in the code alongside, here is the entire code for this section —

Executing this file, you’ll be able to see your webcam display on a window titled ‘video feed’. To close the window, press ‘q’.

Remember how we converted our color image to grayscale and saved it to the disk? Well, the same is possible for videos too. We just have to make a few minor changes.

Inside the while loop, insert these statements —

Upon saving and executing the file, you’ll get two windows —

Original version
Grayscale version

Now, we usually output two or more windows when we want to compare our alterations with the original file/feed. This makes it easier to spot the changes we’ve made.

We can also save the file using VideoWriter object. Before that, we need to specify the fourcc variable. FourCC is a 4-byte code used to specify the video codec. The full list of codes can be obtained at Video Codecs by FourCC.

The name of the output file will be the first argument. Then the number of frames per second (fps) and frame size should be passed.

In order to save the video, we need to record each frame inside the while loop.

Don’t forget to release the VideoWriter instance at the end.

The full code —

After executing this code, you’ll notice that a video named ‘output.avi’ will be present inside your main directory. That is the video that we just recorded and wrote onto the disk.

3. Final Thoughts

This article dealt with the simplest of all stuff — loading and writing multimedia content. OpenCV is a far more powerful tool and can be used for much more high-level works.

For the time being, I hope you are a lot more comfortable working with OpenCV. In fact, I would encourage you to read the official docs and fiddle around with the code, until I write the next article in this series, which will deal with more complex content.

The links to future articles will be updated here, so do keep a watch on this space.

GitHub Code

Code for OpenCV in Python : Part 1

Footnotes

Getting Started with Videos — OpenCV-Python Tutorials 1 documentation (opencv-python-tutroals.readthedocs.io)

More content at plainenglish.io