OPENCV FOR BEGINNERS

SWAP Inc.
6 min readMay 27, 2021

A cheat sheet to get you started with one of the best open source computer vision software libraries.

OpenCV is a cross-platform library that is used to develop real-time computer vision applications. Ever wondered how your snapshot perfectly outlines your face with the filter? Wondered how your cars give a crash alert while in reverse gear? Thought of the basics of google lens? Wondered how your paper crops up perfectly while being scanned? Thought of how Facebook recognizes your friends in a group pic? Well, the most common library being used behind all these use-cases is OpenCV.

OpenCV focuses on image processing, feature extraction, and object detection. It stands for “Open source Computer Vision Library”. Computer vision acts as a bridge between computer software and the visuals around us. It makes the computer learn the visualizations around us. Predicting the 3D visual from the 2D image is the basic task of this library.

Let us now get into it,

So, how do we use it ?

It is as easy that just a line of code !!

  • Enter it in your command prompt, this will install the latest version of OpenCV.

Well , now arises a question. Is this the only thing required to do a complete project ?

Yes, you need things like NumPy, imutils for Face Detection. When it comes to Object Detection you need Tensorflow, Image AI library, scipy. So, it all depends on your application that adds up the libraries.

So where do I start as a Beginner?

I would suggest some Image processing projects that you can work on as a beginner.

In this blog, we will be looking into the implementation of Face Detection.

Let us understand first some basic concepts

How does OpenCV process images ?

It converts each image into a multi-dimensional array of pixels which represents the intensity of each color in the image. Images can be rendered in color layered with 3 channels (Blue, Green, and Red), Grayscale with pixel values varying from 0 (black) to 255 (white), and binary portraying black or white values (0 or 1) only. Numpy is used to convert the image into an array of pixels.

np.asarray(image,dtype=np.uint8)

Here “np is an alias name for Numpy” and the method asarray is used to convert “image” into an array of pixels. “np.unit8” will have the matrix to have only Unsigned integer (0 to 255).

What is a contour ?

It is a curve that joins all the continuous points in a given image. It gives the outline of an image.

What is gray scaling?

Grayscaling is the process of converting an image from other color spaces e.g RGB, CMYK, HSV, etc. to shades of gray. It varies between complete black and complete white.

What is thresholding?

Thresholding is a type of image segmentation, where we change the pixels of an image to make the image easier to analyze. In thresholding, we convert an image from color or grayscale into a binary image, i.e., one that is simply black and white.

Is there any way to resize the image?

resized = imutils.resize(image, width=500)

cv2_imshow(resized)

Using imutils we can resize the image.

here “image” variable contains the image which is resized to the width of 500px and stored in the “resized” variable.

Using “cv2_imshow” we can display the image.

Cropping

Syntax:

image[y: y + h, x: x + w]

Example:

crop = image[10:300,10:300]

How to rotate or flip an image?

Rotating is done by,

imutils.rotate(image, degree)

A flip (mirror effect) is done by reversing the pixels horizontally or vertically.

cv2.flip(image, value)

The value can be 0,1 or -1.

Now let us get into the implementation part

We will be using Google colab for doing this. (You can also do it in your local system)

Step 1: Go to google and search for google colab

Step 2: Open google colab and create a new notebook

Step 3: Name the notebook

Step 4: Insert the code snippet for camera capture

Step 5: The captured image will be saved as photo.jpg in your folder.

Step 6: Install OpenCV

Step 7: Import the required libraries

Step 8: Read the captured image using “imread” and store it in a variable called “image” and display the image using “im_show”.

Step 9: Convert the image into an array of pixels.

Step 10: Find the contours by using the gray_scale image

Step 11: Load the predefined Cascade Classifier to detect the face.

Upload the XML file in.

This is an XML file, that is pre-trained with all the parameters for face detection. There are a lot of cascade classifiers for object detection too.

Step 12: Detect the Face from the image.

  1. The detectMultiScale function is a general function that detects objects.
  2. The first option is the grayscale image.
  3. The second is the scaleFactor. Since some faces may be closer to the camera, they would appear bigger than the faces in the back. The scale factor compensates for this.
  4. The detection algorithm uses a moving window to detect objects. minNeighbors defines how many objects are detected near the current one before it declares the face found. minSize, meanwhile, gives the size of each window.

Step 13: Draw a rectangle around the detected face.

Step 14: Finally, display the result

Take a look into the whole code

https://github.com/manish2509/Face-Detection

Happy Learning!

-Manishma Sundararajan

--

--