Learning the Basics of Practical AI Development with Face Recognition Library 'InsightFace' (3)

Implementing ArchFaceONNX with insightface for Face Recognition

In our last course, we implemented face detection. This time, let's implement face recognition.

Click here for the previous course

Preparation for Face Detection

The code for face detection is as follows. It is the same as in the previous course. The model path is arbitrary, so please change it according to your environment.

face detection code
1import cv2
2import numpy as np
3import insightface
4from insightface.app.common import Face
6# load detection model for face detection
7detector = insightface.model_zoo.get_model("models/buffalo_l/det_10g.onnx") // Change to any path you like
8detector.prepare(ctx_id=-1, input_size=(640, 640))

Preparation for Face Recognition

We use insightface to load the recognition model. The model path is arbitrary, so please change it according to your environment.

face recognition code
2// Prepare recognition model
3ONNX_MODEL_PATH = "models/buffalo_l/w600k_r50.onnx"
4// Use InsightFace to load model
5model = insightface.model_zoo.get_model(ONNX_MODEL_PATH)
6model.prepare(ctx_id=-1, input_size=(640, 640)) // ctx_id is computing device id

Creating a Function to Infer Vectors for Face Detection and Recognition from Images

The image is passed as a file path argument, and inside the function, it is loaded with cv2.imread(). This function assumes, for convenience, that there is always one face in the image. If there is no face in the image or if you want to detect multiple faces, you need to modify the code accordingly.

face recognition code
2def get_embedding_from_img_path(img_path):
3 rgb_img = cv2.cvtColor(cv2.imread(img_path), cv2.COLOR_BGR2RGB)
4 bboxes, kpss = detector.detect(rgb_img)
5 if bboxes.shape[0] == 0:
6 return None
7 i = 0 // take only the first face
8 bbox = bboxes[i, 0:4]
9 det_score = bboxes[i, 4]
10 kps = None
11 if kpss is not None:
12 kps = kpss[i]
13 face = Face(bbox=bbox, kps=kps, det_score=det_score)
14 embedding = model.get(rgb_img, face)
15 return embedding

What is a Vector (embedding) for Face Recognition?

By the way, what exactly is the vector (embedding) that the face recognition model infers in the function above? To understand this, it is important to know the mechanism of face recognition. Face recognition is performed in the following steps.

  1. Detecting the face from the image
  2. Comparing the detected face with another face
  3. If the similarity exceeds a certain threshold, it is judged as the same person

Method of Comparing Faces

What becomes crucial here is the second step, comparing face images. Computers cannot directly compare face images, so it is necessary to convert face images into numerical values that computers can understand. This is what is meant by high-dimensional vectors that numerize facial features. Although 'high-dimensional vectors' may sound complicated, they are essentially just arrays of numbers, such as [5,6,8,9,1...]. If it's a 512-dimensional vector, it contains 512 numbers. It might be difficult to understand by comparing these, but it becomes easier if we think in two dimensions, such as [2,4]. This means [x,y] in the two-dimensional (2D) world, where x is horizontal and y is vertical. For example, if there are coordinates A[2,2], B[90,71], and C[7,5], it is understandable that coordinate C is closer to A than B. By calculating the distance between vectors in this way, it can be treated as similarity.

Calculating the Similarity of Faces

To calculate the similarity of faces, we will use cosine similarity this time. Cosine similarity is an indicator for calculating the angle between vectors, and it takes values from -1 to 1. A value closer to 1 indicates a higher similarity. Cosine similarity is calculated as follows:

face recognition code
2def calc_cos_sim(embedding1, embedding2):
3 return np.dot(embedding1, embedding2) / (np.linalg.norm(embedding1) * np.linalg.norm(embedding2))

This function takes two vectors as arguments, calculates their dot product, and divides it by the product of their norms (lengths). This is the cosine similarity.

Cosine Similarity

Cosine similarity measures the cosine of the angle between vectors to quantify how similar they are to each other. This similarity does not depend on the length of the vectors but solely on their direction. Therefore, if two vectors are pointing in exactly the same direction, the cosine similarity is 1. Conversely, if two vectors point in opposite directions, the cosine similarity is -1. This calculation provides a similarity based solely on direction, without dependence on vector length. For cases where the magnitude of vectors is also a concern, 'Euclidean distance' is often used, which takes into account both the magnitude and direction of vectors, impacting the outcome.

Though the explanation has become quite complex, you don't need to fully understand everything right now. What's important is to grasp that after extracting a face image, AI (using the ArchFaceONNX model) converts the face into a vector, and then uses cosine similarity for comparison. This is a basic method used in numerous AI applications.

Try Face Recognition with the Created Code

Let's recognize whether two images of faces are of the same person. As always, please adjust the path to the image files according to your environment.

face recognition code
2# Input image 1
3embedding1 = get_embedding_from_img_path("data/images/maya/maya_1.jpg")
4# Input image 2
5embedding2 = get_embedding_from_img_path("data/images/maya/maya_3.jpg")
6# Calculate similarity
7cos_sim = calc_cos_sim(embedding1, embedding2)
8# Display the result
10# If the similarity is 0.6 or higher, it is judged to be the same person

Face authentication has become incredibly simple. With this code, you can easily implement face authentication. The decision threshold for determining if it is the same person is set at 0.6, but this may need to be adjusted depending on the actual application.

Summary of Session 3

In this session, we learned to implement face authentication by dividing it into three modules: 'detecting the face', 'vectorizing the face image', and 'comparing the vectors'. This is a basic process in face authentication and a fundamental method used in various AI constructions. Through this course, we have learned basic AI methods. This course used the insightface library, which provides pre-trained models such as det_10g.onnx for 'face detection' and w600k_r50.onnx for 'image vectorization', allowing for quick implementation of face authentication. However, in real-world problem solving, it is often necessary to build and train your own models as pre-trained models may not always exist. In future courses, we intend to learn methods for training models on our own.