Development of an Enhanced Human Face Recognition Model

Rume Elizabeth Yoro; Oluwatolani Achimugu; Philip Achimugu; Olalekan Sunday Damilare; Monday Abutu Idakwo

doi:10.2478/ias-2024-0004

Full Article

1

Introduction

Despite the great advances in computer vision research, face identification still poses many challenges because of typical problems such as facial masks, pose variations, lighting changes, partial occultations, aging, and changes in appearance [1, 2]. Many approaches have been invented to overcome these difficulties but these approaches still exhibit various types of limitations. As a result, the topic remains active for further research.

A face recognition system consists of three critical processes, which include detection, feature extraction, and classification [3]. Generally, a key issue in face recognition is finding an efficient facial image representation as a feature vector. There are numerous approaches that developed useful schemes to detect and recognize the features of human faces in recent years [4,5,6]. Feature extraction is the key element in face recognition. Currently, diverse recognition methods use different extraction strategies.

Face recognition is one of the most important applications of biometrics-based authentication system in the last few decades. Face recognition is a kind of recognition task pattern, where a face is categorized as either known or unknown after comparing it with the images of a known person stored in the database. Face recognition is challenging, given the certain variability in information because of random variations across different people, including systematic variations from various factors such as lightening conditions and pose.

Face recognition has progressed significantly in recent years due to its many applications primarily in forensic sciences, driver's licenses, passport verifications, surveillance systems, social networks, access controls and location of missing persons.

Therefore, this paper aims to develop an enhanced image recognition model with high degree of accuracy, reduced processing time and memory consumption rate.

2

Related Works

In spite of the success rates achieved in recognizing human faces, effective and efficient analyses of features related to facial information are still difficult tasks that require a lot of time and effort. The fundamental problems associated with successful face detection and recognition systems include illumination conditions, scale, occlusion, pose, background, and expression [7].

Various algorithms and methods have been proposed to address these aforementioned challenges, most of which achieved promising performance. Nevertheless, most existing techniques are computationally expensive due to the high dimension of facial images, which results in computational burden in terms of processing speed and memory consumption. Brief descriptions of the problems associated with existing face recognition models are:

a)
Pose: This concerns out-of-plane rotation or images that do not conform to a pre-defined specification of image properties. This might affect the accuracy of recognition [8].
b)
Occlusion: This is a situation where the quality of an image is obstructed or distorted by the use of sunglasses, masks, scarves, and hats or shaky camera handling during image capture, which can lead to lost or blurred features. However, local region-based methods have a degree of competence in solving the menace associated with partial occlusion [9].
c)
Illumination: Variations due to excess or low light density can distort the quality of an image. Therefore, it is expedient to develop algorithms capable of handling illumination variations, which can cause recognition difficulty. Some techniques can solve this problem [10], but illumination disparity is a challenge in an uncontrolled mechanism such as 24-hour surveillance [11].
d)
Expression: This distorts the features of the face and affects recognition accuracy. Opening and closing of the eyes, smiling and frowning faces are a few examples of expressions. Texture in the local region changes due to the expressions on the face. Local region-based or patch-based methods that uses a histogram of features have been successfully used for expression of invariant face recognition models [11].
e)
Age: Facial features can vary from infancy to adolescence and during adulthood. This is an issue even in controlled face recognition because passport and visa facial images are not frequently updated. The issues associated with age progression in face recognition have been addressed [12]. However, the challenge remains when the age difference is significant.

In this research, an enhanced model is proposed to address the computational complexities caused by the high dimension of facial images. This, in turn, increases processing speed and reduces memory consumption. Therefore, the feature extraction process must be efficient in terms of computing time and memory usage.

3

Method

360 images were acquired from a digital camera used to capture 6 facial images from 60 different persons. 240 of those images were used for training, while the remaining 120 were used for testing. The colored images were converted to grayscale to pre-process them. To capture both linear and non-linear features, as seen in Algorithm A, the input space is transformed nonlinearly into a high-dimensional feature space and mapped as an input variable, and then Principal Component Analysis (PCA) is performed. Executing PCA in the high-dimensional feature space is to obtain high-order statistics of the input variables. Kernel tricks are then employed to reduce complexities by computing the dot products in the original low-dimensional input space by means of a kernel function, as seen in Line 4 of Algorithm A.

Images were normalized by calculating their average face vectors and subtracting them from each face vector. This process removed unwanted attributes or properties from the images.

Algorithm A: Reducing Dimensionality

Create a training set. Training set consist of total M images and each image is of N*N
Map an input variable into a high-dimensional feature space, and then perform PCA
∅: R^N → F, x_i → ∅(x_i), i = 1, 2,…, M
Convert the face images in the training set to face vector denoted as Xi
Convert face vector into a lower dimensional K eigenvector with original face dimensionality U_i = AV_i where U_i=ith vector in higher dimension space and V_i=ith vector in lower dimension space
Normalize the face vector. First calculate the average face vector and then subtract average face vector from each face vector. Normalized face vector Φi = Xi – ѱ
Find Eigen faces with help of covariance matrix. C=AA^t Where A= {Φ1, Φ2… Φm}
Calculate eigenvectors from above matrix with reduced dimensionality
Select K best Eigen faces, where K<M and can represent the whole training set
Represent each face image as a linear combination of all K eigenvectors and each face from the image can be represented as a weighted sum of K eigen faces + mean or average face

After reducing the dimensionality of images, features are extracted by dividing and labeling the images into between-class and within-class. Between-class captures the image variations of the same individual while within-class captures the image variations among classes of individuals as seen in Line 2 of Algorithm B where $S_{m}^{n}$ S_m^n is the m^th sample of class, and n, .u_n is the mean of class n. Furthermore, C is the number of classes, L_n is the number of samples in class n, and u is the mean of all the classes.

Algorithm B: Feature Extraction

Compute the d-dimensional mean vectors for the different classes from the dataset
Compute the scatter matrices (between-class and within-class scatter matrix).
The between-class scatter matrix SB is: $SB = \sum_{n = 1}^{C} \sum_{m = 1}^{L_{n}} (S_{m}^{n} - u_{n}) {(S_{m}^{n} - u_{n})}^{T}$ SB = \sum\limits_{n = 1}^C {\sum\limits_{m = 1}^{{L_n}} {\left( {S_m^n - {u_n}} \right){{\left( {S_m^n - {u_n}} \right)}^T}} }
The within-class scatter matrix SW is given by $SW = \sum_{n = 1}^{C} (u_{n} - u) {(u_{n} - u)}^{T}$ SW = \sum\limits_{n = 1}^C {\left( {{u_n} - u} \right)} {\left( {{u_n} - u} \right)^T}
Compute the eigenvectors (e1, e2... ed) and corresponding eigenvalues (λ1, λ2... λd) for the scatter matrices
Sort the eigenvectors by decreasing eigenvalues and choose k eigenvectors with the largest eigenvalues to form a d×k dimensional matrix W (where every column represents an eigenvector)
Use the d×k eigenvector matrix to transform the samples into the new subspace. This can be summarized by the mathematical equation: y = WT × × (where × is a d×1-dimensional vector representing one sample and y is the transformed k×1-dimensional sample in the new subspace)

The enhanced model’s performance was evaluated based on recognition accuracy, false positive rate, sensitivity (i.e., true positive rate), and precision, as depicted in Equations 1–4. (1) $Sensitivity = TP / (TP + FN)$ {\rm{Sensitivity}} = {\rm{TP}}/({\rm{TP}} + {\rm{FN}}) (2) $Specificity = TN / (TN + FP)$ {\rm{Specificity}} = {\rm{TN}}/({\rm{TN}} + {\rm{FP}}) (3) $FPR = FP / (TN + FP)$ {\rm{FPR}} = {\rm{ FP}}/({\rm{TN}} + {\rm{FP}}) (4) $Accuracy = (TP + TN) / (TP + TN + FP + FN)$ {\rm{Accuracy}}\; = ({\rm{TP}} + {\rm{TN}})/({\rm{TP}} + {\rm{TN}} + {\rm{FP}} + {\rm{FN}})

The simulation was executed in MATLAB (R2018a), installed on a 64-bit window-based operating system with sufficient processing speed, memory, and disk space. Figure 1 shows the datasets and their preprocessing operations.

4

Results

The results obtained from the enhanced model is evaluated and presented in this section. The time spent on training the dataset is shown in Table 1. The time spent increases as the dimension size of the images increases, which implies that the time consumed depends on the features in the training set. The average training time generated by the application of Algorithm A after three trials on the images at 100 by 100-pixel resolution is 97.66s, and at 200 by 200-pixel resolution is 139.54s, as presented in Table 1. The result depicts that the enhanced model used less time to train the images compared to existing ones.

Table 1.

Time Spent for Training Datasets

Dimension Size	Time1(s)	Time2(s)	Time3(s)	Ave. Time (s)
100 by 100	100.50	93.21	99.28	97.66
200 by 200	132.12	144.31	143.20	139.54

Table 2 presents results obtained by the enhanced model at threshold values of 0.1, 0.2, 0.3, 0.4, and 0.5 with respect to the performance metrics. The table reveals that the performance varies with a change in the threshold value. Also, it was discovered that accuracy and specificity increased with an increase in threshold value while the false positive rate and sensitivity decreased with an increase in the threshold value. However, the optimum performance was achieved at the threshold value of 0.50. The Table also shows that the computation time is within the range of 21.01 to 24.06 seconds with an increase in the threshold values.

Table 2.

Results of Enhanced Model at different Threshold Values.

Threshold Values	Accuracy (%)	Sensitivity (%)	Specificity (%)	FPR (%)	Recognition Time (sec)
0.1	95.28	99.10	81.10	19.00	22.00
0.2	95.27	99.10	81.10	19.00	24.06
0.3	96.94	99.00	87.78	12.22	22.02
0.4	97.78	98.88	94.44	5.56	23.03
0.5	98.61	98.89	97.78	2.22	21.01

The proposed model is less computationally expensive due to the application of a reduced dimensionality model for improved performance. The results obtained from the training time show that additional features in the training set leads to more time consumption rate.

5

Conclusion and Future Work

The results revealed that recognition accuracy, false positive, sensitivity, specificity, and time computation rate exhibited by the enhanced model is greatly improved. Therefore, a face recognition system based on the enhanced model would produce more reliable results than existing ones. It should be considered in building a truly robust face recognition system where high recognition accuracy and computational efficiency must not be compromised. An enhanced recognition model with high accuracy, reduced complexities and time consumption rates has been developed. Future work will dwell on the implementation of the proposed model.

Development of an Enhanced Human Face Recognition Model

Full Article

Paradigm

My account