Gender Classification Of Facial Images Using CNN In Python.

10 min readFeb 22, 2021

I was influenced in building this particular gender classification through facial image project because most times I find it very hard to tell a baby’s gender through their facial look and it really interests me , In fact not only a baby, Asian’s Male also , excluding their physical appearance , at times it might be quite difficult to tell whether he is a male or female from their look because they have this female-like look on their face . So i thought if i could build a model that will learn from thousands of images and efficiently predict gender , perhaps mimick we humans real-life high predictions . So let’s get straight to the project and give a brief discussion on Convolutional Neural Network applied to gender classification .

An approach using a convolutional neural network (CNN) is proposed for real-time gender classification based on facial images . The proposed CNN architecture exhibits a very much less complexity in its design compared to other CNN solutions when applied to pattern recognition. The number of processing layers in this CNN is reduced to just four only by stacking convolution and the sub-sampling layers . Unlike the conventional CNN’s we replace the convolutional operation by cross-correlation, thereby reducing the computational load.

The convolutional neural network is a neural network variants that consists of fused convolutional and sub-sampling layers which ends with one or more fully-connected layer in the standard multilayer perceptron (MLP). It can extract features, reduce data dimensionality, and classify simultaneously in one network structure.

Mankind is a great example of neural machine- in fact , We use our human brain as a the simplest lay man example when explaining how neural networks work. The fun of mankind nature is its ability of looking at multiple images simultaneously and processing them without knowing how the processing was done “God is the greatest” , But it is different with machines. Every image is a cummulative arrangement of dots(A pixel) in a special order , so a change in the order of the pixel leads to a change in the image.

There are three basic components of a convolutional layer :
1. Convolution layer
2. Pooling layer
3. Fully connected layer.

So that’s a brief introduction on Gender Classification using convolutional neural network. Let’s get straight to our project.

IMPORTING LIBRARIES

Keras Tuner is a hyperparameter tuning technique which helps in determining the right no. of filters, filter size, learning rate, dropout e.t.c that helps the model and saves manually tuning of hyperparameters. Keras Tuner can also be integrated in ANN, CNN, RNN. it is not subjected to CNN alone.

These are all the necessary libraries which will be required to successfully run all the codes that we we’ll be using in this project.

IMPORTING DATASET

We have successfully imported our dataset to our colab notebook which was initially uploaded to google drive , So we mounted the drive on colab in order for colab to read and locate the dataset. The dataset can be gotten from Kaggle. This dataset resulted from various preprocessing and cleaning from the UTK dataset and many other datasets full of images. This dataset consists of 23, 705 grayscale facial images of humans based on age gender and ethnicity with each image having a resolution of 32 by 32 pixels.
Note : They are although images , they’re just loaded as numpy arrays and not binary image objects.

Now we can see our dataset, we’re mainly interested in the third and last column for this project as they’ll serve as the label and image column respectively.

EXPLORING THE DATA

The pixels column in the dataset will serve as our main image column which should contain numpy arrays of pixels having 4-dimension resolution but each row pixels in this column are strings, So we are to convert each row to a numpy array and reshape into a four dimensional array. I created a function to convert each row to a numpy array and reshape the column into a four dimensional array (total length of rows, 32, 32, 1) : the length of rows implies the total number of the data of which in this scenario is 23,705 , 32 by 32 implies the height and width pixels, 1 indicates that it is a gray-scale image and it has only one channel, if it were an rgb image , it will have three channels.

So we further explored the data by visualizing the frequency of the two genders available in the data , we could infer that there are more male than female in the data.

Output is below :

So now let’s split the dataset into X and Y ,which is our image data and label data respectively

We proceed by splitting our data into training and testing sets , we also split the testing data with the cut size to get our validation sets, this will be achieved by using our train_test_split method as we choose a constant cut size (test size) in both cases from our sklearn library.

We need to be aware that in neural networks, Our labels have to be in vectorized form for us to get a better efficiency, that is the reason why we are using the to_categorical method from keras.utils library to change the labeled data to a categorical form.

So we continue our data preprocessing by now passing the trained,test,validation images set into the function I created previously.. As it will convert each row of the each of the three images sets data into an image representation of pixel data.

Since the image data in train_images, test_images and val_images is from 0 to 255 , we need to rescale this from 0 to 1 . To do this, we need to divide the all the three sets of images by 255 . It’s important that the training set and the testing set be preprocessed in the same way:

Let’s visualize few samples of the image data , which we’ll clearly see the real images we are talking about ,the images are in gray-scale form-Black and white .It is also a way to show and confirm that the data is ready to be trained .

Here are facial images wrangled, gotten from different data sources, featuring some celebrities that we’re familiar with e.g Will Smith, Kevin Hart. e.t.c

DEFINE THE MODEL

As you can observe from above that the images are in a Black and White format. Now the we just completed the preprocessing section and we’ll move on to our model training which requires it to be trained with CNN, and with the help of keras tuner, as I mentioned initially it will save us time in tuning our hyperparameters manually.. Here’s the link to the official documentation of the for further understanding on how it works https://keras-team.github.io/keras-tuner/ . Here’s how to perform hyperparameter tuning for a single-layer dense neural network using random search:

First, we define a model-building function. It takes an argument hp from which you can sample hyperparameters, such as hp.Int(‘units’, min_value=32, max_value=512, step=32) (an integer from a certain range).This function returns a compiled model.

This helps in selecting the best no. of filters from a range of values passed representing no. of filters. The same method also applies to the kernel size /filter size. We added another convolution layer to the network , followed by the keras.layers.Flatten layer which transform the 2-Dimensional images in a shape of 32 by 32 pixels into a 1-D array which will result to (32 * 32) 1024 pixels. The layer unstacks rows of pixels of images and lining them up as one single array. After the pixels are flattened, the network consists of a sequence of two keras.layers.Dense layers. These are densely connected, or fully connected, neural layers. Again, The dense layer nodes are determined by the hyperparameter passed in the RandomSearch library from the keras-tuner and relu activation function was used . The last dense layer is the output layer , since we’re predicting two classes, I used two nodes and sigmoid activation function which is used for binary classification.

COMPILE THE MODEL

Before the model is ready for training, it needs a few more settings. These are added during the model’s compile step:

Loss function — This measures how accurate the model is during training. You want to minimize this function to “steer” the model in the right direction. Here we will use “binary_categorical_crossentropy” Optimizer — This is how the model is updated based on the data it sees and its loss function. Metrics — Used to monitor the training and testing steps. The following example uses accuracy, the fraction of the images that are correctly classified.

Next,we instantiate a tuner. We specified the model-building function, the name of the objective to optimize (whether to minimize or maximize is automatically inferred for built-in metrics), the total number of trials (max_trials) to test, and the number of models that should be built and fit for each trial (executions_per_trial).

Available tuners are RandomSearch and Hyperband.

Note: the purpose of having multiple executions per trial is to reduce results variance and therefore be able to more accurately assess the performance of a model. If you want to get results faster, you could set executions_per_trial=1 (single round of training for each model configuration).In our case it was set to default (executions_per_trial = 1) .

Then we start the search to pick the best hyperparameter configuration, the call to search has the same configuration as model.fit , Here’s what happens in search: models are built iteratively by calling the model-building function, which populates the hyperparameter space (search space) tracked by the hp object. The tuner progressively explores the space, recording metrics for each configuration. When search is over, we then retrieved the best model(s):

Thus , We already got our best model, This model will be used to train our images and labels, we can also concatenate the validation data to the images and labels in order to train our model on a larger dataset which will yield a better performance .

Now we train our model on our final images and labels, and also specifying the initial and final epochs, the initial epochs was referenced during our search to pick the best hyperparameter configuration.

Now i’ll like to show you the training process i got in order for you to confirm that the model is and acceptable there’s no underfitting or overfitting during the training process :

The model has finished training, and from final metrics we got ,it can be accepted because there’s no concept of overfitting or underfitting from the model training because the convergence difference in both pairs(accuracies and lossies) of metrics is not so large. We proceed to make predictions, accuracy, classification reports and save the model as an h5 file.

The classification report below shows that i got a pretty good accuracy on the testing set , which was about 97%, i.e our model should perform very well when giving us back our response during deployment using tensorflow serving .

ERROR ANALYSIS

Analyzing error visually helps in tuning image augmentation although we didn’t make use image augmentation , It gives insights on how the model may perform in the future and makes us know if the model matches human level performance.

The code above compresses the predictions and test labels we got initially to a 1-D array picking the maximum value in order for us to compute the error on both the test images, labels, predictions so we can visually see the minimum error our model made below :

We have done the error analysis visually , and we saw clearly the errors the model made during the prediction, labeling the actual class, and predicted class , I further made an analysis to show the countplot between the two genders for us to know which gender was predicted more wrongly.

We can always update our model, by including some features which should help the model efficiently, examples are : Data Augmentation, adding batch normalization, dropout layers perhaps might not really have great effects in the model , and we can decide to add model checkpoints from keras.callbacks library if we’re working with a large amount of data in order to save the model at designated checkpoints when training. In this case we’re dealing with thousands of datasets so we can obviously add it to the model. So we have successfully trained the model in this project and we will deploy it in order to make it available for production and to be applied in real life cases, I will be running the tensorflow serving natively. and this deployment will be published in my next article .

CONCLUSION

This is my First Article on my latest project , All the codes and files to this project are available here on my github repo : Github . The second part (Deployment) will be available on my next article, I hope you enjoyed this write-up . If you found this article helpful, please do share it with your friends and leave a clap :-). If you have any queries, feedback or suggestions do let me know in the comments. Also, you can connect with me on Twitter, Linkedin and peteradekolu@gmail.com .There are also available resources online you can always read, watch and learn for you to understand better in various concepts like this. There are so much to share with all of you and I’m just getting started. Stay tuned for more!

Gender Classification Of Facial Images Using CNN In Python.

Written by Adekolu Peter