Data cleansing is the major use for this algorithm. It aids in determining the gender of an image by looking at the face. The image is erased if the face cannot be located. The algorithm can be altered to suit different requirements.
from deepface import DeepFace # Pretrained model which is present in DeepFace library.
from tqdm import tqdm # Used to create a bar that represents process progress.
import cv2
import matplotlib.pyplot as plt
import time
import os
start = time.time()
# plt.imshow(img[:,:,::-1])
# plt.show() # To display the image if required.
dire = r"Location of folder in which all the files are present"
for img in tqdm(os.listdir(dire)):
path = dire+'/'+img
try:
# print(path)
img = cv2.imread(path)
result = DeepFace.analyze(img, actions= ['gender'])
# print("Gender: ", result['gender'])
if result['gender'] # We can make changes here for custom use.
os.remove(path)
except ValueError:
os.remove(path)
print("All is done.") # To understand that all the process is finished.
time.sleep(1)
end = time.time()
print(f"Runtime of the program is {end-start}") # To print out the final execution time.
Multiple photographs taken from the internet are combined in the folder. The files contain photos of various genders, some of which are corrupt. These noisy photos can be removed with the help of our script.
The noisy images displayed here are not just arbitrary snapshots. In reality, these are pictures that in some way depict the attributes of a face. These are the results of a face detector model using MTCNN that was cropped out.
We only need to make changes to one line of the general script as follows:-
if result['gender'] != "Man" and result['gender'] != "Woman": #change the General script with this line of code.
os.remove(path)
After running the script we will obtain the following results as shown below.
The only photographs left are those with human faces.
Progress bar is shown for understanding the cleaning status. Total execution time will be printed out at the end along with the text "All is done".
This uses the same directory as above. We must add a variable count and make the appropriate adjustments in order to determine the number of photos that contain human faces.
from deepface import DeepFace
from tqdm import tqdm
import cv2
import os
dire = r"Location of folder in which all the files are present"
count = 0 #Initiated count
for img in tqdm(os.listdir(dire)):
path = dire+'/'+img
try:
img = cv2.imread(path)
result = DeepFace.analyze(img, actions= ['gender'])
if result['gender'] == "Man" or result['gender'] == "Woman":
count += 1 # Count value is incremented when a face is found.
except ValueError:
pass
print("No of human faces =",count)
Output is given as
No of human faces = 9
We only need to make changes to one line of the general script as follows:-
if result['gender'] != "Man" #change the General script with this line of code.
os.remove(path)
After executing the script, we will receive a folder with only photographs of men in it and the rest empty.
We only need to make changes to one line of the general script as follows:-
if result['gender'] != "Woman" #change the General script with this line of code.
os.remove(path)
After executing the script, we will receive a folder with just photographs of women in it, with the rest of the images being deleted.
The essential libraries can be downloaded from 'PyPI' for installation. The libraries themselves as well as their requirements will be installed.
pip install deepface
-Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python. It is a hybrid face recognition framework wrapping state-of-the-art models: VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID, ArcFace and Dlib. The library is mainly powered by TensorFlow and Keras. Experiments show that human beings have 97.53% accuracy on facial recognition tasks whereas those models already reached and passed that accuracy level.
pip install tqdm
-tqdm instantly make your loops show a smart progress meter - just wrap any iterable with tqdm(iterable), and you’re done!
pip install opencv-python
-OpenCV (Open Source Computer Vision Library: http://opencv.org) is an open-source library that includes several hundreds of computer vision algorithms.
pip install matplotlib
-Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.
-The time and OS modules are part of Python's standard library. So no need to download it.
Then you will be able to import the libraries and use its functionalities.
Pull requests are welcome.