Pfe Bader El Hajari

Université Mohammed V de Rabat
Faculté des Sciences
Département d’Informatique
Master in Intelligent Processing Systems
(Traitement Intelligent des Systèmes)
Title:
REAL-TIME SYSTEM FOR PEOPLE

COUNTING
Presented by:
EL HAJARI BADER
Soutenu le xx Octobre 2019 devant le Jury :
M. Prénom Nom Professeur à la Faculté des Sciences - Rabat Président

M. Prénom Nom Professeur à la Faculté des Sciences - Rabat Encadrant
Mme Prénom Nom Professeur à la Faculté des Sciences - Rabat Examinateur
Mme Prénom Nom Professeur à la Faculté des Sciences - Rabat Examinateur
Année universitaire 2020-2021

ii
Remerciements
At the end of this work, we would like to express our deep gratitude and our sincere
appreciations to our supervisor Professor IDRISSI Abdellah, for all the time he devoted to us, his
invaluable advice, and for the quality of his follow-up
throughout the period of our project.
Also I want to thank ICOZ company for providing me with the opportunity to be part of their
company for those last months, and especially my tutor TAOUAF FAYSSAL, who always
came up with creative ideas to make the project goals achievable.
iii
Résumé
Dans ce rapport de stage sont présentées les différentes étapes de conception et de

Réalisation d’un système compteur de personnes en temps réel.
Ce système a pour but d’aider les entreprises spécialement, les magasins et les places de
marché pour savoir le nombre d’employées là, et/ou pour surveiller leur trafic de visiteurs.
Ce système peut envoyer des alertes en temps réel si le nombre total de personnes est dépassé
d’un nombre donné dans une salle ou un magasin...
J’ai réalisé le système en utilisant des techniques de l'intelligence artificielle et notamment

Deep Learning et Computer vision à l’aide des fonctions de la librairie OpenCV, système est
capable de compter les entrées-sorties des personnes et ainsi de pouvoir donner des analyses pour
suivre l'engagement client dans la zone d'intérêt d'un magasin, pour mieux comprendre le
comportement des utilisateurs.
Ce stage se déroule du 26 avril au 31 octobre2021 et n’est pas encore terminé au moment du

rendu de ce rapport.
Mots clés: Système compteur de personnes en temps réel, Deep Learning, Computer
vision, l'intelligence artificielle, OpenCV
iii
Abstract
In this internship report are presented the different phases of conception and realization of a
real time people counter system.
This system helps companies especially stores and marketplaces for understanding the number
of the employees inside the company, and/or tracking and counting the visitor entering.
The system can send an alert in real-time, in case if the limit number of people is exceeded in
a store/building…
I realized the system by using artificial intelligence techniques especially Deep Learning and
Computer vision and other libraries such as OpenCV.
The system is able to count the number of the personness(in/out), and visualize the conversion
rate data, in order to evaluate the performance of stores and other.
This internship starts from 26 April to 31 October, 2021
Keywords : Real time people counting, Deep Learning, Computer vision, artificial intelligence,
OpenCV
iv
Table of Contents
Remerciements ...................................................................................................................................................... iii
Résumé ................................................................................................................................................................... iii
Abstract ................................................................................................................................................................. iv
List of Abbreviations .............................................................................................................................................. 1
General context of the project ........................................................................................................................... 2
1.1 Introduction .............................................................................................................................................. 2
1.2 Presentation of the organization ............................................................................................................. 2
1.2.1 ICOZ Company .................................................................................................................................. 2
1.2.2 Technical Sheet of the Company ........................................................................................................ 3
1.2.3 Company Diagram .............................................................................................................................. 3
1.3 Issue / Problematic ................................................................................................................................... 4
1.4 project goals and objectives ..................................................................................................................... 4
1.5 Agile methodology................................................................................................................................... 4
1.6 Scrum methodology ................................................................................................................................. 5
1.7 Communication within the ICOZ team.................................................................................................... 5
1.8 Resources provided and/or to be used .................................................................................................... 6
Conception of the project ....................................................................................................................................... 7
2.1 Introduction .............................................................................................................................................. 7
2.2 Artificial intelligence ............................................................................................................................... 7
2.3 Deep learning ........................................................................................................................................... 8
2.4 Computer vision ....................................................................................................................................... 8
2.4.1 Image classification ............................................................................................................................ 9
2.4.2 Object Detection ............................................................................................................................... 10
2.4.2.1 SSD MobileNet Architecture ........................................................................................................... 11
2.4.2.2 SSD architecture ............................................................................................................................... 11
2.4.3 Object Tracking ................................................................................................................................ 12
2.4.4 difference between object detection and object tracking ............................................................... 12
2.4.5 Combine the concept of object detection and object tracking ....................................................... 12
3.1 Development Tools and technologies .................................................................................................... 14
3.1.1 Tools used .......................................................................................................................................... 14
3.1.2 Work environment ............................................................................................................................. 14
3.1.3 Librairies ............................................................................................................................................ 15
Realization of the project ....................................................................................................................................... 18
4.1.1 Introduction ............................................................................................................................................... 18
vi
4.1.2 Conception of the project .......................................................................................................................... 18
4.1.3 Concept of Centroid tracking .................................................................................................................... 18
4.2 Realization ................................................................................................................................................... 23
Conclusion ........................................................................................................................................................... 31
References ............................................................................................................................................................ 32
vii
viii
List of Figures
Figure1 : ICOZ LOGO ............................................................................................................................................ 2
Figure 2 : Technical Sheet of ICOZ ....................................................................................................................... 3
Figure 3: Company Diagram .................................................................................................................................. 3
Figure 4 : agile methodology diagram .................................................................................................................... 5
Figure 5: scrum methodology diagram ................................................................................................................. 5
Figure 6 : Core Elements of Artificial intelligence ................................................................................................. 7
Figure 7 : A typical neural network ....................................................................................................................... 8
Figure 8 : Output from SSD Mobilenet Object Detection Model ......................................................................... 10
Figure 9 : Connection of MobileNet and SSD ...................................................................................................... 11
Figure 10 : SSD architecture ................................................................................................................................. 11
Figure 11: Accept bounding box coordinates and compute centroids .................................................................. 19
Figure 12: Compute Euclidean distance between new bounding boxes and existing objects .............................. 20
Figure 13 : Updating coordinates.......................................................................................................................... 21
Figure 14 : Register new objects........................................................................................................................... 22
Figure 15: importing classes and libraries ......................................................................................................... 23
Figure 16: construct the argument parse and parse the arguments ....................................................................... 24
Figure 17: initialize the list of class labels ............................................................................................................ 24
Figure 18 : initialization of the video stream ........................................................................................................ 25
Figure 19: finding objects belonging to the “person” class .................................................................................. 25
Figure 20: adding the bounding box for the objects ......................................................................................... 26
Figure 21 : counting if the person has moved up or down through the frame ...................................................... 27
Figure 22 : the case if the direction of the person was moving down................................................................... 28
Figure 23 : the case if the direction of the person was moving up ....................................................................... 28
Figure 24 :counting the number of people ............................................................................................................ 29
Figure 25 : script for the email alert .................................................................................................................... 29
Figure 26 : example of an email alert .................................................................................................................. 30
Figure 27 Graph of number of visitors in a week ............................................................................................. 30
ix
List of Abbreviations
Abbreviations Description
AI Artificial Intelligence
SSD Single Shot Multibox Detector
CNN Convolutional Neural Network
DNN Deep Neural Network
SVM Support Vector Machine
R-CNN Region Based Convolutional

Neural Networks
HOG Histogram Of Oriented Gradients
FPS Frames Per Second
SMTP Simple Mail Transfer Protoco
1
Chapter 1
General context of the project
1.1 Introduction
As part of my final year of a Master's degree in intelligent processing systems at
University of Mohammed V, the Faculty of Sciences of Rabat (FS Rabat), I have to do a six-
month
I have to do an internship of 6 months. This internship aims to complete my course.

It allows me to be trained within a company in order to acquire knowledge of a sector of
activity, to knowledge of a sector of activity, while allowing me to put into practice the
theoretical
theoretical knowledge that I have acquired during my studies.
In this report, I present my work environment as well as the main mission I carried out
within the ICOZ company, namely the realization of a system for counting people in real time.
1.2 Presentation of the organization

1.2.1 ICOZ Company
ICOZ is a moroccan Start-up that specializes in software development and IT consulting the
company based in Casablanca, was created by TAOUAF FYÇAL in December 2012 and started
quickly as an innovation company in Morocco. Its activity has gradually expanded to many trades
around the Web.
Since the launch of ICOZ many solutions have appeared in the Moroccan market with a strong
potential
Figure1 : ICOZ LOGO
2
1.2.2 Technical Sheet of the Company
Company name ICOZ
Foundation year 2012
Legal form of the company SARL
Capital 15 000 MAD
Staff 10
Email contact@icoz.ma
Adress Rue Al Farouk Al Rahali N

102
Etage 2 Sidi Maarouf 4
Casablanca 20520, Maroc
Sector of activity Web and digital
communications
Figure 2 : Technical Sheet of ICOZ
1.2.3 Company Diagram
Figure 3: Company Diagram

3
1.3 Issue / Problematic
Due to the COVID-19 pandemic, businesses have to comply with restrictions on the number of
visitors, based on the new government social distancing regulations.
Having to control the amount of visitors poses a problem for essential public service providers,
such as Pharmacies, Supermarkets Hospitals, clinics and vet shops, Banks, Government offices
and other
These businesses cannot provide their services online and require physical visits which, in turn,
may lead to overcrowding
To ensure that people are as spaced out as needed, businesses turn to occupancy monitoring
by manually counting the visitors and also to gather data for statistical analysis.
Yet this rudimentary technique is limited in both scope and reliability, running the risk of
inaccuracies due to human error.
1.4 Project goals and objectives
 The project idea is to create a system that can detect and track visitor in real time
Our system will count the number of visitor.
 the system can use the visitor counting data to visualize and analyzing the conversion
rate data, in order to evaluate the performance of stores and other.
 If the total number of people exceeded inside, the system can send an email alert in
real-time
1.5 Agile methodology
Agile is a project management approach developed as a more flexible and efficient way to get
products to market. The word ‘agile’ refers to the ability to move quickly and easily.
Therefore, an Agile approach enables project teams to adapt faster and easier compared to other
project methodologies.
agile allows the team to plan continuously throughout the project. This makes it easier for
adjustments and changes when required. An agile team is highly recommended for firms that work
in a dynamic environment and want to meet tight project deadlines. Especially in the tech industry,
agile methods and teams are highly preferred for its innovative and adaptable condition.
4
Figure 4 : agile methodology diagram
1.6 Scrum methodology

Scrum, the most popular agile framework in software development, is an iterative approach that
has at its core the sprint the scrum term for iteration. Scrum teams use inspection throughout an
agile project to ensure that the team meets the goals of each part of the process.
The scrum approach includes assembling the project’s requirements and using them to define
the project. we can then plan the necessary sprints, and divide each sprint into its own list of
requirements. Daily scrum meetings help keep the project on target as do regular inspections and
reviews. At the end of every sprint, you hold a sprint retrospective to look for ways to improve the
next sprint.
Figure 5: scrum methodology diagram
1.7 Communication within the ICOZ team

The members of the project team are very available and it is easy to get quick answers to
specific questions and to arrange meetings. Otherwise, I was able to request information from the
5
entire project team by e-mail. For simple questions requiring that needed a quick answer.
1.8 Resources provided and/or to be used

Several resources were provided to me as part of my internship. Firstly, a PC was made
available to me, with the Windows operating system. Connected to the network of ICOZ.
I also had an RDP access to the server on which I was going to develop the system.
6
Chapter 2
Conception of the project
2.1 Introduction
The conception phase of a system is the most important part of the project. It consists of finding
out the needs of the agents, also modelling the system and preparing its development, so that it can
be correctly integrated into the company's IT system. At the end of this conception phase, we
should have a functional specification.
We will also need to have and explaining the criteria for choosing one technology over another.
in this phase we will talk about the technologies that we used during our project we will talk
about artificial intelligence and deep learning also computer vision...
2.2 Artificial intelligence

Artificial intelligence is the simulation of human intelligence processes by machines,
especially computer systems to solve complex problems.
Specific applications of AI include expert systems, Natural Language Processing, Speech
Recognition, Deep Learning, Machine Learning and Computer Vision
In the rest of the report we focus on Computer Vision and Deep Learning
Figure 6 : Core Elements of Artificial intelligence

7
2.3 Deep learning
Deep learning is a subset of machine learning, which is essentially a neural network with three
or more layers. These neural networks attempt to simulate the behavior of the human brain albeit
far from matching its ability allowing it to learn from large amounts of data. While a neural
network with a single layer can still make approximate predictions, additional hidden layers can
help to optimize and refine for accuracy.
Deep learning drives many artificial intelligence (AI) applications and services that improve
automation, performing analytical and physical tasks without human intervention. Deep learning
technology lies behind everyday products and services (such as digital assistants, voice-enabled
TV remotes, and smart motion detection technology) as well as emerging technologies (such as
self-driving cars).
On the other hand, Deep learning neural networks, or artificial neural networks, attempts to
mimic the human brain through a combination of data inputs, weights, and bias.
These elements work together to accurately recognize, classify, and describe objects within
the data.
Figure 7 : A typical neural network
2.4 Computer vision

Computer vision is a field of artificial intelligence (AI) that enables computers and systems to
derive meaningful information from digital images, videos and other visual inputs and take
actions or make recommendations based on that information.
If AI enables computers to think, computer vision enables them to see, observe and
understand.
8
Computer vision works much the same as human vision, except humans have a head start.
Human sight has the advantage of lifetimes of context to train how to tell objects apart, how far
away they are, whether they are moving and whether there is something wrong in an image.
Computer vision trains machines to perform these functions, but it has to do it in much less
time with cameras, data and algorithms rather than retinas, optic nerves and a visual cortex.
Because a system trained to inspect products or watch a production asset can analyze thousands
of products or processes a minute, noticing imperceptible defects or issues, it can quickly surpass
human capabilities. Computer vision applications have become one of the most rapidly
developing areas in automation and robotics, as well as in some other similar areas of science
and technology, mechatronics, intelligent transport and logistics, biomedical engineering, and
even in the food industry.
Nevertheless, computer vision seems to be one of the leading areas of practical applications
for recently developed artificial intelligence solutions, particularly computer and machine vision
algorithms.
Computer vision have a lot of models these models help us answer questions about an image.
What objects are in the image? Where are those objects in the image? Where are the key points
on an object? What pixels belong to each object? We can answer these questions by building
different types of DNNs. These DNNs can then be used in applications to solve problems like
determining how many cars are in an image, whether a person is sitting or standing, or whether
an animal in a picture is a cat or a dog.
Among these models there are Image classification, Object detection, Image segmentation and
Object Tracking.
Since the project is about Tracking and Counting People in real-time we will focus on Image
classification and Object detection also Object Tracking model.
2.4.1 Image classification
Image classification (or Image recognition) attempts to identify the most significant object
class in the image, is a subdomain of computer vision in which an algorithm looks at an image
and assigns it a tag from a collection of predefined tags or categories that it has been trained on.
Vision is responsible for 80-85 percent of our perception of the world, and we, as human
9
beings, trivially perform classification daily on whatever data we come across.
Therefore, emulating a classification task with the help of neural networks is one of the first
uses of computer vision that researchers thought about.
2.4.2 Object Detection

Object detection is a computer technology related to computer vision and image processing that
deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or
cars) in digital images and videos.
Well-researched domains of object detection include face detection and Pedestrian Detection.
Object detection has applications in many areas of computer vision, including image retrieval and
video surveillance.
Object detection as the term suggest is the procedure to detect the objects in real world. For
example, dog, car, humans, birds etc. In this process we can detect the presence of any still object
with much ease. another great thing that can be done with it is that detection of multiple objects in
a single frame can be done easily. For Example, in the image below the SSD (Single Shot
Detector) model has detected mobile phone, laptop, coffee, glasses in a single shot. It detects
different objects in a single shot.
Figure 8 : Output from SSD Mobilenet Object Detection Model
We have chosen to work with this model since is one of the most representative detection
methods with respect to speed and accuracy trade-off. Compared to other detector such as R-CNN,
SSD is faster with MobileNet architecture
10
2.4.2.1 SSD MobileNet Architecture
SSD MobileNet is an object detection framework which is trained to detect and classify the defects
and stains on the captured image. Here, MobileNet network utilized to extract the high-level feature
from the images for classification and detection and SSD is a detection model which uses MobileNet
feature map outputs and convolution layers of different sizes to classify and detect bounding boxes
through regression.
Figure 9 : Connection of MobileNet and SSD
2.4.2.2 SSD architecture
The SSD architecture is a single convolution network that learns to predict bounding box
locations and classify these locations in one pass. Hence, SSD can be trained end-to-end. The SSD
network consists of base architecture (MobileNet in this case) followed by several convolution
layers:
Figure 10 : SSD architecture
11
2.4.3 Object Tracking
Object Tracking refers to the process of following a specific object of interest, or multiple
objects, in a given scene. It traditionally has applications in video and real-world interactions
where observations are made following an initial object detection. Now, it’s crucial to autonomous
driving systems such as self-driving vehicles from companies like Uber and Tesla.
Object Tracking methods can be divided into 2 categories according to the observation model:
generative method and discriminative method. The generative method uses the generative model
to describe the apparent characteristics and minimizes the reconstruction error to search the object
The discriminative method can be used to distinguish between the object and the background,
its performance is more robust, and it gradually becomes the main method in tracking. The
discriminative method is also referred to as Tracking-by-Detection, and deep learning belongs to
this category
2.4.4 difference between object detection and object

tracking
When we use object detection, we determine where an object is located in an image/frame. An

object detector is also generally more computationally expensive, and therefore slower, than an
object tracking algorithm.
An object tracker, will take the input coordinates (x, y) coordinates of where an object is
located in an image and then will
-Give a unique identifier to that particular object
-Track the object as it moves through a video stream
-predicting the location of the new object in the next frame based on various image attributes
(gradient, optical flow, etc.)
2.4.5 Combine the concept of object detection and object

tracking
12
combining the concepts of object detection and tracking into a single algorithm, normally
divided into two phases:
Phase 1 - Detection:
In the detection phase, we actually run our more computationally expensive object tracker to
detect if we have new objects entered into our view, and see if we can find objects that were "lost"
in the tracking phase.
For each detected object, we create or update an object tracker with the new bounding box
coordinates. Since our object detector is more computationally expensive, we only run this phase
once every N frames.
Phase 2 - Tracking:
When we are not in the "detection" phase, we are in the "tracking" phase. For each of our
detected objects, we create an object tracker to follow the object as it moves through the frame.
Our object tracker should be faster and more efficient than the object detector.
We will keep tracking until we reach the Nth frame, then run our object tracker again. The
entire process is then repeated.
13
Chapter 3
3.1 Development Tools and technologies
3.1.1 Tools used
In this part, we will talk about the tools that we used in our project. We will talk
About:
— Work environment anaconda and its tools and Pycharm

— The programming language used and libraries.
3.1.2 Work environment

Anaconda
Anaconda is a distribution of the Python and R programming

languages for scientific computing (data science, machine
learning applications, large-scale data processing, predictive
analytics, etc.), that aims to simplify package
management and deployment. The distribution includes data-science
packages suitable for Windows, Linux, and macOS.
Jupyter Notebook
The Jupyter Notebook is an open-source web application that allows you

to create and share documents that contain live code, equations,
visualizations, and narrative text. Its uses include:
— Data cleaning and transformation
— Numerical simulation
— Statistical modeling
— Data visualization
— Machine learning
14
PyCharm
PyCharm is an integrated development environment (IDE) used

in computer programming, specifically for the Python language. It provides
code analysis, a graphical debugger, an integrated unit tester, integration
with version control systems (VCSes), and supports web development
with Django as well as data science with Anaconda.
PyCharm is cross-platform, with Windows, macOS and Linux versions.
Python
Python is a computer programming language often used to build websites and

software, automate tasks, and conduct data analysis. Python is a general purpose
language, meaning it can be used to create a variety of different programs and
isn’t specialized for any specific problems.
This versatility, along with its beginner-friendliness, has made it one of the
most-used programming languages today. A survey conducted by industry
analyst firm RedMonk found that it was the most popular programming language among
developers in 2021, Python has become a staple in data science, allowing data analysts and other
professionals to use the language to conduct complex statistical calculations, create data
visualizations, build machine learning algorithms, manipulate and analyze data, and complete
other data-related tasks.
Python can build a wide range of different data visualizations, like line and bar graphs, pie
charts, histograms, and 3D plots. Python also has a number of libraries that enable coders to write
programs for data analysis and machine learning more quickly and efficiently, like TensorFlow
and Keras.
3.1.3 Librairies
Numpy
Numpy is one of the most commonly used packages for

scientific computing in Python. It provides a multidimensional
array object, as well as variations such as masks and matrices,
which can be used for various math operations. Numpy is
compatible with, and used by many other popular Python
packages.
Numpy makes many mathematical operations used widely in scientific computing fast and easy
to use, such as:
— Vector-Vector multiplication
15
— Matrix-Matrix and Matrix-Vector multiplication
— Element-wise operations on vectors and matrices (i.e., adding, subtracting,
multiplying, and dividing by a number)
— Element-wise or array-wise comparisons
— Applying functions element-wise to a vector/matrix (like pow, log, and exp)
— A whole lot of Linear Algebra operations can be found in NumPy.linalg
Reduction, statistics, and much more.
Pandas
Pandas is an open source Python package that is most widely

used for data science/data analysis and machine learning tasks. It is
built on top of another package named Numpy, which provides
support for multi-dimensional arrays. As one of the most popular
data wrangling packages, Pandas works well with many other data
science modules inside the Python ecosystem, and is typically included in every Python
distribution, from those that come with your operating system to commercial vendor distributions
like ActiveState’s ActivePython.
Pandas makes it simple to do many of the time consuming, repetitive tasks associated with
working with data, including:
— Data cleansing
— Data fill
— Data normalization
— Merges and joins
— Data visualization
— Statistical analysis
— Data inspection
— Loading and saving data
OpenCV
OpenCV is the huge open-source library for the computer vision,

machine learning, and image processing and now it plays a major role in
real-time operation which is very important in today’s systems. By using it,
one can process images and videos to identify objects, faces, or even
handwriting of a human. When it integrated with various libraries, such as
NumPy, python is capable of processing the OpenCV array structure for
analysis. To Identify image pattern and its various features we use vector
space and perform mathematical operations on these features.
The first OpenCV version was 1.0. OpenCV is released under a BSD
license and hence it’s free for both academic and commercial use. It has
C++, C, Python and Java interfaces and supports Windows, Linux, Mac OS, iOS and Android.
When OpenCV was designed the main focus was real-time applications for computational
efficiency.
All things are written in optimized C/C++ to take advantage of multi-core processing.
There are lots of applications which are solved using OpenCV, some of them are listed below
16
— Face recognition
— Automated inspection and surveillance
— number of people – count (foot traffic in a mall, etc)
— Vehicle counting on highways along with their speeds
— Interactive art installations
— Anamoly (defect) detection in the manufacturing process (the odd defective
products)
— Street view image stitching
— Video/image search and retrieval
— Robot and driver-less car navigation and control
— Object recognition
— Medical image analysis
— Movies – 3D structure from motion
— TV Channels advertisement recognition
SciPy
SciPy is an open-source Python library which is used to solve

scientific and mathematical problems. It is built on the NumPy
extension and allows the user to manipulate and visualize data with a
wide range of high-level commands, SciPy provides a number of
special functions that are used in mathematical physics such as elliptic, convenience functions,
gamma, beta, etc.
Imutils
Is a series of convenience functions to make basic image processing functions such as

translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with
OpenCV and both Python 2.7 and Python 3.
Dlib
Dlib is a modern C++ toolkit containing machine learning

algorithms and tools for creating complex software in C++ to solve
real world problems. It is used in both industry and academia in a
wide range of domains including robotics, embedded devices, mobile
phones, and large high performance computing environments.
Dlib's open source licensing allows to use it in any application, free
of charge.
While the library is originally written in C++, it has good, easy to
use Python bindings.
17
Chapter 4
Realization of the project
4.1.1 Introduction
In this chapter, we will present the practical part of our project, we will explain the general
concept of the project and we will also talk about the implementation part.
4.1.2 Conception of the project

The idea of the project it’s about counting the number of the personness in real time so the concept it
will based on three fundamental steps that we have talked about in Chapter 2
Detecting Tracking Counting
To implement our system, the people counter, we used OpenCV and dlib.
We used OpenCV for the standard computer vision functions, along with the deep learning object
detector for counting people.
We use dlib for its implementation of correlation filters. We also use the centroid tracking.
4.1.3 Concept of Centroid tracking

The centroid tracking is an object tracking algorithm as it relies on the Euclidean distance between
(1) existing object centroids and (2) new object centroids between subsequent frames in a video.
The centroid tracking algorithm is a multi-step process:
18
Step 1: Accept bounding box coordinates and compute centroids
Figure 11: Accept bounding box coordinates and compute

centroids
To create an object tracking algorithm using centroid tracking, the first step is to accept bounding
box coordinates from an object detector and use them to compute centroids.
The centroid tracking algorithm supposes that we are passing in a set of bounding box (x, y)-
coordinates for each detected object in every single frame.
These bounding boxes can be produced by any type of object detector (color thresholding + contour
extraction, Haar cascades, HOG + Linear SVM, SSDs, Faster R-CNNs, etc.), provided that they are
computed for every frame in the video.
After having the bounding box coordinates we must compute the centroid, or more simply, the
center (x, y)-coordinates of the bounding box.
Then we will assign the bounding boxes presented to our algorithm with unique IDs
19
Step 2: Euclidean distance between new bounding boxes and existing objects
Figure 12: Compute Euclidean distance between new

bounding boxes and existing objects
In the image there is three objects for object tracking We have to compute the Euclidean distances
between each pair of original centroids (red) and new centroids (green).
For each subsequent frame in our video stream we applying the Step 1 of computing object
centroids.
Instead of assigning a new unique ID to each detected object (which would defeat the purpose of
object tracking), we first need to determine if we can associate the new object centroids (yellow) with
the old object centroids (purple).
To accomplish this process, we compute the Euclidean distance (highlighted with green arrows)
between each pair of existing object centroids and input object centroids.
From Figure we can see that we have this time detected three objects in our image.
The two pairs that are close together are two existing objects.
We then calculate the Euclidean distances between each pair of original centroids (yellow) and new
centroids (purple).
20
Step 3 Update (x, y)-coordinates of existing objects
Figure 13 : Updating coordinates
The primary assumption of the centroid tracking algorithm is that a given object will potentially
move in between subsequent frames, but the distance between the centroids for frames 𝐹𝑡 and 𝐹𝑡+1 will
be smaller than all other distances between objects.
Therefore, if we choose to associate centroids with minimum distances between subsequent frames
we can build our object tracker.
In Figure we can see how our centroid tracker algorithm chooses to associate centroids that
minimize their respective Euclidean distances.
the lonely point in the bottom-left It didn’t get associated with anything then it will be a new object.
21
Step 4: Creation of new objects
Figure 14 : Register new objects
We have a new object that wasn’t linked with an existing object, so it is registered as object ID 3
In the cas that there are more input detections than existing objects being tracked, we have d to
register the new object.
Registering simply means that we are adding the new object to our list of tracked objects by:
 Assigning it a new object ID

 Storing the centroid of the bounding box coordinates for that object
After that we can go back to Step 2 and repeat the pipeline of steps for every frame in our video
stream.
Step 5: Deregister old objects
Any reasonable object tracking algorithm needs to be able to handle when an object has been
lost, disappeared, or left the field of view.
Exactly how we handle these situations is really dependent on where your object tracker is
meant to be deployed, but for this implementation, we will deregister old objects when they cannot
be matched to any existing objects for a total of N subsequent frames.
22
4.2 Realization
The first step is to download the required packages. Installation via pip:
pip install opencv-python

pip install dlib
OpenCV will be used for deep neural network inference, opening video files, writing video files
and displaying output images on our screen.
The dlib library is used to build its correlation tracker.
Then we import our custom classes CentroidTracker and TrackableObject
Figure 15: importing classes and libraries
the VideoStream and FPS modules from imutils.video that allowed us to work with a webcam
and to calculate the estimated Frames Per Second (FPS) throughput rate.
23
Figure 16: construct the argument parse and parse the arguments
These arguments allow us to transfer information to our people counter script from the terminal
at run-time:
prototxt: Path to the Caffe “deploy” prototxt file.

model: The path to the Caffe pre-trained model.
input: Optional input video file path. If no path is specified, the webcam will be utilized.
output: Optional output video path. If no path is specified, a video will not be recorded
confidence: With a default value of 0.4, this is the minimum probability threshold to filter out
weak detection.
skip-frames: skip-frames: The number of frames to skip before running our DNN detector on the
tracked object again. By default, we skip 30 frames between object detection with OpenCV's DNN
module and our single-shot detection model.
Figure 17: initialize the list of class labels
Then we initialize CLASSES — the list of classes that our SSD supports. We can count other
24
moving objects as well (however, if is “Car”, “Bus”, or “bicycle”, but we are only interested in the
“person” class
we also load the pretrained MobileNet SSD used to detect objects and tracking people
Figure 18 : initialization of the video stream
In the first case if a video path was not supplied, we will grab a reference to the ip camera
Otherwise, we will be capturing frames from a video file
Figure 19: finding objects belonging to the “person” class
By looping over the detections, we proceed to capture the confidence and filter out the weak
results, which are not belong to the "person" class.
25
Figure 20: adding the bounding box for the objects
Then we compute the (x, y)-coordinates of the bounding box for the objects
we instantiate our dlib correlation tracker followed by passing in the object’s bounding box
coordinates to dlib.rectangle , storing the result as rect
Subsequently, we start tracking and append the tracker to the trackers list
That’s a wrap for all operations we do every N skip-frames!
Then if the tracking is taking place in the else block:
We update the status to “Tracking” and catch the object position from there we extract the
position coordinates
26
Figure 21 : counting if the person has moved up or down through the frame
if there is an existing Trackable Object, we have to see if the object "person" is moving up or
down, so the difference between the y-coordinate of the current centroid and the mean of the
previous centroids will tell us in which direction the object is moving (negative for 'up' and
positive for 'down')
we take the mean is to ensure that our direction tracking is more stable.
27
Figure 22 : the case if the direction of the person was moving down
After checking that the direction is positive and the centroid is above the centerline, we
increment total down.
Figure 23 : the case if the direction of the person was moving up
The second case if the direction is Negative and the centroid is below the centerline, we
increment Total up
28
Figure 24 :counting the number of people
This time there have been 7 persons who have entered the building and 7 people who have left.
Real-Time alert
If the total number of people exceeded inside the system can send an email alert in real-time to
the staff by using the SMTP protocol which handles sending e-mail and routing e-mail between
mail servers.
Figure 25 : script for the email alert
29
Figure 26 : example of an email alert
Figure 27 Graph of number of visitors in a week
in addition, we can use the people counting data to visualize and analyzing the conversion rate
data, in order to evaluate the performance of stores and other
30
Conclusion
This report presents in an explicit way the course of our internship period which is a part of our master's
degree final project in intelligence processing systems, this internship, which took place in a professional
environment.
In this project were developed a system that can count personnes in real time by using the techniques of
computer vision we ara shuch as Image Classification and detection object Detection .
On the professional side, we learned to work in a team with constraints and instructions to respect,
especially Agile working.
We would like to specify that the internship continues after our graduation because we wish to
achieve the objectives mentioned above, we aspire to :
-strenthening the accurcy of our system

-we will also build a facial recognition system
31
References
[1] G. Howard, M. Zhu, B. Chen et al., “MobileNets: efficient convolutional neural networks for
mobile vision applications,” http://arxiv.org/abs/1704.04861.
[2] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang
Fu, Alexander C. Berg, “SSD: Single Shot MultiBox Detector,”
http://arxiv.org/abs/1512.02325.
[3] Review: SSD — Single Shot Detector (Object Detection),

https://towardsdatascience.com/review-ssd-single-shot-detector-object-detection-
851a94607d11
[4] Zhong-Qiu Zhao, Member, IEEE, Peng Zheng, Shou-tao Xu, and Xindong Wu, Fellow, IEEE
, “Object Detection with Deep Learning” https://arxiv.org/pdf/1807.05511.pdf
[5] smtplib — SMTP protocol client https://docs.python.org/3/library/smtplib.html
[6] "Anaconda," [Online]. Available : https ://www.anaconda.com/
[7] «Jupyter,» [Online]. Available : https ://jupyter.org/.
[8] «Numpy,»[Online]. Available: http ://www.numpy.org/
[9] «OpenCV,»[Online]. Available: https://opencv.org/
32

Pfe Bader El Hajari

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Pfe Bader El Hajari

Transféré par

Droits d'auteur :

Formats disponibles

Université Mohammed V de Rabat

Faculté des Sciences

REAL-TIME SYSTEM FOR PEOPLE

M. Prénom Nom Professeur à la Faculté des Sciences - Rabat Président

Année universitaire 2020-2021

Dans ce rapport de stage sont présentées les différentes étapes de conception et de

J’ai réalisé le système en utilisant des techniques de l'intelligence artificielle et notamment

Ce stage se déroule du 26 avril au 31 octobre2021 et n’est pas encore terminé au moment du

This internship starts from 26 April to 31 October, 2021

SSD Single Shot Multibox Detector

CNN Convolutional Neural Network

DNN Deep Neural Network

SVM Support Vector Machine

R-CNN Region Based Convolutional

HOG Histogram Of Oriented Gradients

FPS Frames Per Second

SMTP Simple Mail Transfer Protoco

I have to do an internship of 6 months. This internship aims to complete my course.

1.2 Presentation of the organization

Figure1 : ICOZ LOGO

Company name ICOZ

Foundation year 2012

Legal form of the company SARL

Capital 15 000 MAD

Adress Rue Al Farouk Al Rahali N

Figure 2 : Technical Sheet of ICOZ

1.2.3 Company Diagram

Figure 3: Company Diagram

1.4 Project goals and objectives

1.5 Agile methodology

1.6 Scrum methodology

Figure 5: scrum methodology diagram

1.7 Communication within the ICOZ team

1.8 Resources provided and/or to be used

Conception of the project

2.2 Artificial intelligence

Figure 6 : Core Elements of Artificial intelligence

Figure 7 : A typical neural network

2.4 Computer vision

2.4.1 Image classification

2.4.2 Object Detection

Figure 8 : Output from SSD Mobilenet Object Detection Model

Figure 9 : Connection of MobileNet and SSD

2.4.2.2 SSD architecture

Figure 10 : SSD architecture

2.4.4 difference between object detection and object

When we use object detection, we determine where an object is located in an image/frame. An

-Give a unique identifier to that particular object

-Track the object as it moves through a video stream

2.4.5 Combine the concept of object detection and object

— Work environment anaconda and its tools and Pycharm

3.1.2 Work environment

Anaconda is a distribution of the Python and R programming

The Jupyter Notebook is an open-source web application that allows you

PyCharm is an integrated development environment (IDE) used

Python is a computer programming language often used to build websites and

Numpy is one of the most commonly used packages for

Pandas is an open source Python package that is most widely

OpenCV is the huge open-source library for the computer vision,

SciPy is an open-source Python library which is used to solve

Is a series of convenience functions to make basic image processing functions such as

Dlib is a modern C++ toolkit containing machine learning

4.1.2 Conception of the project

Detecting Tracking Counting