Counting people and vehicle traffic using AI and camera footage

Measuring traffic and pedestrian flow using deep learning and tracking algorithms

With advances in deep learning research, object detection from camera footage has become relatively easy. However, counting and measurement require some ingenuity. For the experiment, we analyzed footage from an existing low-resolution 480p security camera and measured the traffic and pedestrian flow in front of Legasis Takamatsu Lab and on a certain street.

Method used this time

We use ResNet-50 for the neural network, and MSCOCO (Microsoft Common Objects in Context) for the training dataset to detect tracks, cars, and pedestrians. For this experiment, we excluded bicycles from the measurements as they are difficult to detect at night with the low-resolution security camera without infrared mode. To facilitate tracking, we are using a GPU equivalent to Nvidia RTX3070 for calculations. The tracking mainly uses a Kalman filter. The tracking algorithm is quite complex, so we will omit the parameter description, but thanks to the GPU, we can detect at 30 FPS, allowing for reliable tracking even with rough tracking settings.

AI Counting Experiment Results

The measurement accuracy during the day was close to 100%. After sunset, the accuracy decreased because the camera used is not a night-vision camera and the footage is dark, but it was sufficient to grasp the trend. In the case of automobile data, accurate measurements were possible even in the dark footage at night. In the location of the camera for this experiment, there were many cars stopping, so it was important to track them accurately to avoid duplicate counting.

We have created a daily chart of pedestrian, car, and truck traffic volumes from December 2021 to October 2022. The ability to easily collect such data is an advantage of AI cameras.

Key Points for Implementing an AI Camera Counter

The camera angle is a crucial factor in object detection. This project is an experiment, so we used an existing camera, but if you plan to use existing camera footage in a production setting, make sure the angle and resolution are suitable for the application.
Decide whether to invest in hardware costs or tracking algorithm costs. In embedded systems, object detection takes time, requiring ingenuity in the tracking algorithm.
Higher camera resolution may seem better, but it also affects detection speed, so plan according to the specifications of the CPU or GPU you are using.
Simply tracking and counting objects will result in a lot of duplicates due to repeated frame in-and-outs of the same object. Therefore, duplicate removal is essential. If the detection speed is slower than 8 FPS, accurate measurement becomes very difficult, so be cautious when using edge computers.