What are you looking for?
51 Résultats pour : « Portes ouvertes »

L'ÉTS vous donne rendez-vous à sa journée portes ouvertes qui aura lieu sur son campus à l'automne et à l'hiver : Samedi 18 novembre 2023 Samedi 17 février 2024 Le dépôt de votre demande d'admission à un programme de baccalauréat ou au cheminement universitaire en technologie sera gratuit si vous étudiez ou détenez un diplôme collégial d'un établissement québécois.

Automated Manufacturing Engineering Research and Innovation Intelligent and Autonomous Systems LIVIA – Imaging, Vision and Artificial Intelligence Laboratory

AI Techniques for Safe Cell Tower Inspection with Drones

Purchased from Gettyimages. Copyright.

Aerial inspection of cell towers has become essential due to several critical factors. As per the Federal Communications Commission, cell towers need regular inspections. With the rise of 5G, even more antennas and cell sites must be investigated regularly. These inspections are not only related to mechanical, electronic and structural defects but also include aspects like asset verification, inventory management, and location mapping for the installation of new hardware.

Traditional manual inspections of towers ranging from 30 to 300 meters in height are fraught with risk. They have led to numerous accidents, including serious injuries and fatalities. Additionally, manual inspections are costly (from $900 to $5000 per tower), time-consuming and often result in incomplete data [1].

To deal with this issue, we can use drones to collect data from different sites. To guide drones for autonomous driving, these must understand the environment. For this purpose, we propose a low-cost object localizer to guide the drone in its autonomous movements around the tower for aerial inspections, offering a safer, more cost-effective and efficient alternative to manual inspection of cell towers. It provides quality data and analysis that helps us, the ground staff, ensure the integrity and functionality of the ever-growing and increasingly complex global cell tower network. Thus, aerial inspection is not just a trend, but an essential development in maintaining and expanding modern communication infrastructure.

Furthermore, to train our localizer, we need human-annotated data (boxes around the object tower and other relevant objects). To deal with this issue, we propose to train localizers using pseudo-labels efficiently harvested from a self-supervised vision model [2]. However, since these models (self-supervised transformers) decompose the scene into multiple maps containing various object parts and do not rely on any explicit supervisory signal, they cannot distinguish between the object of interest and other objects, as required by WSOL. To address this, we propose leveraging the multiple maps generated by the different transformer heads to acquire pseudo-labels for deep WSOL model training.

Annotation Problem in Object Localizers

Deep neural networks have achieved impressive performance in various computer vision tasks, such as object detection and localization [3]. However, training deep neural networks requires bounding box annotations, which indicate the location of an object of interest. Annotating these objects with bounding boxes is both costly and time-consuming, as it requires the expertise of human annotators [3]. To address these challenges, we propose a novel approach that can learn with weak supervision: using image-level labels to describe the objects of interest and learning to predict bounding boxes for these objects within images, as illustrated in the figure below. More specifically, we introduce a method for harvesting pseudo-labels from self-supervised deep neural networks, which are then used to train our object localizer.

Figure 1: Weakly supervised learning paradigm

How We Generated Pseudo-Labels

To obtain pseudo-ground truth masks (pseudo-GTs), we first extracted various attention/activation maps from neural networks (specifically transformers), which were trained in a self-supervised manner, without the use of image-level labels. These maps allowed us to identify object proposals for an object of interest. Moreover, these attention maps consisted of continuous values, hence, we converted them into binary regions and delineated the object proposals. Subsequently, we used each of these object proposals to generate new augmented or perturbed images by applying blurring techniques to areas outside the identified regions.

Following this, all augmented or perturbed images were fed to a pre-trained classifier, and the top-performing maps with the highest classifier scores were selected as pseudo-labels. Subsequently, one map from the top N-selected maps was used to sample a few pixels for the foreground and background regions based on their activation values. These pseudo-pixels were then used to train our localizer network, which helped the drone identify objects of interest. Moreover, training with only a few pseudo-pixels enabled the network to explore different parts of the object [3, 4]. This training methodology allowed us to eliminate the need for human experts to annotate objects of interest in individual images.

Figure 2: Illustration of System Diagram

Results and Conclusion

Our method achieved state-of-the-art performance in tower localization tasks when compared to baseline methods, as demonstrated in the figure below. Furthermore, our model can effectively reduce noise and is capable of generating maps with sharper boundaries compared to the attention maps used to harvest pseudo-labels in our localizer training. In addition to these results, we also conducted experiments with the CUB-200-2011 dataset and demonstrated that our model adapts very well to other domains for localizing an object of interest.

Figure 3: Results of our model on TelDrone dataset

Additional Information

Extended results, analysis and experimental setting can be found in our paper: Discriminative Sampling of Proposals in Self-Supervised Transformers for Weakly Supervised Object Localization. The code to reproduce our results is available online and free of charge at: https://github.com/shakeebmurtaza/dips.

For more information, please refer to the original article.

About the authors
Shakeeb Murtaza, currently pursuing his PhD at the ÉTS LIVIA laboratory, working on the object localization with limited supervision. His primary research thrust involves developing weakly supervised methods for object localization.
Soufiane Belharbi is a post-doctoral fellow at the ÉTS LIVIA Laboratory, in collaboration with the McCaffrey Laboratory/GCRC McGill. He is working on neural network training with weak supervision.
Marco Pedersoli is a professor in the Systems Engineering Department at ÉTS. His research focusses on visual objet detection and pose estimation, weakly and semi-supervised learning, and convolutional and recurrent neural networks.
Aydin Sarraf is a senior data scientist at Ericsson Canada, GAIA Montreal. He received his PhD degree in pure mathematics from University of New Brunswick in 2014. His research interests include machine learning, computer vision, and telecommunication.
Eric Granger is a professor in the Systems Engineering Department at ÉTS. His research focuses on machine learning, pattern recognition, computer vision, information fusion, and adaptive and intelligent systems, with applications in biometrics, affective computing, medical imaging, and video surveillance.