In the ever-evolving landscape of urban planning and smart city development, a groundbreaking study led by Shailja from the Geographic Information System (GIS) Cell at Motilal Nehru National Institute of Technology Allahabad, Prayagraj, India, is set to revolutionize how we analyze and categorize rooftop structures. Published in the ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences (International Society for Photogrammetry and Remote Sensing Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences), this research introduces a transformer-based deep learning model, the Swin Transformer, to tackle the complexities of rooftop classification.
Rooftop type classification is a critical process that involves identifying and categorizing the structural geometry of building roofs using geospatial data. This information is invaluable for urban analysis, aiding applications such as 3D city modeling, solar potential estimation, infrastructure planning, and post-disaster damage assessment. The challenge, however, lies in handling the intricacies of complex roof shapes, small or similar-looking structures, and variations in roof types, especially in areas with unplanned construction.
Shailja’s research proposes the Swin Transformer model to address these issues. The model was trained on orthophoto-derived GeoTIFF images, categorizing roofs into four types: flat, gable, complex, and bug. Images were resized to 256×256 pixels and processed in batches of 128. The dataset was split into 2528 training images, 544 testing images, and 545 validation images.
The results are impressive. The Transformer architecture achieved an overall test accuracy of 75%, with excellent performance for gable classes, boasting an F1-score of 85.23%, and complex classes achieving an F1-score of 70.75%. While flat and bug classes showed moderate performance due to lower recall, the integration of early stopping and a learning rate scheduler significantly improved precision for these classes. “The Swin Transformer showed improved precision for bugs from 60.94% to 66.67% and flat from 78.41% to 66.93%, while maintaining a comparable overall accuracy of 74.63% and enhancing class balance in predictions,” Shailja explained.
The implications of this research are vast, particularly for the energy sector. Accurate rooftop classification can enhance solar potential estimation, aiding in the development of more efficient and effective solar energy solutions. As cities continue to grow and evolve, the need for precise urban analysis becomes increasingly critical. This technology can also play a pivotal role in post-disaster damage assessment, providing crucial data for recovery and reconstruction efforts.
Shailja’s work not only introduces a robust model for rooftop classification but also sets the stage for future advancements in geospatial analysis and smart city development. By distinguishing roofs from other urban features, this technology can contribute to more accurate and comprehensive urban planning, ultimately shaping the cities of tomorrow.
As we look to the future, the potential applications of the Swin Transformer model are vast. From improving energy efficiency to enhancing disaster response, this research marks a significant step forward in the field of urban analysis. Shailja’s innovative approach highlights the transformative power of deep learning in addressing complex urban challenges, paving the way for smarter, more resilient cities.

