Progressive Resolution Training with SWA and Test-Time Augmentation for Robust Road Segmentation in Remote Sensing Imagery

    DOI: https://doie.org/10.10399/JBSE.2026538738

    Nidhi Singh, Aditi Sharma, Divyanshu Chauhan



    Abstract:

    Accurate extraction of road networks from high-resolution satellite imagery is inherently difficult for use in spatial applications due primarily to their complex topography and the extreme class imbalance associated with them, which typically leads to fragmentation of the resultant road segments. In order to overcome these structural challenges, we propose a U-Net multi-stage encoder-decoder architecture, which as part of a multi-encoder benchmark for evaluating the performance of several different architectures, uses the Focal Tversky Loss as a loss function to balance the extreme class imbalances associated with road networks and to provide increased penalties for false negatives. The proposed methodology employs a progressive resolution training approach, which has been established to improve generalization performance, as well as provide a significant increase in late-stage convergence when combined with Stochastic Weight Averaging (SWA), as well as to include SWA as part of the optimization process during the training phase. Finally, during the inference phase, all geometrically transformed predictions will be aggregated through a full Test-Time Augmentation (TTA), with an additional post-processing step, in which morphological operations will be applied to each prediction, in order to ensure the continuity of structures and remove noise artifacts from the predictions. The experimental results show that the proposed framework outperformed all other benchmark systems and confirmed through an ablation study that each incrementally added component contributed to the overall increase in performance, when compared to a baseline model. The most effective EfficientNet-B7 configuration results in an Intersection over Union (IoU) of 82.99%, a Dice coefficient of 90.68%, and F2 score 90.30%, thus providing an extremely accurate automated road network extraction approach with great geometric consistency


    PDF

Indexed By