DOI: https://doie.org/10.10399/JBSE.2025132962
Indira, Dr. Sampath A K
GoogLeNet, VGG16, Transformer model, Few-Shot Learning, Sparsemax Attention, Adaptive FOSSA Optimization, SHAP.
The widespread adoption of Internet of Things (IoT) devices in critical domains has necessitated the development of efficient, accurate, and interpretable device type identification systems. To address the limitations of conventional models in resource-constrained environments, we propose a Lightweight and Explainable Ensemble Deep Learning Framework for IoT device type identification. The framework integrates three distinct models GoogLeNet for spatial feature extraction, VGG16 for deep visual representation, and a Transformer for capturing long-range dependencies whose outputs are fused through a dynamic confidence-based fusion strategy that adaptively weighs predictions based on model confidence. Model pruning and quantization are applied to reduce complexity while maintaining performance. A Few-Shot Learning Module is triggered when prediction confidence is low, enabling classification of rare or unseen device types with minimal data. For enhanced interpretability, the framework incorporates Sparsemax Attention and Attribution Loss to generate focused and label-aligned attention maps, and employs SHAP (SHapley Additive exPlanations) to provide both global and local feature attributions, improving transparency and trust in predictions. The ensemble’s hyperparameters are fine-tuned using the Adaptive FOSSA Optimization to accelerate convergence and enhance predictive performance. Implemented in Python, the proposed model demonstrates superior performance compared to existing methods, achieving an accuracy of 0.9899, precision of 0.9903, and notably low FNR of 0.0062 and FPR of 0.0071, making it highly effective for IoT deployments.