A Novel ViT Model with Wavelet Convolution and SLAttention Modules for Underwater Acoustic Target Recognition
2025
Haoran Guo | Biao Wang | Tao Fang | Biao Liu
Underwater acoustic target recognition (UATR) technology plays a significant role in marine exploration, resource development, and national defense security. To address the limitations of existing methods in computational efficiency and recognition performance, this paper proposes an improved WS-ViT model based on Vision Transformers (ViTs). By introducing the Wavelet Transform Convolution (WTConv) module and the Simplified Linear Attention (SLAttention) module, WS-ViT can effectively extract spatiotemporal complex features, enhance classification accuracy, and significantly reduce computational costs. The model is validated using the ShipsEar dataset, and the results demonstrate that WS-ViT significantly outperforms ResNet18, VGG16, and the classical ViT model in classification accuracy, with improvements of 7.3%, 4.9%, and 2.1%, respectively. Additionally, its training efficiency is improved by 28.4% compared to ViT. This study demonstrates that WS-ViT not only enhances UATR performance but also maintains computational efficiency, providing an innovative solution for efficient and accurate underwater acoustic signal processing.
Afficher plus [+] Moins [-]Informations bibliographiques
Cette notice bibliographique a été fournie par Multidisciplinary Digital Publishing Institute
Découvrez la collection de ce fournisseur de données dans AGRIS