A fish behavior recognition model based on multi-level fusion of sound and vision U-fusionNet-ResNet50+SENet
-
摘要: 为解决在光线昏暗、声音与视觉噪声干扰等复杂条件下,单模态鱼类行为识别准确率和召回率低的问题,提出了基于声音和视觉特征多级融合的鱼类行为识别模型U-FusionNet-ResNet50+SENet,该方法采用ResNet50模型提取视觉模态特征,通过MFCC+RestNet50模型提取声音模态特征,并在此基础上设计一种U型融合架构,使不同维度的鱼类视觉和声音特征充分交互,在特征提取的各阶段实现特征融合,最后引入SENet构成关注通道信息特征融合网络,并通过对比试验,采用多模态鱼类行为的合成加噪试验数据验证算法的有效性。结果表明:U-FusionNet-ResNet50+SENet对鱼类行为识别准确率达到93.71%,F1值达到93.43%,召回率达到92.56%,与效果较好的已有模型Intermediate-feature-level deep model相比,召回率、F1值和准确率分别提升了2.35%、3.45%和3.48%。研究表明,所提出的U-FusionNet-ResNet50+SENet识别方法,可有效解决单模态鱼类行为识别准确率低的问题,提升了鱼类行为识别的整体效果,可以有效识别复杂条件下鱼类的游泳、摄食等行为,为真实生产条件下的鱼类行为识别研究提供了新思路和新方法。Abstract: In order to solve the problem of low accuracy and recall rate of single-mode fish behavior recognition under complex conditions such as dim light, sound and visual noise interference, a multi-level integration of sound and visual features of fish behavior recognition model U-FusionNet-ResNet50+SENet was proposed by ResNet50 model to extract visual modal features.Sound modal characteristics were extracted by MFCC+RestNet50 model.On this basis, a U-shaped fusion architecture was designed to fully interact the visual and sound features of fish behaviors with different dimensions, and to realize feature fusion in each stage of feature extraction. Finally, SENet was introduced to form a feature fusion network of attention channel information, and the effectiveness of the algorithm was verified by the synthetic test data of multi-modal fish behaviors through comparative experiments. The results showed that the accuracy rate of fish behavior recognition by U-FusionNet-ResNet50+SENet reached 93.71%, F1 score 93.43% and recall rate 92.56%.Compared with the existing Intermediate-feature-level deep model with better effect, there was increase in recall rate by 2.35%, F1 value by, 3.45% and accuracy by 3.48%, indicating that the U-FusionNet-ResNet50+SENet recognition method proposed in this paper can effectively solve the problem of low accuracy of single-mode fish behavior recognition, and improve the overall effect of fish behavior recognition.
-
Key words:
- behavior recognition /
- deep learning /
- multimodal fusion /
- U-FusionNet /
- ResNet50 /
- SENet
-
SAOWAKOON S,SAOWAKOON K,JUTAGATE A,et al.Growth and feeding behavior of fishes in organic rice-fish systems with various species combinations[J].Aquaculture Reports,2021,20:100663.
YU H.Research progress on object detection and tracking techniques utilization in aquaculture:a review[J].Journal of Dalian Ocean University,2020,35(6):793-804.(in Chinese)
CAO X H,LIU H.Advances in the study and application of feature extraction in feeding behavior of cultured fish[J].Fishery Modernization,2021,48(2):1-8.(in Chinese)
HUANG H Y,YANG Y W,LI L,et al.Mixed proportion identification of freshwater fish based on passive underwater acoustic signals[J].Transactions of the Chinese Society for Agricultural Machinery,2019,50(10):215-221.(in Chinese)
KIM J S,YOON Y G,HAN D G,et al.Classification of bearded seals signal based on convolutional neural network[J].The Journal of the Acoustical Society of Korea,2022,41(2):235-241.
YIN L M,CHEN X Z,ZHANG X G,et al.Measurement and analysis of the aquaculture noise for Larimichthys crocea in the fiberglass fish tank[J].Marine Fisheries,2017,39(3):314-321.(in Chinese)
QU R,LIU H,ZHUANG B L,et al.Research progress of feeding acoustics in aquaculture[J].Fishery Modernization,2020,47(4):1-6.(in Chinese)
JIANG X Y,WU F,LI X,et al.Deep compositional cross-modal learning to rank via local-global alignment[C]//Proceedings of the 23rd ACM international conference on multimedia,Brisbane Australia.New York,NY,USA:ACM,2015,41(7):69-78.
VENUGOPALAN J,TONG L,HASSANZADEH H R,et al.Multimodal deep learning models for early detection of Alzheimer's disease stage[J].Scientific Reports,2021,11(1):3254.
NAGRANI A,YANG S,ARNAB A,et al.Attention bottlenecks for multimodal fusion[J].Advances in Neural Information Processing Systems,2021,34:14200-14213.
ZHANG F Y,ZHAO M,ZHOU Y Z,et al.Detection of diseased Takifugu rubripes based on ResNet50 and transfer learning[J].Fishery Modernization,2021,48(4):51-60.(in Chinese)
(1):21-27.(in Chinese)
ZHAO M,YU H,LI H Q,et al.Detectionof fish stocks by fused with SKNet and YOLOv5 deep learning[J].Journal of Dalian Ocean University,2022,37(2):312-319.(in Chinese)
LUO T,LI J F,HAN J H,et al.Osteoporosis evaluation method based on multimodal feature fusion[J].Journal of Beijing University of Posts and Telecommunications,2019,42(6):84-90.(in Chinese)
HU J,SHEN L,ALBANIE S,et al.Squeeze-and-excitation networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(8):2011-2023.
计量
- 文章访问数: 247
- PDF下载数: 0
- 施引文献: 0