2023, 38(2): 348-356. doi: 10.16535/j.cnki.dlhyxb.2022-307
关键词: 行为识别 , 深度学习 , 多模态融合 , U-FusionNet , ResNet50 , SENet
Keywords: behavior recognition , deep learning , multimodal fusion , U-FusionNet , ResNet50 , SENet
为解决在光线昏暗、声音与视觉噪声干扰等复杂条件下,单模态鱼类行为识别准确率和召回率低的问题,提出了基于声音和视觉特征多级融合的鱼类行为识别模型U-FusionNet-ResNet50+SENet,该方法采用ResNet50模型提取视觉模态特征,通过MFCC+RestNet50模型提取声音模态特征,并在此基础上设计一种U型融合架构,使不同维度的鱼类视觉和声音特征充分交互,在特征提取的各阶段实现特征融合,最后引入SENet构成关注通道信息特征融合网络,并通过对比试验,采用多模态鱼类行为的合成加噪试验数据验证算法的有效性。结果表明:U-FusionNet-ResNet50+SENet对鱼类行为识别准确率达到93.71%,F1值达到93.43%,召回率达到92.56%,与效果较好的已有模型Intermediate-feature-level deep model相比,召回率、F1值和准确率分别提升了2.35%、3.45%和3.48%。研究表明,所提出的U-FusionNet-ResNet50+SENet识别方法,可有效解决单模态鱼类行为识别准确率低的问题,提升了鱼类行为识别的整体效果,可以有效识别复杂条件下鱼类的游泳、摄食等行为,为真实生产条件下的鱼类行为识别研究提供了新思路和新方法。
In order to solve the problem of low accuracy and recall rate of single-mode fish behavior recognition under complex conditions such as dim light, sound and visual noise interference, a multi-level integration of sound and visual features of fish behavior recognition model U-FusionNet-ResNet50+SENet was proposed by ResNet50 model to extract visual modal features.Sound modal characteristics were extracted by MFCC+RestNet50 model.On this basis, a U-shaped fusion architecture was designed to fully interact the visual and sound features of fish behaviors with different dimensions, and to realize feature fusion in each stage of feature extraction. Finally, SENet was introduced to form a feature fusion network of attention channel information, and the effectiveness of the algorithm was verified by the synthetic test data of multi-modal fish behaviors through comparative experiments. The results showed that the accuracy rate of fish behavior recognition by U-FusionNet-ResNet50+SENet reached 93.71%, F1 score 93.43% and recall rate 92.56%.Compared with the existing Intermediate-feature-level deep model with better effect, there was increase in recall rate by 2.35%, F1 value by, 3.45% and accuracy by 3.48%, indicating that the U-FusionNet-ResNet50+SENet recognition method proposed in this paper can effectively solve the problem of low accuracy of single-mode fish behavior recognition, and improve the overall effect of fish behavior recognition.
[1] | SAOWAKOON S,SAOWAKOON K,JUTAGATE A,et al.Growth and feeding behavior of fishes in organic rice-fish systems with various species combinations[J].Aquaculture Reports,2021,20:100663. |
[2] | 于红.水产动物目标探测与追踪技术及应用研究进展[J].大连海洋大学学报,2020,35(6):793-804. YU H.Research progress on object detection and tracking techniques utilization in aquaculture:a review[J].Journal of Dalian Ocean University,2020,35(6):793-804.(in Chinese) |
[3] | 张重阳,陈明,冯国富,等.基于多特征融合与机器学习的鱼类摄食行为的检测[J].湖南农业大学学报(自然科学版),2019,45(1):97-102. ZHANG C Y,CHEN M,FENG G F,et al.Detection method of fish feeding behavior based on the multi-feature fusion and the machine learning[J].Journal of Hunan Agricultural University (Natural Sciences Edition),2019,45(1):97-102.(in Chinese) |
[4] | 黄志涛,何佳,宋协法.基于鱼体运动特征和图像纹理特征的鱼类摄食行为识别与量化[J].中国海洋大学学报(自然科学版),2022,52(1):32-41. HUANG Z T,HE J,SONG X F.Recognition and quantification of fish feeding behavior based on motion feature of fish body and image texture[J].Periodical of Ocean University of China(Natural Sciences Edition),2022,52(1):32-41.(in Chinese) |
[5] | 曹晓慧,刘晃.养殖鱼类摄食行为的特征提取研究与应用进展[J].渔业现代化,2021,48(2):1-8. CAO X H,LIU H.Advances in the study and application of feature extraction in feeding behavior of cultured fish[J].Fishery Modernization,2021,48(2):1-8.(in Chinese) |
[6] | 黄汉英,杨咏文,李路,等.基于被动水声信号的淡水鱼混合比例识别[J].农业机械学报,2019,50(10):215-221. HUANG H Y,YANG Y W,LI L,et al.Mixed proportion identification of freshwater fish based on passive underwater acoustic signals[J].Transactions of the Chinese Society for Agricultural Machinery,2019,50(10):215-221.(in Chinese) |
[7] | KIM J S,YOON Y G,HAN D G,et al.Classification of bearded seals signal based on convolutional neural network[J].The Journal of the Acoustical Society of Korea,2022,41(2):235-241. |
[8] | 殷雷明,陈雪忠,张旭光,等.玻璃钢水槽内大黄鱼养殖环境噪声测量与分析[J].海洋渔业,2017,39(3):314-321. YIN L M,CHEN X Z,ZHANG X G,et al.Measurement and analysis of the aquaculture noise for Larimichthys crocea in the fiberglass fish tank[J].Marine Fisheries,2017,39(3):314-321.(in Chinese) |
[9] | 曲蕊,刘晃,庄保陆,等.水产养殖中摄食声学研究进展[J].渔业现代化,2020,47(4):1-6. QU R,LIU H,ZHUANG B L,et al.Research progress of feeding acoustics in aquaculture[J].Fishery Modernization,2020,47(4):1-6.(in Chinese) |
[10] | JIANG X Y,WU F,LI X,et al.Deep compositional cross-modal learning to rank via local-global alignment[C]//Proceedings of the 23rd ACM international conference on multimedia,Brisbane Australia.New York,NY,USA:ACM,2015,41(7):69-78. |
[11] | 范习健,杨绪兵,张礼,等.一种融合视觉和听觉信息的双模态情感识别算法[J].南京大学学报(自然科学版),2021,57(2):309-317. FAN X J,YANG X B,ZHANG L,et al.Emotion recognition based on visual and auditory information[J].Journal of Nanjing University (Natural Science Edition),2021,57(2):309-317.(in Chinese) |
[12] | VENUGOPALAN J,TONG L,HASSANZADEH H R,et al.Multimodal deep learning models for early detection of Alzheimer's disease stage[J].Scientific Reports,2021,11(1):3254. |
[13] | NAGRANI A,YANG S,ARNAB A,et al.Attention bottlenecks for multimodal fusion[J].Advances in Neural Information Processing Systems,2021,34:14200-14213. |
[14] | 张方言,赵梦,周弈志,等.基于ResNet50和迁移学习的红鳍东方鲀病鱼检测方法[J].渔业现代化,2021,48(4):51-60. ZHANG F Y,ZHAO M,ZHOU Y Z,et al.Detection of diseased Takifugu rubripes based on ResNet50 and transfer learning[J].Fishery Modernization,2021,48(4):51-60.(in Chinese) |
[15] | 胥婧雯,于红,李海清,等.基于MFCC和ResNet的鱼类行为识别[J].海洋信息技术与应用,2022(1):21-27. XU J W,YU H,LI H Q,et al.Fish behavior recognition based on MFCC and ResNet[J].Marine Information,2022(1):21-27.(in Chinese) |
[16] | 赵梦,于红,李海清,等.融合SKNet与YOLOv5深度学习的养殖鱼群检测[J].大连海洋大学学报,2022,37(2):312-319. ZHAO M,YU H,LI H Q,et al.Detectionof fish stocks by fused with SKNet and YOLOv5 deep learning[J].Journal of Dalian Ocean University,2022,37(2):312-319.(in Chinese) |
[17] | 罗涛,李剑峰,韩家辉,等.一种基于多模态特征融合的骨质疏松评估方法[J].北京邮电大学学报,2019,42(6):84-90. LUO T,LI J F,HAN J H,et al.Osteoporosis evaluation method based on multimodal feature fusion[J].Journal of Beijing University of Posts and Telecommunications,2019,42(6):84-90.(in Chinese) |
[18] | HU J,SHEN L,ALBANIE S,et al.Squeeze-and-excitation networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(8):2011-2023. |
[1] | 陈晶, 聂青, 刘妍. 《WHO基本药物示范目录》与我国《国家基本药物目录》动态调整程序比较与借鉴.水产学报,2015(3): 289-293.doi:10.3866/PKU.WHXB201503022 |