2023, 38(2): 348-356. doi: 10.16535/j.cnki.dlhyxb.2022-307
关键词: 行为识别 , 深度学习 , 多模态融合 , U-FusionNet , ResNet50 , SENet
Keywords: behavior recognition , deep learning , multimodal fusion , U-FusionNet , ResNet50 , SENet
为解决在光线昏暗、声音与视觉噪声干扰等复杂条件下,单模态鱼类行为识别准确率和召回率低的问题,提出了基于声音和视觉特征多级融合的鱼类行为识别模型U-FusionNet-ResNet50+SENet,该方法采用ResNet50模型提取视觉模态特征,通过MFCC+RestNet50模型提取声音模态特征,并在此基础上设计一种U型融合架构,使不同维度的鱼类视觉和声音特征充分交互,在特征提取的各阶段实现特征融合,最后引入SENet构成关注通道信息特征融合网络,并通过对比试验,采用多模态鱼类行为的合成加噪试验数据验证算法的有效性。结果表明:U-FusionNet-ResNet50+SENet对鱼类行为识别准确率达到93.71%,F1值达到93.43%,召回率达到92.56%,与效果较好的已有模型Intermediate-feature-level deep model相比,召回率、F1值和准确率分别提升了2.35%、3.45%和3.48%。研究表明,所提出的U-FusionNet-ResNet50+SENet识别方法,可有效解决单模态鱼类行为识别准确率低的问题,提升了鱼类行为识别的整体效果,可以有效识别复杂条件下鱼类的游泳、摄食等行为,为真实生产条件下的鱼类行为识别研究提供了新思路和新方法。
为解决在光线昏暗、声音与视觉噪声干扰等复杂条件下,单模态鱼类行为识别准确率和召回率低的问题,提出了基于声音和视觉特征多级融合的鱼类行为识别模型U-FusionNet-ResNet50+SENet,该方法采用ResNet50模型提取视觉模态特征,通过MFCC+RestNet50模型提取声音模态特征,并在此基础上设计一种U型融合架构,使不同维度的鱼类视觉和声音特征充分交互,在特征提取的各阶段实现特征融合,最后引入SENet构成关注通道信息特征融合网络,并通过对比试验,采用多模态鱼类行为的合成加噪试验数据验证算法的有效性。结果表明:U-FusionNet-ResNet50+SENet对鱼类行为识别准确率达到93.71%,F1值达到93.43%,召回率达到92.56%,与效果较好的已有模型Intermediate-feature-level deep model相比,召回率、F1值和准确率分别提升了2.35%、3.45%和3.48%。研究表明,所提出的U-FusionNet-ResNet50+SENet识别方法,可有效解决单模态鱼类行为识别准确率低的问题,提升了鱼类行为识别的整体效果,可以有效识别复杂条件下鱼类的游泳、摄食等行为,为真实生产条件下的鱼类行为识别研究提供了新思路和新方法。
[1] | SAOWAKOON S,SAOWAKOON K,JUTAGATE A,et al.Growth and feeding behavior of fishes in organic rice-fish systems with various species combinations[J].Aquaculture Reports,2021,20:100663. |
[2] | 于红.水产动物目标探测与追踪技术及应用研究进展[J].大连海洋大学学报,2020,35(6):793-804. YU H.Research progress on object detection and tracking techniques utilization in aquaculture:a review[J].Journal of Dalian Ocean University,2020,35(6):793-804.(in Chinese) |
[3] | 张重阳,陈明,冯国富,等.基于多特征融合与机器学习的鱼类摄食行为的检测[J].湖南农业大学学报(自然科学版),2019,45(1):97-102. ZHANG C Y,CHEN M,FENG G F,et al.Detection method of fish feeding behavior based on the multi-feature fusion and the machine learning[J].Journal of Hunan Agricultural University (Natural Sciences Edition),2019,45(1):97-102.(in Chinese) |
[4] | 黄志涛,何佳,宋协法.基于鱼体运动特征和图像纹理特征的鱼类摄食行为识别与量化[J].中国海洋大学学报(自然科学版),2022,52(1):32-41. HUANG Z T,HE J,SONG X F.Recognition and quantification of fish feeding behavior based on motion feature of fish body and image texture[J].Periodical of Ocean University of China(Natural Sciences Edition),2022,52(1):32-41.(in Chinese) |
[5] | 曹晓慧,刘晃.养殖鱼类摄食行为的特征提取研究与应用进展[J].渔业现代化,2021,48(2):1-8. CAO X H,LIU H.Advances in the study and application of feature extraction in feeding behavior of cultured fish[J].Fishery Modernization,2021,48(2):1-8.(in Chinese) |
[6] | 黄汉英,杨咏文,李路,等.基于被动水声信号的淡水鱼混合比例识别[J].农业机械学报,2019,50(10):215-221. HUANG H Y,YANG Y W,LI L,et al.Mixed proportion identification of freshwater fish based on passive underwater acoustic signals[J].Transactions of the Chinese Society for Agricultural Machinery,2019,50(10):215-221.(in Chinese) |
[7] | KIM J S,YOON Y G,HAN D G,et al.Classification of bearded seals signal based on convolutional neural network[J].The Journal of the Acoustical Society of Korea,2022,41(2):235-241. |
[8] | 殷雷明,陈雪忠,张旭光,等.玻璃钢水槽内大黄鱼养殖环境噪声测量与分析[J].海洋渔业,2017,39(3):314-321. YIN L M,CHEN X Z,ZHANG X G,et al.Measurement and analysis of the aquaculture noise for Larimichthys crocea in the fiberglass fish tank[J].Marine Fisheries,2017,39(3):314-321.(in Chinese) |
[9] | 曲蕊,刘晃,庄保陆,等.水产养殖中摄食声学研究进展[J].渔业现代化,2020,47(4):1-6. QU R,LIU H,ZHUANG B L,et al.Research progress of feeding acoustics in aquaculture[J].Fishery Modernization,2020,47(4):1-6.(in Chinese) |
[10] | JIANG X Y,WU F,LI X,et al.Deep compositional cross-modal learning to rank via local-global alignment[C]//Proceedings of the 23rd ACM international conference on multimedia,Brisbane Australia.New York,NY,USA:ACM,2015,41(7):69-78. |
[11] | 范习健,杨绪兵,张礼,等.一种融合视觉和听觉信息的双模态情感识别算法[J].南京大学学报(自然科学版),2021,57(2):309-317. FAN X J,YANG X B,ZHANG L,et al.Emotion recognition based on visual and auditory information[J].Journal of Nanjing University (Natural Science Edition),2021,57(2):309-317.(in Chinese) |
[12] | VENUGOPALAN J,TONG L,HASSANZADEH H R,et al.Multimodal deep learning models for early detection of Alzheimer's disease stage[J].Scientific Reports,2021,11(1):3254. |
[13] | NAGRANI A,YANG S,ARNAB A,et al.Attention bottlenecks for multimodal fusion[J].Advances in Neural Information Processing Systems,2021,34:14200-14213. |
[14] | 张方言,赵梦,周弈志,等.基于ResNet50和迁移学习的红鳍东方鲀病鱼检测方法[J].渔业现代化,2021,48(4):51-60. ZHANG F Y,ZHAO M,ZHOU Y Z,et al.Detection of diseased Takifugu rubripes based on ResNet50 and transfer learning[J].Fishery Modernization,2021,48(4):51-60.(in Chinese) |
[15] | 胥婧雯,于红,李海清,等.基于MFCC和ResNet的鱼类行为识别[J].海洋信息技术与应用,2022(1):21-27. XU J W,YU H,LI H Q,et al.Fish behavior recognition based on MFCC and ResNet[J].Marine Information,2022(1):21-27.(in Chinese) |
[16] | 赵梦,于红,李海清,等.融合SKNet与YOLOv5深度学习的养殖鱼群检测[J].大连海洋大学学报,2022,37(2):312-319. ZHAO M,YU H,LI H Q,et al.Detectionof fish stocks by fused with SKNet and YOLOv5 deep learning[J].Journal of Dalian Ocean University,2022,37(2):312-319.(in Chinese) |
[17] | 罗涛,李剑峰,韩家辉,等.一种基于多模态特征融合的骨质疏松评估方法[J].北京邮电大学学报,2019,42(6):84-90. LUO T,LI J F,HAN J H,et al.Osteoporosis evaluation method based on multimodal feature fusion[J].Journal of Beijing University of Posts and Telecommunications,2019,42(6):84-90.(in Chinese) |
[18] | HU J,SHEN L,ALBANIE S,et al.Squeeze-and-excitation networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(8):2011-2023. |
[1] | 陈晶, 聂青, 刘妍. 《WHO基本药物示范目录》与我国《国家基本药物目录》动态调整程序比较与借鉴.水产学报,2015(3): 289-293.doi:10.3866/PKU.WHXB201503022 |