首页 >  大连海洋大学学报 >  基于声音与视觉特征多级融合的鱼类行为识别模型U-FusionNet-ResNet50+SENet

2023, 38(2): 348-356. doi: 10.16535/j.cnki.dlhyxb.2022-307

基于声音与视觉特征多级融合的鱼类行为识别模型U-FusionNet-ResNet50+SENet

1. 大连海洋大学 信息工程学院, 辽宁省海洋信息技术重点实验室, 辽宁 大连 116023;

2. 设施渔业教育部重点实验室(大连海洋大学), 辽宁 大连 116023;

3. 大连海洋大学 水产与生命学院, 辽宁 大连 116023

收稿日期:2022-10-13

基金项目:   辽宁省教育厅重点科研项目(LJKZ0729)  国家自然科学基金(31972846) 

关键词: 行为识别 , 深度学习 , 多模态融合 , U-FusionNet , ResNet50 , SENet

A fish behavior recognition model based on multi-level fusion of sound and vision U-fusionNet-ResNet50+SENet

1. Key Laboratory of Marine Information Technology of Liaoning Province, College of Information Engineering, Dalian Ocean University, Dalian 116023, China;

2. Key Laboratory of Environment Controlled Aquaculture (Dalian Ocean University), Ministry of Education, Dalian 116023, China;

3. College of Fisheries and Life Science, Dalian Ocean University, Dalian 116023, China

Received Date:2022-10-13

Keywords: behavior recognition , deep learning , multimodal fusion , U-FusionNet , ResNet50 , SENet

摘要

为解决在光线昏暗、声音与视觉噪声干扰等复杂条件下,单模态鱼类行为识别准确率和召回率低的问题,提出了基于声音和视觉特征多级融合的鱼类行为识别模型U-FusionNet-ResNet50+SENet,该方法采用ResNet50模型提取视觉模态特征,通过MFCC+RestNet50模型提取声音模态特征,并在此基础上设计一种U型融合架构,使不同维度的鱼类视觉和声音特征充分交互,在特征提取的各阶段实现特征融合,最后引入SENet构成关注通道信息特征融合网络,并通过对比试验,采用多模态鱼类行为的合成加噪试验数据验证算法的有效性。结果表明:U-FusionNet-ResNet50+SENet对鱼类行为识别准确率达到93.71%,F1值达到93.43%,召回率达到92.56%,与效果较好的已有模型Intermediate-feature-level deep model相比,召回率、F1值和准确率分别提升了2.35%、3.45%和3.48%。研究表明,所提出的U-FusionNet-ResNet50+SENet识别方法,可有效解决单模态鱼类行为识别准确率低的问题,提升了鱼类行为识别的整体效果,可以有效识别复杂条件下鱼类的游泳、摄食等行为,为真实生产条件下的鱼类行为识别研究提供了新思路和新方法。

为解决在光线昏暗、声音与视觉噪声干扰等复杂条件下,单模态鱼类行为识别准确率和召回率低的问题,提出了基于声音和视觉特征多级融合的鱼类行为识别模型U-FusionNet-ResNet50+SENet,该方法采用ResNet50模型提取视觉模态特征,通过MFCC+RestNet50模型提取声音模态特征,并在此基础上设计一种U型融合架构,使不同维度的鱼类视觉和声音特征充分交互,在特征提取的各阶段实现特征融合,最后引入SENet构成关注通道信息特征融合网络,并通过对比试验,采用多模态鱼类行为的合成加噪试验数据验证算法的有效性。结果表明:U-FusionNet-ResNet50+SENet对鱼类行为识别准确率达到93.71%,F1值达到93.43%,召回率达到92.56%,与效果较好的已有模型Intermediate-feature-level deep model相比,召回率、F1值和准确率分别提升了2.35%、3.45%和3.48%。研究表明,所提出的U-FusionNet-ResNet50+SENet识别方法,可有效解决单模态鱼类行为识别准确率低的问题,提升了鱼类行为识别的整体效果,可以有效识别复杂条件下鱼类的游泳、摄食等行为,为真实生产条件下的鱼类行为识别研究提供了新思路和新方法。

参考文献

[1] SAOWAKOON S,SAOWAKOON K,JUTAGATE A,et al.Growth and feeding behavior of fishes in organic rice-fish systems with various species combinations[J].Aquaculture Reports,2021,20:100663.
[2] 于红.水产动物目标探测与追踪技术及应用研究进展[J].大连海洋大学学报,2020,35(6):793-804. YU H.Research progress on object detection and tracking techniques utilization in aquaculture:a review[J].Journal of Dalian Ocean University,2020,35(6):793-804.(in Chinese)
[3] 张重阳,陈明,冯国富,等.基于多特征融合与机器学习的鱼类摄食行为的检测[J].湖南农业大学学报(自然科学版),2019,45(1):97-102. ZHANG C Y,CHEN M,FENG G F,et al.Detection method of fish feeding behavior based on the multi-feature fusion and the machine learning[J].Journal of Hunan Agricultural University (Natural Sciences Edition),2019,45(1):97-102.(in Chinese)
[4] 黄志涛,何佳,宋协法.基于鱼体运动特征和图像纹理特征的鱼类摄食行为识别与量化[J].中国海洋大学学报(自然科学版),2022,52(1):32-41. HUANG Z T,HE J,SONG X F.Recognition and quantification of fish feeding behavior based on motion feature of fish body and image texture[J].Periodical of Ocean University of China(Natural Sciences Edition),2022,52(1):32-41.(in Chinese)
[5] 曹晓慧,刘晃.养殖鱼类摄食行为的特征提取研究与应用进展[J].渔业现代化,2021,48(2):1-8. CAO X H,LIU H.Advances in the study and application of feature extraction in feeding behavior of cultured fish[J].Fishery Modernization,2021,48(2):1-8.(in Chinese)
[6] 黄汉英,杨咏文,李路,等.基于被动水声信号的淡水鱼混合比例识别[J].农业机械学报,2019,50(10):215-221. HUANG H Y,YANG Y W,LI L,et al.Mixed proportion identification of freshwater fish based on passive underwater acoustic signals[J].Transactions of the Chinese Society for Agricultural Machinery,2019,50(10):215-221.(in Chinese)
[7] KIM J S,YOON Y G,HAN D G,et al.Classification of bearded seals signal based on convolutional neural network[J].The Journal of the Acoustical Society of Korea,2022,41(2):235-241.
[8] 殷雷明,陈雪忠,张旭光,等.玻璃钢水槽内大黄鱼养殖环境噪声测量与分析[J].海洋渔业,2017,39(3):314-321. YIN L M,CHEN X Z,ZHANG X G,et al.Measurement and analysis of the aquaculture noise for Larimichthys crocea in the fiberglass fish tank[J].Marine Fisheries,2017,39(3):314-321.(in Chinese)
[9] 曲蕊,刘晃,庄保陆,等.水产养殖中摄食声学研究进展[J].渔业现代化,2020,47(4):1-6. QU R,LIU H,ZHUANG B L,et al.Research progress of feeding acoustics in aquaculture[J].Fishery Modernization,2020,47(4):1-6.(in Chinese)
[10] JIANG X Y,WU F,LI X,et al.Deep compositional cross-modal learning to rank via local-global alignment[C]//Proceedings of the 23rd ACM international conference on multimedia,Brisbane Australia.New York,NY,USA:ACM,2015,41(7):69-78.
[11] 范习健,杨绪兵,张礼,等.一种融合视觉和听觉信息的双模态情感识别算法[J].南京大学学报(自然科学版),2021,57(2):309-317. FAN X J,YANG X B,ZHANG L,et al.Emotion recognition based on visual and auditory information[J].Journal of Nanjing University (Natural Science Edition),2021,57(2):309-317.(in Chinese)
[12] VENUGOPALAN J,TONG L,HASSANZADEH H R,et al.Multimodal deep learning models for early detection of Alzheimer's disease stage[J].Scientific Reports,2021,11(1):3254.
[13] NAGRANI A,YANG S,ARNAB A,et al.Attention bottlenecks for multimodal fusion[J].Advances in Neural Information Processing Systems,2021,34:14200-14213.
[14] 张方言,赵梦,周弈志,等.基于ResNet50和迁移学习的红鳍东方鲀病鱼检测方法[J].渔业现代化,2021,48(4):51-60. ZHANG F Y,ZHAO M,ZHOU Y Z,et al.Detection of diseased Takifugu rubripes based on ResNet50 and transfer learning[J].Fishery Modernization,2021,48(4):51-60.(in Chinese)
[15] 胥婧雯,于红,李海清,等.基于MFCC和ResNet的鱼类行为识别[J].海洋信息技术与应用,2022(1):21-27. XU J W,YU H,LI H Q,et al.Fish behavior recognition based on MFCC and ResNet[J].Marine Information,2022(1):21-27.(in Chinese)
[16] 赵梦,于红,李海清,等.融合SKNet与YOLOv5深度学习的养殖鱼群检测[J].大连海洋大学学报,2022,37(2):312-319. ZHAO M,YU H,LI H Q,et al.Detectionof fish stocks by fused with SKNet and YOLOv5 deep learning[J].Journal of Dalian Ocean University,2022,37(2):312-319.(in Chinese)
[17] 罗涛,李剑峰,韩家辉,等.一种基于多模态特征融合的骨质疏松评估方法[J].北京邮电大学学报,2019,42(6):84-90. LUO T,LI J F,HAN J H,et al.Osteoporosis evaluation method based on multimodal feature fusion[J].Journal of Beijing University of Posts and Telecommunications,2019,42(6):84-90.(in Chinese)
[18] HU J,SHEN L,ALBANIE S,et al.Squeeze-and-excitation networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(8):2011-2023.

相关文章

[1] 陈晶, 聂青, 刘妍. 《WHO基本药物示范目录》与我国《国家基本药物目录》动态调整程序比较与借鉴.水产学报,2015(3): 289-293.doi:10.3866/PKU.WHXB201503022
  • 导出引用
  • 下载XML
  • 收藏文章
计量
  • 文章下载量()
  • 文章访问量()

目录

基于声音与视觉特征多级融合的鱼类行为识别模型U-FusionNet-ResNet50+SENet