基于深度神经网络及隐马尔科夫模型的生猪状态音频识别

彭硕; 刘东阳; 时国龙; 李广博; 慕京生; 辜丽川; 焦俊

本文已被：浏览 1379次下载 1171次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
基于深度神经网络及隐马尔科夫模型的生猪状态音频识别
彭硕¹,刘东阳¹,时国龙¹,李广博¹,慕京生²,辜丽川¹,焦俊^1*
1.安徽农业大学信息与计算机学院, 合肥 230036;2.蒙城县京徽蒙农业科技发展有限公司, 安徽亳州 233524

摘要:

针对传统音频识别方法在生猪音频信号识别中识别率较低的问题，将深度神经网络及隐马尔可夫模型理论作为生猪音频信号识别依据，以长白猪的吃饭声、发情声、嚎叫声、哼叫声和生病长白猪的喘气声为识别对象，利用卡尔曼滤波和改进的EMD-TEO倒谱距离端点检测算法对生猪音频信号进行预处理，把提取的39维的梅尔频率倒谱系数(Mel-frequency cepstral coefficient，MFCC)作为网络学习和识别的数据集，构建基于深度神经网络及隐马尔科夫模型的生猪状态音频识别模型。试验结果表明:1)隐马尔可夫隐状态数设置为5，深度神经网络隐藏层设置为3层，每层128个节点的深度神经网络-隐马尔可夫模型(Deep neural network-hidden Markov model，DNN-HMM)，对5种生猪状态音频，即吃饭声、嚎叫声、哼叫声、发情声和病猪喘气声的识别率为70%、95%、75%、80%和95%，总体识别率83%；2)相较于传统的高斯混合模型-隐马尔可夫模型(Gaussian mixture model-hidden Markov model，GMM-HMM)，DNN-HMM对相应音频的识别率分别提高了5%、5%、15%、30%、30%，总体识别率提高了17%；3)DNN-HMM模型对于5种不同类型的生猪音频信号均呈现出较好的识别效果。基于DNN-HMM生猪音频识别模型，对生猪不同状态下音频的识别具有较高正确率，且更为可靠。

关键词: 生猪 MFCC 卡尔曼滤波 DNN-HMM 识别音频信号

DOI：10.11841/j.issn.1007-4333.2022.06.16

分类号:

基金项目:安徽省科技重大攻关项目(16030701092)；安徽省2019年度科技重大专项(201903a06020009)

Pig state audio recognition based on deep neural network and hidden Markov model

PENG Shuo¹,LIU Dongyang¹,SHI Guolong¹,LI Guangbo¹,MU Jingsheng²,GU Lichuan¹,JIAO Jun^1*

1.College of Information and Computer Science, Anhui Agricultural University, Hefei 230036, China;2.Mengcheng Jinghui-Meng Agricultural Science and Technology Development Co., Ltd., Bozhou 233524, China

Abstract:

In view of the difficulty and inaccuracy of traditional audio recognition in pig audio signal recognition, deep neural network and hidden Markov model theory were used as the basis for pig audio signal recognition. The eating sound, estrous sound, howling sound, humming sound of landraces and the panting sound of the sick landraces were used as recognition objects. Kalman filter and improved EMD-TEO cepstral distance endpoint detection algorithm were adopted to preprocess pig audio signals, and 39-dimensional mel-frequency cepstral coefficient(MFCC)was extracted as a data set for network learning and recognition. A pig states audio recognition model based on deep neural network and hidden Markov model was constructed. The experimental results showed that: 1)In the deep neural network and hidden Markov models(DNN-HMM)with five hidden states, three hidden layers and 128 nodes, the recognition rates of eating sound, howling sound, humming sound, estrous sound and panting sound of sick pigs were respectively 70%, 95%, 75%, 80% and 95%, and the overall recognition rate was 83%. 2)Compared with the traditional gaussian mixture model-hidden Markov model(GMM-HMM), DNN-HMM improved the recognition rates of corresponding audio by 5%, 5%, 15%, 30% and 30%, respectively. The overall recognition rate increased by 17%; 3)DNN-HMM model showed good recognition effect for 5 different types of pig audio signals. Based on the DNN-HMM pig audio recognition model, the recognition of pig audio in different states had higher accuracy and was more reliable.

Key words: pig MFCC Kalman filter DNN-HMM identification audio signal

引用本文: