设为首页 |  加入收藏
首页首页 期刊简介 消息通知 编委会 电子期刊 投稿须知 广告合作 联系我们
一种基于语音信号的抑郁症智能诊断方法

A new intelligent diagnosis method of depressionbased on audio signal

作者: 辛逸男      吴鹏飞  刘欣阳  刘志宽 
单位:中南民族大学生物医学工程学院(武汉430074)<br />通信作者:张莉。E-mail:zhangli1996@163.com
关键词: 抑郁症智能诊断;短期特征;特征组合;长期特征;随机森林算法 
分类号:R318.04
出版年·卷·期(页码):2023·42·1(38-44)
摘要:

目的 提出一种基于语音特征的机器学习诊断新方法,以实现抑郁症的临床智能诊断。方法 选择抑郁症患者与正常人群的语音信号作为信号源,语音信号特征采取短期特征与长期特征相结合的方法,将短期特征离散化后,分别通过独立组合和共同出现的方法生成组合特征,并结合随机森林算法和极度梯度提升算法进行分类与评估。结果 组合特征作为分类特征相较于短期特征、长期特征以及深度学习的方法在F1分数上绝对提高21%、14%、14%,非抑郁类的敏感度上绝对提高36%、29%、7%。结论 特征组合方法能够根据语音片段对抑郁程度进行很好的分类。

Objective  To propose a new method of machine learning diagnosis based on audio signal and to realize clinical intelligent diagnosis of depression. Methods We selects the audio signals of depressed patients and normal people as signal source, the audio signal feature adopts the method of combining short-term features and long-term features. After discretizing the short-term features, new long-term features are generated through independent combination and co-occurrence methods, and the random forest algorithm and extreme gradient boosting algorithm are combined for classification and evaluation. Results Com-pared with short-term features, long-term features and deep learning approaches, the combined features as classification features have absolute increases in F1 scores of 21%, 14%, and 14%, and absolute in-creases in non-depression sensitivity of 36%, 29%, and 7%.Conclusions The combined features as clas-sification features was able to classify depression levels based on audio signal.

参考文献:

[1] Cohn JF, Kruez TS, Matthews I, et al. Detecting depression from facial actions and vocal prosody[C]// 2009 3rd International Con-ference on Affective Computing and Intelligent Interaction and Workshops.Amsterdam,Netherlands:IEEE, 2009:1-7.
[2] Yang L, Jiang DM, Sahli H. Integrating deep and shallow models for multi-modal depression analysis — hybrid architectures[J]. IEEE Transactions on Affective Computing, 2018,1(12):239-253.
[3] Wang ZY, Chen LX, Wang LF, et al. Recognition of audio depres-sion based on convolutional neural network and generative antago-nism network model[J]. IEEE Access, 2020,8:101181.
[4] France DJ, Shiavi RG, Silverman S, et al. Acoustical properties of speech as indicators of depression and suicidal risk[J]. IEEE Trans-actions on Biomedical Engineering, 2000, 47(7): 829-837.
[5] Schumanna I, Schneidera A, Kanterta C, et al. Physicians' attitudes, diagnostic process and barriers regarding depression diagnosis in primary care: a systematic review of qualitative studies[J]. Family Practice, 2012, 29(3): 255-263.
[6] Quatieri TF, Malyska N. Vocal-source biomarkers for depression: A link to psychomotor activity[C]// Interspeech. Portland, Oregon:ISCA,2012:1059–1062.
[7] Hnig F, Wagner J, Batliner A, et al. Classification of user states with physiological signals: On-line generic features vs. specialized feature sets[C].2009 17th European Signal Processing Conference. Glasgow, Scotland:EUSIPCO,2014:2357-2361.
[8] Horwitz R, Quatieri TF, Helfer BS, et al. On the relative importance of vocal source, system, and prosody in human depression[C]//2013 IEEE International Conference on Body Sensor Networks. Cam-bridge, MA:IEEE, 2013:1-6.
[9] Valstar M, Gratch J, Ringeval F, et al. Depression, mood, and emo-tion recognition workshop and challenge[C]// Proceedings of the 6th international workshop on audio/visual emotion challenge. Am-sterdam, Netherlands:AVEC, 2016:3-10.
[10] Ma XC, Yang HY, Chen Q, et al. DepAudioNet: an efficient deep model for audio based depression classification[C]// Proceedings of the 6th international workshop on audio/visual emotion chal-lenge. New York,NY:Association for Computing Machinery, 2016:35-42.
[11] 李金鸣, 付小雁. 基于深度学习的音频抑郁症识别[J]. 计算机应用与软件, 2019, 36(9): 161-167.
Li J M, Fu X Y. Audio depression recognition based on deep learn-ing[J]. Computer Applications and Software, 2019, 36(9): 161-167.
[12] 曹欣怡, 李鹤, 王蔚. 基于语料库的语音情感识别的性别差异研究[J]. 南京大学学报(自然科学版), 2019, 55(5):758-764.
Cao X Y, Li H, Wang W. A study on gender differences in speech emotion recognition based on corpus[J]. Journal of Nanjing Univer-sity(Natural Sciences), 2019, 55(5):758-764.
[13] Dhingra SS, Kroenke K, Zack MM, et al. PHQ-8 Days: a meas-urement option for DSM-5 major depressive disorder (MDD) sever-ity[J]. Population Health Metrics, 2011, 9(1): 11.
[14] Theodoros G, Gianni P. pyAudioAnalysis: an open-source python library for audio signal analysis[J]. PLoS ONE, 2015, 10(12): 1-17.
[15] Gu XQ, Ni TG, Wang HY. New fuzzy support vector machine for the class imbalance problem in medical datasets classification[J]. The Scientific World Journal, 2014, 2014: 536434.
[16] Cawley G C, Talbot NLC. On over-fitting in model selection and subsequent selection bias in performance evaluation[J]. Journal of Machine Learning Research, 2010, 11(1):2079-2107.
[17] Nilsonne A, Sundberg J. Differences in ability of musicians and nonmusicians to judge emotional state from the fundamental fre-quency of voice samples[J]. Music Perception: An Interdisciplinary Journal, 1985, 2(4): 507-516.
[18] Breiman L , Friedman JH , Olshen RA , et al. Classification and Regression Trees (CART)[J]. Biometrics, 1984, 40(3):358.

服务与反馈:
文章下载】【加入收藏
提示:您还未登录,请登录!点此登录
 
友情链接  
地址:北京安定门外安贞医院内北京生物医学工程编辑部
电话:010-64456508  传真:010-64456661
电子邮箱:llbl910219@126.com