A comparison of high precision F0 extraction algorithms for sustained vowels
Perturbation analysis of sustained vowel waveforms is used routinely in the clinical evaluation of pathological voices and in monitoring patient progress during treatment. Accurate estimation of voice fundamental frequency (F0) is essential for accurate perturbation analysis. Several algorithms have...
Saved in:
| Published in: | Journal of speech, language, and hearing research Vol. 42; no. 1; pp. 112 - 126 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
United States
01.02.1999
|
| Subjects: | |
| ISSN: | 1092-4388 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Perturbation analysis of sustained vowel waveforms is used routinely in the clinical evaluation of pathological voices and in monitoring patient progress during treatment. Accurate estimation of voice fundamental frequency (F0) is essential for accurate perturbation analysis. Several algorithms have been proposed for fundamental frequency extraction. To be appropriate for clinical use, a key consideration is that an F0 extraction algorithm be robust to such extraneous factors as the presence of noise and modulations in voice frequency and amplitude that are commonly associated with the voice pathologies under study. This work examines the performance of seven F0 algorithms, based on the average magnitude difference function (AMDF), the input autocorrelation function (AC), the autocorrelation function of the center-clipped signal (ACC), the autocorrelation function of the inverse filtered signal (IFAC), the signal cepstrum (CEP), the Harmonic Product Spectrum (HPS) of the signal, and the waveform matching function (WM) respectively. These algorithms were evaluated using sustained vowel samples collected from normal and pathological subjects. The effect of background noise and of frequency and amplitude modulations on these algorithms was also investigated, using synthetic vowel waveforms. |
|---|---|
| AbstractList | Perturbation analysis of sustained vowel waveforms is used routinely in the clinical evaluation of pathological voices and in monitoring patient progress during treatment. Accurate estimation of voice fundamental frequency (F0) is essential for accurate perturbation analysis. Several algorithms have been proposed for fundamental frequency extraction. To be appropriate for clinical use, a key consideration is that an F0 extraction algorithm be robust to such extraneous factors as the presence of noise and modulations in voice frequency and amplitude that are commonly associated with the voice pathologies under study. This work examines the performance of seven F0 algorithms, based on the average magnitude difference function (AMDF), the input autocorrelation function (AC), the autocorrelation function of the center-clipped signal (ACC), the autocorrelation function of the inverse filtered signal (IFAC), the signal cepstrum (CEP), the Harmonic Product Spectrum (HPS) of the signal, and the waveform matching function (WM) respectively. These algorithms were evaluated using sustained vowel samples collected from normal and pathological subjects. The effect of background noise and of frequency and amplitude modulations on these algorithms was also investigated, using synthetic vowel waveforms.Perturbation analysis of sustained vowel waveforms is used routinely in the clinical evaluation of pathological voices and in monitoring patient progress during treatment. Accurate estimation of voice fundamental frequency (F0) is essential for accurate perturbation analysis. Several algorithms have been proposed for fundamental frequency extraction. To be appropriate for clinical use, a key consideration is that an F0 extraction algorithm be robust to such extraneous factors as the presence of noise and modulations in voice frequency and amplitude that are commonly associated with the voice pathologies under study. This work examines the performance of seven F0 algorithms, based on the average magnitude difference function (AMDF), the input autocorrelation function (AC), the autocorrelation function of the center-clipped signal (ACC), the autocorrelation function of the inverse filtered signal (IFAC), the signal cepstrum (CEP), the Harmonic Product Spectrum (HPS) of the signal, and the waveform matching function (WM) respectively. These algorithms were evaluated using sustained vowel samples collected from normal and pathological subjects. The effect of background noise and of frequency and amplitude modulations on these algorithms was also investigated, using synthetic vowel waveforms. Perturbation analysis of sustained vowel waveforms is used routinely in the clinical evaluation of pathological voices and in monitoring patient progress during treatment. Accurate estimation of voice fundamental frequency (F0) is essential for accurate perturbation analysis. Several algorithms have been proposed for fundamental frequency extraction. To be appropriate for clinical use, a key consideration is that an F0 extraction algorithm be robust to such extraneous factors as the presence of noise and modulations in voice frequency and amplitude that are commonly associated with the voice pathologies under study. This work examines the performance of seven F0 algorithms, based on the average magnitude difference function (AMDF), the input autocorrelation function (AC), the autocorrelation function of the center-clipped signal (ACC), the autocorrelation function of the inverse filtered signal (IFAC), the signal cepstrum (CEP), the Harmonic Product Spectrum (HPS) of the signal, and the waveform matching function (WM) respectively. These algorithms were evaluated using sustained vowel samples collected from normal and pathological subjects. The effect of background noise and of frequency and amplitude modulations on these algorithms was also investigated, using synthetic vowel waveforms. The performance of seven short-term average methods for fundamental frequency estimation was evaluated using synthetic vowels & normal & pathologic voice samples obtained from the Massachusetts Eye & Ear Infirmary voice database. Algorithms included the average magnitude difference function, the autocorrelation function, the center-clipped autocorrelation function, the inverse-filtered autocorrelation function, waveform matching, cepstrum, & harmonic product spectrum. A brief description of each algorithm was provided. Synthetic vowel waveform conditions were as follows: (1) fixed fundamental frequency, no shimmer, no background noise; (2) fixed fundamental frequency, no shimmer, varying levels of background noise; (3) fixed fundamental frequency, variable shimmer, no background noise; & (4) variable fundamental frequency, no shimmer, no background noise. Time domain methods were found to produce better fundamental frequency estimates of synthetic vowels than frequency domain techniques but frequency domain methods responded better to amplitude changes in the input waveform. Best performance for synthetic, normal, & pathological vowels was found with the waveform matching algorithm. 2 Tables, 6 Figures, 15 References. D. Taylor |
| Author | Jamieson, D G Parsa, V |
| Author_xml | – sequence: 1 givenname: V surname: Parsa fullname: Parsa, V organization: Hearing Health Care Research Unit, The University of Western Ontario, London, Canada – sequence: 2 givenname: D G surname: Jamieson fullname: Jamieson, D G |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/10025548$$D View this record in MEDLINE/PubMed |
| BookMark | eNqFkDtPwzAUhT0U0QesjMgTW4qfqTNWFS1IkVhgjhznunGVxMFOePx7gigzZzm6R5-OdM8SzTrfAUI3lKwpEeL-FJs6rAUjdE0pm6EFJRlLBFdqjpYxnsgkKtJLNKeEMCmFWqB8i41vex1c9B32FtfuWOM-gHHRTcmeYPgcgjbDz6Wbow9uqNuIrQ84jnHQroMKv_sPaOIVurC6iXB99hV63T-87B6T_PnwtNvmSc-4GhKTlkxxnVowNKWZBaWBc2OVkFpyoBuhuai0lWWprKiIqlSmBU-FYRSoFHyF7n57--DfRohD0bpooGl0B36MhZKSbaZn_wXTTCpGxGYCb8_gWLZQFX1wrQ5fxd9Q_Btj9msB |
| ContentType | Journal Article |
| DBID | CGR CUY CVF ECM EIF NPM 7X8 8BM 7T9 |
| DOI | 10.1044/jslhr.4201.112 |
| DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic ComDisDome Linguistics and Language Behavior Abstracts (LLBA) |
| DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) ComDisDome MEDLINE - Academic Linguistics and Language Behavior Abstracts (LLBA) |
| DatabaseTitleList | ComDisDome MEDLINE Linguistics and Language Behavior Abstracts (LLBA) |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Medicine Languages & Literatures Social Welfare & Social Work |
| EndPage | 126 |
| ExternalDocumentID | 10025548 |
| Genre | Research Support, Non-U.S. Gov't Journal Article Comparative Study |
| GroupedDBID | --- --Z -W8 -~X .GJ .GO 0-V 04C 0R~ 186 18M 1HT 29L 36B 3EH 3V. 4.4 53G 5GY 6NX 6PF 7RV 7X7 85S 88E 88I 8A4 8AF 8FI 8FJ 8G5 8R4 8R5 AAHSB AAWTL AAYRB ABDBF ABDPE ABIVO ABOPQ ABPPZ ABTAH ABUWG ABWJO ABZEH ACGFO ACGOD ACHQT ACNCT ACUHS ACUXI ADBBV ADOJX AENEX AERSA AFKRA AGHSJ AHMBA AI. AIKWM ALIPV ALMA_UNASSIGNED_HOLDINGS ALSLI ARALO AZQEC BCR BENPR BKEYQ BLC BMSDO BPHCQ BVXVI CCPQU CGR CJNVE CPGLG CRLPW CS3 CUY CVF DU5 DWQXO EAD EAP EAS EBD EBO EBS ECE ECF ECM ECT EDJ EIF EIHBH EJD EMB EMK EMOBN ESX EX3 F5P F9R FJW FYUFA GNUQQ GUQSH H13 HCIFZ HMCUK HZ~ H~9 I-F IAO ICO IEA IER IHR IHW IN- INH INIJC INR IOF IPO IPY ITC M0P M1P M2M M2O M2P M2Q M2R MLAFT MVM NAPCQ NPM O9- OHT P-O P2P PADUT PCD PEA PQEDU PQQKQ PROAC PSQYO PSYQQ Q2X QF4 QM7 QN7 QO4 QO5 RWL S0X S10 SJA SV3 TAE TH9 TN5 TUS TWZ UHB UKHRP UPT VH1 VJK VQA VXZ WH7 WOW WQ9 YCJ YQT YR5 Z5M ZCA ZCG ZHY ZXP ZY4 7X8 8BM ABUFD AFFHD PHGZM PHGZT PJZUB PPXIY PRQQA 7T9 |
| ID | FETCH-LOGICAL-p238t-c6b283a6fec1619fe8ae33cf845a53e174a34daf5bb8f4d08d89a4364c21e1543 |
| ISICitedReferencesCount | 34 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000078352700009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1092-4388 |
| IngestDate | Sun Nov 09 10:04:54 EST 2025 Sun Nov 09 09:21:17 EST 2025 Wed Feb 19 02:33:23 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-p238t-c6b283a6fec1619fe8ae33cf845a53e174a34daf5bb8f4d08d89a4364c21e1543 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2 |
| PMID | 10025548 |
| PQID | 69582047 |
| PQPubID | 23479 |
| PageCount | 15 |
| ParticipantIDs | proquest_miscellaneous_85527000 proquest_miscellaneous_69582047 pubmed_primary_10025548 |
| PublicationCentury | 1900 |
| PublicationDate | 1999-02-01 |
| PublicationDateYYYYMMDD | 1999-02-01 |
| PublicationDate_xml | – month: 02 year: 1999 text: 1999-02-01 day: 01 |
| PublicationDecade | 1990 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Journal of speech, language, and hearing research |
| PublicationTitleAlternate | J Speech Lang Hear Res |
| PublicationYear | 1999 |
| SSID | ssj0000146 |
| Score | 1.7041701 |
| Snippet | Perturbation analysis of sustained vowel waveforms is used routinely in the clinical evaluation of pathological voices and in monitoring patient progress... The performance of seven short-term average methods for fundamental frequency estimation was evaluated using synthetic vowels & normal & pathologic voice... |
| SourceID | proquest pubmed |
| SourceType | Aggregation Database Index Database |
| StartPage | 112 |
| SubjectTerms | Algorithms Fundamental Frequency Humans Models, Biological Noise Noise - adverse effects Phonetics Speech Acoustics Speech Pathology Speech Production Measurement Speech Synthesis Voice Disorders - diagnosis Voice Quality Vowels |
| Title | A comparison of high precision F0 extraction algorithms for sustained vowels |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/10025548 https://www.proquest.com/docview/69582047 https://www.proquest.com/docview/85527000 |
| Volume | 42 |
| WOSCitedRecordID | wos000078352700009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVPQU databaseName: Education Database issn: 1092-4388 databaseCode: M0P dateStart: 19970401 customDbUrl: isFulltext: true dateEnd: 20091231 titleUrlDefault: https://search.proquest.com/education omitProxy: false ssIdentifier: ssj0000146 providerName: ProQuest – providerCode: PRVPQU databaseName: Linguistics Database issn: 1092-4388 databaseCode: CRLPW dateStart: 19970401 customDbUrl: isFulltext: true dateEnd: 20091231 titleUrlDefault: https://search.proquest.com/linguistics omitProxy: false ssIdentifier: ssj0000146 providerName: ProQuest – providerCode: PRVPQU databaseName: Nursing & Allied Health Database issn: 1092-4388 databaseCode: 7RV dateStart: 19970401 customDbUrl: isFulltext: true dateEnd: 20091231 titleUrlDefault: https://search.proquest.com/nahs omitProxy: false ssIdentifier: ssj0000146 providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central (subscription) issn: 1092-4388 databaseCode: BENPR dateStart: 19970401 customDbUrl: isFulltext: true dateEnd: 20091231 titleUrlDefault: https://www.proquest.com/central omitProxy: false ssIdentifier: ssj0000146 providerName: ProQuest – providerCode: PRVPQU databaseName: Proquest Health and Medical Complete issn: 1092-4388 databaseCode: 7X7 dateStart: 19970401 customDbUrl: isFulltext: true dateEnd: 20091231 titleUrlDefault: https://search.proquest.com/healthcomplete omitProxy: false ssIdentifier: ssj0000146 providerName: ProQuest – providerCode: PRVPQU databaseName: Psychology Database issn: 1092-4388 databaseCode: M2M dateStart: 19970401 customDbUrl: isFulltext: true dateEnd: 20091231 titleUrlDefault: https://www.proquest.com/psychology omitProxy: false ssIdentifier: ssj0000146 providerName: ProQuest – providerCode: PRVPQU databaseName: Research Library issn: 1092-4388 databaseCode: M2O dateStart: 19970401 customDbUrl: isFulltext: true dateEnd: 20091231 titleUrlDefault: https://search.proquest.com/pqrl omitProxy: false ssIdentifier: ssj0000146 providerName: ProQuest – providerCode: PRVPQU databaseName: Science Database (subscription) issn: 1092-4388 databaseCode: M2P dateStart: 19970401 customDbUrl: isFulltext: true dateEnd: 20091231 titleUrlDefault: https://search.proquest.com/sciencejournals omitProxy: false ssIdentifier: ssj0000146 providerName: ProQuest – providerCode: PRVPQU databaseName: Social Science Database issn: 1092-4388 databaseCode: M2R dateStart: 19970401 customDbUrl: isFulltext: true dateEnd: 20091231 titleUrlDefault: https://search.proquest.com/socscijournals omitProxy: false ssIdentifier: ssj0000146 providerName: ProQuest |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3fb9MwELa6DSFeplFgbGzFD4iXKiM_nMR-LNsqBG2pqgF9q9zEZhNdEiVd2Z_PObZTtlEJHng5VY6VRv4-ne6Su-8QeiNEFHIxlw5Pg9QhVHoOi7zEiT2lnxX5THBSD5uIRyM6nbJxq_XJ9sKsFnGW0dtbVvxXqGENwFats_8Ad3NTWIDfADpYgB3sXwHfM3XlZrZgVwkSKykAPUun23e74I9LOyJ88T0vr5aXWpahW-l2KghCV_lPLbT8p9C1KoTQI6Ts605bBKrmY6uXD0ZDqHnXPIb8md-pqf3IryFJ19_8z8yEr9S04zFbsNw4TJepris9ms96VOI_YI52j54umX7gtl1ClNuuFpflCYGQ5OTeRjjh4rrGy6uTIC3OeU8o217aQjt-HDJV3zd0x79pielWM_vAVsGTkHd3_1jpyJpbbU476vDjYg_tmsPHPY33U9QSWRvtD8zxV_gtHjQC2VUbPR6amok2OtId2PibWEheCthpF_LyxzM06OE1X3AuseILbviC-y5e8wWv-YKBL7jhC9Z8eY6-9M8vTj84ZsSGU0CstnSSaA7xJY-kSCD0Z1JQLoIgkZSEPAwEpKs8ICmX4XxOJUldmlLGSRCRxPcERN_BC7Sd5Zl4iTCFTN7nMZPqSEPCaSggFGS-5yVK5i49QK_tWc7AhanvUjwT-U01i1gIcSiJN--gSicQIDxA-xqEWaG1WGYWqcONV16hJ2viHqHtZXkjjtGjZLW8qsoO2oonX5WdxrWlHbTz_nw0nnRq9ijrD2v7ubZ6ZfILrzh8zA |
| linkProvider | ProQuest |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+comparison+of+high+precision+F0+extraction+algorithms+for+sustained+vowels&rft.jtitle=Journal+of+speech%2C+language%2C+and+hearing+research&rft.au=Parsa%2C+V&rft.au=Jamieson%2C+D+G&rft.date=1999-02-01&rft.issn=1092-4388&rft.volume=42&rft.issue=1&rft.spage=112&rft_id=info:doi/10.1044%2Fjslhr.4201.112&rft_id=info%3Apmid%2F10025548&rft.externalDocID=10025548 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1092-4388&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1092-4388&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1092-4388&client=summon |