Automatic Speech Recognition on Non-Pathological Dataset of Urdu Language

Anoshia  Imtiaz; Munaf Rashid; Sidra Abid Syed; Hira Zahid; Muzzaffar  Iqbal; Akhtar Ali  Khan

doi:10.51153/kjcis.v5i2.87

Vol. 5 No. 2 (2022), Articles

Vol. 5 No. 2 (2022)

Automatic Speech Recognition on Non-Pathological Dataset of Urdu Language

Articles

https://doi.org/10.51153/kjcis.v5i2.87

Published 2022-07-07

Anoshia Imtiaz
Munaf Rashid
Sidra Abid Syed⁺⁻
Hira Zahid
Muzzaffar Iqbal
Akhtar Ali Khan

Anoshia Imtiaz

Munaf Rashid

Sidra Abid Syed

syed

Hira Zahid

Muzzaffar Iqbal

Akhtar Ali Khan

PDF

Keywords

Voice dataset, Urdu language, SVM, MFCC, Pitch

How to Cite

Imtiaz, A. ., Rashid, M., Abid Syed, S., Zahid, H., Iqbal, M. ., & Khan, A. A. . (2022). Automatic Speech Recognition on Non-Pathological Dataset of Urdu Language. KIET Journal of Computing and Information Sciences, 5(2). https://doi.org/10.51153/kjcis.v5i2.87

Abstract

One of its subsystems, speech, has a strong underlying characteristic and a distinct voice. Voice disorders are abnormal conditions that influence the quality of voice. Several protocols, including acoustic analysis, can detect clinical voice pathology. Based on a computerized acoustic analysis, machine learning algorithms and non-invasive systems may play a very vital part in initial detection, tracking, and even growth of proficient pathological speech analysis. The aim of this research paper is to collect a non-pathological dataset i.e. healthy voice dataset. Two important and critical features; 1) MFCC and 2) Pitch are used to generate a final audio clip. SVM used as a classifier to train and test the dataset model and the models exhibited reasonably high training and testing accuracies i.e. 85.886% which proves to be a milestone on Urdu language dataset.

https://doi.org/10.51153/kjcis.v5i2.87

PDF

References

Graham Williamson. Human Communication: A Linguistic Introduction (2nd Edition) 2006.

ASHA Clinical Topics. Voice disorders. Website, 2019. https://www.asha.org/PracticePortal/Clinical-Topics/Voice-Disorders

Michael J. Clark James Hillenbrand, Laura A. Getty and Kimberlee Wheeler. Acoustic characteristics of American English vowels. Journal of the Acoustical Society of America, 97(1):3099–3111, 1995.

T. Parsons. Voice and Speech Processing. McGraw-Hill College Div., Inc, 1986.

G. C. M. Fant. Acoustic Theory of Speech Production. Mouton, Gravenhage, 1960.

J. R. Deller, J. G. Proakis, J. H. L. Hansen. Discrete-Time Processing of Speech Signals. Prentice-Hall, Inc., Englewood Cliffs, NJ, 1993.

D. O’Shaughnessy. Speech Communication: Human and Machine. Addison Wesley Publishing Co., 1987.

L. R. Rabiner, R. W. Schafer. Digital Processing of Speech Signals. Prentice-Hall, Inc., Englewood Cliffs, 1978.

"ATLAS - Urdu: Urdu Language", Ucl.ac.uk, 2021. [Online]. Available: https://www.ucl.ac.uk/atlas/urdu/language.html. [Accessed: 17- Sep- 2021].

S. Huang, N. Cai, P. P. Pacheco, S. Narrandes, Y. Wang, W. Xu, Applications of support vector machine (SVM) learning in cancer genomics, Cancer Genomics-Proteomics, 15 (2018), 41–51.

A. Shmilovici, Support vector machines, in Data Mining and Knowledge Discovery Handbook, Springer, Boston, MA, (2009), 231–247.

S. Memon, M. Lech, L. He, Using information theoretic vector quantization for inverted MFCC based speaker verification, in 2009 2nd International Conference on Computer, Control and Communication, IEEE, (2009), 1–5.

M. Sahidullah, G. Saha, On the use of distributed dct in speaker identification, in 2009 Annual IEEE India Conference, IEEE, (2009), 1–4.

Ö. Eskidere, A. Gürhanl?, Voice disorder classification based on multitaper mel frequency cepstral coefficients features, Comput. Math. Methods Med., 2015 (2015), 956249.

R. O. Duda, P. E. Hart, D. G. Stork, Pattern Classification, 2nd edition, Wiley-Interscience, USA, 2000.

Most read articles by the same author(s)

Hira Zahid, Sidra Abid Syed, Marissa Jerome, Rida Batool, Sarmad Shams, Patient Benefactor Linker , KIET Journal of Computing and Information Sciences: Vol. 3 No. 2 (2020): Volume 3 | Issue 2 | July - Dec | 2020
Hira Zahid, Sidra Abid Syed, Marissa Jerome, Rida Batool, Sarmad Shams, Patient Benefactor Linker , KIET Journal of Computing and Information Sciences: Vol. 3 No. 2 (2020): Volume 3 | Issue 2 | July - Dec | 2020

Automatic Speech Recognition on Non-Pathological Dataset of Urdu Language

Keywords

How to Cite

Download Citation

Abstract

References

Most read articles by the same author(s)