PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Estimation of the Fundamental Frequency of the Speech Signal Compressed by MP3 Algorithm

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The paper analyzes the estimation of the fundamental frequency from the real speech signal which is obtained by recording the speaker in the real acoustic environment modeled by the MP3 method. The estimation was performed by the Picking-Peaks algorithm with implemented parametric cubic convolution (PCC) interpolation. The efficiency of PCC was tested for Catmull-Rom, Greville, and Greville two- parametric kernel. Depending on MSE, a window that gives optimal results was chosen.
Rocznik
Strony
363--373
Opis fizyczny
Bibliogr. 41 poz., tab., wykr.
Twórcy
  • Technical College Aleksandra Medvedeva 20, 18000 Nis, Serbia
autor
  • Technical Faculty in Bor, University of Belgrade Vojske Jugoslavije 12, 19210 Bor, Serbia
Bibliografia
  • 1. Atal B. (1972), Automatic speaker recognition based on pitch contours, Journal of the Acoustical Society of America, 52, 6, 1687-1697.
  • 2. Avila F., Biscainho L. (2012), Bayesian Restoration of Audio Signals Degraded by Impulsive Noise Modeled as Individual Pulses, IEEE Transactions On Audio, Speech, And Language Processing, 20, 9, 2470-2481.
  • 3. Ayadi M., Kamel M., Karray F. (2011), Survey on speech emotion recognition: Features, classification schemes, and databases, Pattern Recognition, 44, 572-587.
  • 4. Barbancho I., Tardon L., Sammartino S., Barbancho A. (2012), In harmonicity-Based Method for the Automatic Generation of Guitar Tablature, IEEE Transactions On Audio, Speech, And Language Processing, 20, 6, 1857-1868.
  • 5. Brandenburg K., Stoll G., Dehery Y.F., Johnston J.D., Kerkhof L.V., Schroeder E.F. (1992), The ISO/MPEG Audio Codec: A Generic Standard for Coding of High Quality Digital Audio, 92nd. AES- convention, preprint 3336, Vienna.
  • 6. Britanak V. (2011), A survey of efficient MDCT implementations in MP3 audio coding standard: Retrospective and state-of-the-art, Signal Processing, 91, 624-672.
  • 7. Dhar P.K., Echizen I. (2011), Robust FFT Based Watermarking Scheme for Copyright Protection of Digital Audio Data, Seventh International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), 181-184.
  • 8. Fragoulis D., Papaodysseus C., Exarhos M., Roussopoulos G., Panagopoulos T., Kamarotos D. (2006), Automated classification of piano-guitar notes, IEEE Transactions On Audio, Speech, And Language Processing, 14, 3, 1040-1050.
  • 9. Griffin D., Lim J. (1988), Multiband excitation vocoder, IEEE Transactions On Audio, Speech, And Language Processing, 36, 8, 1223-1235.
  • 10. Hacker S. (2000), MP3: The Definitive Guide, O’Reilly & Associates, Sebastopol, CA 95472.
  • 11. Hussain Z.M., Boashash B. (2002), Adaptive instantaneous frequency estimation of multicomponent signals using quadratic time-frequency distributions, IEEE Transaction on Signal Processing. 50, 8, 1866-1876.
  • 12. ISO/IEC (1992), Information Technology – Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5 Mbit/s, Part3:Audio, ISO/IEC JTC1/SC29/WG11 MPEG, International Standard 11172-3 (MPEG-1).
  • 13. ISO/IEC 13818-3 (1994), Information Technology. Generic Coding of Moving Pictures and Associated Audio: Audio. ISO/IEC JTC1/SC29/WG11 MPEG, International Standard 13818-3(MPEG-2), 1994.
  • 14. Joen B., Kang S., Baek S.J., Sung K.M. (2003), Filtering of a Dissonant Frequency Based on Improved Fundamental Frequency Estimation for Speech Enhancement, IEICE Trans. Fundamentals, E86-A, 8, 2063-2064.
  • 15. Kacha F., Benmahammed G.K. (2005), Time-frequency analysis and instantaneous frequency estimation using two-sided linear prediction, IEEE Signal Processing, 85, 491-503.
  • 16. Kang S. (2004),Dissonant frequency filtering technique for improving perceptual quality of noisy speech and husky voice, IEEE Signal Processing, 84, 431-433.
  • 17. Kang S., Kim Y. (2006), A Dissonant Frequency Filtering for Enhanced Clarity of Husky Voice Signals, Lecture Notes Comp Science, 4188, Berlin Springer, 517-522.
  • 18. Kawahara H., Katsuse I., Cheveigne A. (1999), Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds, Speech Communication, 27, 3-4, 187-207.
  • 19. Kawahara C.H. (2002), YIN, a fundamental frequency estimator for speech and music, Journal of the Acoustical Society of America, 111, 4, 1917-1930.
  • 20. Keys R.G. (1981), Cubic convolution interpolation for digital image processing, IEEE Transaction on Acoustics, Speech & Signal Processing, 29, 6, 1153-1160.
  • 21. Klapuri A. (2003), Multiple fundamental frequency estimation based on harmonicity and spectral smoothness, IEEE Transactions On Audio, Speech, And Language Processing, 11, 6, 804-816.
  • 22. McCandless M. (1999), The MP3 revolution, IEEE Intelligent Systems and their Applications, 14, 3, 8-9.
  • 23. Meijering E., Unser M. (2003), A Note on Cubic Convolution Interpolation, IEEE Transaction on Image Processing, 12, 4, 447-479.
  • 24. Milivojevic Z., Mirkovic D. (2009), Estimation of the fundamental frequency of the speech signal modeled by the SYMPES method, International Journal of Electronics and Communications (AEU), 63, 200-208.
  • 25. Milivojevic Z., Mirkovic M., Rajkovic P. (2004), Estimating of the fundamental frequency by the using of the parametric cubic convolution interpolation, Proceedings of International Scientific Conference UNITECH ’04, 138-141, Gabrovo, Bulgaria.
  • 26. Milivojevic Z., Brodic D. (2011), Estimation of the Fundamental Frequency of the Speech Signal Com-pressed by G.723.1 Algorithm Applying PCC Interpolation, Journal of Electrical Engineering, 62, 4, 181-189.
  • 27. Milivojevic Z., Mirkovic M., Milivojevic S. (2006), An Estimate of Fundamental Frequency Using PCC Interpolation - Comparative Analysis, Information Technology and Control, 35, 2, 131-136.
  • 28. Milivojevic Z., Milivojevic M., Brodic D. (2012), The Effects of the Acute Hypoxia to the Fundamental Frequency of the Speech Signal, Advances in Electrical and Computer Engineering, 12, 2, 57-60.
  • 29. Mirkovic M., Milivojevic Z., Rajkovic P. (2004), Performances of the system with the implemented PCC algorithm for the fundamental frequency estimation, Proceedings of XII Telecommunications Forum TELFOR ’04, Section 7, Signal processing, Beograd.
  • 30. Moon H. (2012), A Low-Complexity Design for an MP3 Multi-Channel Audio Decoding System, IEEE Transactions On Audio, Speech, and Language Processing, 20, 1, 314-321.
  • 31. Pang H.S., Baek S.J., Sung K.M. (2000), Improved Fundamental Frequency Estimation Using Parametric Cubic Convolution, IEICE Trans. Fund., E83-A, 12, 2747-2750.
  • 32. Park K.S., Schowengerdt R.A. (1983), Image re-construction by parametric cubic convolution, Computing, Vision, Graphics & Image Processing, 23, 258-272.
  • 33. Rabiner L. (1977), On the use of autocorrelation anal-ysis for pitch detection, IEEE Transactions On Acoustic, Speech, Signal Processing, ASSP-25, 1, 24-33.
  • 34. Resch B., Nilsson M., Ekman A., Kleijn W. (2007), Estimation of the instantaneous pitch of speech, IEEE Transactions On Audio, Speech, And Language Processing, 15, 3, 813-822.
  • 35. Ross M., Schafer H., Cohen A., Freudberg R., Manley H. (1974), Average magnitude difference function pitch extractor, IEEE Transactions On Acoustic, Speech, Signal Processing, ASSP-22, 5, 353-362.
  • 36. Sekhar S.C., Sreenivas T.V. (2004), Effect of interpolation on PWVD computation and instantaneous frequency estimation, IEEE Signal Processing, 84, 107-116.
  • 37. Shahnaz C., Zhu W., Ahmad M. (2012), Pitch Estimation Based on a Harmonic Sinusoidal Autocorrelation Model and a Time-Domain Matching Scheme, IEEE Transactions On Audio, Speech, And Language Processing, 20, 1, 310-323.
  • 38. Veprek P., Scordilis M. (2002), Analysis, enhancement and evaluation of five pitch determination techniques, Speech Communication, 37, 3-4, 249-270.
  • 39. Wang X., Hong H. (2006), A Novel Synchronization Invariant Audio Watermarking Scheme Based on DWT and DCT, IEEE Transactions on Signal Processing, 54, 12, 4835-4840.
  • 40. Yarman B., Guz U., Gurkan H. (2006), On the comparative results of SYMPES: A new method of speech modeling, International Journal of Electronics and Communications (AEU), 60, 421-427.
  • 41. Yeo I., Kim H.J. (2003), Modified patchwork algorithm: a novel audio watermarking scheme, IEEE Transactions on Speech and Audio Processing, 11, 4, 381-386.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-f9638727-2285-4db3-9832-5274076b72dd
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.