Speech and Audio Signal Processing


In this lecture the basics of speech, audio, and music signal processing are treated.

You will learn a lot about fundamentals as well as about recent research in this interesting field of signal processing.

The topics of the lecture are:

  • 1. Introduction to the properties of speech and audio signals
  • 2. Basic methods of audio signal processing
    • a. Frequency analysis methods
    • b. Estimation of the auto correlation functions (ACF) and power spectral density functions (PSD)
    • c. Tracking and detection methods including computational efficient calculation methods
    • d. Measurements of room impulse responses and reverberation times
    • e. Typically used filter methods with corresponding design methods
  • 3. Methods for codebook processing
    • a. Principle
    • b. Training of codebooks
    • c. Efficient codebook search
  • 4. Audio coding: prediction coding, line spectral frequencies (LSF), code excited linear prediction (CELP)

  • 5. Noise reduction and beamforming

  • 6. Cepstral processing as one important tool of speech processing, Example: Cepstral smoothing for noise reduction to avoid the “Musical Tones” problem
  • 7. Methods for pitch frequency calculation and applications

  • 8. “Mel frequency cepstral coefficients” (MFCC) as one important feature analysis in speech processing
  • 9. Speaker detection based on MFCCs combined with LDA (linear discriminant anaylsis) and PCA (principle component analysis)
  • 10. Hidden Markov Models (HMM)
  • 11. Speech regognition
  • 12. Acoustic classification methods: Bayes methods, Gaussians mixture models (GMM), etc.
  • 13. Music signal processing, e.g. beat detection