PDF

Keywords

Gaussian mixture models (GMM)
Automatic speech recognition (ASR)
Hidden Markov models (HMM)
Machine learning (ML)
Multiple layer perceptron (MLP)
Radial basis probabilistic neural network (RBPNN)
Support Vector Machine (SVM)
Learning vector quantization (LVQ)
Mel frequency cepstral coefficients (MFCCs)
Dynamic Time Wrapping (DTW)
Mixture Invariant Training (MixIT)

Abstract

  Voice is a behavioral biometric that may reveal a person's age, gender, ethnicity, and emotional state. Speaker recognition is the method of identifying individuals through their sounds. Despite the fact that over the last eight decades, academics have already been focusing on speaker identification, technological advancements like the Internet of Things (IoT), smart homes, voice assistants, smart gadgets and humanoids have made their use popular in modern society. This study offers a thorough analysis of the speaker identification literature. It looks at recent developments as well as problems in this area of study. This study looks into feature extraction, classifiers, and the structure of the speaker recognition system. Also covered is how speaker recognition is used in apps. The objective is to increase researchers' understanding of speaker identification by machine learning since recent research has shown that it is easy to deceive machine learning into producing an incorrect prediction.
https://doi.org/10.33899/csmj.2023.179466
  PDF