ASR (Automatic Speech Recognition) systems are responsible for transforming a speech signal in the form of sequence-of-words which can be utilized for both text-based communication as well as device controlling purposes. Aim of analyzing ASR systems arises in order to simulate human judgment of the systems performance for computing their usefulness. Also for assessing other issues when carrying out comparisons among the systems. The existing paper presents techniques for imbibing speech recognition techniques effectively for handling the web portal via voice for accessing data that can be from any place and at any given time. Technology has taken a leap by establishing the STT (speech–to–text) conversion system which creates a text-format of the speech. This being extremely useful for the deaf individuals and significant in other domains too. Data mining has shown remarkable growth in analyzing acoustic features of speech and sound. The research recommends ASR (Automatic Speech Recognition) by the means of HSNN (Hybrid Semi Neural Network) based on the acoustic feature of the voice. Using the proposed technique of HSNN (Hybrid Semi Neural Network) speech recognition of the speakers is carried out depending on its type, thereby categorizing the speech/voice into text. Deep Neural Network (DNN) and convolution neural network (CNN) techniques combine to form the HSNN (Hybrid Semi Neural Network). Deep neural network relies upon the stacked sparse auto-encoder for carrying out the voice classification. By applying supervised fine-tuning and unsupervised pre-training, valid information can be fetched from the data. On the other hand CNN (Convolution Neural Network) relies upon the logistic regression for classifying the acoustic identification of a given voice. Following are the handling processes in the mechanism: preprocessing, filtering, feature extraction, classification and identification. Using the speech recognition system the deaf individuals can access data from any place and at any given time. It yields improvised in the given time span.
Volume 11 | 06-Special Issue
Pages: 683-694