Machine learning is a significant developing area for almost all researches today. In this paper we describe we perform pre-processing on the audio data which helped in improving the accuracy after classification and there few methods of extracting features from sound data, one commonly used feature extraction technique in speech recognition is isolating the Mel Frequencies Cepstral Coefficients (MFCC). Attaining knowledge from empirical data is the backbone of machine learning. The knowledge is derived by changing either structure or parameters of an algorithm model or both in order to bring an improvement in its expected performance on future data. We perform comprehensive experiments involving audio pre-processing using different time-frequency representations, logarithmic magnitude compression, frequency weighting and scaling. We show that many commonly used input pre-processing techniques are redundant except magnitude compression There are few commonly used features extraction methods, such as Mel-scaled spectrogram, Chroma gram, spectral-contrast, and the tonal centroid features We go on to detail the effectiveness of different models on each method, including tests of Random Forests, Naïve Bayes, J48, SVM, Machines architectures In this paper we will discuss classifying the data by Effective pre-processing designing various models and doing their comparative analyses Urban sound dataset and let us have a brief look at its recognition accuracy pattern with respect to various models. We go on to detail the effectiveness of different models on each method, including tests of Random Forests, Naïve Bayes, J48, SVM, Machines architectures.
Volume 11 | 05-Special Issue
Pages: 2293-2304