ABSTRACT
The research of speech recognition started in the 50's of last century. For nearly 60
years of continuous development, significant achievements have been made in many
key technology areas. This technology is used widely and many products related to it
have entered the market. Speech recognition technology changes the traditional
Human–Computer Interaction way and offers great convenience to people’s work and
life. Recently, isolated word speech recognition is the most mature and widely used.
Although it has a high recognition rate in common situation, its robustness is still poor
and is easily affected by the random factors. So it is necessary to continue further
research on isolated word speech recognition.
The paper realizes a four isolated words recognition system by analyzing the
current speech recognition algorithms. It proposes an improved recognition method
based on the DTW algorithm and analyzes the recognition rates of VQ algorithm with
different codebook’s size and compares the computation time of different algorithms on
Matlab as well. Finally, the paper analyzes the test results on Matlab and transplants the
algorithms on ARM platform..
The improved DTW recognition algorithm is based on the statistics. It first
computes the distortions between test template and each word’s many reference
templates, then computes the mean value of each word, replaces the distortion’s value
which is much bigger than the mean value with mean value, then computes the new data
set’s mean value, finally sort all the mean values, the word which has the smallest mean
value is the recognition result. This method to some extend overcomes the impact of
causal factors, so it improves the recognition rate. For four English words, the rate of
traditional DTW algorithm is 93%, 82%, 93%, 69% respectively and the improved
DTW algorithm is 95%, 90%, 94%, 91% in contrast.
For VQ technology used in speech, the paper first introduces the basic theory of
VQ, optimal codebook design and the process of recognition, then analyze the influence
of recognition rate caused by codebook’s size. It compares the recognition performance
in the specific four words case when the codebook’s size is 4, 8 and 16. The conclusion
is that within a certain range, the larger the codebook’s size is the higher the recognition
rate is, but when the codebook’s size is too large, the rate doesn’t increase any more and
even may decrease while the computation increases.
Finally, the paper does engineering implementation. The first step is to design the
graphical user interface with Qt software, then transplant the algorithm to mini2440
platform and achieve the expected goal and has some practical value.
Key Words: Speech Recognition Feature Extraction DTW VQ
ARM