Development of speech recognition system for indexing and searching in a big collection of mediafiles

Федосеев Георгий Александрович; Fedoseev Georgii

Development of speech recognition system for indexing and searching in a big collection of mediafiles

Files

Diploma22May18-2.pdf (2.09 MB)

reviewSV_stt08684_SHHegoleva_Nadezhda_Lvovna_(reviewer)(Ru).txt (3.82 KB)

reviewSV_st007810_Degtyarev_Aleksandr_Borisovich_(supervisor)(Ru).txt (3.54 KB)

Date

2018

Authors

Федосеев Георгий Александрович

Fedoseev Georgii

Abstract

На сегодняшний день коммерческие системы распознавания русской речи достигли сравнимого с человеком уровня распознавания в 90-95%. В то же время, практически отсутствуют решения для русского языка с открытым исходным кодом на основе современных архитектур. Основная проблема заключается в отсутствии достаточно объемных открытых корпусов транскрибированной русской речи. В данной работе предложен метод автоматического создания корпусов объемом в несколько сотен часов речи и рассмотрен процесс создания системы распознавания речи на основе открытой реализации архитектуры DeepSpeech. Кроме того, в работе рассматривается применение построенной модели для создания системы поиска по речи в коллекции медиафайлов.
To date, commercial systems for recognizing Russian speech have achieved 90-95% accuracy which is comparable to human level. At the same time, there are practically no open source solutions for Russian speech recognition based on modern architectures. The main reason is the lack of large enough public datasets of transcribed Russian speech. This paper proposes a method for automatic dataset crawling, resulting in datasets containing several hundred hours of speech, and describes ASR system creation based on the open source implementation of DeepSpeech architecture. In addition, the paper considers the application of the implemented model to create a search system for speech in the collection of media files.

Keywords

распознавание речи, корпус речи, глубокая нейронная сеть, рекуррентная нейронная сеть, система поиска, speech recognition, speech dataset, DNN, RNN, speech search

URI

http://hdl.handle.net/11701/13420

Collections

BACHELOR STUDIES

Full item page

Development of speech recognition system for indexing and searching in a big collection of mediafiles

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By