Commit 248fe20a authored by pbethge's avatar pbethge
Browse files

edit README

parent da8a56c8
# Voice Activity Detection for Spoken Language Identification
In this example we will use two artificial neural networks. After gathering a chunk of audio we will check if there it contains human speech. This process is called Voice Activity Detection (VAD). If a voice is detected we start accumulating audio in order to feed the Spoken Language Indenifier (LID). The indentifier may be exchanged with any other AI that feeds on speech.
# Voice Activity Detection for Keyword Spotting
In this example we will use two artificial neural networks. After gathering a chunk of audio we will check if there it contains human speech. This process is called Voice Activity Detection (VAD). If a voice is detected we start accumulating audio in order to feed the Keyword Spotting System (KWS). The KWS may be exchanged with any other AI that feeds on speech.
Parts of this code are heavily borrowed from:
- [silero-vad](https://github.com/snakers4/silero-vad)
- [speechbrain lid](https://github.com/speechbrain/speechbrain)
- [speech-commands](https://github.com/douglas125/SpeechCmdRecognition)
Please check out [this fork of speech-commands](https://github.com/bytosaur/SpeechCmdRecognition) to train on specific words.
### Installing Python Requirements
__Note__: We suggest using virtual environments for dealing with python code.
......@@ -27,4 +28,4 @@ The sample rate should be kept at 16kHz for both neural networks. If higher samp
Simply run the main script
```shell
python main.py
```
\ No newline at end of file
```
# Voice Activity Detection for Keyword Spotting
In this example we will use two artificial neural networks. After gathering a chunk of audio we will check if there it contains human speech. This process is called Voice Activity Detection (VAD). If a voice is detected we start accumulating audio in order to feed the Keyword Spotting System (KWS). The KWS may be exchanged with any other AI that feeds on speech.
# Voice Activity Detection for Spoken Language Identification
In this example we will use two artificial neural networks. After gathering a chunk of audio we will check if there it contains human speech. This process is called Voice Activity Detection (VAD). If a voice is detected we start accumulating audio in order to feed the Spoken Language Indenifier (LID). The indentifier may be exchanged with any other AI that feeds on speech.
Parts of this code are heavily borrowed from:
- [silero-vad](https://github.com/snakers4/silero-vad)
- [speech-commands](https://github.com/douglas125/SpeechCmdRecognition)
- [speechbrain lid](https://github.com/speechbrain/speechbrain)
Please check out [this fork of speech-commands](https://github.com/bytosaur/SpeechCmdRecognition) to train on specific words.
### Installing Python Requirements
__Note__: We suggest using virtual environments for dealing with python code.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment