@@ -2,7 +2,7 @@ High-Level Audio Processing in Python
============
This project contains code for live audio recording and processing in python.
Besides basic usage of Pyaudio for simple recording, you will find asynchronous
Besides basic usage of PyAudio for simple recording, you will find asynchronous
processing methods and the use of artificial intelligence for high level applications.
This code has been developed by [ZKM | Hertz-Lab](https://zkm.de/en/about-the-zkm/organization/hertz-lab) as part of the project [»The Intelligent Museum«](#the-intelligent-museum).
...
...
@@ -23,12 +23,23 @@ We may use class abstraction at some point in time.
You will find a lot of borrowed code and trained neural nets from other repository, instead of submodules, as the purpose of this project is to maintain working code rather than bleeding edge technology.
We want to thank the following repositories for providing open-source solutions:
Please find a README in each of the example subfolders for clarification of the requirements and usage as each example is very different in that regard.
We will be using PyAudio to access the microphone which depends on PortAudio. Please take a look at the [installation guide](http://files.portaudio.com/docs/v19-doxydocs/tutorial_start.html). For some platforms however you can find prebuilt binaries.
##### Linux (APT)
```shell
sudo apt install libasound-dev portaudio19-dev
```
##### MacOS (Brew)
```shell
brew install portaudio
```
Please find a README in each of the example subfolders for clarification of addtional requirements and usage as each example is very different in that regard.
In this example we will use two artificial neural networks. After gathering a chunk of audio we will check if there it contains human speech. This process is called Voice Activity Detection (VAD). If a voice is detected we start accumulating audio in order to feed the Keyword Spotting System (KWS). The KWS may be exchanged with any other AI that feeds on speech.