vosk-server/mqtt at master · truongnguyen101195/vosk-server

Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
asr_server_mqtt.py	asr_server_mqtt.py
requirements.txt	requirements.txt
test.wav	test.wav
test_mqtt.py	test_mqtt.py

Vosk ASR via MQTT

A simple MQTT-based ASR server that might be useful to accept voice data from devices like Matrix Voice.

Usage

Prerequisites

Install any MQTT broker on your RPi, e.g Mosquitto.
Install Git.
Install python >= 3.4 and pip >= 19.0.
Install vosk: pip3 install vosk.

Project Setup

# Download vosk-server
git clone https://github.com/alphacep/vosk-server.git
cd ./vosk-server/mqtt

# Prepare any lightweight model: https://github.com/alphacep/vosk-api/blob/master/doc/models.md
wget http://alphacephei.com/kaldi/alphacep-model-android-ru-0.3.tar.gz
tar xf alphacep-model-android-ru-0.3.tar.gz
mv alphacep-model-android-ru-0.3 ./model

# Configure environment
nano .env

Add the following environment variables (modify values for your needs):

PID=any_project_id
VOSK_LANG=ru
VOSK_SAMPLE_RATE=16000.0
MQTT_ADDRESS=broker_ip
MQTT_USERNAME=username
MQTT_PASSWORD=password
TEST_FILE_NAME=test.wav

Execution

Start ASR server: ./asr_server_mqtt.py
Run a test script: ./test_mqtt.py

If you can't run these scripts, please grant them execution permissions:

sudo chmod +x ./asr_server_mqtt.py
sudo chmod +x ./test_mqtt.py

Note that test.wav is recorded in Russian. If you want to test some other model, please record a 16kHz / 16bit mono wav file. Then put it into ./mqtt root and change TEST_FILE_NAME env variable. You should also download a new language model as it was described above.

Logs

When you start ASR server, you'll see the following output:

vosk --min-active=200 --max-active=3000 --beam=10.0 --lattice-beam=2.0 --acoustic-scale=1.0 --frame-subsampling-factor=3 --endpoint.silence-phones=1:2:3:4:5:6:7:8:9:10 --endpoint.rule2.min-trailing-silence=0.5 --endpoint.rule3.min-trailing-silence=1.0 --endpoint.rule4.min-trailing-silence=2.0
LOG (vosk[5.5.641~2-79319]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (vosk[5.5.641~2-79319]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (vosk[5.5.641~2-79319]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 1 orphan nodes.
LOG (vosk[5.5.641~2-79319]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 2 orphan components.
LOG (vosk[5.5.641~2-79319]:Collapse():nnet-utils.cc:1472) Added 1 components, removed 2
LOG (vosk[5.5.641~2-79319]:CompileLooped():nnet-compile-looped.cc:345) Spent 0.3428 seconds in looped compilation.
Connected to mqtt server

It means KaldiRecognizer and MQTT server are configured and ready for receiving audio stream.

When you start a test script, it'll literally do nothing but read a test recording and send chunks to ASR server via MQTT. A test script will automatically close connection when the file is fully read and sent to the server.

Connected to mqtt server
Disconnecting...

However, after a short delay you should also see a set of final transcripts on the server-side:

{'result': [{'conf': 1.0, 'end': 2.97, 'start': 2.16, 'word': 'включи'}, {'conf': 1.0, 'end': 3.69, 'start': 3.09, 'word': 'свет'}], 'text': 'включи свет'}
{'result': [{'conf': 0.999962, 'end': 5.97, 'start': 5.31, 'word': 'выключил'}, {'conf': 1.0, 'end': 6.48, 'start': 6.000001, 'word': 'свет'}], 'text': 'выключил свет'}
{'result': [{'conf': 0.948885, 'end': 8.58, 'start': 8.01, 'word': 'включи'}, {'conf': 1.0, 'end': 9.39, 'start': 8.61, 'word': 'телевизор'}], 'text': 'включи телевизор'}
{'result': [{'conf': 0.719489, 'end': 11.4, 'start': 10.77, 'word': 'выключи'}, {'conf': 1.0, 'end': 12.09, 'start': 11.408416, 'word': 'телевизор'}], 'text': 'выключи телевизор'}
{'result': [{'conf': 0.991423, 'end': 14.19, 'start': 13.62, 'word': 'включил'}, {'conf': 1.0, 'end': 15.09, 'start': 14.22, 'word': 'кондиционер'}], 'text': 'включил кондиционер'}
{'result': [{'conf': 0.999582, 'end': 17.219995, 'start': 16.53, 'word': 'выключая'}, {'conf': 1.0, 'end': 17.97, 'start': 17.219995, 'word': 'кондиционер'}], 'text': 'выключая кондиционер'}
{'result': [{'conf': 1.0, 'end': 19.83, 'start': 19.41, 'word': 'какая'}, {'conf': 1.0, 'end': 20.34, 'start': 19.83, 'word': 'сегодня'}, {'conf': 1.0, 'end': 20.94, 'start': 20.34, 'word': 'погода'}], 'text': 'какая сегодня погода'}

Model Switching

If you want to switch from Russian to any other model, you can do it without restarting a server. Just publish the following message to your MQTT broker:

mosquitto_pub -h [MQTT_ADDRESS] -u [MQTT_USERNAME] -P [MQTT_PASSWORD] -t [PID]/lang -m [VOSK_LANG]

Note that you have to download, unpack and put the required model to ./mqtt/model-[VOSK_LANG] first.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mqtt

mqtt

README.md

Vosk ASR via MQTT

Usage

Prerequisites

Project Setup

Execution

Logs

Model Switching

Files

mqtt

Directory actions

More options

Directory actions

More options

Latest commit

History

mqtt

Folders and files

parent directory

README.md

Vosk ASR via MQTT

Usage

Prerequisites

Project Setup

Execution

Logs

Model Switching