Steves Computer Vision Blog: Voice Command v3.0 for the Raspberry Pi

Voicecommand v3.0 Changes

I've made some big changes and their are even bigger things in the works. Here is a small list. I've had some help from a couple of committed users who have come up with some new good ideas which is awesome.

There is now a ~ option that finds the word anywhere in a command. For instance ~music==pianobar will work if you say: let's hear some music, play music, or music.
!filler is now a string so you can set it manually. If you put it to 0, it will be empty and if you put it to 1, it will be FILLER FILL for compatibility issues.
Example scripts have been added in the Misc folder for you to play with. These can send and receive emails and text messages as well as posting to facebook; all using only your voice.
Flags can now overwrite the config options and can be reversed by following them with a 0 or enforced if followed with a 1. Ex. if !continuous==1 in your config file, you can force it to run only once with voicecommand -c0
The commands and keywords are now case insensitive. So no tricky case matching.
Multiple language support has been added. This is based on your country code which I think you can find here (plus en_uk and en_us). Look up your country code and use that. Ex. For US: !language==en_us, for Spain !language==es, for Germany !language==de.
You can set a Wolfram Alpha API and maxResponse (the number of branches) like !api==XXXXXX-XXXXXXXXXX and !maxResponse==3. This will give you better answers. You can sign up for a Wolfram Alpha API on their website for free.
Logging has been enabled into /dev/shm/voice.log. It throws stuff to this instead of /dev/null
The need for tts-nofill has been removed!! Now tts doesn't use any filler unless you send it yourself.

New Install and Update videos have been added. They can be found here:
http://stevenhickson.blogspot.com/2013/06/installing-and-updating-piauisuite-and.html

Consider donating to further my tinkering since I do all this and help people out for free.
Places you can find me

As always, the updated man page can be found below:

voicecommand

Section: voicecommand man page (8)
Updated: 13 May 2013
Index

NAME

voicecommand - Listen to user defined strings and run the corresponding command

SYNOPSIS

voicecommand [OPTIONS]...

DESCRIPTION

voicecommand was developed for the Raspberry Pi but will work on any linux system with a microphone attached. It is a crude program, which uses basic comparisons to determine if your voicecommand fits a format specified in a config file; it it does, it runs the corresponding linux command. It supports auto-completion and variables as well as command verification, a continuous mode, and other options. For help/comments/questions, feel free to e-mail me at help@stevenhickson.com. I answer sporadically but do eventually respond.

Note: All of the flags that turn something on are off can be reversed and overwritten by following it with a 1 or 0. So for instance if you have !continuous==1 in your config file, you can run voicecommand -c0 to turn continuous off.

OPTIONS

?: Same as -h
-b: Turns off the FILL audio. The purpose of this was because the Raspbery Pi (or mine at least) cuts off the first few seconds of audio. This flag turns that feature off. You should only be concerned with this if you hear FILL before everything it says.
-c: Makes voicecommand run in continuous mode, where it will keep listening over and over again.
-d: Sets the duration for listening to the audio for voice commands
-D: Sets the audio hardware. The default is plughw:1,0 -
-e: Edits the voicecommand config file.
The format is voice==command
You can use any character except for newlines or ==
If the voice starts with ~, the program looks for the keyword anywhere. Ex: ~weather would pick up on weather or what's the weather
You can use ... at the end of the command to specify that everything after the given keyword should be options to the command.
Ex: play==playvideo ...
This means that if you say "play Futurama", it will run the command playvideo Futurama
You can use $# (where # is any number 1 to 9) to represent a variable. These should go in order from 1 to 9
Ex: $1 season $2 episode $3==playvideo -s $2 -e $3 $1
This means if you say game of thrones season 1 episode 2, it will run playvideo with the -s flag as 1, the -e flag as 2, and the main argument as game of thrones, i.e. playvideo -s 1 -e 2 game of thrones
Because of these options, it is important that the arguments range from most strict to least strict.
This means that ~ arguments should probably be at the end.
You can also put comments if the line starts with # and special options if the line starts with a !
Default options are shown as follows:
!keyword==pi,!verify==1,!continuous==1,!quiet==0,!ignore==0,!thresh==0.7,!maxResponse==-1
api==BLANK,!filler==FILLER FILL,!response==Yes Sir?,!duration==3,!com_dur==2,!hardware==plughw:1,0,!language==en_us
Keyword, filler, and response accept strings. verify, continuous, quiet, and ignore except 1 or 0 (true or false respectively). thresh excepts a floating point number. These allow you to set some of the flags as permanent options (If these are set, you can overwrite them with the flag options).
You can set a WolframAlpha API and maxResponse (the number of branches) like !api==XXXXXX-XXXXXXXXXX amd !maxResponse==3
You can now customize the language support for speech recognition and some text to speech with the language flag. Look up your country code and use that. Ex. For US: !language==en_us, for Spain !language==es, for Germany !language==de.
-f /my-location/config-file: This allows you to load a different config file located in a different spot. The default one is in your home directory and is ~/.commands.conf
The config file must be formatted the same way.
-h: Shows this man page.
-i: Sets the ignore mode. When this flag is activated, if a command is not in the config file, nothing happens. The default behavior is to try to find an answer or response to that question and then speak it. This turns off that behavior.
-I string: Sets the forced input mode. This allows you to test it without the microphone or get it to parse typed information. It will not run in continuous mode with this.
-k word: Sets the keyword. The default is pi. If this flag is set, the verify and continuous flags are also set since this is only checked during those two modes.
Ex. voicecommand -c -v -k Jarvis
-l: Sets the duration for listening to the audio for the command keyword. This is different than the -d flag that listens for the voice commands.
-s: Runs a setup operation that attempts to set all of the config options in the config file so that voicecommand works properly
-r word: Sets the response. The default is "Yes Sir?" (For version 1.0, it was Ready?. If this response is more than one word, it should be put in quotes, otherwise it doesn't need to be
Ex. voicecommand -r Ready?
-t #: Sets the threshold for volume to determine if the keyword was spoken. This should be a floating point number. The default value is 0.7 which works well with the Logitech C310 camera/mic from about 6 feet away.
Ex. voicecommand -t 1.2
-p: Sets passthrough mode on so that instead of running the commands, it just prints them. This is going to be used for the XBMC plugin and Android app.
-q: Sets quiet mode on so that voicecommand never speaks through the audio output. It still prints everything but doesn't ever respond. This includes the keyword response.
-v: Makes voicecommand verify the keyword. This only happens in continuous mode so if this flag is set, the continuous flag will be set as well. The default mode is to not verify. When voicecommand hears any sound above the threshold, it says the response then listens for a command. The default keyword is pi. When the verify flag is set, after the threshold is met, voicecommand verifies that the keyword was spoken.

AUTHOR

Steven Hickson (help@stevenhickson.com)

BUGS

No known bugs. To report bugs, send a clear description to help@stevenhickson.com Since this program is fairly crude, user typos could cause crashes/failed responses. Please read the man page thoroughly before submitting a bug.

COPYRIGHT

Copyright © 2013 Steven Hickson. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it as long as you give credit to the author and include this license. There is NO WARRANTY, to the extent permitted by law.

HISTORY

This is the second major version of this program

Steves Computer Vision Blog

Saturday, June 29, 2013

Voice Command v3.0 for the Raspberry Pi