Google AIY Voice Kit
Way, way back in the mists of time, long before the mornings darkened, the days got colder and the evenings longer, a mysterious and unexpected package arrived through my letterbox. Admittedly I had been expecting the magazine to which the package was attached, but the package itself was a complete surprise; as I’m sure it was to all other Magpi magazine subscribers who were fortunate enough to receive their very own Google AIY Voice kit with their May 2017 issue of the Magpi magazine.
What is it?
The Google AIY Voice kit is an amazing collection of bits and pieces that allow you to create and customise your very own Google Assistant. The kit consists of:-
- Voice HAT accessory board
- Voice HAT microphone board
- 2 plastic standoffs
- 3-inch speaker with wires attached
- Arcade style illuminated push button
- Cables for connecting button and microphone board to the accessory board
- Cardboard box and frame for construction of completed Voice kit
In order to construct the full Google Assistant, the only other items required are a Raspberry Pi, a pair of needle nose pliers, a small Phillips screwdriver, and (in true Blue Peter form) a small amount of double sided sticky tape. The kit is compatible with the Raspberry Pi 2, 3 and Zero W, though the general recommendation is to use the Pi 3 model B as the cardboard box and frame are designed to specifically fit the Pi 3.
Note, although the kit itself is compatible with various versions of the Raspberry Pi (including the Pi Zero W), some of the demonstration applications provided by the Google Assistant SDK will only run on the Pi 2 or Pi 3.
Sounds Brilliant, Where Can I get One??
As already mentioned, subscribers to the Magpi magazine will have received these kits with their May 2017 issue of the Magpi magazine. However, once word had got out about this incredibly generous giveaway, shop copies of the magazine disappeared off the shelves at an alarming rate; and rather disappointingly re-appeared on eBay just as quickly.
However, since then Google has commissioned a limited production run of the AIY Voice kit in order that they may be purchased, and these are currently available from CPC.
Assembling the Google AIY Voice kit is fairly straightforward, and is fully described at the Google AIY Voice kit web site. Purchased versions of the kit also include a lovely little Magpi Essentials – AIY Projects book with full details of how to assemble and use the kit.
So, What Does it do?
The basic premise of the Google AIY Voice kit is that you can use it to construct a fully functional Google Assistant. In order to do this, the kit uses Google AI services under the guise of the Google Assistant SDK (Software Development Kit). This SDK provides the user with an API (Application Programming Interface) to the Google AI services, and in order to use the kit, it is necessary to first enable this API through the Google Cloud Platform. As with the assembly of the kit, the process for doing this is all described at the Google AIY Voice kit web site, and in the Magpi Essentials book.
Once fully set up, running any of the demonstration programs provided by the Google Assistant SDK will show off the capabilities of the Google Assistant services.
For example, try running the Python program ‘assistant_grpc_demo.py’ and asking the question ‘What is the weather forecast?’. Hopefully, you should get an audible response telling you your local weather forecast. By default, any temperatures will most likely be given in degrees Fahrenheit. However, if you then follow up with the question ‘What is that in Celcius?’, the Google Assistant should helpfully provide the same weather information, but this time giving temperatures in degrees Celcius.
How Does it Work?
Basically what happens when you issue the Assistant a voice request is that the audio data of the request is analysed by the Google Assistant services. Having analysed the request, the Assistant services respond with an appropriate response. The response is made up of a string of text, and a packet of audio data. The string of text is simply the transcript of the voice request given to the Assistant, and, to avoid any misunderstandings, the demonstration program will actually write this text out to the screen for you to see. The packet of audio data is the Assistant’s response to the request. By default, this audio is automatically played through the speaker of the voice kit (so make sure that the volume is set to a respectable level before asking any embarrassing questions).
Why not try out some other requests? We had a bit of fun by asking the following questions:
- What is the meaning of life?
- What is the airspeed velocity of an unladen swallow?
- What is the weather in the Canaries in February?
- What is the value of Pi?
- Are you Skynet?
What Else Can it do?
One of the great things about this kit is that once you understand a little about how it works you can come up with your own ideas for things you might like it to do. Furthermore, because this is a Raspberry Pi project, in all likelihood there will be other people in the Pi community who will have done something similar, so it shouldn’t be too difficult to find guidance on how to do these things. We have already come across quite a few different projects that have made use of the Google AIY Voice kit.
- Marvellous retrofitted home assistants
- Google Pi Intercom with the AIY Projects kit
- I turned a Furby into an Amazon Echo
- List Of Mods For Raspberry Pi AIY Project
- Hackster.io AIY Projects
- AIY Projects: Do-it-yourself AI for Makers
- A Magic Mirror Powered by AIY Projects and the Raspberry Pi
Attaching Other Hardware
Something that can easily go unnoticed is that the Voice HAT provided with the Google AIY Voice kit does support additional hardware. Firstly, if you want stereo sound it is possible to attach a 2nd speaker to the Voice HAT. Additionally, you can use the voice HAT to control 6 servo motors (up to 25mA), and 4 DC motors (up to 500mA). Other general GPIO pins also allow support for I2C, SPI, and UART. However, in order to connect extra hardware, it is necessary to solder additional connectors to the Voice HAT. Depending on what additional hardware you might wish to attach to your Voice HAT, you may require some or all of the following items:
- 3.5mm 2 way terminal block for connection of 2nd speaker
- DC socket to connect external power supply
- 3 way PCB headers for connecting servos and motors
- 24 way PCB header for other GPIO connections
The image below shows the Voice HAT with added connectors for 2nd speaker, DC power supply, motors, servos, I2C and SPI.
Modifying the Software
When the Google AIY kit was first released, the Google Voice kit API was still evolving. As such, many of the early projects developed using the AIY Voice kit make reference to files that are not available in the most recent release of the Google AIY Voice kit operating system image. For example, many of these projects make reference to a file called ‘action.py’. This file is not included in later versions of the AIY Voice kit system image. Although this code is still available in the ‘master’ branch of the AIY projects GitHub repository, this code is officially deprecated.
However, we have discovered that it is relatively easy to adapt some of the projects that have made use of this deprecated software. One of my personal favourite AIY projects is Mike Redrobe‘s excellent update to support the playing of tracks from YouTube. We have taken this update, and integrated it into our demonstrator that makes use of the Google API GRPC libraries. I have included a copy of our code below.
#!/usr/bin/env python3 # Copyright 2017 Google Inc. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. """A demo of the Google Assistant GRPC recognizer.""" import logging import aiy.assistant.grpc import aiy.audio import aiy.voicehat import subprocess import RPi.GPIO as gpio import time logging.basicConfig( level=logging.INFO, format="[%(asctime)s] %(levelname)s:%(name)s:%(message)s") LANGUAGE = "en-GB" QUIETVOLUME = 20 GET_VOLUME = r'amixer get Master | grep "Front Left:" | sed "s/.*\[\([0-9]\+\)%\].*/\1/"' SET_VOLUME = 'amixer -q set Master %d%%' def setVolume(vol): subprocess.call(SET_VOLUME % vol, shell=True) def getVolume(): res = subprocess.check_output(GET_VOLUME, shell=True).strip() return int(res) savedVolume = getVolume() def volumeCommand(voice_command): global savedVolume vol = 0 if voice_command.lower() == 'volume up': change = 10 elif voice_command.lower() == 'volume down': change = -10 elif voice_command.lower() == 'volume max': change = 100 elif voice_command.lower() in ['volume mute', 'volume zero']: savedVolume = getVolume() change = -100 elif voice_command.lower() in ['volume unmute', 'volume on mute']: vol = savedVolume if vol == 0: vol = QUIETVOLUME else: volume = voice_command.replace('volume', '', 1) try: vol = int(volume) except: change = 0 res = subprocess.check_output(GET_VOLUME, shell=True).strip() try: logging.info("volume: %s", res) if vol == 0: vol = int(res) + change vol = max(0, min(100, vol)) setVolume(vol) aiy.audio.say('Volume at %d %%.' % vol, lang=LANGUAGE) except (ValueError, subprocess.CalledProcessError): logging.exception("Error using amixer to adjust volume.") def playerOff(): global playshell if (playshell != None): pkill = subprocess.Popen(["/usr/bin/pkill","vlc"],stdin=subprocess.PIPE) def radioOff(): global radioshell radioshell.kill() radioshell = None playshell = None def play(voice_command): track = voice_command.replace('play', '', 1) logging.info("playing: %s", track) global playshell if (playshell == None): playshell = subprocess.Popen(["/usr/local/bin/mpsyt",""],stdin=subprocess.PIPE ,stdout=subprocess.PIPE) playshell.stdin.write(bytes('/' + track + '\n1\n', 'utf-8')) playshell.stdin.flush() radioshell = None def radio(voice_command): global radioshell if (radioshell == None): logging.info(voice_command) if voice_command.lower() != 'radio off': playerOff() if voice_command.lower() in ['radio 1', 'radio one']: stationurl = "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_radio1_mf_p" elif voice_command.lower() == 'radio 2': stationurl = "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_radio2_mf_p" elif voice_command.lower() == 'radio 3': stationurl = "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_radio3_mf_p" elif voice_command.lower() == 'radio 4': stationurl = "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_radio4fm_mf_p" elif voice_command.lower() == 'radio 5': stationurl = "http://open.live.bbc.co.uk/mediaselector/5/redir/version/2.0/mediaset/http-icy-mp3-a-stream/proto/http/vpid/bbc_radio_five_live" elif voice_command.lower() == 'radio 6': stationurl = "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_6music_mf_p" elif voice_command.lower() == 'radio 1xtra': stationurl = "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_radio1xtra_mf_p" elif voice_command.lower() == 'radio 4 Extra': stationurl = "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_radio4extra_mf_q" else: stationurl = "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_radio4fm_mf_p" radioshell = subprocess.Popen(["/usr/bin/cvlc",stationurl],stdin=subprocess.PIPE ,stdout=subprocess.PIPE) radioshell.poll() else: logging.info(voice_command) if voice_command.lower() == 'radio off': radioOff() def main(): status_ui = aiy.voicehat.get_status_ui() status_ui.status('starting') assistant = aiy.assistant.grpc.get_assistant() button = aiy.voicehat.get_button() with aiy.audio.get_recorder(): while True: status_ui.status('ready') print('Press the button and speak') button.wait_for_press() global radioshell, playshell if radioshell != None or playshell != None: # Reduce volume volume = getVolume() setVolume(QUIETVOLUME) status_ui.status('listening') print('Listening...') text, audio = assistant.recognize() if radioshell != None or playshell != None: # Reinstate volume setVolume(volume) if text is not None: if text == 'goodbye': status_ui.status('stopping') print('Bye!') break elif 'play' in text.lower()[0:5]: play(text) audio = None elif 'stop' in text.lower()[0:5]: playerOff() audio = None elif 'radio' in text.lower()[0:6]: radio(text) audio = None elif 'volume' in text.lower()[0:7]: volumeCommand(text) audio = None print('You said "', text, '"') if audio is not None: aiy.audio.play_audio(audio) if __name__ == '__main__': main()