January 2017

Adding Notification Capability to Your Amazon Echo

One of the capabilities that is missing from the Amazon Echo is the ability to initiate spoken notifications even when the user hasn't addressed the Echo. Although Amazon will likely address this at some point they obviously want to be careful in how they make it available so it doesn't cause the device to become an annoyance. In the meantime, those of us who have specific use cases -- for instance an alternate channel for notification of a high-priority issue for a production workload on the cloud -- can use workarounds. One such work around is the use an inexpensive Raspberry Pi (possibly a Pi Zero with a WiFi/Bluetooth shield or USB dongle) to use a bluetooth audio link to play the alert through the Echo. (Another is to make your own Echo on a Pi which is somewhat more complex). With the introduction of Amazon's new Polly synthesis API this can be an almost seamless extension of the Echo.

I will describe the two main parts of the challenge -- getting audio streaming to work from the Pi to the Echo and exercising various AWS APIs to perform the synthesis and to get commands from the cloud. I'll cover the highlights but generally will assume the reader has some knowledge of Amazon Web Services.

I used both a Raspberry Pi 3 (with no extra hardware) and a Raspberry Pi Zero with the RedBear IoT pHAT to achieve this. The first issue is pairing with the Echo to produce audio output from the Pi. I used the latest Raspbian Jessie build on the Pi. The Bluez bluetooth stack and PulseAudio are both already installed (and needed for this). It can be tricky to get the audio connected (and also to achieve this automatically after a reboot). With one of my Echos (the original model) and the Zero/IoT Phat combo -- I had no problem automatically establishing the connection even after a reboot. In another test with a recent-vintage Echo Dot, I'm still working on perfecting an automatic way to reliably connect -- I can do it manually but usually after a few failed attempts. Keep in mind I'm trying to do this through the console. In either case, once the connection is up, it tends to stay up. The manual steps from the console are roughly:

  1. Make sure you run as a non-root user (like Pi)
  2. Start pulse audio if it's not running with "pulseaudio --start"
  3. Ask the Echo to pair by saying "Alexa Pair"
  4. Run "bluetoothctl"
  5. Use "scan on" and then when you see the Echo use "pair" followed by its bluetooth address
  6. Use "connect" follow by its bluetooth address to connect
  7. Use "quit" to exit bluetoothctl

Beyond the pair step some of my automated logic is shown in audio.py which has a setup routine which attempts to establish the connection. One problematic case is where the connect succeeds and then fails a few seconds later. I'm working on adding strategies to the code to recover from that. Hopefully if you're trying this, you successfully connected. The next steps involve getting PulseAudio configured. Those steps continue below.

  1. Use "pactl list cards" -- if you connected above you should see a bluez_card listed whose description says "Echo..." -- note its number
  2. Use "pactl list sinks" -- if you are already linked to play audio you should see a bluez_sink with description "Echo...", note its number and skip to step 5.
  3. If you don't see the Echo as a sink you need to run "pactl set-card-profile <CARD_NUMBER> a2dp"
  4. Now re-run "pactl list sinks" to confirm you see the Echo as a sink and note its number
  5. Use "pactl set-default-sink <SINK_NUMBER>" to set audio to play on the Echo

I will likely keep refining audio.py as I test various scenarios.

To install the software, you'll need to run:

sudo apt-get install mpg123
sudo pip install boto3 pulsectl
git clone https://github.com/swb1701/AlexaPi.git

The notifier.py routine first attempts to connect to the Echo (whose bluetooth ID should be in secrets.py). It then establishes a connection with AWS for synthesis and uses Amazon's SQS (simple queueing service) to get commands to execute. SQS allows up to a million free API calls per month. It can be set to long poll when retrieving things from the queue (for up to 20 seconds). That means if you operate the device 24/7 for a month, you will have used up about 130,000 of your free API calls each month. But in return you get a nearly immediate spoken response on the Echo any time you post an entry to the SQS queue in the cloud. There are numerous other ways to feed commands to your Pi from the cloud including use of AWS IOT APIs -- I'm just offering SQS as one of the simplest. For the notifier you can use JSON to send complex commands to the PI (although if JSON isn't detected, the notifier defaults to speaking the text of queue entries). A complex JSON command might include information on an audible alert to be played, a repeat count, etc... Soon I'll discuss a more complex example of how to do spoken interaction with a Pi using a nearby Echo (see the repo for a preview).

You'll need to fill in some AWS keys, queue names, and your bluetooth device address in secrets.py (see secrets.py.example). And then you run the notifier with "python notifier.py". If notifier.py works for you in automatically establishing the bluetooth connection, you can then run it on boot by creating a script like:

#!/bin/bash
/bin/sleep 30
/usr/bin/sudo -u pi /bin/bash -c 'cd /home/pi/AlexaPi;/usr/bin/python notifier.py'
And then linking it and scheduling to run in /etc/init.d by executing these commands as root:
cd /etc/init.d
ln -s /home/pi/notifier notifier
update-rc.d notifier defaults

The notifier.py code needs AWS access keys defined in secrets.py. The AWS keys should be for an IAM account with appropriate permissions for the operations the notifier will perform. These include Polly synthesis access:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "polly:*"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}
as well as access to the SQS queue that will be used to push commands to the notifier:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sqs:*",
            "Resource": "arn:aws:sqs:*:<YOUR-ACCOUNT>:alexa-queue-*"
        }
    ]
}

The queue name should correspond to an SQS queue you establish using the AWS console. I used multiple queues which all started with alexa-queue- and permitted them all with the wildcard in the example above. My current SQS scheme would associate an SQS queue with each Echo. To send a broadcast, you'd need to send a message to all queues (or set up a broadcast queue and a Lambda that spreads it across a set of Echo devices -- of course at this point a pub/sub model like AWS IOT would be better for handling these complex scenarios -- maybe later I'll offer up that variant).

My use case was primarily to get AWS cloud notifications back to my Echo devices. For me, most of the logic that performs the notifications is already running on AWS Lambda where submission to an SQS queue is straight-forward. But you can use the same SQS scheme from almost any language or place you have access to an AWS API. You'll see that using Lambda is convenient as it's also a likely location for deploying Alexa Skills which handle voice interactions from Alexa itself. I'll be discussing that when we get into more complex interactions between the Pi and the Echo. There are also other interesting scenarios that you can use with the Pi to get beyond Echo limitations. For instance, now through spoken interaction you could set alarms beyond the time horizon allowable on the Echo itself. You could also set up complex spoken notification criteria.