Since Amazon presented and sold the Amazon Echo in 2014 (USA, 2016 in Germany and the UK), there were many concerns whether the voice controlled and cloud connected gadget spies on you and uploads all conversations that are in reach of the seven installed microphones. We took a closer look again at the Amazon Echo and the Amazon Echo Dot to prove or dispel some concerns.

The Devices

The Amazon Echo as well as the Amazon Echo Dot uses a highly modified version of Unix as operating system. Although the Echo Dot has an USB connector, no wired communication is possible (e.g. via ADB). However, the fastboot-tool from the Android SDK showed us a connected device, but we were not able to unlock the bootloader, flash any files, or root the operating system.

Communication checked

Almost every communication channel of the Amazon devices uses TLS1.2 encryption with certificate validation/pinning. This technique prevented us to use a Man-In-The-Middle-proxy and read along the encrypted traffic.

Certificate pinning
Certificate pinning

One of the few unencrypted requests was an ordinary online check by requesting an Amazon page. Due to the fact, that all sensitive information will be transported encrypted, we only can make assumptions on what is transmitted to Amazon – but we cannot proof it.

We created a small scenario to evaluate what data might be sent over the internet. We opened up Wireshark and captured the traffic. The scenario was like

  1. ~8 seconds of a silent room
  2. Asking Alexa “What time is it?” and awaiting the answer (~6 seconds)
  3. ~8 seconds of a silent room
  4. “Alexa – tea, earl grey, hot” and await the answer (~5 seconds)
  5. ~23 seconds of a normal conversation between two people (not including any Alexa keywords)
Transmitted bytes from the Amazon Echo
Transmitted bytes from the Amazon Echo

The graph shows the transmitted bytes with a resolution of 100ms. It is noticeable that the traffic raises a significant amount, as soon as the “Alexa” keyword is spoken. We assume, that at these phases, the Echo uploads the voice recording to the Alexa Voice Service (AVS, https://developer.amazon.com/alexa-voice-service). As soon as Alexa finished the answer to our question the traffic drops to the previous level. Data transmitted in these idle phases might be device metrics, keep-alive and push messages. But it’s very safe to say, that the Echo devices do not upload recordings of the surrounding to Amazon. What we cannot rule out is if the Echo is recording all time (and saving the audio to the internal storage). If this is the case, Amazon may be able to access or upload these files via remote commands, e.g. in the case of a crime (https://www.forbes.com/sites/ianmorris/2016/12/28/amazon-echo-now-an-expert-murder-witness/ and http://edition.cnn.com/2017/03/07/tech/amazon-echo-alexa-bentonville-arkansas-murder-case/). Although Amazon assured that the devices only store the audio when the wake-up keyword is spoken.

We also tried to analyse differences in contacted webservices over a large timespan in the two scenarios idle mode and active usage. Unfortunately, the inspection tools were fairly useless for the in- and outgoing TLS-Stream. Both scenarios only showed connections to device-metrics-server of Amazon (and we assumed more for the latter scenario).

As stated above, we were not able to perform a man-in-the-middle-attack due to certificate pinning, but all captured SSL traffic was not safe in front of our tools – well, at least the metadata. The following screenshot shows an evaluated SSL capture during a firmware update.

Analysis of the captured SSL traffic
Analysis of the captured SSL traffic

In contrast to other researchers and enthusiasts trying to the hack the Amazon Echo Dot, we couldn’t observe that a firmware update is transmitted over an unprotected HTTP pipeline as stated e.g. in https://medium.com/@micaksica/exploring-the-amazon-echo-dot-part-1-intercepting-firmware-updates-c7e0f9408b59#7525 or https://blog.padil.la/2017/01/20/amazon-echo-dot-system-image/. As they downloaded a firmware image with the version 564196920 our Echo Dot already had the version 571207720 installed. Shortly after booting the Amazon Echo we observed a peak in data throughput. After about a minute the Echo downloaded 131MB from a CDN, most probably containing the data for a new firmware. Because the Amazon Echo App reported version 578223820 afterwards.

Data throughput during firmware update
Data throughput during firmware update

Because of the fact, that we were not able to catch any unencrypted requests to retrieve the firmware update, it is safe to assume that Amazon changed the OTA-Update-procedure in favour of a more secure way. Nonetheless we analysed the firmware 564196920 from the mentioned blog post and found a highly modified version of Android as operating system – most likely Amazon’s FireOS. There were also several apks packed in the firmware image, but we did not find any abnormalities.

The bigger Amazon Echo also was of interest by smart home enthusiast. Unluckily no one found a downloadable OTA-Update file, neither did we. Although some hackers were able to use the debug connection pad on the Echo motherboard to open a terminal and extract the filesystem (https://github.com/echohacking/wiki/wiki/Echo). The system image shows, that the Echo is running an Unix based operating system, but unlike the Echo Dot it does not seem to be based on Android/Fire OS.

Conclusion

The Amazon Echo and Echo Dot leaves mixed feelings with us. Although we are happy about every single encryption we find in our network traffic captures, there is always an underlying bad feeling about these new connected smart home devices. The multiple installed microphones are always able to listen to its surrounding and, because of the secure encryption, we cannot tell what data is transmitted to Amazon. So in the end everybody has to decide on its own, if trendy techniques and gimmicks are worth the risk of losing a part of your personal privacy.