Inaudible ultrasound commands can be used to secretly control Siri, Alexa, and Google Now

The Verge – by James Vincent

Is your digital assistant taking orders behind your back? Scientists from China’s Zheijiang University have proved it’s possible, publishing new research that demonstrates how Siri, Alexa, and other voice-activated programs can be controlled using inaudible ultrasound commands. This provides a new method of attack for hackers targeting devices like phones, tablets, and even cars. But don’t get too worried — the technique has a number of key limitations that means it’s unlikely to cause chaos.  

Using ultrasound as discreet form of digital communication is actually pretty common. As pointed out in a FastCompany report on the topic, Google’s Chromecast and Amazon’s Dash Buttons both use inaudible sounds to pair to your phone. And advertisers take advantage of these secret audio freeways too, broadcasting ultrasonic codes in TV commercials that work like cookies in a web browser; tracking a user’s activity across devices.

Deploying these high-pitched frequencies to hack voice assistants has also been suggested before, but this new work from Zheijiang provides the most comprehensive test of the concept to date. And really, it’s impressive just how susceptible modern technology is.

To carry out their attacks, the researchers first created a program to translate normal voice commands into frequencies too high for humans to hear using harmonics. (In this case, that means frequencies higher than 20 kHz.) Then, they tested whether those commands would be obeyed by 16 voice control systems, including Siri, Google Now, Samsung S Voice, Cortana, Alexa, and a number of in-car interfaces. The researched dubbed their method “DolphinAttack” — because dolphins, like bats, use high-pitch noises bounced off their surroundings as a form of echolocation.

DolphinAttack was successful across the board, and the researchers were able to issue a number of commands, including “activating Siri to initiate a FaceTime call on iPhone, activating Google Now to switch the phone to the airplane mode, and even manipulating the navigation system in an Audi automobile.” They suggest the method could be used for a number of malicious attacks, including instructing a device to visit a website which would download a virus or exploit; or initiating outgoing phone calls to spy on a victim.

In a neat bit of extra-credit work, they even thought through how to compromise a voice command system trained to respond to only one person’s voice. (Siri has offered this feature for a while, but it’s hardly foolproof.) They theorized that if you could get a potential target to say a particular sentence — for example, “he carries cake to the city” — you could slice up the syllables and rearrange them to form the words “Hey Siri.” Then, hey presto, you can issue your nefarious commands to the target device.

DolphinAttack proved to be consistently able to issue commands to a number of devices in different languages. Here are some of the test results for controlling Siri.

As with the rest of the research, this method is satisfyingly clever, but a little too impractical to be a widespread danger.

For a start, for a device to pick up an ultrasonic voice command, the attacker needs to be nearby — as in, no more than a few feet away. The attacks also needs to take place in a fairly quiet environment. A DolphinAttack that asks Siri to turn on airplane mode was 100 percent successful in an office; 80 percent successful in a cafe; and just 30 percent successful in the street. The researchers also had to buy a special speaker (albeit a very cheap one) to broadcast the commands, and noted that the attacks sometimes had to be tuned to their target. That’s because the frequency responses of microphones differs from manufacturer to manufacturer. For the Nexus 7, for example, they found that the best performance came from commands issued at 24.1 kHz (although the phone also responded to other frequencies).

In addition to these environmental restraints, it’s worth remembering that pretty much all digital assistants systems respond audibly to any voice commands. So the chances of a hacker controlling your phone without you noticing are slim. Plus, to carry out more impactful commands — like telling a device to visit a certain website, or sending money to someone — you usually have to unlock your device or confirm the instruction. The researchers also noted that it would be pretty easy to implement a fix: you can just tweak the hardware or software to ignore commands outside a certain frequency range.

All these caveats aside, DolphinAttack shows how new ways of interacting with technology invariably introduce new vulnerabilities. The advent of ‘conversational computing’ is no exception, and manufacturers may want to look into this sort of hack before the inaudible whispering campaign against them starts ups gets started in earnest.

Start the Conversation

Your email address will not be published.