Talk about straight-out-of-science-fiction.
The promise? “Make the memories last.” The plan to get there? “Ambient Intelligence.”
At their annual re:MARS conference, Amazon’s Senior Vice President and Head Scientist for Alexa, Rohit Prasad announced a new voice assistant feature that’s raising eyebrows — and inviting questions about (potentially harmful) applications outside of its intended purpose.
The company is developing a feature that will allow Alexa, Amazon’s digital assistant, to mimic the voice of anyone it hears — using less than a minute of provided audio.
“The way we made it happen is by framing the problem as a voice conversion task and not a speech generation path,” he explained. “We are unquestionably living in the golden era of AI, where our dreams and science fictions are becoming a reality.”
The example used to explain this new feature involved a child asking “Alexa, can grandma finish reading me the Wizard of Oz?” Once Alexa affirmed the command, her voice changed. As mentioned above, Amazon wants to “make the memories last,” especially after COVID-19, where “so many of us have lost someone we love,” Prasad added.
While these examples — alongside applications like helping those with speech impediments — are Amazon’s primary goal with this development, it’s not hard to see potential abuse and security concerns of this feature. For starters, politically-aligned/motivated deep fakes. There’s also, as CNET pointed out, an ethics issue about the rights of the deceased’s voice — specifically, how long it can be kept on a device or company server.
What is ambient intelligence?
AI, the underlying technology used here, has been around for a while. Prasad explains ambient intelligence this way:
“Ambient intelligence is artificial intelligence [AI] that is embedded everywhere in our environment. It is both reactive, responding to explicit customer requests, and proactive, anticipating customer needs. It uses a broad range of sensing technologies, like sound, vision, ultrasound, atmospheric sensing like temperature and humidity, depth sensors, and mechanical sensors, and it takes actions, playing your favorite tune, looking up information, buying products you need, or controlling thermostats, lights, or blinds in your smart home.”
He elaborates that Alexa is made up of over 30 machine learning systems that can each process different sensory signals. Ambient intelligence is the path that leads to generalizable intelligence (GI). Now, while this sounds like something out of a movie where robots take over the world, according to Prasad there are three key attributes of GI:
- Accomplish multiple tasks
- Rapidly evolve to ever-changing environments
- Learn new concepts and actions with minimal external human input
