Voice Commands Speak to Future Benefits

Truly conversational devices are becoming more and more prevalent in our daily lives. Through smart home devices such as Amazon Echo or Google Home, interactive “assistants” allow consumers to activate shortcuts to save time and help automate routine tasks — all using the sound of their own voice.

This sea change in home automation is starting to take place in the living room as well. No longer the domain of just adjusting the thermostat or turning on music, people are using their voice to search for — and discover — content on TV. The industry is seeing not only an increase in the number of households that use voice to discover content, but also the complexity of that discovery. The prevalence of home assistants is teaching people not only that they can speak to a device, but how to speak naturally with that device.

Act Naturally
A transformation from robotic dictation to natural conversation is underway and will play out over the coming months and years. What started out as simple searches for titles (“Find me Game of Thrones”) or channels ( “Tune to HBO”) has now broadened to much subtler, and more complex, searches such as “What’s on TV?” or “Find me some good comedy movies,” followed by “only the ones rated five stars.” A voice-supported interface that isn’t smart enough to handle these complex searches, or maintain context for follow-up questions, is like the first iteration of Siri: a fun toy, but for many of us, it’s not providing an overly valuable experience to improve our day-to-day lives.

The implications of this shift are significant. A voice interface is a like a wormhole through a user interface. Instead of navigating through endless menu paths that only very few viewers succeed at mastering, a cluttered screen can be replaced by a clean and intuitive user interface that allows voice interaction. The challenge then becomes not only to provide a set of results for a given search, but also to interpret the intent of the viewer, and act upon that intent in a meaningful way so that only the most salient results are seen, personalized to the viewer's interests.

Adding the extra dimension of personalization to voice functionality not only offers viewers the most cutting-edge user experience, but also a more accurate, high-quality set of results. The data is available to marketers, thus it’s important to deliver the right promotion at the right time to the right viewer.

Talk, Then Target
For example, when a viewer expresses a very well-defined intent, like “Find me my game,” the system knows to tune to the New York Yankees based on previous viewing habits, but also provides the opportunity to deliver targeted marketing and upsell paid content such as MLB Extra Innings. For networks and studios, the opportunity to deliver a target promotion is priceless. If a viewer asks, “Find me a comedy movie” and a broadcast network or studio has a new satirical comedy, that program can be promoted to only the viewers who have previously enjoyed such shows.

The ability to dynamically generate valuable screen real estate, as opposed to hard-coded areas of the screen, will deliver a much greater monetization opportunity. Combining this with the ability to target certain portions of a viewing audience enables not only the right content at a time that is “in the moment,” but also delivers it to the right viewer. Tracking the performance of these efforts will also allow marketers to continuously optimize not only the content, but also the ad message.

The living room is on a journey of transformation. Gone are the days of endless sifting through channels, or debating over which content is best. Voice functionality provides not only the ability to provide a clean, simple and easy-to-use interface, but also the next generation of monetization platforms.

Jon Heim is director of product management, conversation services, and Chris Ambrozic is senior director of analytics and product at TiVo.