There are many advantages that make voice the next big disruptor on the horizon in the world of user experience, but for voice to work, it needs to be adopted by more users until a critical mass is reached. As intuitive as the interface is, there are some potential challenges standing in the way of widespread adoption, from lack of awareness of what the interface can/can not do to the absence of a common language to more practical problems like the lack of privacy in public spaces.
What stands in the way of adoption?
1. The knowledge black box
In a touch-screen interface, user options are clearly codified visually and hence limited. The analogy is a close-ended questionnaire that lets you choose only “Yes,” “No” or “Maybe.” In a VUI, the possibilities are more open. Imagine that same questionnaire that’s now open-ended and so allows the responder to speak in their own language and at length. While there is greater flexibility with voice interfaces, there is also a lack of guidance to the user on what’s possible to be said. Let’s say a shopper on an e-commerce site wants to see close-up pictures of a laptop they wish to purchase. How does the shopper know if they can ask the app to zoom in on an image? A solution to address what can and cannot be said has to be found in order to avoid such a problem, for example, a reference list of things that can be said on a page accessible to users.
2. Being in the dark while you wait
The same shopper can recognize the hourglass icon that lets them know a certain command is underway in a touch-screen interface. This is a challenge in VUI. Furthermore, if the shopper is about to make a purchase but wants to backtrack and see the flow of his actions thus far, how does he do it? A simple way out here could be to have a message announced like: “Please wait while your request is processed. If you would like to cancel the operation, please say ‘Cancel’.” Other, more elegant solutions could also be devised.
3. Misreading words that may have more than one meaning
VUI is notorious for getting things wrong. If, say, one’s intent is to buy books and one simply says: “Show me books”, there’s a high possibility of the VUI leading one to an explanation of what “Books” is on Wikipedia instead of taking them to Amazon. This can frustrate the user and discourage them from enjoying the user experience with VUI. But combining a VUI with touch will help by clarifying what was meant to the user through a prompt showing what was actually heard and allowing the user to backtrack this if possible.
4. No common minimum language yet
Perhaps the most pressing challenge in scaling up VUI lies in the fact that there is no standard vocabulary or design pattern. In touch-screen interfaces, universally accepted words like “Browse”, “Submit” and “Expand” exist. No such common minimum language exists in VUI yet. If a shopper reaches the last stage of buying their laptop and then says: “Complete the purchase using my MasterCard” to close the sale, it could be very efficient. But this can only happen when there is a common vocabulary for as many commonly faced scenarios and this is ironically only possible when enough users adopt it. Graphical user interface (GUI) or touch-screen interface design can help hasten this learning.
5. Failing to decode accents
The other potential problem with VUI is the inability to decode a range of accents. Even within the same state, the way two people pronounce the word “opportunity” could vary, causing the VUI to balk or generate a wrong response. Considerable UX research would be required to accommodate a broad spectrum of accents and support it with timely, inquiring questions to the user when commands are not understood.
6. Public situations
Last but not least, users would find it hard to use VUI while in a meeting with a client or during a family gathering. In these cases, it’s best to acclimatize users to using touch-screen interfaces for certain operations, especially where VUI might not be not feasible or ideal.
Even if some of the above challenges are fixed, we don’t see voice phasing out touch. If we’ve learnt something from past tech trends, it is that each type of interface tends to have certain unique benefits, behavioural or technical, that appeal to a certain type of user or situation and are hence not easily replaceable. Voice and touch have their own pros and cons and will appeal to different users in different situations. So while many technophiles and marketers are sounding the clarion call for voice phasing out touch like the computer phased out the typewriter, we envision a future where voice and touch will co-exist seamlessly, like computers and tablets for instance. This will allow both touch and voice’s unique strengths to complement each other, overcome some of the above VUI challenges and create a more natural interaction for the user. At Redd, we’re excited about exploring ways to create this hybrid interface.