In rare moments of greatness, when a person or company does her, his or its job well enough, the result is a change in expectations. Consumers become accustomed to a heightened level of service. They expect it by default. When heightened expectations are briefly unmet, users sometimes lash out in disillusionment, forgetting how far along things have come. For example, you suffer through reading an imperfect entry on Wikipedia or Seeking Alpha, and ask why the site can't do better, forgetting that the site has made incredible improvements on the nature of free information on the Internet.
Expectations for voice recognition have been low for several years. We've all had underwhelming first dates with voice recognition on our dumbphones and GPS devices. This formed a set of expectations that self-reinforced what people built. If developers and innovators don't see technology moving toward voice recognition, they're not going to try to push it to move in that direction. But with momentum building, they're going to hop on board; and expectations create reality.
There is a new era of expectations for voice recognition, and a new reality is soon to follow. Frank M. Fazio of Brooklyn is suing Apple (AAPL). It is a class action based on unmet high expectations for his iPhone's voice recognition provider, Siri. It appears he is disappointed that Siri cannot rival a human's accuracy in making appointments, finding restaurants, learning guitar chords, or mastering the tying of a tie.
Siri has already proven to represent a step forward in voice recognition. Whereas first-generation implementations were largely superficial, Siri offers fundamental improvements on the way you navigate your mobile lifestyle. But Siri is young, and she is not perfect: Mr. Fazio's lawsuit marks the shifting expectation that voice recognition should be perfect. You might think it's just one man's opinion, but watch the media make a big deal about it, and understand that the lawsuit punctuates the early realization of a fantasy.
I have been building an investment thesis exploring opportunities for gaining exposure to voice recognition. The most direct method is purchasing the leader in this space, Nuance (NUAN). However, Nuance buyers have to pay a high price for expected growth. I have instead looked for companies that could be hurt by voice recognition. So far my most compelling target is Sirius XM (SIRI), followed by Groupon (GRPN).
My thesis begins with the idea that technology in general is under-utilized. I have arrived at this idea from observing the use of technology by older generations and younger generations. We interact with technology primarily through computers, which are controlled through a visual, textual, literal interface. I have noticed a tendency for activity to be resisted whenever it involves a literal interface. For example, many of the courses I took in college consisted of professors reading from a textbook outline. I wondered why we needed the teachers when we had the textbook. I suspect it's because of the literal interface.
Something in human nature resists learning new skills when such skills involve textual thinking. For example, conversational Spanish is easier to learn than programming because programming is taught via textual thinking. The cognitive principle at stake here is ultimately one of abstraction and working memory. Procedural skills like navigating a web browser form barriers to use which are analogous to grammar and vocabulary in learning a language. These barriers tax the working memory. The simpler a medium of technology, the more bearable it is on our lazy brains, and the more likely we won't ignore its potential.
My friend works in a hospital. When orders are filled from the laboratory, a notification prints out on the nursing floors, along with a duplicate copy. The duplicate has no purpose whatsoever, and the hospital wastes ink and time throwing it away for every order. This has apparently gone on for years. Nobody has bothered to fix it. Because to fix it requires more than a verbal consensus that it should be fixed. To fix it requires technological competency.
The cloud transforms the processing space available for deployments of artificial intelligence. This extra space allows devices to stream prepackaged solutions, which previously would have taken up too much room. That means you don't have to learn skills. The computer has already memorized the solution, you just need to tell it very simply what to do.
Westerners are talkative people so we have little trouble picking up verbal communication skills for our devices. As we communicate with computers, they grow smarter, updating solutions, such that others can benefit. This cloud dynamic of artificial intelligence exists already, however it will be newly leveraged when voice recognition diminishes the cognitive barrier to entry of technological literacy.
Sirius XM and Groupon have appeared on my short radar because they currently differentiate themselves through a "simplifier" service. Their value relies on the fact that other services require consumers to have higher levels of technological literacy. If interfaces switch from literal to auditory through speech recognition, there will be a sweeping secular destruction of value for all simplifier services. To survive and thrive, a company must maintain additional offerings beyond simplification.
Groupon has lost 17% since my critique of its scale on February 21, while Sirius XM is carried by takeover rumors. I do not see a compelling short of either at this immediate time. However it is crucial that one pay attention to voice recognition as it alters the landscape of competition for these and other simplifying companies. I hope this article inspires further discussion and additional ideas for identifying simplifiers.
The smart money in technology understands that voice recognition is not quite here yet. But the smartest money understands that when it gets here, it is going to arrive with a vengeance. The shift in expectations, as formalized by this lawsuit's play in the media, marks the second coming of voice recognition. Five years from now, you will be angry at your computer for making you type anything.