With the development of voice control devices, AMZN seems to have developed yet another platform with substantial leverage.

The main reason for this is that management keeps relentlessly focused on the long-term building technologies and platforms that can be leveraged.

While voice-controlled digital assistants were already a feature for smartphones and tablets, Amazon (NASDAQ:AMZN) took this a whole lot further by putting a digital assistant (Alexa) in a separate device, the voice-controlled Echo in 2015. This became a surprise hit.

These digital assistants combine a host of technologies like:

Automatic speech recognition.

Natural language understanding engines that enable a system to instantly recognize and respond to complex voice requests. This builds on recent breakthroughs in machine learning and deep neural networks.

Far field voice recognition enables the system to function when the speaker is much further from the microphones and in a much noisier scenario. At the core is Amazon's seven microphone array which uses beamforming to identify the microphone closest to the voice and amplify that one and suppress the others. The system even works when playing music.

Text to speech is also driven by machine learning.

Together, this yields intelligent voice control systems. The first reincarnation of these products was in the mobile phone with Apple (NASDAQ:AAPL) Siri and Google's (NASDAQ:GOOG) (NASDAQ:GOOGL) Assistant. Microsoft (NASDAQ:MSFT) came with Cortana, and Samsung (OTC:SSNLF) couldn't be left out as it has just recently introduced Bixby on the Galaxy 8.

This isn't all, as there are independent players like Hound, which offer a digital assistant for iOS and Android, competing head on with Apple and Google on their home turf. Good luck to that, we're inclined to say.

Especially when it can't do certain things native assistants can, because it's not connected to the phone's software or have access to your search history.

Also, one has to open the application first to be able to ask it questions (to get around the native assistants). But it's worked on it for nine years and claims to have certain advantages. Here is the Hound CEO Keyvan Mohajer (per The Verge):

He says the underlying technology behind Hound is built around a unique approach to natural language processing. When combined with advances in machine learning and other artificial technology techniques, Hound is able to do what Mohajer calls "speech-to-meaning." While other digital assistant software translates what you speak into text and tries to figure out what you said, Hound supposedly skips that step and deciphers your speech as it hears it.

The reviewer from The Verge claims it's faster and better than anything he has ever seen, and another publication put it at number two in terms of speech recognition accuracy, with 95%, behind Baidu's (NASDAQ:BIDU) 96% as the leader.

So perhaps there is hope for this system yet. If the claims are true, it could be quite an interesting takeover target. There were others like independent cross-platform assistant called Viv from the makers of Siri. It has already been acquired by Samsung which now uses in Bixby.

The mobile phone platform has handicapped Amazon as its fire phones and tablets never really caught on and didn't establish a mass market large enough for a platform. Putting Alexa as an app on other mobile platforms can only achieve so much as it will always be at a handicap versus the native assistants.

So Amazon had to improvise, and it did brilliantly with Echo, taking the digital assistant out of the mobile phone and into the living room (or any room of the house).

Which one is the best?

There are several articles comparing the capabilities of the assistants. For instance, here is the NYT:

Apple was the strongest at productivity tasks like calendar appointments and email; Google was the best at travel and commute-related tasks. Alexa excelled at music, and Cortana was mediocre across the board

The speech recognition accuracy is assessed at Inc., and there are head-to-head comparisons for instance between Google's Home and Amazon's Alexa at Forbes and at CNET, or Siri versus Alexa at Tom's Guide. Other comparisons are at T3, Macworld and CNN.

The Tom's Guide review from October 2016 is notable as there seems to be a big improvement in the voice recognition of the Echo, beating Apple's Siri by a big margin. An earlier review on voice recognition from Inc.com in June 2016 had placed it last in this department.

Amazon and Microsoft are hampered by a lack of a home mobile phone platform (both have apps for the other platforms, but needless to say they operate at a disadvantage on these, as we described above with Hound).

Google might seem to have an advantage in terms of search engine integration, but others have access to the same engine just like you and me. But Google does have an advantage in knowing much more about you through your search history.

Amazon, of course, knows your shopping history and the stuff you like (both via your wish list and its predictive algorithms based on people with similar interests).

Here is a nice overview of what Google's Assistant can do, but one has to realize that this is the Assistant on the Pixel (and a few other) phone. The one in Google home has fewer capabilities.

The phone is just the beginning

Or in Alexa's case, the Echo is just the beginning. There is a simple reason for this. Voice is a more natural, faster interface compared to text. Most people can only type 40 words a minute, but they can speak 160 words.

It's also fiddly to have to type into small keyboards like those on mobile phones (even if we continue to be amazed by the dexterity some people develop in this field). While voice has its own disadvantages, the advantages are obvious.

Amazon has multiple paths for expansion of Alexa to establish a dominant platform:

Skills

Devices

Open source

Skills are basically a combination of a website and an app; they add functionality for a specific domain (ordering pizza, streaming music from a particular service, etc.). Skills are added at a dizzying pace. When the Echo came out (late 2014), it only had 20 skills. By the end of December 2016, these have grown to 5,200, 10,000 in February, and today with over 100 added on a daily basis.

The open platform clearly has the potential to reinforce Amazon's first-mover advantage. Here is Rajeev Kaul, managing director and Accenture technology lead for travel in North America (from TechRepublic):

Because of the openness of the Amazon Alexa platform, and the ease of being able to create skills to do whatever you need to do, it makes Alexa that much more powerful. I need to have very advanced engineers and specialists that know how these other platforms work to get the same kind of value that I can get with a lower skill set and a lower cost with Alexa.

Smart home

Basically the Echo (and Echo Dot and Echo Tap) is becoming a smart home and smart office IoT hub, and as such, it has a roughly two-year advantage on Google, it's main rival in this space (although we expect others to follow). The Echo is considerably ahead on skills and partners.

Amazon is already partnering with smart home solutions from Nest, Ecobee, Honeywell, SmartThings, Wink, Insteon, Belkin WeMo, Philips Hue, LIFX, Big Ass Fans, IFTTT, Control4, Creston and other devices via skills, whilst Google Home has Philips Hue, Nest, SmartThings, Belkin WeMo, Honeywell, and IFTTT.

But it doesn't stop with homes.

Smart hotels

Wynn (NASDAQ:WYNN) is already putting an Echo into 4,748 hotel rooms in Vegas, and there is tremendous interest in the industry. Per TechRepublic:

Having the Echo used by the hotel industry will result in two distinct benefits to Amazon. The first is that it will lead developers to create more travel-related skills for Alexa, which is an area currently lacking. The second benefit is that it will likely increase Echo sales for home use, once hotel guests experience Alexa and decide to buy it for their own household, Kaul said.

However, per the same source, other hotels experiment with Apple iPads. That seems a fairly expensive solution with considerable theft risk at first sight.

Smart cars

Amazon is a follower here, beaten by Apple (CarPlay) and Google (Android Auto). However, it's catching up with deals to integrate Alexa with Ford (NYSE:F), Volkswagen (OTCPK:VLKAY), Hyundai (OTC:HYMLF) (which also has a deal with Google), and Volvo (OTCPK:VOLVY). The first two are really big league car manufacturers, which have left the competition in the dust, at least for now. What can it do? Here is the Motley Fool:

Inside the car, Alexa can perform navigation duties, read an audiobook, or create shopping lists on Amazon. Additionally, the driver can control smart-home functions, such as setting the thermostat or adjusting the lighting inside the house, from the car.

Here is one journalist's experience with Alexa in a Ford Fusion Energi (per Business Insider):

You can control the locks, check the vehicle's range (both electric and gas), remotely start and stop the vehicle, check the odometer, check the battery charge, and check the tire pressure. You can do all of this via a simple voice command to your at-home Amazon Echo. There were several times I parked the vehicle and forget whether or not I had locked the doors, but instead of leaving my apartment and running down the street to check the vehicle, I simply asked Alexa to handle it. I also just enjoyed the novelty of asking Alexa to check the range of my car from bed before heading out.

Alexa can also be retrofitted into older cars with the help of applications like Logitech's (NASDAQ:LOGI) ZeroTouch, which let you hook your smartphone to Alexa in your car.

Smart office

While the office seems less of a conducive environment for voice-activated assistants, there are still lots of interesting opportunities, per The Guardian:

A firm called HipChat has already launched a "skill" that harnesses Echo as an always-on alert system that "shouts" when one of its sites goes down. By teaching Echo to notice a certain set of conditions that will result in a problem, they are now capable of reducing service issues more quickly than ever before. In other applications, Alexa could instantly alert brand owners to negative social media sentiment, or give retailers the ability to take advantage of a fluctuation in the market to make truly dynamic pricing strategies possible.

One might retort that these kinds of warnings can also appear on your computer screen, but that requires that somebody is watching all the time.

Enlisting others

Alexa's functionality can be built into other devices. In fact, Amazon is encouraging that. You can even build your own Alexa with a $30 Raspberry Pi; Amazon offers the code. Things might be getting on turbo with the opening up of Amazon Web Services AI (AWS AI), consisting of (from Forbes):

Amazon Rekognition, a rich image analysis service that can identify various attributes of an image.

Amazon Polly, a service that accepts text or a string and returns an MP3 audio file containing the speech. With support for 47 different voices in 23 different languages, the service exposes rich cognitive speech capabilities.

Amazon Lex is the new service for natural language processing and automatic speech recognition. It is the same service that powers Alexa and Amazon Echo. The service converts text or voice to a set of actions that developers can parse to perform a set of actions.

This opens up a whole new part of AWS, potentially repeating its success. It's a land grab at first with little of the huge profitability of AWS, but once the platform is built out, monetization is likely to follow one way or another.

Apart from increasing the platform, one other advantage should be immediately obvious since much of these systems are driven by deep-learning AI capabilities. The more they are used, the better they become.

It is also opening up its microphone technology for third parties, per The Verge.

Limitations

Voice control isn't suited for every situation; it's good for the home or your car, less so for the office, especially the open offices of today.

It is difficult to make money from the skills, although that doesn't seem to have slowed their rather spectacular proliferation.

The skills aren't widely used, and this isn't so surprising as invoking a skill is complicated. Users have to remember the specific names and keywords necessary to activate each app. So it's not surprising that there is evidence that people don't use many skills. This is a problem that needs sorting out.

Privacy concerns

Identity verification

Privacy concerns are not easy to dismiss. The Echo works by being always on, that is, it's always listening. In fact, the Echo keeps some 60 seconds of audio in memory for pre-processing so it can react faster and more appropriate.

According to Amazon, that recording is local and erased with the current 60 seconds, so basically it always has a 60 seconds record of what's going on in your house or your office. This is data that can be subpoenaed, although perhaps not with guaranteed success, according to TIME:

Amazon recently refused to turn over voice records from an Echo user to Arkansas police investigating a murder which took place last November, saying that such recordings should be protected under the First Amendment.

It doesn't look like the risks are substantial here as that data is constantly erased, but how secure are these devices to hacking?

Identity verification

And then there is the problem of voice recognition. How does the Echo know to whom it is talking, given that there are no passwords in the voice-driven devices (although one can set a four-digit verification code for sensitive stuff like credit card use in the Echo)?

This problem will become more serious the more powerful these devices become. Needless to say, you don't want anyone to place stock orders via your Echo, or make bank wires, or your kid inadvertently ordering stuff, etc.

Apple seems to be covered here through the acquisition of a couple of Israeli face recognition AI companies (PrimeSense, LinX). Here is Forbes:

Apple is just about cornering the market with far-field and near-field [2] facial recognition AI technology... However I think the surprising use of this rather strong facial recognition technology in Voice First Echo-ish type of device will be the ultimate way for Apple to enter into the far-field Voice First category.

And it already has a near-field voice device in the form of the AirPods for more personal interactions and information, stuff you might not want to share with your environment.

Amazon has been working on a solution that can identify your voice, which is profile based (a "voice print"), according to TIME. Since this is a serious problem, we do expect others to follow (they probably already are).

Adding automatic voice print recognition has another benefit; it allows the Echo to be near seamlessly used by multiple people. Here Amazon could have another leg up the competition.

Value

RBC analyst Mark Mahaney argues that by 2020, Alexa could rake in $10B in revenues for Amazon. This is split between device sales ($5B), more shopping ($5B) and an unspecified amount from having a platform with a 100M+ installed base.

Amazon could also make money from skills search placement and premium content skills revenue sharing in a way similar to the iOS platform.

Research firm Tractica estimates about 40 million homes will use a voice-activated digital assistant by 2021. While $10B is not that much of a figure for the likes of Amazon, we have to stress that it's very early days yet.

The underlying AI technologies are relatively young, and they are improving all the time. In fact, they are improving themselves, as much of it depends on deep-learning capabilities, and they are fed ever more data.

So it's difficult to imagine how big a new platform this will become, and what it will be capable off, but the fact that all the major tech companies (Microsoft, Apple, Google, Samsung, Facebook (NASDAQ:FB), Amazon, etc.) are embarked on this suggest they think it's worth their while.

It's also way too early to call a winner, but one thing seems fairly certain. Amazon has a first-mover advantage and that could become pretty relevant over time.

