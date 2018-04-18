If Tesla can do self-driving without lidar, it will have a long head start in a potentially multi-trillion dollar industry.

The number one criticism of Tesla’s (TSLA) self-driving car strategy is that the sensor suite equipped on all its production cars does not include lidar. Some critics believe that full self-driving requires the high-precision spatial data that lidar provides. In this article, I’ll discuss exciting new computer vision research that lends support to Tesla’s strategy.

Tesla’s gambit

Some quick context first. Tesla can’t equip its production cars lidar because the hardware remains prohibitively expensive. Low-range, low-resolution lidar might be affordable enough, but what I’ll call “autonomy-grade” lidar — the high-range, high-resolution units used by Waymo (GOOG, GOOGL) and others — can cost as much as $75,000. The price of lidar will quite plausibly descend along a Moore’s law-like trajectory until it’s affordable enough. However, that could take years.

If Tesla can develop full self-driving software before affordable lidar is available, it can beat everyone to the market with self-driving cars. This is an exciting possibility for Tesla shareholders, given that the global self-driving car industry could eventually generate trillions in revenue. Lidar is the anchor holding other companies back. If it turns out lidar isn’t necessary, Tesla will have a long head start.

Lidar vs. cameras

If Tesla can use cameras and radar to see the world at a level of spatial precision comparable to lidar, then lidar isn’t necessary. There is no need to wait around for affordable lidar, and Tesla will have its head start. So, how do these sensors stack up?

Previously, I wrote about a paper that found a car equipped with cameras can localize objects in its environment to within 10 centimetres (3.9 inches) of accuracy — that’s about the length of a credit card, plus the width of a finger. (Lidar, by comparison, is accurate to within 1.5 centimetres or 0.6 inches, the width of a finger.) Intuitively, it seems like that is enough accuracy for the purposes of the driving task, given a buffer zone of at least 20 or 30 centimetres between the car and other vehicles and pedestrians. In that article, I also looked at evidence to suggest humans struggle to achieve that level of accuracy even when they try.

Now, a new paper from researchers at Cambridge and Oxford shows the rich, accurate representations than can be extracted from camera input using a deep neural network. The researchers used images from a single camera to do three tasks simultaneously: 1) estimate depth, 2) detect and classify individual objects like cars and pedestrians, and 3) semantically segment the world into different regions, such as road, sidewalk, median, cars, people, trees, sky, and so on.

First, here’s a visualization of what a car can see using six lidar units:



Now, here’s what the deep neural network was able to perceive using input from a single camera:

The deep neural network has a robust and detailed awareness of the car’s environment, including depth (bottom left), obstacle detection (bottom right), and semantic segmentation (top left).

Semantic segmentation, for instance, lets a car know what’s a driveable roadway and what’s not. In my understanding, this is how Tesla’s Enhanced Autopilot is able (since the 2018.10.4 update) to navigate improvised lanes at construction zones, without the use of lane lines:



Watching the video from the Cambridge and Oxford researchers, it seems to me that camera images provide plenty of rich information, including depth information. It’s just a matter of extracting it. The principal virtue of lidar appears to be that it doesn’t require depth information to be extracted. But if software can extract depth information from camera images, what is lidar needed for?

Conclusion

As I write in my article "Tesla’s Computer Vision Master Plan”, Tesla is supplementing cameras with radar and probably with HD maps. It is also leveraging its fleet of production cars to train its deep neural networks with an unprecedented, unparalleled amount of driving data. It’s designing a custom microprocessor that could allow it to run deeper neural networks. Putting these pieces together forms a cohesive computer vision strategy.

I believe that lidar is over-emphasized and that these other pieces are under-emphasized. The breakthrough that is enabling the development of self-driving cars isn’t lidar, it’s deep neural networks. Not hardware, but software. Tesla is therefore shipping its cars with the best affordable hardware available, and using software to do the rest.

Tesla demonstrates full self-driving using just cameras, no lidar.

Disclosure: I am/we are long TSLA.

I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it (other than from Seeking Alpha). I have no business relationship with any company whose stock is mentioned in this article.