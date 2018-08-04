Big data is changing our lives in a big way. Use of the Internet, smart phones and many other technologies is generating 2.5 quintillion bytes of data every day. There are ever growing applications for this data, including those that could benefit your investment portfolio.

In this episode of The Bid, we speak to Rich Mathieson, portfolio manager for global equity strategies and a member of the Systematic Active Equity division, about how big data is transforming the way we think about investing.

Liz Koehler: The world is awash in data. With 2.5 quintillion bytes generated every day, IBM estimates that 90% of the data in the world today has actually been created in just the past two years. It's no surprise, then, that every day generates some new promise of how we can use this big data to change the world. In this episode of The Bid, we speak to the expert on how big data is transforming the way we think about investing. Rich Mathieson is a portfolio manager for global equity strategies, and a member of the Systematic Active Equity division within blackrock's Active Equities Group.

Rich, thanks so much for joining us today. Take us back just a few years ago: what was investing like before the worldwide commercialization of the internet, before smart phones, before social media? What opportunities do you think might have been missed because they maybe lacked some of today's technology?

Rich Mathieson: Hi Liz. I will take it back further than a few years. The Systematic Active Equity Team at blackrock have been combining data and technology with traditional investment insight for as long ago as 30 plus years. And I think really through a lot of that period, the big constraint we had to generating alpha in client portfolios was data. Today that is no longer the case. Think of something you would want to measure electronically and the chances are somebody in the world is already doing that. So today the constraint on alpha isn't necessarily data availability, it's the ideas that you have and how to use that data to forecast outcomes in financial markets.

Liz Koehler: That's great. It's fascinating, I was just reading an article this morning that was talking about how farmers themselves in Australia are even using big data to change their industry and to improve their own crops. So it's amazing how prevalent it is today. So as we talk about today, what do we mean - we hear it all over the news - what do we really mean by "big data" and how new is it really?

Rich Mathieson: Yes. I think we first started thinking about the idea-at the time we called it unstructured data or alternative data back in 2010 and that was really when we first started to realize that there was this growing amount of information available to investors that didn't come in traditional, prepackaged databases of rows and columns of numeric information. And it required a lot of work to turn it into useful information but we felt if we could do that we would have a real edge in terms of investing.

Liz Koehler: That's great. And speaking of all of that data, last year in 2017, your team trailed over 70 new datasets. That's pretty impressive. What kinds of data does the Systematic Active Equity Team really look at, and then how does your team analyze those massive amounts of data to really gain those investment insights that you referenced?

Rich Mathieson: We have a general philosophy that there is no such thing as bad data. We want to look at as much data and information as possible to ultimately try and answer the traditional questions that any investor would ask of the securities or stocks that they invest in. Examples where we've had a lot of success would be interpreting textual information that used to simply be words on a hard copy of an analyst report or a broker note or a company earnings call transcript. Today that information is electronic. We can capture it, we can use it to measure how well securities and companies are doing. Information from social media, information from internet search-think about the way we all use the internet today. If you're going to spend a lot of money in something, chances are you start doing your research online before you engage in the transaction. Geolocation data that helps us analyze consumer aggregate behavior in terms of actual physical location, in and out of retail locations, such as shops and stores. These are just some of the many examples of how we're using this alternative data to again answer very traditional investment questions.

Liz Koehler: There has been a lot of publicity around tech firms recently that have access to personal data of their customers. How does your team think about that in the data that you touch?

Rich Mathieson: Yeah. And it's a great question and a very, very topical one. I think that the big differentiation of it highly is what we are interested in compared to what a lot of technology firms might be interested in. We aren't interested in any way shape or form in anyone's personal information; it doesn't help us make better forecasts for the companies and securities that we invest in. What we are really interested in, is aggregate consumer behavior or aggregate behavior of individuals and how that behavior maps on to companies' prospects. So we're not interested in individual, we are interested in how individuals as a group are behaving and what that means for companies. When we're looking to bring some of this alternative data into the building, we make sure very clearly from a legal and compliance standpoint that at no point in time will that include any personal identifiable information.

Liz Koehler: Makes sense. It's a whole lot more about the trends and the aggregated data that the team is looking to glean insights from. So Rich, the asset management industry as a whole is embracing Big Data, that's clear, to make investment decisions. But it seems that some of these managers could easily buy some of this data or the technologies in order to analyze it off the shelf. Is that really all that's required?

Rich Mathieson: No, absolutely not. And I think you hit the nail in the head earlier, SAE researchers last year trialed around about 70 different datasets. Many of those will be publicly available, anybody could get hold of it, anybody could buy it. It's very, very difficult for a lot of asset managers to look at 70 datasets. One of the things we've learned over that ten year period in which we've been looking at this type of information is that bringing as much data as possible to bear on the same investment question for example, what our next quarter sales are likely to be better or worse than expected for a given company, bring as much data as possible to answer that question is very, very important. And only firms with an ability to leverage a technology platform and they can accelerate as much data as possible over that platform, will be able to answer those types of questions successfully. So size and scale is important in this game. The second point I would make is again, this data even when you acquire it from an often third party vendor, is messy, noisy, unstructured, a lot of work is required to refine and curate that data and ultimately map it onto an easily tradeable security for you to actually build that information into a client portfolio. It's not easy. And we have a lot of years of skill and experience and the talent required in order to make that transformation.

Liz Koehler: Rich, the Systematic Active Equity Team has been analyzing Big Data to enhance its investment outcomes with technologies like machine learning for almost a decade. What are some of the necessary ingredients that you think asset managers need to ensure they are actually using this Big Data most effectively?

Rich Mathieson: The first I highlighted earlier is size and scale is important and an ability to bring as much differentiated data to bear on the same investment question as possible. For an example, if you're looking to forecast whether next quarter sales for a company are going to surprise for the upside or downside. There is a tradeoff between the forecasting horizon of the information and the accuracy that you're likely to get. So for example, I talked about internet search earlier is giving you a very nice early warning of an intention across large groups of consumers towards particular brands and products. The problem being there that I think that gives you just an intention rather than get you close to hard transaction activity. So if you take it to the other end, if you look at things like geolocation, or aggregated transaction data from bank statement and credit card statements, you're getting closer and closer to hard transaction activity and ultimately book sales. Any one of these datasets might not necessarily give you the right answer, but when you bring them all together and they corroborate one another, that's when you can get something clearly powerful in terms of forecasting ability and an ability to accurately get ahead of improving company fundamentals. Second point I would note, when you bring in a new dataset, when you build an initial model, it's very rarely the best possible model that you can build. And what we've found is the best results come from years of innovation, of layering incremental innovation on top of the same insight or same idea as new data, information, techniques for looking at the data become available. One of the best examples I think we have of that is the way that abilities in natural language processing have evolved over the last eight to ten years. Original algorithms that we ran to build models that essentially enabled us to read text which is very rigid preprogrammed dictionaries of words. For example, words like growth, exciting, opportunity or threat, deterioration, competition, these would be words that the investment team would select and then the program or algorithm would look for those words within the text of a broker report or a company earnings call or regulatory filing. The second iteration of that insight would start to then look at different features of the text. So the company using lots of numerical data, we tend to find that good companies talk a lot about numbers. We would compare the sentiment in the text across different sources, so for example, different parts of the call Q&A section when management teams are more likely to be or less likely I should say to be reading from prepared remarks, we found that particularly useful. And then I guess bringing up to date the most recent innovation in that insight actually brings in the concept of machine learning and combines that with natural language processing where we've built an algorithm that essentially learns from analyzing the relationship of words versus stock returns, what the important words are. Rather than individuals preprogramming the words to look for, the algorithm is actually learning for itself and it's doing that on an individual security level and a very adaptive and dynamic way. So those are just two of the examples of lessons that we've learned, continual innovation and bringing as much data as possible to answer a traditional investment question.

Liz Koehler: Wow, that's fascinating work. On top of it, I get to tell my husband tonight that all of my online shopping is not a bad thing, I just am contributing to the important Big Data cause here. But no, thank you, those examples are really great. Broadening this out to our listeners, how might investors really see this come to life in their own investments?

Rich Mathieson: Yeah. So I think whilst the ideas and the data we've been able to analyze to model those ideas have changed a lot over the last decade, the way that we build those ideas into client portfolios has remained very, very consistent throughout our 30 year history. The process starts with traditional investment question, how am I going to forecast whether the stock will beat expectations in terms of next quarter earnings or sales or whether this stock is going to see an improvement in expectations for future earnings over the next six months, or a change in profitability over the next 12 to 18 months? These are the same types of investment questions that any investor would be interested in knowing the answer to for a given security. But what we then do is try and bring as much data as possible to bear to answer those questions, and we want to measure the exposure of pretty much every stock in an underlying investible universe to that data, to that information with a view to maximizing the breadth of the opportunity set, we can get exposure to in portfolios. And then as we go from that measurement of exposure to the idea, we then build as many of those views as possible into very, very diversified portfolios where essentially we're going to be long or overweight all of the stocks that we think have positive exposure to the idea or short or underweight all the stocks that we think have negative exposure to the idea as measured by the stock's exposure to the underlying data we have identified enables us to model that idea. So what you end up with is a very broad portfolio of assets. We tend to hold large numbers of securities, we control risk very, very tightly so you are diversifying away the element of that stock's risk that isn't explained by this exposure to your investment idea and getting a nice, clean pure exposure to that information set, to that investment idea in the portfolio.

Liz Koehler: It seems that all over the media today, you hear about machines being poised to take over the world, and in this particular case, even investing. Is the human touch still instrumental in all of this?

Rich Mathieson: Yeah. Very much so and I think there is a couple of key elements there. First is that certainly the robots aren't in complete control yet. For any algorithm we deploy, for any piece of data that we bring in and analyze, there is still a very, very large interaction with the investment team, with human beings, with the algorithm and underlying data. A lot of the algorithms that we have used for example in the machine learning space, they weren't originally designed for analyzing time series of financial information with a view to building an investment model. And quite often, we have to bring a lot of investment insight that we have built up over 30 years of primary research into what matters for stock returns to bear on defining and refining the model in order to enable it to think and behave like an investor. The second point I would note is that culture is very important and the idea of building a very open, collaborative culture where experts in the field of data science and individuals who have talent in knowing how to extract useful information from these very large messy, unstructured datasets are working in a very, very collaborative way with again, individuals who might not necessarily be data scientists but have again, years of experience in understanding what really matters for stock returns. And I think if an investment manager doesn't have that open, collaborative culture and isn't able to fully integrate these two elements into the investment process, then I think a lot of what we've been discussing today will struggle to become a reality.

Liz Koehler: Rich, thank you so much for joining us today; it really was a pleasure having you.

Rich Mathieson: Thank you. My pleasure to be here.

This post originally appeared on BlackRock