Artificial intelligence (AI) gained unprecedented attention within the hedge fund community in recent years. However, AI is not some new kid on the block. In fact, its roots go as far back as the 1940s when Warren McCulloch and Walter Pitts first introduced the neural network. Today, it finds widespread use in applications from identifying images, speech, natural language processing to robotics and more. Similarly, the use of AI techniques for trading or investment is not a new idea either. But it was not successful in any big way in the earlier attempts. So why is everyone so excited about using AI for investments again? From my own lens, I attribute this to a confluence of technology advances and changing market dynamics.
Rapid Technology Advances
Faster Processing Speed
Our technology has improved by leaps and bounds over the years. My first encounter with a PC was an 8-bit Apple machine with a monochrome CRT monitor running on MS-DOS. Then came machines with more powerful Intel processors. There is the 16-bit 286 series which is quickly followed by the 32-bit 386 and 486 series. And when I was in university, I was using a machine running on a 32-bit Intel Pentium processor which is hundreds of times faster than the 8-bit Apple.
I remembered coding from scratch my first neural network in the C language for my dissertation. Back then, I wanted the computer to identify the person who is speaking. In academic terms, we call that automatic speaker identification. And how is that done? Without getting into convoluted details, I literally fed the neural networks thousands of processed speech samples from each speaker repeatedly to train it. You can draw a parallel against a school kid learning for his spelling test. To commit the words to his memory, he keeps writing the words over and over again until he gets it.
So how long does it take to train a neural network then? Awfully long. But I guess it was partly due to my inept programming skills then. I would set the machine to train in the morning and the results will only be out around dinner. In contrast, we can now use an open source AI Python package to train a neural network on our PC with a few lines of code in a matter of minutes.
CPU Processing Speed in Million Instructions Per Second (MIPS)
Larger Data Storage Capabilities
Improvements in data storage capabilities are immense. From the 5.25-inch floppy disks prevalent in the 80s with a capacity of only 360 kilobytes to the virtually unlimited cloud storage today, we have indeed come a long way. Even personal desktops today are equipped with state of the art, yet affordable, hard drives ranging from a few hundred gigabytes to a few terabytes. A single terabyte is enough to store more than 30 years of 1-minute OHLC (Open High Low Close) price data across 5,000 stocks. And if we move up to the institutional level where we capture finer resolution data, we can be talking in terms of petabytes. In fact, it should not take long before exabyte, zettabyte, and yottabyte become the new norm among common people.
Exponential Growth of Data
The digitization of information and the proliferation of the internet brought about an explosive growth of data. Your mobile phones, PCs, sensors and many other digital devices have become critical points for data collection. Based on IDC’s report in 2018, our world holds 33 zettabytes of data today and that is projected to grow rapidly to 175 zettabytes in 2025. These include data crawled from the web, transactions on commercial activities, survey results, social media postings and much more. And they can come in unstructured formats – HTML, video clips, sound files, images, etc. These came to be known as alternative data.
As companies look to monetize these alternative data, more will become readily available and some might be useful for investment purposes. Thus, we are no longer constrained to traditional structured market data such as price, order flows, economic and balance sheet numbers. There is now a wide plethora of choices on the plate as long as you can pay.
A Changing Market
Deteriorating performance and alpha
Besides technology enablers, there is also an increasing urgency for the asset management industry to look into harnessing AI’s potential. In particular, hedge fund managers face their most challenging period yet. Traditional investment alphas are diminishing as competition intensifies. In 1997, there are about 4500 hedge funds managing a total of less than USD 300 billion worth of assets [source: The Globalist]. Today, there are more than 10,000 hedge funds holding aggregate assets of about USD 3 trillion [source: Barclays Hedge].
Central banks worldwide added to hedge funds’ long term woes after they intervened in the 2008 financial crisis. They launched a massive stimulus program unprecedented in scale. That knocked the market off its orbit, rewrote rules and changed behaviors. For a good 10 years till now, volatility moved dramatically lower and stock markets seemed invincible. Between 2009 and 2018, MSCI World delivers an impressive annualized 12% return. On the other hand, HFRI Fund Weighted Indices (hedge fund proxy) pales in comparison with a measly 5% annualized return. Today, there is no lack of articles lambasting hedge funds for their dismal performance given the fees they charged.
Hedge funds and market correlation is increasing
Low correlation against conventional markets and providing critical diversification benefits is a key selling point for hedge funds. After all, hedge funds aren’t called alternative investments for nothing. But these features are fading away as well. Since 1990, the correlation between hedge funds and the stock market returns has been creeping up steadily. The 24-month correlation went from below 0.5 in the 90s to above 0.9 today.
Unlike earlier downturns where hedge funds provide significant outperformance, they provided less value in recent times. Below is a table that shows how the various hedge fund indices fare during periods where MSCI World loses more than 20%. Hedge funds in aggregate profited or managed to come in flat during the earlier downturns of the 90s and early 2000. But it loses a significant amount, albeit still lower than the markets, during the 2008 bear market.
HFRI FWI (Fund Weighted Index), HFRI FOF (Fund of Funds), HFRI EHI (Equity Hedge Index), MSCI WL (World) [Source: Which Is A Better Investment – Hedge Funds Vs Equity Market]
Hedge funds are cutting old strategies
Many investment strategies simply do not work as well as it used to. Trend following, a long time staple strategy for established institutions such as Winton Capital and a top performer in 2008, has since fallen out of favor. In recent years, trend followers are whipsawed by sudden bouts of drastic market sell-offs followed by equally ferocious recoveries. With rapid information flow and a fast reacting market, slower trend strategies seemed unable to catch up. In 2018, Winton announced its decision to cut its exposure for trend following strategies to 25% [Source: Winton Capital]. And just recently, Renaissance Technologies followed suit to reduce its allocation to trend strategies in the Renaissance Institutional Diversified Alpha (RIDA) Fund.
Are these strategies permanently obsolete? Will it come back with a vengeance in the future? That is anybody’s guess. I do believe the decisions are, in part, business motivated rather than purely investment driven. In any case, only time can tell if the decision is right. But in the meantime, there is a fire to fight and hedge funds are turning to AI and alternative data for new sources of alpha.
There are many different methods to build machine intelligence. But the one most widely talked about and closest to modeling our brain, is perhaps, the neural networks. And within neural networks, there are different variants. The one I am looking at is the Multi-Layer Perceptrons (MLPS). It is what one might call a form of Deep Learning – a buzz word that did not exist in the 90s even though MLPs were already in use. In this framework, our brain is modeled as layers of interconnected neurons. See a generic diagram below.
The input layer comprises observable data the user feeds into the network. The hidden layer is where the magic takes place and the output layer is the results. This is a very versatile set up that can be used in different ways. For instance, you can feed market data into the network and get it to predict the price of a stock index at the next time step. Alternatively, you may only be interested in the direction the stock index is headed. Then an output with a binary value 0 or 1 where 1 indicates upward and 0 otherwise suffices. In both cases, all you need is a single output neuron.
Neural Network Supervised Learning
A neural network starts off like a fresh student who knows nothing. Before it is of any use, it has to learn. And when it comes to machine learning, there are various approaches. We can leave it to figure out groups of patterns and associations on its own (unsupervised) or we can teach it (supervised). For predictive purposes, we tend to rely on the supervised approach.
How do we teach a neural network? The concept is quite simple without delving into the math. We prepare a set of training data where we show the system the desired output for each input. We can feed the data one by one, batch by batch, or all in one go. The system will not get it right in the first try. But it will learn by adjusting the connecting weights of the neurons based on an error function. This error function measures collectively how far off its predictions are against the desired outcomes. The end goal is to reduce this error function to a minimum. When an entire training set is fed in, we complete a single cycle of training or often called an epoch by AI practitioners. This learning process is then repeated hundreds or thousands of epochs until the error function stabilizes to a minimum.
After training the neural network, it is time to put into practice what it has learned. To do that, we test it on data it has never seen before but of which we know the outputs. That gives us a proxy to its real-world performance. And if it meets the grade, we run the model forward with live data and subsequently deploy it into production.
How is Neural Network different from conventional models?
Alright, a predictive neural network learns. But ultimately, what it solves is still a fitting problem. What makes it so different from, say, a multi-factor linear regression model? After all, hedge funds been using such models to score and rank stocks. There are a few notable differences.
Neural network derive and prescribe its own function
A key difference is that in conventional models, we prescribe the function used. So we limit at the onset the scope of how the inputs should be related to the outputs if any. For example, a linear multi-factor regression model assumes that the factors and its outputs have a linear association. This allows us to easily understand the model and explain the results. But if the relationship is non-linear, then such a model will not be adequate.
When it comes to market and investments, we often can’t pin down a clear relation. Most of the time, we just make assumptions and use whatever is mathematically tractable for our purpose. But why do we need to box ourselves up to trying out only things that we can readily explain? I think everyone agrees that there is much that we do not know. So if you do not have a solution, why not let someone else take over?
A neural network learns and finds on its own the associations between the inputs and outputs. It derives its own function embedded in the hidden layers through the training process. By setting appropriate activation functions for the neurons, the neural net can in aggregate solve complex non-linear problems. But it is often impractical to observe and make sense of this hidden layer. This is also why neural networks are deemed to be black boxes.
Neural networks map the inputs into a separate feature space
Data-driven predictive analysis involves generalizing a fit to the past. Mathematically, this is easier in a higher dimensional space. We know we can easily fit 2 points in a 2-dimensional space with a line. However, we can’t fit 3 points exactly unless they run along the same line or we use a nonlinear function. But if we introduce the 3rd dimension, then we can define a plane which captures all the 3 points. And if there are more points, then a hyperplane will do the job. In short, the more dimensions or degrees of freedom you have, the easier it is to fit. This is theory what can be done but note that taking this to the extreme results is not advisable as it results in overfitting and an unusable model.
Traditional models work with a fixed dimension imposed by their function. For instance, a 2-factor linear regression model works in a 3-dimensional setting. A neural network has no such constraints. Because it can map (linearly or non linearly) the original inputs into a separate feature space in the hidden layers. For example, by setting up a hidden layer with more neurons than the inputs, the neural network can expand the dimensions into a hyperspace to perform its task. But an excessive number of inputs, as often the case with the real world, is not practical either. Besides the many dimension reduction techniques out there, a neural net can also be used to extract lower dimensional features that represent the original larger data set.
Is Artificial Intelligence our holy grail answer to investing?
With such powerful techniques and a huge pool of untapped data, does it mean we can finally pin down that holy grail approach to investing? While I agree there is significant progress in AI, I don’t think we are anywhere near it, nor do I believe we can ever find it. But aren’t machines already besting the top humans in intellectual games like chess? Yes, but many things in life are more than just a game of chess.
The financial markets are far more complex than a game of chess
In 1996, IBM’s AI machine Deepblue shook the world when it beat the world champion Gary Kasparov in chess. Then in 2016, Google’s AlphaGo trounced Lee Sedol decisively, a world-renowned professional player, in a more complicated game called Go. These are indeed important milestones worth celebrating. But the magnitude of a chess problem compared with the financial markets might be likened to a single star in the universe. Both games are well-structured problems within a finite space. Everything is about you and your opponent where everything is laid bare except his thoughts. Every move is relevant and critical. And at any point in time, you have complete information about what has happened. You know the current position of the pieces, the sequence of your own and opponent’s moves. On top of it, it is a turn based game. Your turn, my turn. In between, no surprises.
The financial markets, on the other hand, is an enormous beast of chaos. There are possibly hundreds of millions of participants day in day out. No one waits for you. New information comes in every instant. No one has all the pieces. No one fully understands how each piece shapes up the markets. And everyone has their own interpretations and is capable of acting differently. Unlike a game of chess that stays more or less the way it is over centuries or more, the financial markets are constantly evolving, expanding and rewriting its rules.
Short data history and unknown value
Your end product is only going to be as good as what you feed it. Artificial Intelligence implementations such as deep learning require a huge amount of good quality data. Personally, I have yet to test out alternative data sets. But there are some foreseeable challenges. Alternative data, being new, may not have such a long history. In addition, for such data to be deployed in actual systems, it needs to be complete, structured, accurate and delivered in a timely manner. Even if it fulfills these criteria, the question of its predictive value remains. Until it is tried and tested extensively, no one can lay claims to its efficacy. In fact, it is highly likely both buyers and sellers of new alternative data sets have no idea what it can really do for them.
Overfitting and Garbage
Put a powerful tool in the wrong hands and it will be subject to abuse. With more data, power and sophisticated algorithms at our fingertips, it can be tempting to just throw as much as we can into a neural net. We can practically whack something out with brute force without giving it much thought. After that, we can marvel at an awesome solution fitted to almost every single output in the past. And just when we thought we hold the world in our hands, the perfect model fails miserably in the real world. Why? Because we just fitted everything including noise. And imagine if more than 90% of the data is noise to start with. This can be avoided by adopting good practices in the design of the neural network and preparation of the data with the use of validation and test sets.
Another attempt is to feed the neural network all kinds of data permutations and combinations in the hope of finding something useful. In general, I have no issue with this approach. After all, what is wrong with making use of machines to assist us in thinking out of the box? But when no human post analysis is involved, and we rely solely on empirical statistics, we can end up with models based on coincidental relations that are destined to flop. Any results may be transient at best. To be fair, I must confess I can’t rule out a relation at the onset. Maybe I am just biased or ignorant. But I am always wary and skeptical, for instance, when people start linking their market forecast to movements of the celestial bodies.
More inputs do not necessarily translate to better performance
To build good models, we want our inputs to have a high correlation with the outputs. However, we do not want our inputs to be highly correlated with one another. While neural networks are capable of addressing the correlations, it can’t increase the informational value of the data set beyond what it intrinsically has. So there is little point in having lots of different input data when they are all highly correlated. That will not help improve the model. For example, we can make a guess on a person’s age given his height. But telling us further how much he weighs does not provide much more than what height already told us. So it is unlikely to make our guesses any more accurate. And in our case, it remains unclear whether alternative data provides any significant added value to traditional market data.
So can machines take over investment managers today? Let's not delude ourselves. Artificial intelligence is not the same as human cognition. We can’t replicate something we do not fully understand. It is a field of study based on the foundations of multiple disciplines such as computer science, neuroscience, math, statistics and more. And it relies on pure computational power to make machines mimic specific human actions or decision making. But to make everything work, a human architect lies behind every machine and model.
I would agree though, that advances in machine intelligence and the emergence of alternative data opens up new opportunities for hedge funds. But let's not jump the gun and get overly excited. Much of these are overhyped by business marketing and the media. Given the secrecy around hedge funds, we are not going to know exactly what they are doing with it anyway. Maybe behind that facade of AI, what some are doing is still the old school linear regressions with adaptive capabilities that are nothing more than just moving time windows.
At this stage, all we can say is the potential is there. In the meantime, let's stay open and observe, or we can join the fray and try it for ourselves.
Disclaimer: Any views or opinions represented in are personal and belong solely to me. It does not represent any other people, institutions or organizations that may or may not be associated with in professional or personal capacity. The views or opinions are not intended to offend or malign any religion, ethnic group, club, organization, country, company or individuals. The content is provided for informational purposes only. It is not intended to be, nor shall it be construed, as an offer, or a solicitation of an offer, to buy or sell an interest in any fund or security. I make no representations as to accuracy, reliability, completeness, suitability or validity of any information.
Disclosure: I/we have no positions in any stocks mentioned, and no plans to initiate any positions within the next 72 hours. I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it. I have no business relationship with any company whose stock is mentioned in this article.