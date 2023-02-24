Peach_iStock/iStock via Getty Images

Modern Machine Learning at Euclidean

Euclidean Fund I ended 2022 with a full-year return [1] of -10.32%. By comparison [2], the S&P 500 was down -18.1%; the NASDAQ Composite, -32.54%; and the Russell 2000, -20.44%. Through February 24th, 2023, Euclidean Fund I is up approximately 8%, whereas the S&P 500 is up approximately 4%. While we are disappointed that we did not provide investors with a positive return in 2022, we are happy to have provided some protection from the large drawdowns in market indices during 2022. At the start of 2022, we believed there had been an excessive level of irrational exuberance in the market for many years and that, eventually, a more sober environment that favors fundamental approaches to investing, such as Euclidean’s, would prevail. We are hopeful that the events of 2022 are an indication that such a shift has begun.

Over a decade ago, Euclidean was founded with the goal of applying advancements in machine learning to long-term systematic equity investing, and that remains our core mission today. Since then, machine learning has evolved in significant ways and, as a result, so have the machine learning-based models that Euclidean uses to drive its investment process. We have written extensively in the past four years about this evolution through investor letters and a series of posts on our website. However, with the recent increased visibility of AI and machine learning applications (such as ChatGPT) in the popular press, I want to take this opportunity to explain how the evolution of machine learning has influenced our work at Euclidean and how it has led us to where we are today.

Machine Learning and Long-Term Equity Investing

Investors have been using sophisticated computational and mathematical techniques in equity investing for decades and this has given rise to an entire industry of quantitative hedge funds, mutual funds, and ETFs. So it is natural to ask: What does machine learning bring to the table that has not been uncovered by decades of research on quantitative methods of equity investing? To answer this, it is important to understand that the foundational and workhorse tool of quantitative investing is linear regression. Machine learning, however, is the process by which models are “trained” to discover complex non-linear relationships in large amounts of data.

Linear regression is a powerful tool that has been used for centuries because it is effective, mathematically rigorous, computationally simple, and generates models that are easy to interpret and understand. However, it is limited in that it cannot discover complex non-linear relationships in high-dimensional datasets. It is this type of limitation which necessitated the development of machine learning techniques that can perform increasingly complex tasks, exemplified by the impressive feats of AI that we see today. Modern voice recognition, self-driving cars, generative AI, computer vision and high-quality language translation could never have been achieved through the application of simple linear models.

Equity investing involves making judgments about companies, how they will perform financially, and how their value and price will evolve over time. Companies are complex evolving entities, and the amount and dimensionality of the information on them is enormous. Presumably, therefore, to make the most informed judgments on companies as equity investments, the complexities of these relationships should be evaluated and understood. This is the foundational thesis that originally motivated Euclidean to explore how machine learning can be used to improve quantitative equity investing. Since then, the field of machine learning has advanced significantly, introducing a range of potential benefits to its application in investing. We will explore these benefits in more detail below.

The Evolution of Machine Learning and Euclidean Technologies

During the last 10 years, the combination of ever-increasing computing power available at lower costs, the volume of data available online, and a set of key innovations have led to rapid progress in the field of “deep learning” – the training of deep artificial neural networks to learn complex non-linear relationships in data. One area of deep learning that has received significant attention is natural language processing, which includes applications such as language translation, sentence completion, and generative models like ChatGPT. These applications share a common technique called sequence-to-sequence learning, which has proven to be highly successful. In natural language processing, sequence-to-sequence learning involves training a deep neural network to map one sequence of words (or characters) to another – for example, given the input sequence “the cat ate all the,” the neural network is trained to output “food as it was hungry.” This process is iteratively repeated, with the sequence being incremented forward by one word at a time as it iterates through an entire corpus of text. This can be visually depicted by making the outputs of the deep learning model the same as the inputs but shifted forward by one word, as shown here:

The deep learning models are then trained to generate the output sequence from the input sequence. In the case of large language models, such as the one behind ChatGPT, the models are trained on a corpus of text that includes almost the entire internet. The result is the ability to generate very sophisticated (although not without flaws) dialogues between a person and a machine.

In 2017, long before large language models were making headlines, we saw the potential for applying sequence-to-sequence learning to quantitative investing. Specifically, company financial data, such as income statements and balance sheets, also form a sequence of information but in time. It seemed evident to us that there would be great value in successfully predicting the next step in these company financial sequences. To be sure, this is exactly what an entire industry of well-paid financial analysts is attempting to do every day: pouring through financial statements, earnings transcripts, and company filings to forecast future company financials.

Using sequence-to-sequence natural language models as an analogy, we set up a new deep learning model in which a sequence of annual financial data (including most information that can be found in an income statement, a balance sheet, and cashflow statements) are aligned in time as inputs. The outputs are the same financial data but shifted one year into the future, as shown below:

By using deep learning techniques to train a model to generate the output sequence given the input sequence, we were, in effect, teaching the model to forecast the next year’s financial information for a company from its own historical financial information.

This basic idea led to three years of intensive research in the application of deep learning to long-term fundamental investing. Over those years, the idea expanded and evolved in many ways, most importantly by adding the quantification of uncertainty in the forecasts as one of the key objectives of the new models. The research led to two published peer-reviewed papers that were presented at both the NeurIPS and ICML machine learning conferences. The papers can be found here and here. In March 2020, we deployed the new models to drive the investment process of our core investment fund: Euclidean Fund I. Since then, Euclidean Fund I’s returns compared to the S&P 500’s total returns are as follows:

Returns are presented net of all fees and expenses, and include the reinvestment of all income. All returns are purely historical, are no indication of future performance, and are subject to adjustment. All returns are annualized except for periods of less than one year. Changes in markets, interest and exchange rates, economic and/or political conditions and other factors may influence the future performance of the Fund. The Fund may also from time to time change its investment strategies and objectives and allocate Fund assets differently than in prior periods. Click to enlarge

We are pleased with the fund’s performance under these new models, and we are very optimistic about its future in an environment where investors have become more focused on evaluating companies based on their fundamentals – an environment that we believe favors Euclidean’s machine learning-based approach.

Our commitment to our investors is to constantly evaluate newly developed approaches and techniques in the rapidly evolving field of machine learning to further enhance our models. One area where we believe there is significant potential is natural language processing. Large language models, such as the one powering ChatGPT, have achieved unprecedented feats by training on almost the entire corpus of text available on the internet. It is important to note that financial statements and structured numerical data about companies make up only a small portion of the available data, while most of the information about companies is contained in the form of written or spoken language – such as the text of SEC filings, earnings transcripts, news articles, analyst reports, and investor presentations. Given the success of machine learning-based language models, it seems reasonable to explore their application to such data, as this has the potential to further improve Euclidean's approach. Therefore, most of our research efforts are currently focused on this area.

We believe our approach is a winning one and look forward to demonstrating this with strong performance in the decades ahead. Please reach out anytime with questions or feedback on this letter or Euclidean's investment philosophy.

Best regards, John

Historical results represented herein are for illustrative purposes only and are not based on actual performance results. The hypothetical portfolio and the associated returns do not reflect the effects of transaction costs, bid/ask spreads, slippage, or management fees. Historical results are not indicative of future performance. The opinions expressed here are those of Euclidean Technologies Management. The opinions referenced are as of the date of publication and are subject to change due to changes in the market or economic conditions and may not necessarily come to pass. Forward-looking statements cannot be guaranteed. Euclidean Technologies Management is an investment adviser registered with the U.S. Securities and Exchange Commission. Registration does not imply a certain level of skill or training. More information about Euclidean’s investment advisory services and fees can be found in its Form ADV Part 2, which is available upon request. Click to enlarge

Footnotes [1] Returns are reported net of management fees and expenses. All returns are purely historical, are no indication of future performance, and are subject to adjustment. Changes in markets, interest and exchange rates, economic and/or political conditions and other factors may influence the future performance of the Fund. The Fund may also from time to time change its investment strategies and objectives and allocate Fund assets differently than in prior periods. [2] All index returns are “total returns”, meaning price changes plus the reinvestment of dividends. The index total returns data is provided by YCharts. Click to enlarge

Original Post

Editor's Note: The summary bullets for this article were chosen by Seeking Alpha editors.