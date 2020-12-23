Its current valuation is modest given the company’s track record of profitable growth and exposure to the attractive AI market.

Appen’s (OTC:APPEF) stock has declined significantly in recent months as revenue growth decelerated due to COVID-19 headwinds. This appears to be a broad problem across most of the analytics market, which presents investors the opportunity to acquire stock at a discounted price. Revenue growth is expected to increase in 2021, which should cause a revaluation of the stock by the market.

Market

The Artificial Intelligence (NYSE:AI) market is growing rapidly driven by declining hardware costs, larger training data volumes and improvements in algorithms. As the value of data continues to increase, the demand for data collection and data labelling also increases. 80% of a typical AI project’s time is spent on gathering, organizing and labelling data. Additionally, it is estimated that data labelling accounts for 25% of the time and cost of a machine learning project. Ensuring this process is done correctly and efficiently is crucial to the success of any AI project.

Figure 1: Breakdown of the Estimated Time and Cost of a Machine Learning Project

(source: CloudFactory)

There are a number of factors driving the demand for data collection and labelling, not least of which is the continued rapid growth in data being generated. This growth is dominated by unstructured data which is currently growing at a rate of 26.8% annually, compared to structured data which is growing at rate an annual rate of 19.6%. Unstructured data refers to any data which despite possibly having internal structure is not structured via pre-defined data models or schema. Unstructured data includes formats like audio, video and social media postings.

Figure 2: Growth in Stored Data Globally

(source: m-files)

Demand for data labelling is also required for maintaining models throughout their working life, not just during the initial training period. This is because data has a tendency to become stale over time, causing the model to drift and performance to degrade over time. It is estimated approximately 34% of models need to be refreshed on a monthly basis and a significant portion of Appen’s revenue comes labelling data to maintain model accuracy.

Figure 3: Model Degradation Over Time

(source: Appen)

Neural networks are a type of machine learning algorithm that have been behind some of the dramatic improvements in model performance in recent years. In comparison to many other algorithms, neural networks generally continue to exhibit improving performance as the volume of training data increases, even with massive volumes of data.

Figure 4: Model Performance Improvement with Data Volume

(source: Appen)

As a result, the training compute requirements of state-of-the-art models in areas like image recognition, translation and text generation is growing exponentially. This is generally accompanied by a corresponding increase in the volume of labelled data required for training purposes.

Figure 5: Training Compute Requirements for State of the Art Machine Learning Models

(source: openai)

While dramatic improvements in model performance has spurred the adoption of AI by many businesses, others have either not attempted AI projects or have encountered difficulties implementing projects. One of the core reasons for this is a lack of data or a lack of quality data, problems which are addressed by data labelling companies.

Figure 6: Factors Holding Back AI Adoption

(source: Appen)

The quantity and variety of data handled during larger AI projects has now grown to the point that specialist service providers are essential for many companies. The data labelling market is relatively small but growing rapidly due to strong growth in end markets.

Figure 7: Growth Proxies for Training Data Demand

(source: Appen)

Cognilytica estimates the market for third party data labelling will grow from 1.7 billion USD in 2019 to 4.1 billion USD in 2024. According to IDC AI spend is growing at approximately 28% annually and will reach approximately 97.9 billion USD in 2023. Using Appen’s estimate of 25% of AI project cost being dedicated to data labelling would imply data labelling spend of approximately 24.5 billion USD. There is likely a significant opportunity to expand the market for third party labelling by capturing internal spending. Known government AI budgets include 5 billion USD in the US and 2.3 billion GBP in the UK. Assuming 25% of budget is allocated to data labelling these 2 governments could represent a 2.75 billion USD opportunity.

Appen

Appen provides clients with a range of services, primarily related to extracting insights from data (images, text, speech, etc). Most of Appen’s services fall into the data collection and data labelling categories. Appen is primarily focused on relevance, language and image use cases, which commonly include:

• Speech to text

• Natural language understanding

• Computer vision

• Search relevance

• Product recommendations

Figure 8: Appen Use Cases

(source: Appen)

Appen provides these services utilizing over 1 million freelance workers while ensuring that labelling is accurate and consistent. This is achieved by providing training, utilizing automated resume screening for onboarding and using more tenured workers where possible to reduce rework requirements.

Appen has a number of focus areas including expanding their addressable market by supporting a wider variety of use cases. As part of this effort, they are enhancing their security products to meet the data privacy requirements of potential clients who have sensitive data. Government clients are one of the primary targets of this endeavor. Appen is also introducing a low touch service model which will make Appen’s services viable for smaller clients.

Appen is introducing features which will help support growth markets like 3D point cloud / LiDAR annotation for autonomous vehicles, robotics, manufacturing and retail Pixel-level image and video annotation for high-quality computer vision data. With the exception of Tesla, LiDAR appears to be the technology of choice for autonomous vehicles. Pointerra is a provider of 3D point cloud database solutions and and while the company is still small it is currently growing at triple digit growth rates. This may be indicative of the growing importance of point cloud data.

Figure 9: Pointerra Annualized Contract Value

(source: Created by author using data from Pointerra)

Appen is trying to expand their China business and currently have a team of over 160 employees there who are focused on building a customer base, particularly in the technology sector. Appen has also developed an air-gapped technology stack and operations for privacy and IP protection. China revenue increased sharply in the first half of 2020, although this revenue is still currently insignificant. The escalation of tension between China and Australia in recent months and China’s moves to restrict the import of a number of Australian goods make this a risky source of revenue though.

Figure 10: Appen China Revenue

(source: Appen)

Appen recently established a dedicated entity to focus on the government sector and are currently trying to expand their customer base. This entity has air-gapped technology and operations and prime contractor eligibility. A Washington DC office has been opened and business development is ongoing.

Appen is engaged in a number of efficiency initiatives:

Matching workers with tasks that match their competencies.

Automation tools to enhance worker productivity. Pilot projects in transcription achieved over 100% productivity gains with improved quality.

Improved user interface for freelance workers to reduce friction in labelling.

Automated quality management tools which utilize AI.

Improving model maintenance capabilities through better integration of services with clients’ model pipelines.

Figure 11: Appen Focus Areas

(source: Appen)

Appen recently lowered earnings guidance and the stock subsequently declined approximately 17% and is now approximately 38% off of all-time highs. While lowered guidance is always negative, particularly for growth stocks, it appears that this is a result of lower spending on analytics and AI broadly and not an Appen specific issue. Revenue growth for analytics software providers like Alteryx (AYX) and C3.ai (AI) has also decelerated significantly in recent quarters. Additionally, Appen’s management has stated that crowd hiring trends for Lionbridge AI indicate that this is an industry wide slowdown.

The revenue growth of large tech companies like Google (GOOG) and Facebook (FB) is probably a reasonable leading indicator of spending on data labelling services. These companies saw lower growth in the first half of 2020 but Q3 indicated that they may have already passed the trough. If revenue growth for these companies continues to increase in coming quarters it likely points to a rapid turnaround in growth for Appen in 2021.

Figure 12: Revenue Growth of Customers and Comparable Companies

(source: Created by author using data from company reports)

Technology

One of the primary risks for Appen is how the AI technology landscape evolves in coming years. Improvements in AI algorithms and how they are trained may result in reduced data requirements in coming years. Research in the following areas could potentially reduce the demand for data labelling services:

Unsupervised Learning – Models are trained using unlabeled data with the algorithm looking for patterns in the data. This often involves clustering items with similar attributes, like determining which items are commonly bought together. The applications for unsupervised learning are currently limited due to the lack of labels restricting the ability to make predictions, but improvements in algorithms are opening use cases in areas like natural language processing and computer vision for autonomous vehicles.

Semi-Supervised Learning – Models are trained using a small number of labelled data points and a large number of unlabeled data points. This process assumes that the labelled data can be used to infer the labels of the unlabeled data. Semi-supervised learning still requires data labelling but can dramatically reduce the volume required. This is potentially an area where Appen can develop expertise though to increase the productivity of their cloud workers.

Transfer Learning – Uses a model developed for a similar problem as a starting point for a new problem. This potentially transfers knowledge from an existing problem to a new problem and can result in improved model performance with reduced volumes of training data. Transfer learning is likely to become more common as machine learning technology matures and model templates become common for certain types of problems like image recognition.

Few Shot Learning – Refers to approaches which try to achieve acceptable model performance from a small amount of training data which can include regularization techniques and meta-learning.

Synthetic Data – Artificial data can be generated with statistical properties that match the sample data and used as a replacement for sample data is some cases. A common example is using synthetic data to train self-driving vehicles as sample data can be prohibitively expensive to acquire. This process still requires labelled data as a starting point for generating synthetic data but is again an approach which has the potential to reduce the demand for data labelling services.

Competitive Advantage

Data labelling companies compete on the ability to provide quality data at low-cost and on the ability to handle large projects. Outsourcing makes the business more scalable but it can also introduce quality problems if it is not handled properly. Training and software are typically used to ensure the quality and productivity of freelance workers. Companies also need to be able to recruit workers with the necessary skills to label datasets, some of which may be specialized. Appen do not control freelance workers though so it is difficult to consider them as a source of competitive advantage.

Procedural Knowledge – Appen was founded as a language technology provider in 1996 and has a long-history in the market. Over this time, they are likely to have built institutional knowledge of how to manage crowd workers and perform quality services at low cost. New entrants into the market must develop knowledge of how to build themselves into customer workflows, how to attract and retain quality crowd workers and how to ensure service quality.

Technology – Related to procedural knowledge is the development of specialized internal tools to assist with managing customers and freelance employees as well as aiding worker productivity. This has become a focus area for Appen and software tools are likely to become more difficult to replicate as data volumes increase and data becomes more specialized (e.g., 3D point cloud).

Two-Sided Marketplace – Appen has successfully developed a liquid marketplace for freelance workers and companies seeking data labelling services. Although it can be difficult for a marketplace to reach critical mass, I generally consider this to be a weak source of competitive advantage.

Economies of Scale - A large and competent freelance workforce with access to leading productivity tools allows Appen to manage labelling projects that are too large for others and at a lower price point. Clients may be reluctant to utilize multiple vendors due to concerns over labelling quality and consistency between vendors. Scale also allows Appen to deploy significant resources on tools which smaller competitors may not be able to justify.

Customer Insight – Appen often develops long-term relationships with customers and has exposure to their workflows. This gives Appen insights that competitors may not have and allows Appen to customize their services to client requirements.

Switching Costs – Appen may end up embedded in their customers data pipelines, particularly for on-going services related to model maintenance. Depending on the cost of Appen’s services relative to the budget of an AI project it may be difficult for customers to justify the burden of building a new vendor into the pipeline.

Customers

Appen’s client base consists mainly of government agencies and large technology companies like Facebook and Microsoft (MSFT). Appen has an extremely concentrated customer group which leaves them vulnerable to a change in their customer’s macroeconomic environment and customer spending decisions. During the financial half-year ended 30 June 2020 approximately 90.2% of the Group's external revenue was derived from sales to five major customers. This is something Appen is actively trying to change with growth initiatives focusing on China, government clients and smaller clients but given the size of the larger tech companies it is somewhat inevitable.

Competitors

The market for data labelling services is relatively fragmented with Appen and Lionbridge AI being the only two major competitors globally.

Lionbridge AI

Lionbridge AI provides labelling services across image, video, audio, text and geospatial and has over 20 years experience. Their crowd workforce consists of more than 1 million contributors with between 30,000-50,000 members of this community deployed at any one point in time. Lionbridge AI offers services across Data Collection, Data Annotation, Data Validation and Linguistics with their specialty being the provision of multilingual training data services.

Lionbridge AI counts a number of high-profile companies as customers, including: Expedia, Facebook, Apple, HP, Merck, Microsoft, CISCO, Johnson & Johnson, Siemens, Credit Suisse, Nike, eBay, Singapore Airlines, JPMorgan Chase & Co, CocaCola and General Motors. Lionbridge AI appears adept at building on-going relationships with clients as the average relationship length amongst its top five customers averages 15 years.

Revenue in 2019 was approximately 260 million CAD, which was approximately half that of Appen. Revenue growth was 29% and the EBITDA margin was in the range of 20-25%. These are similar numbers to Appen, although slightly lower growth and higher margins.

In November 2020 TELUS International entered into an agreement to acquire Lionbridge AI in a deal valued at 935 million USD, a similar revenue multiple on which Appen currently trades. TELUS (TU) is a communications and information technology company with $15 billion in annual revenue and 15.4 million customer connections spanning wireless, data, IP, voice, television, entertainment, video and security. TELUS Health is Canada’s largest healthcare IT provider, and TELUS International delivers the most innovative business process solutions to some of the world’s most established brands.

TELUS acquired Lionbridge to help provide outstanding customer experiences and accelerate their digital transformation. It is not clear that TELUS can add any strategic value to Lionbridge AIs customers though and the acquisition appears to be largely focused on internal initiatives.

Cloud Vendors

The large cloud vendors like Amazon (AMZN), Google and Microsoft have built a large suite of services around artificial intelligence and could likely successfully launch competing crowdsourced data labelling platforms if they desired. Amazon already has Mechanical Turk, which is a freelance worker platform that can be used for data labelling. These companies could offer an integrated suite of AI services which would be a compelling value proposition for many. Data labelling is likely viewed as a relatively low value opportunity for these companies though. If nothing else this ability to integrate forward will limit the bargaining power of Appen and cap the potential for higher margins.

Other

There are a number of smaller competitors offering data labelling services, some of which use freelance workers and others who rely on their own employees. Companies utilizing their own employees for data labelling have more control over their operations and can potentially offer higher quality services. It is potentially difficult to have the necessary in-house expertise to be able to offer services in niche areas though (e.g., translation for a language which is not widely spoken). It also introduces operating leverage into the business and creates downside risk if employees are not fully utilized.

Financial Analysis

Appen has a track record of strong revenue growth in recent years although is now guiding for relatively flat revenue in the second half of 2020. This slowdown has been blamed on a combination of a reduction in digital ad spending, a reduction in IT/digital spending, a reduction or cancellation of services from Appen’s smallest customers, interruptions to global hardware supply chains and suspension of face-to-face projects such as audio data collection. Larger clients have also shifted spending to new product areas and this has impacted spending on some large mature projects.

Appen has faced headwinds throughout the pandemic as they do not benefit from subscription revenue and machine learning expenses have been treated as non-essential by clients. Appen are expecting a rapid return to growth next with revenue growth expected to be broadly in line with growth of the AI market. Using Cognilytica’s estimate of market size Appen has a market share of approximately 24%.

Figure 13: Appen Revenue

(source: Created by author using data from Appen)

Appen is focused on increasing committed revenue and annual contract value has grown 405% in the past 12 months to $103 million. This was underpinned by an enterprise-wide platform agreement with a major customer which includes an 80 million USD annual commitment. A higher proportion of revenue coming from contracts should increase revenue stability and give higher visibility into future revenue which may result in valuation multiple expansion.

Figure 14: Appen Contract Revenue

(source: Appen)

Speech and Image projects are cyclical in nature, heavily dependent on customer timing, investment and product life cycles and require less ongoing data refresh than relevance projects.

Table 1: Appen Segments

(source: Created by author using data from Appen)

Figure 15: Appen Segment Revenue

(source: Created by author using data from Appen)

Appen’s gross margins have declined in recent years, probably driven by strong growth in the content relevance business. Despite this, operating margins have been relatively stable due to operating leverage as the business has scaled. Investors should look for improvements in profitability as a result of Appen’s efficiency initiatives, along with continued growth.

Figure 16: Appen Profit Margins

(source: Created by author using data from Appen)

Content relevance margins are relatively weak in comparison to language resources, although are improving as the business grows, which should help to drive Appen’s operating profitability going forward.

Figure 17: Appen Segment Margins

(source: Created by author using data from Appen)

Valuation

Despite being exposed to the large potential of the AI market, Appen’s business model has failed to capture the imagination of investors until relatively recently and this was on the back of a period of extremely high growth. Based on a discounted cash flow analysis I estimate that the intrinsic value of Appen’s stock is approximately 42 AUD per share.

Figure 18: Appen Historical Enterprise to Gross Profit Ratio

(source: Created by author using data from Yahoo Finance)

Figure 19: Enterprise Value to Gross Profit Ratio for Comparable Companies

(source: Created by author using data from Yahoo Finance)

Disclosure: I am/we are long APPEF. I wrote this article myself, and it expresses my own opinions. I am not receiving compensation for it (other than from Seeking Alpha). I have no business relationship with any company whose stock is mentioned in this article.