UBmatrix's XBRL Search Engine: On the Tarmac and Revving Up

| About: EDGAR Online, (EDGR)

Background: EDGAR Online (NASDAQ:EDGR) merged with UBmatrix on Nov. 23, 2010, with UBmatrix becoming a wholly owned subsidiary of EDGAR Online, Inc. UBmatrix designs advanced server-side software for government regulators worldwide including the SEC, FDIC, Banque de France, National Bank of Belgium and others. Its technology partners include Oracle (NASDAQ:ORCL), SAP AG (NYSE:SAP), IBM (NYSE:IBM), Information Builders, SQL Power and cundus; and its implementation partners include leading international business organizations. UBmatrix has also introduced its front-end Microsoft Office "Report Builder" for:

1) the U.K.-mandated inline XBRL tax reporting market:

Inline XBRL (click to expand)

2) the basic XBRL conversion solutions---including detailed tagging of footnotes:

Detailed Tagging of Footnotes Example (click to expand)


3) "xBReeze," a special bank application satisfying Basel II requirements:

Considered a founder and pioneer of XBRL (eXtensible Business Reporting Language), UBmatrix was granted the seminal U.S. patent #6947947 entitled "Method for Adding Metadata to Data" (Provisional application dated Aug. 17, 2001 and patent date of Sept. 20, 2005) There are 214 claims to the patent. Now, the combined companies have both the full perspective of the regulator server validation requirements as well as the front-end solution needs for the filing companies. EDGAR Online's proprietary filing solution is offered via their Xcelerate product.

Patent Application Undisclosed by EDGAR Online: It's often the case that companies don't disclose, discuss, or even report patent applications since there is no assurance that the patent will be granted. Or, even if the patent is granted, there is often no completed product based on the patent.

As the result of reviewing EDGAR Online's recent data-rendering patent grants (cf. #7877678 and #7917841), I happened to discover that one of the references cited UBmatrix's patent application entitled "Method For Searching Data Elements on the Web Using a Conceptual Metadata and Contextual Metadata Search Engine." (Provisional patent application filed Sept. 27, 2004 and patent filed on Sept. 27, 2005). This surprised me since I had been focusing on EDGAR Online's data-rendering patents and UBmatrix's basic XBRL patent. Until researching EDGAR Online's recent patent awards, I had no idea that UBmatrix had also filed the basic patent for an XBRL search method.

For perspective and a quick look-back, it was on October 18, 2004 that UBmatrix announced the "First Web-Based Solution for XBRL-Enabled Business Reporting." Interestingly, this press release was just three weeks after UBmatrix had filed its XBRL search method provisional patent application.

What strikes me about the UBmatrix XBRL Search patent application is its extreme brevity, clarity of method description, and its limited number of claims. The patent describes an incredibly basic yet elegant method of using XBRL to get precisely the search result(s) desired. Here is the abstract of the UBmatrix search method:

Here is an abstract of the UBmatrix patent application:

"An exemplary method for searching data includes receiving a search query comprising a conceptual metadatum parameter and contextual metadata parameters, locating a first set of instance documents containing a first contextual metadatum of the contextual metadata, filtering each instance document in the first set to identify a data element in the instance document that indicates each parameter in the search query, based on definitions internal to the instance document and taxonomies or extensions associated with the instance document, and displaying the filtering results."

The patent application presents the following detailed example of the XBRL search method:

"An example search definition includes the following text elements:

[0014] Company: Microsoft

[0015] Data Concept: assets

[0016] Period: 2002-12-31

[0017] Currency: US$ (In Million: Checked)

[0018] Note that "Assets" is an XBRL Conceptual Metadata Element, while the date "2002-12-31", company name "Microsoft", and currency paramenters "US$, and in Million" are XBRL Contextual Metadata Elements. FIG. 1 illustrates an exemplary method for processing this search definition to obtain search results."

[Note: I searched Google for ("Microsoft" "assets" "12/31/02") and got 4.5 million search results.] And, even if the exact number is displayed in the results, I would have to re-key that data or copy and paste it in my system. As the UBmatrix patent application states, "The search feature on web search engines is based on text and the presence of text elements in HTML/XML pages." Of passing interest (although not meant to be of any great import), if you enter "XBRL search" in Google's Patent search engine, the only search result generated is UBmatrix's patent application.

Unlike various semantic search methods that use algorithms and other complex means of determining what the user wants, the XBRL taxonomy -- by definition -- limits and precisely defines the search terms to be entered. After specifying your search query, the XBRL engine searches the web for all relevant XBRL instance documents and captures data to display the desired search results. Using various drop-down windows and lists of available search choices, the XBRL search method described by UBmatrix would be time-efficient and the results precise. What's more, the patent application notes that the XBRL engine could generate various statistical formats specified by the user and then directly transfer the search results to the user's system.

I expect that UBmatrix's data-centric XBRL search method will become a keystone to financial research worldwide, business enterprise management systems, and business intelligence applications. As noted below, UBmatrix's XBRL Engine is now in use but somewhat limited by the availability of financial data. However, starting with the SEC reports for the second quarter, all companies reporting to the SEC must file in the XBRL format---and will thereby fully populate XBRL databases. No doubt there will be some creative XBRL search applications developed when around 9,700 SEC-reporting companies submit their full financial statements in XBRL format so that reported data is comparable, searchable, machine-readable, and programmable. As companies file in XBRL, UBmatrix will gradually build a massive storage database so that historical data can be extracted by simple query.

If UBmatrix's XBRL Engine becomes widely used, that search market could become an important part of EDGAR Online's data and analytics business. For the year ended Dec. 21, 2010, EDGAR Online had $19.5 million in revenue; and its current market cap is only about $40 million. I suspect the world search market of financial data may greatly expand as XBRL becomes the international standard for financial statements.

UBmatrix's XBRL Engine: Now, 6.5 years after UBmatrix made its provisional patent filing, the UBmatrix XBRL Engine is already developed and in use---although the UBmatrix patent application has not been granted. Here are four presentations discussing UBmatrix's XBRL Engine:

"Developing and Deploying XBRL-Based Applications in an Enterprise-Class Environment" by UBmatrix (Copyright 2008) -- Note pages 3-5 for Storage, Query and Retrieval.

"Unlocking XBRL Content", An Oracle and UBmatrix Whitepaper, Sept. 2009

"In addition to XML centric access, the XBRL Storage Model provides a set of relational views. These views define a logical data model that exposes the XBRL instance documents and taxonomy in a manner that enables traditional SQL operations and analytics to be performed directly on the XBRL content. For example, the relational views provided by the XBRL Storage Model can be used to answer queries of the form “find the value for 2009 Q1 Total Revenue in Oracle’s 10-k statement”. Specialized indexing mechanisms included with the XBRL Storage Model ensure that this type of query performs in an optimal manner." [page 7 of this whitepaper]

"XBRL Extension to Oracle Database 11g Release 2 XML DB" (Nov. 30, 2010)

"Oracle XBRL Extension leverages Oracle Database 11g Release 2 XML DB to provide an Enterprise class solution for collecting, validating, storing and analyzing XBRL content. It adds support for XBRL storage in the Oracle Database, allowing organizations to reliably and securely store, query and manage large volumes of XBRL content. When it is integrated with a 3rd party XBRL Processing Engine, Oracle XBRL Extension supports end-to-end processing and validation of XBRL content on the way into or out of the Oracle Database.." [from Release]

"The XBRL Extension to Oracle Database 11g Release 2 XML DB", an Oracle Whitepaper, Jan. 2011 -- Note page 5 under "XBRL Repository Query Processing" where they discuss "ad-hoc queryability over the XBRL content."

"In addition, the XBRL repository provides a relational representational view over the content by exposing a third normal form (3NF) logical data model with a set of base entities, which are physically implemented as relational views over the XBRL documents. Specialized indexing mechanisms are used to accelerate query processing of these views comparable to a physical relational implementation. The 3NF logical data model effectively provides ad-hoc queryability over the XBRL content, for example for queries of the form “find 2009 Q1 Total Revenue in Oracleʼs 10-k statement” which queries the instance document only."

Enterprise Search Engines: The Search Engine List lists 9 enterprise search engines:

  • Google (NASDAQ:GOOG)
  • Microsoft Office SharePoint (NASDAQ:MSFT) [Note: MSFT bought Powerset in 2008 for -- reportedly -- $100+ million.]
  • Northern Light (private)
  • Open Text (NASDAQ:OTEX)
  • Oracle Secure Enterprise Search (ORCL)
  • SAP AG (SAP)
  • SAIC TeraText (NYSE:SAIC)
  • Vivisimo (private)
  • ZyLAB (private)

Since UBmatrix is already embedded in Oracle and SAP AG, the U.S. patent grant for their XBRL Search Method would only strengthen their position in the XBRL search field. If the XBRL search method becomes available to non-enterprise users in web applications, the potential value of this patent application might become of special interest. In the above discussion I have referred to UBmatrix's XBRL Search Engine, not to current methods used by semantic search tools. However, Prof. Abraham Bernstein of the University of Zurich gave an intriguing presentation, "Making the Semantic Web Accessible to the Casual User" on June 25, 2008. Fortunately, XBRL web search does not have to handle the infinite technical challenges of text searches. UBmatrix first introduced XBRL in their seminal patent, and now they have this new XBRL Search Engine patent application where they access the standards-defined taxonomy to retrieve the precise search result that is desired -- a much less daunting task than that of the natural language search described by Prof. Bernstein.

XBRL and the SEC Mandate: The SEC mandated that all SEC-reporting companies file financial reports (10-Ks and 10-Qs) in the XBRL interactive data format. The SEC's phase-in schedule started with 500 of the largest companies in 2009, 1,200 of large accelerated filers in 2010, and the remaining 8,000+ in 2011 starting in the 2nd quarter for each year. [See page 10 here.] For obvious reasons, the demand for XBRL search engines has not taken off yet since the as-reported database of company data is now being only partially reported. After June 2011, I would expect a sharp pick-up in interest in the UBmatrix XBRL search capability.

EDGAR Online's Recent Results: In 2010, EDGAR Online absorbed large merger expenses and had weak data and analytics sales. XBRL filing revenue increased 33% in 2010. With gross margin of 60%, the company needs top-line growth in revenue to achieve profitability.

There are compelling reasons to expect significantly higher revenue in 2011 and beyond -- tied to the full implementation of the SEC's XBRL mandate. EDGAR Online has strategic partnerships with Merrill Corp. (private), Vintage Filings, and Issuer Direct (ISDR.ob) regarding SEC-mandated mutual fund XBRL filings. And, they have corporate XBRL filing partnerships with RR Donnelley (NASDAQ:RRD), Vintage Filings, CNW, and Issuer Direct. Vintage Filings and CNW are both indirect subsidiaries of United Business Media (London). EDGAR Online has also been a longstanding proponent of having XBRL mandated for asset-backed securities and for corporate actions.

Besides the XBRL filings market, EDGAR Online has recently introduced its I-Metrix Money Market Holdings Dataset and its I-Metrix As Reported XBRL Dataset. These products will add to the company's traditional data and analytic products. EDGAR Online's flagship analytic product is I-Metrix Professional that uses XBRL data within an Excel add-in. See below for a quick I-Metrix page-view:

I-Metrix Professional (click to expand)

The management of EDGAR Online has indicated that it expects 25% annual growth in revenue in each of the next three years. Bain Capital is now the largest investor in EDGAR Online, having provided $11.2 million in net equity financing in early 2010.

On May 28, 2011 Robert J. Farrell became the company's new President & CEO, having recently been Chairman & CEO of Metastorm (recently sold to Open Text). Farrell's background is highly relevant to EDGAR Online's enterprise, data, and analytics business. He is also an expert in the field of business process management, an attribute that should prove helpful in developing EDGAR Online's new business process outsourcing deal with SunGard for XBRL filings. And, his Metastorm background should provide excellent credentials for understanding the potential uses of XBRL within the areas of enterprise performance management and both internal and external financial reporting.

EDGAR Online common stock is a pure play on XBRL, and the company may be a major beneficiary of new XBRL reporting companies in 2011 and beyond. Although Issuer Direct Corporation (ISDR.ob 0.26) is a partner of EDGAR Online and also has XBRL filing activities, its market cap is under $5 million and its shares are closely held.

Disclosure: I am long EDGR. I am long both EDGR and ISDR.ob.