A Metadata-Driven Approach to Computing Financial Analytics in a Relational Database


Published on

David Rozenshtein and Sandip K. Mehta
Long Island University and Reuters, USA
Reuters, USA

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A Metadata-Driven Approach to Computing Financial Analytics in a Relational Database

  1. 1. A Metadata-Driven Approach to Computing Financial Analytics in a Relational Database by David Rozenshtein, PhD Sandip K. Mehta Reuters October 2006
  2. 2. What are Financial Analytics? <ul><li>Algebraic formulas: </li></ul><ul><ul><li>Calculate financial results for companies, instruments, indices, industries, etc. </li></ul></ul><ul><li>Inputs: </li></ul><ul><ul><li>Financial metrics: items from financial statements, estimates, prices, interest rates, etc. </li></ul></ul><ul><ul><li>Values computed for other analytics. </li></ul></ul><ul><ul><li>Constants. </li></ul></ul><ul><li>Operations: </li></ul><ul><ul><li>Mathematical and logical operators and functions. </li></ul></ul>
  3. 3. Approaches to Financial Analysis Systems <ul><li>Typical approach: </li></ul><ul><ul><li>“Hardcode” analytic formulas into the system’s source code. </li></ul></ul><ul><ul><li>Important disadvantage – need to change this source code as analytics are added, deleted or modified. </li></ul></ul><ul><li>Better approach: </li></ul><ul><ul><li>Represent analytic formulas as “metadata.” </li></ul></ul><ul><ul><li>Build the analytic system as a formula interpreter . </li></ul></ul><ul><ul><li>Analytics can now be added, deleted or modified without having to make any changes to the analytic system’s source code. </li></ul></ul>
  4. 4. Where and How to Build the Interpreter? <ul><li>Possibility 1: </li></ul><ul><ul><li>Build the interpreter in standard 3GL – C++, C#, Java, etc. </li></ul></ul><ul><ul><li>Advantage: </li></ul></ul><ul><ul><ul><li>Well known algorithms exist. </li></ul></ul></ul><ul><ul><li>Problem: </li></ul></ul><ul><ul><ul><li>Financial data is in the database. </li></ul></ul></ul><ul><ul><ul><li>Analytic results should be placed in the database as well. </li></ul></ul></ul><ul><ul><ul><li>Interpreter itself would be outside the database layer. </li></ul></ul></ul><ul><ul><ul><li>Too much data movement between the database and the interpreter. </li></ul></ul></ul><ul><ul><ul><li>Too slow! </li></ul></ul></ul>
  5. 5. Where and How to Build the Interpreter? <ul><li>Possibility 2: </li></ul><ul><ul><li>Build the interpreter in SQL. </li></ul></ul><ul><ul><li>Advantage: </li></ul></ul><ul><ul><ul><li>Both data and computation are now completely within the database layer. </li></ul></ul></ul><ul><ul><li>Issues: </li></ul></ul><ul><ul><ul><li>How to represent formulas as data? </li></ul></ul></ul><ul><ul><ul><li>How to actually code the interpreter? </li></ul></ul></ul><ul><ul><ul><li>How to make all of this very efficient? </li></ul></ul></ul>
  6. 6. Building the Formula Interpreter in SQL <ul><li>Representing formulas: </li></ul><ul><ul><li>The set of analytic formulas is represented as an annotated directed acyclic graph (ADAG). </li></ul></ul><ul><ul><li>The nodes of the ADAG represent inputs and outputs of the formulas. </li></ul></ul><ul><ul><li>The edges represent operations and functions . </li></ul></ul><ul><ul><li>The ADAG itself is stored as a table in the database. </li></ul></ul><ul><li>Implementing the interpreter: </li></ul><ul><ul><li>The interpreter is implemented as a special kind of graph traversal algorithm. </li></ul></ul>
  7. 7. Standard SQL Graph Traversal Algorithms <ul><li>Standard graph traversal algorithms are depth-first . </li></ul><ul><li>Not suitable for SQL: </li></ul><ul><ul><li>Lead to “one/few rows at a time” computation. </li></ul></ul><ul><ul><li>Too slow! </li></ul></ul>
  8. 8. A Novel SQL Graph Traversal Algorithm <ul><li>We have developed a novel breadth-first graph traversal algorithm. </li></ul><ul><li>Very good for SQL: </li></ul><ul><ul><li>Our system computes multiple analytic formulas for all business entities/fiscal period combinations at a time. </li></ul></ul><ul><ul><li>The number of SQL statement executions is completely independent of the number of business entities or fiscal periods involved, and is proportional only to the depth of the analytic graph, which is usually the logarithm of the number of distinct financial metrics and analytic formulas. </li></ul></ul><ul><ul><li>Very efficient! </li></ul></ul>
  9. 9. The System and the Paper <ul><li>This paper describes an actual system built at Reuters. </li></ul><ul><ul><li>Computes at the rate of approx. 12 thousand analytic formulas per second. </li></ul></ul><ul><ul><li>Runs on an approx. $10,000 computer. </li></ul></ul><ul><li>The system supports arithmetic and logical formulas over numerics. </li></ul><ul><ul><li>For arithmetic formulas, it supports arithmetic operators: +, -, * and /, and a limited number of function symbols. </li></ul></ul><ul><ul><li>For logical formulas, it support binary comparators: =, !=, <, <=, > and >=, and the logical operators: NOT, AND and OR. </li></ul></ul><ul><ul><li>Currently, the system is limited to formulas over numeric values only; however, it is trivial to extend it to other data types. </li></ul></ul><ul><li>Due to space limitations, the paper presents only the arithmetic component of the interpreter. </li></ul>