A New Approach to Resolve
Object-Relational Impedance
Mismatch
Henrietta Dombrovskaya
Enova
Chicago IL
hdombrovskaya@enova...
What is Object-Relational
Impedance Mismatch and why It is
bad?
Why should we care?
We should care because we care
about.....
Database Application Performance
Everybody wants their application to function efficiently,
right?
After all – why choose ...
THE DATABASE IS
SLOW
WHY???
2013 Henrietta Dombrovskaya – Enova 4	
  
Let’s see…
Where do we usually start to look, if we want
to see, why the database performance is
bad?...
2013 Henrietta Do...
Our plan of actions
Look at the most long-running queries and try to
optimize them
Look at the queries, which are executed...
Helpful reports:
2013 Henrietta Dombrovskaya – Enova 7
Our performance monitoring tool (pgBadger) produces
several classes...
Based on these reports - is a database
really slow?
The top 50 most offensive queries, i.e. the queries
which eat up the m...
You may say – that’s how many times the
screen is refreshed, but…
UK: account home controller as accessed about 50,000
tim...
How could this possibly be happening?!
Let’s take a step back…
2013 Henrietta Dombrovskaya – Enova 10
First came a program…
•  Once upon a time there was a program…
•  Until mid-60’s – only sequential access storage was avai...
Since Then We Have…
-Imperative programming languages, which tells, how to do
things
for	
  (i:=1,	
  i++,	
  n)	
  do	
  ...
Database provides:
•  Independent data storage
•  Complex data structures
•  Data integrity
•  Concurrent access and data ...
So…
Now everybody is happy, right?
…not really…
2013 Henrietta Dombrovskaya – Enova 14	
  
Why? Because…
Both imperative programming languages and
declarative query languages work perfectly to
accomplish the tasks...
What is Impedance Mismatch?
These tools were designed to operate with objects of different size
(granularity): variables v...
When We Have an OO Application…
•  During one screen rendering: objects may belong to
different classes with different acc...
ORIM Definition
“A set of conceptual and technical difficulties that is
encountered when a relational database management
...
Development Standards & Performance
Internal company standards and/or industry standards may
prevent from creating efficie...
Translation:
•  As a database developer you would assume, that to get
all payments for all loans for a given customer, you...
What We Have at Enova?
Enova is a Ruby on Rails shop that uses the
ActiveRecord Object Relational Mapping (ORM) library
to...
How ActiveRecord Works
2013 Henrietta Dombrovskaya – Enova 22	
  
What This Means for Application/
Database Interaction
Due to the lack of awareness of the underlying
database interaction ...
Account Presenter
def initialize(customer)
@customer = customer
@customer_extra = customer.customer_extra
@person = custom...
Corresponding Application Log
SELECT	
  *	
  FROM	
  customers	
  	
  
	
  	
  	
  	
  	
  	
  WHERE	
  (customers.id	
  =...
But Wait, There’s More!
SELECT	
  *	
  FROM	
  approvals	
  WHERE	
  (customer_id	
  =	
  12470535)	
  ORDER	
  BY	
  proc...
What We Should See Instead…
2013 Henrietta Dombrovskaya – Enova
SELECT	
  *	
  	
  
FROM	
  customers	
  	
  
WHERE	
  (cu...
Why Does This Happen?
… because the app is written in an imperative way:
“Fist we need to select all loans,
then for each ...
There Are Some Ways To Improve
Eager loading generates almost correct SQL:
SELECT customers."id”
,customers."created_by”
<...
But Then We Apply the Next Method
has_many :bank_accounts, :foreign_key => 'person_id' do
def default_for_customer(custome...
If We Continue With Existing Frameworks
… this problem (ORIM) will never be solved, and
we will continue to loose money on...
New Approach to Resolve Object-Relational
Impedance Mismatch
Our new approach allows drastically reduce the
number of call...
The old methods are unaware of the data
which is already available – make them
data-aware!
Isn’t this contrary to the stan...
Logic Split Methodology
Disassemble method into atomic steps,
Identify ones, which require data retrieval
Using the knowle...
Example: Amount_Outstanding
2013 Henrietta Dombrovskaya – Enova
Definitions
AccountsOutstanding=
AccountsUncollected +
FeesOutstanding +
InterestOutstanding +
PrincipalAccounts +
Account...
Under the Hood: Database Calls
SELECT	
  	
  
	
  vl.value	
  AS	
  account	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
 ...
Drawbacks
The method itself would allow retrieving all the information
related to one loan “in one shot”.
However, because...
PGBadger Log
2013 Henrietta Dombrovskaya – Enova 39	
  
Modified Method
2013 Henrietta Dombrovskaya – Enova 40	
  
Under the Hood: Database Calls
SELECT	
  	
  loan_id	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
	
  	
  	
  	
  	
  	
  ,...
Execution Statistics From Dark Testing: Old
2013 Henrietta Dombrovskaya – Enova 42	
  
… and New
2013 Henrietta Dombrovskaya – Enova 43	
  
So… now everybody happy?...
…not really…
What do application developers say at
this point?....
Wait! What about the busine...
What About the Business Logic?
We need some business logic to know how
to execute joins and selects – just enough
to minim...
One More Time
ü Disassemble method into atomic steps,
ü Identify ones which require data retrieval
ü Using knowledge ab...
Let’s review another example – account
balance calculation for OEC loans.
2013 Henrietta Dombrovskaya – Enova 47	
  
Account_Balance Method for OEC
Ø Obtain account principal balance
Ø Obtain outstanding fees and interest as of
next paym...
Traditional Object-Oriented Approach
Create Account_Balance method which would call the
following methods:
Ø Principal_Ba...
Drilling Down Into Each Of the Steps
1.  Principal balance as described for account_outstanding requires a
single database...
Combining Steps With Data Retrieval
Those steps are: 1,2,3, 3.1, 3.2, 3.3.1, 4,
which means that the database accessing
me...
SELECT statement
SELECT l.id AS loan_id
,sum( CASE WHEN debit_account_cd = ‘principal’
AND t.acct_date<= v_current_date
TH...
Ruby Method: account_balance
def account_balance
amount_payable = amount_outstanding + amount_charged_off
amt = amount_pay...
Ruby Method: unearned_interest
def unearned_interest
amt = -1 * (principal_amount + [(customer_balance -
interest_provisio...
How It Works
This methods is used as a callback in a more complex method,
which returns Loan_Summary where most of the val...
2013 Henrietta Dombrovskaya – Enova 56
LOAN_ID,
FUNDING_DATE<
FUNDING_AMOUNT
FEE
STATE
APR
LOAN_STATUS
AMOUNT_DUE
ACCOUNT_...
On a Larger Scale
In our new application a single Postgres
function retrieves summaries for all
customer loans.
Here is so...
Number of Database Calls/Customer
2013 Henrietta Dombrovskaya – Enova Financial 58	
  
1 2 3 4 5 6 7 8 9
Old App avg_calls...
Execution Time/Customer
2013 Henrietta Dombrovskaya – Enova Financial 59	
  
0
1
2
3
4
5
6
7
8
9
10
1 2 3 4 5 6 7 8 9
New ...
Now
•  We save both time
•  And money!
2013 Henrietta Dombrovskaya – Enova 60
Were we the first to notice the problem?
Definitely NOT!
2013 Henrietta Dombrovskaya – Enova 61
Related Work
The problem of object-relational impedance
mismatch is a constant discussion topic,
when it comes to developi...
First Came HORA
HORA (Hybrid Object-Relational Architecture)
approach was first introduced in 1993 and became
a foundation...
Hybernate
Hybernate is a high-performance Object/Relational
persistence and query service, considered one of the
most flex...
Hybernate Developers Claim
•  In contrast to other solutions it does not hide
“the power of SQL” from developers.
•  This ...
The Caveat
•  Creation of complex queries using Hybernate is
not an easy task
•  Similar to other ORM systems, Hybernate
p...
Newer Version of Active Record
•  Allows Eager loading and some customer
queries, but has the same limitations as
Hybernat...
Agile Technology
•  acknowledges the existence of ORIM ;
•  not only acknowledges the technical impedance
mismatch, but al...
What’s next?
2013 Henrietta Dombrovskaya – Enova69
Future Work
Rewriting a whole application using our new
methodology of splitting logic between an application and
a databa...
Moving In Two Directions:
ü Continue to rewrite the larger parts of our
applications, shooting for having only a
couple o...
Aknowledgements
Many thanks to:
•  Enova CIO/CTO Fred Lee
•  Team Raven, and everybody who worked on the Aperture
project ...
Upcoming SlideShare
Loading in …5
×

Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

1,078 views
929 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,078
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

  1. 1. A New Approach to Resolve Object-Relational Impedance Mismatch Henrietta Dombrovskaya Enova Chicago IL hdombrovskaya@enova.com
  2. 2. What is Object-Relational Impedance Mismatch and why It is bad? Why should we care? We should care because we care about.. 2013 Henrietta Dombrovskaya – Enova 2  
  3. 3. Database Application Performance Everybody wants their application to function efficiently, right? After all – why choose to use databases in the first place? Because The DBMS is specialized software designed to manage data in the most efficient way. Nevertheless, the most common complaint of application developers is …. 2013 Henrietta Dombrovskaya – Enova 3  
  4. 4. THE DATABASE IS SLOW WHY??? 2013 Henrietta Dombrovskaya – Enova 4  
  5. 5. Let’s see… Where do we usually start to look, if we want to see, why the database performance is bad?... 2013 Henrietta Dombrovskaya – Enova 5  
  6. 6. Our plan of actions Look at the most long-running queries and try to optimize them Look at the queries, which are executed most often, and make sure all of them are tuned and/or running fast enough Everything looks good, why is the app still slow? 2013 Henrietta Dombrovskaya – Enova 6
  7. 7. Helpful reports: 2013 Henrietta Dombrovskaya – Enova 7 Our performance monitoring tool (pgBadger) produces several classes of reports, among them: Longest-running queries (slowest queries) Most frequently running queries Queries, which take up the most time (top offenders).
  8. 8. Based on these reports - is a database really slow? The top 50 most offensive queries, i.e. the queries which eat up the most of the execution time, almost exclusively are the queries, which are very short (fast), but executed too many times For example SELECT  *  FROM  loans  WHERE  id=?     is executed 8,500,000 times during the day, each time taking several milliseconds, with total execution time about 2.5 hours. 2013 Henrietta Dombrovskaya – Enova 8  
  9. 9. You may say – that’s how many times the screen is refreshed, but… UK: account home controller as accessed about 50,000 times during the day, and the customer support tool even less However, the pgBadger log shows total almost 1.5 million executions of different versions of the SELECT sum_accounts     statement and similar number of   SELECT*  FROM  loans. Statistics show that some application controllers produce over 1,000 database calls for each screen refresh. 2013 Henrietta Dombrovskaya – Enova
  10. 10. How could this possibly be happening?! Let’s take a step back… 2013 Henrietta Dombrovskaya – Enova 10
  11. 11. First came a program… •  Once upon a time there was a program… •  Until mid-60’s – only sequential access storage was available; a program was considered a stable object, which may process different input data, and produce different output Input Output •  Direct access storage appeared in the early 70s, allowing concurrent data access and selective updates. At the same time developers started to face rising complexity and costs of software development, concurrency control and data integrity problems, and a need of specialized data-management software became apparent. •  DBMS emerged as specialized programs for centralized data management 2013 Henrietta Dombrovskaya – Enova Program 11   DBMS
  12. 12. Since Then We Have… -Imperative programming languages, which tells, how to do things for  (i:=1,  i++,  n)  do   …   end;   and -  Declarative data manipulating languages, which define what to do: SELECT    first_name,  last_name  FROM  people        WHERE  id=101       2013 Henrietta Dombrovskaya – Enova 12
  13. 13. Database provides: •  Independent data storage •  Complex data structures •  Data integrity •  Concurrent access and data consistency Application provides: •  User interface •  Application control flow 2013 Henrietta Dombrovskaya – Enova 13  
  14. 14. So… Now everybody is happy, right? …not really… 2013 Henrietta Dombrovskaya – Enova 14  
  15. 15. Why? Because… Both imperative programming languages and declarative query languages work perfectly to accomplish the tasks the were designed to accomplish. The problems start, when we try to make them to work together. 2013 Henrietta Dombrovskaya – Enova 15  
  16. 16. What is Impedance Mismatch? These tools were designed to operate with objects of different size (granularity): variables vs. tables And different means of identification: address vs. set of attribute values More precisely: •  Object incompatibility - database objects are “too big” for the imperative programming language •  This means that classes and methods for tables and sets should be integrated with the database to work effectively (methods should be executed by the DB engine) •  But most of the architectures do not allow such integration, which leads to reimplementation of database operations at the application layer 2013 Henrietta Dombrovskaya – Enova 16  
  17. 17. When We Have an OO Application… •  During one screen rendering: objects may belong to different classes with different access methods •  Object-Relational Mapping (ORM): maps a database object to the in-memory application object: –  solves a problem of abstraction from details of data storing –  does not provide effective means of manipulation with data sets This particular case of impedance mismatch is called ORIM – object-relational impedance mismatch 2013 Henrietta Dombrovskaya – Enova 17  
  18. 18. ORIM Definition “A set of conceptual and technical difficulties that is encountered when a relational database management system (RDBMS) is being used by a program written in an object-oriented programming language or style, particularly apparent when objects or class definitions are mapped in a straightforward way to database tables or relational schema” Ambler, S., Agile Database Techniques: Effective Strategies for the Agile Software Developer 2013 Henrietta Dombrovskaya – Enova 18  
  19. 19. Development Standards & Performance Internal company standards and/or industry standards may prevent from creating efficient database application. Typical OO design recommendations - clearly separate the following layers of classes: •  End-user interface •  Business logic •  Relationships •  Interaction with the database The standard allows limited interaction between adjacent levels, therefore equivalents of database operations may be implemented within several different processes, sometimes on different machines. 2013 Henrietta Dombrovskaya – Enova 19  
  20. 20. Translation: •  As a database developer you would assume, that to get all payments for all loans for a given customer, you need to join a loans table with the payments table •  But for an application developer loans and payments belong to different classes, therefore they are accessed by different sets of methods •  Which means, that the join operation is replicated “manually”.. Let’s look at some examples… 2013 Henrietta Dombrovskaya – Enova 20  
  21. 21. What We Have at Enova? Enova is a Ruby on Rails shop that uses the ActiveRecord Object Relational Mapping (ORM) library to communicate with a Postgres database. With ActiveRecord a database table or view is mapped into a class and an object instance is tied to a single row in the table. ActiveRecord is the default “model” component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. 2013 Henrietta Dombrovskaya – Enova 21  
  22. 22. How ActiveRecord Works 2013 Henrietta Dombrovskaya – Enova 22  
  23. 23. What This Means for Application/ Database Interaction Due to the lack of awareness of the underlying database interaction on the part of the object methods, one controller performs multiple trips to the database For example… 2013 Henrietta Dombrovskaya – Enova 23  
  24. 24. Account Presenter def initialize(customer) @customer = customer @customer_extra = customer.customer_extra @person = customer.person @address = customer.person.try(:address) @company = customer.person.try(:company) @bank_account = customer.bank_account(true) @debit_card = customer.debit_card @customer_paydate = customer.customer_paydate(true) @paydate_schedule = customer.customer_paydate.try(:paydate_schedule) @customer_source = customer.customer_source end 2013 Henrietta Dombrovskaya – Enova 24  
  25. 25. Corresponding Application Log SELECT  *  FROM  customers                WHERE  (customers.id  =  12470535)       SELECT  *  FROM  people            WHERE  (people.id=  61657007  AND      (type  =   'CustomerPerson'))  AND  (  (people.type  =   'CustomerPerson'  )  );     SELECT  addresses.*,  people_addresses.serial_number   FROM  addresses                  INNER  JOIN  people_addresses  ON  addresses.id  =                people_addresses.address_id              WHERE  (people_addresses.person_id  =  61657007                                    AND  (eff_end_date  is  NULL));     2013 Henrietta Dombrovskaya – Enova 25  
  26. 26. But Wait, There’s More! SELECT  *  FROM  approvals  WHERE  (customer_id  =  12470535)  ORDER  BY  processed_on  desc   LIMIT  1                                                                                                                                                                             SELECT  *  FROM  customers  WHERE  (customers.id  =  12470535)                                                                             SELECT  *  FROM  loans  WHERE  (loans.id  =  25563928)                                                                                             SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   ('applied','approved','on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                   SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   ('applied','approved',E   'on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                                                               SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   (‘issued','issued_pmt_proc'))  ORDER  BY  funding_date  DESC  LIMIT  1    SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   ('applied','approved',E'on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                 SELECT  *  FROM  customers  WHERE  (customers.person_id  =  61657007)                                                               SELECT  count(*)  AS  count_all  FROM  loans  WHERE  (loans.customer_id  =  12470535  AND   (status_cd  in  ('applied','approved','on_hold','issued','issued_pmt_proc')  and   loan_type_cd  =  'installment'))                                                                                                                               SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   ('applied','approved','on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                   SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in   ('applied','approved','on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                     2013 Henrietta Dombrovskaya – Enova 26  
  27. 27. What We Should See Instead… 2013 Henrietta Dombrovskaya – Enova SELECT  *     FROM  customers     WHERE  (customers."id"  =   12470535)       SELECT  *     FROM  people     WHERE  (people."id"  =   61657007     AND  (type  =   'CustomerPerson'))     AND  (  (people."type"     =  'CustomerPerson'  )  );   SELECT  addresses.*,   people_addresses.serial_number     FROM  addresses     INNER  JOIN  people_addresses     ON  addresses.id  =   people_addresses.address_id     WHERE   (people_addresses.person_id  =   61657007     AND  (eff_end_date  is  NULL));       27  
  28. 28. Why Does This Happen? … because the app is written in an imperative way: “Fist we need to select all loans, then for each loan we need to select…..” 2013 Henrietta Dombrovskaya – Enova 28  
  29. 29. There Are Some Ways To Improve Eager loading generates almost correct SQL: SELECT customers."id” ,customers."created_by” <..>,customers."person_id" <..>,customers_extra."estimated_monthly_living_expense" <..>,people."type" <...>,addresses."id"<...> ,companies."id" <...> FROM customers LEFT OUTER JOIN customers_extra ON customers_extra.customer_id = customers.id LEFT OUTER JOIN people ON people.id = customers.person_id AND people."type" = 'CustomerPerson’ LEFT OUTER JOIN people_addresses ON people_addresses.person_id = people.id LEFT OUTER JOIN addresses ON addresses.id = <…> WHERE (customers."id" = 17674188) 2013 Henrietta Dombrovskaya – Enova 29  
  30. 30. But Then We Apply the Next Method has_many :bank_accounts, :foreign_key => 'person_id' do def default_for_customer(customer, reload = false) @bank_account_cache = {} if @bank_account_cache.nil? || reload @bank_account_cache[customer.brand.code] ||= find(:first, :joins => "INNER JOIN payment_instruments.defaults pid ON pid.payment_instrument_id = payment_instruments.bank_accounts.payment_instrument_id", :conditions => ["pid.customer_id = ?", customer], :readonly => false) end And we execute SELECT again! 2013 Henrietta Dombrovskaya – Enova 30  
  31. 31. If We Continue With Existing Frameworks … this problem (ORIM) will never be solved, and we will continue to loose money on timeouts! 2013 Henrietta Dombrovskaya – Enova 31
  32. 32. New Approach to Resolve Object-Relational Impedance Mismatch Our new approach allows drastically reduce the number of calls to the database, and get exactly the data we need. An application view can be rendered using one or two calls to the database instead of 500-900 call in the old app. In addition, it allows to optimize queries execution without any changes to the application and/or models. 2013 Henrietta Dombrovskaya – Enova 32  
  33. 33. The old methods are unaware of the data which is already available – make them data-aware! Isn’t this contrary to the standard OO approach? Yes, but… This is the only way to improve the App/DB interaction. 2013 Henrietta Dombrovskaya – Enova 33
  34. 34. Logic Split Methodology Disassemble method into atomic steps, Identify ones, which require data retrieval Using the knowledge about database objects relationships, construct a single query Execute Use retrieved data in other steps 2013 Henrietta Dombrovskaya – Enova 34  
  35. 35. Example: Amount_Outstanding 2013 Henrietta Dombrovskaya – Enova
  36. 36. Definitions AccountsOutstanding= AccountsUncollected + FeesOutstanding + InterestOutstanding + PrincipalAccounts + AccountsDue In turn: •  AccountsUncollected = uncollected_principal + uncollected_installment_principal 2013 Henrietta Dombrovskaya – Enova Financial 36  
  37. 37. Under the Hood: Database Calls SELECT      vl.value  AS  account                                               ,SUM(CASE  vl.value  WHEN  pt.debit_account_cd                                            THEN  pt.amount  ELSE  0  END)                                                   -­‐  SUM(CASE  vl.value  WHEN  pt.credit_account_cd                                                THEN  pt.amount  ELSE  0  END)  AS  sum                               FROM  payment_transactions  pt   JOIN  valuelists  vl  ON  vl.type_cd  ='transaction_account'          AND  vl.value  IN  (pt.debit_account_cd,   pt.credit_account_cd)   AND  loan_id=?   … and then a value for specific account is selected. 2013 Henrietta Dombrovskaya – Enova 37  
  38. 38. Drawbacks The method itself would allow retrieving all the information related to one loan “in one shot”. However, because the application developers are unaware of the underlying layers, there appears to be no difference: –  Whether we obtain the values of all account balances one-by- one by following normal Object-Oriented method logic, in an imperative way –  Or if we obtain them all, simultaneously Now for some results: 2013 Henrietta Dombrovskaya – Enova 38  
  39. 39. PGBadger Log 2013 Henrietta Dombrovskaya – Enova 39  
  40. 40. Modified Method 2013 Henrietta Dombrovskaya – Enova 40  
  41. 41. Under the Hood: Database Calls SELECT    loan_id                                  ,  sum(CASE  WHEN  debit_account_cd  =  'uncollected_principal'                                            THEN  pt.amount  ELSE  0    END         -­‐CASE  WHEN  credit_account_cd='uncollected_principal'                                    THEN  pt.amount  ELSE  0    END)  AS  uncollected_principal           <...>   ,  sum(CASE  WHEN  debit_account_cd  =  'uncollected_nsf_fees'                                  THEN  pt.amount  ELSE  0    END     -­‐  CASE  WHEN  credit_account_cd  =  'uncollected_nsf_fees'                    THEN  pt.amount  ELSE  0    END)  AS  uncollected_nsf_fees                ,  sum(CASE  WHEN  debit_account_cd  =  'installment_principal'                                          THEN  pt.amount  ELSE  0    END       -­‐  CASE  WHEN  credit_account_cd  =  'installment_principal'                THEN  pt.amount  ELSE  0    END)  AS  installment_principal      FROM  payment_transactions_committed  pt                            INNER  JOIN  loans  l  ON  l.id=pt.loan_id       WHERE  customer_id={?}       GROUP  BY  loan_id       2013 Henrietta Dombrovskaya – Enova 41  
  42. 42. Execution Statistics From Dark Testing: Old 2013 Henrietta Dombrovskaya – Enova 42  
  43. 43. … and New 2013 Henrietta Dombrovskaya – Enova 43  
  44. 44. So… now everybody happy?... …not really… What do application developers say at this point?.... Wait! What about the business logic?! 2013 Henrietta Dombrovskaya – Enova 44  
  45. 45. What About the Business Logic? We need some business logic to know how to execute joins and selects – just enough to minimize the number of db calls. Selected results transformations and manipulations do not have to be executed on the database side. 2013 Henrietta Dombrovskaya – Enova 45  
  46. 46. One More Time ü Disassemble method into atomic steps, ü Identify ones which require data retrieval ü Using knowledge about database objects relationships, construct a single query ü Execute ü Use retrieved data in other steps 2013 Henrietta Dombrovskaya – Enova 46  
  47. 47. Let’s review another example – account balance calculation for OEC loans. 2013 Henrietta Dombrovskaya – Enova 47  
  48. 48. Account_Balance Method for OEC Ø Obtain account principal balance Ø Obtain outstanding fees and interest as of next payment due date Ø Calculate the interest credit (unearned interest) for the number of days left before the payment due date Ø Obtain existing customer balance Ø Calculate the total account balance using the values obtained on steps 1-4. 2013 Henrietta Dombrovskaya – Enova 48  
  49. 49. Traditional Object-Oriented Approach Create Account_Balance method which would call the following methods: Ø Principal_Balance, Ø Interest_Amount, Ø Fees_Amount, Ø Customer_Balance Ø  Interest_Credit Each of these methods would interact with a database independently. 2013 Henrietta Dombrovskaya – Enova 49  
  50. 50. Drilling Down Into Each Of the Steps 1.  Principal balance as described for account_outstanding requires a single database call. 2.  Outstanding interest and fees require one database call each. 3.  Interest credit calculation: 3.1. Obtain the daily interest rate for this customer 3.2. Obtain base amount, which is used to calculate the total interest 3.3. Obtain the number of days, for which the interest should be credited: 3.3.1. Obtain the next payment due date 3.3.2. Calculate number of days based on obtained date and todays’ date 3.4. Calculate amount of credit, based on results from the previous three steps 4.  Customer balance can be obtained using one database call, same as steps 1-3. 2013 Henrietta Dombrovskaya – Enova 50  
  51. 51. Combining Steps With Data Retrieval Those steps are: 1,2,3, 3.1, 3.2, 3.3.1, 4, which means that the database accessing method should execute the following task: For a given loan, retrieve payment transactions, which show principal balance, current interest, fees and customer balance, also retrieve loan’s daily interest rate and next payment due date. 2013 Henrietta Dombrovskaya – Enova 51  
  52. 52. SELECT statement SELECT l.id AS loan_id ,sum( CASE WHEN debit_account_cd = ‘principal’ AND t.acct_date<= v_current_date THEN t.amount ELSE 0 END - CASE WHEN credit_account_cd =’principal’ AND t.acct_date<= v_current_date THEN t.amount ELSE 0 END ) AS amount_payable ,sum (CASE WHEN t.debit_account_cd=’fees_provisional’ THEN t.amount ELSE 0 END - CASE WHEN t.credit_account_cd= ‘fees_provisional’ THEN t.amount ELSE 0 END ) AS fees_provisional ,sum (CASE WHEN t.debit_account_cd=’ interest_provisional’ THEN t.amount ELSE 0 END - CASE WHEN t.credit_account_cd =’interest_provisional’ THEN t.amount ELSE 0 END ) AS interest_provisional ,st.end_date AS next_closing_date ,l.daily_rate AS interest_rate ,sum (CASE WHEN t.debit_account_cd ='customer_balance' THEN -t.amount ELSE 0 END - CASE WHEN t.credit_account_cd ='customer_balance' THEN -t.amount ELSE 0 END ) AS customer_balance FROM loans l LEFT OUTER JOIN payment_transactions_committed t ON l.id=t.loan_id LEFT OUTER JOIN statements st ON l.id=st.loan_id WHERE l.id={?} GROUP BY l.id ,l.daily_rate ,st.end_date 2013 Henrietta Dombrovskaya – Enova 52  
  53. 53. Ruby Method: account_balance def account_balance amount_payable = amount_outstanding + amount_charged_off amt = amount_payable - customer_balance provisional_interest = interest_provisional provisional_fees = fees_provisional if provisional_interest > 0.0 amt = amt + provisional_fees + [ provisional_interest - [unearned_interest, 0].max, 0 ].max else amt = amt + provisional_fees end amt = amt.round_near return [0, amt].max end 2013 Henrietta Dombrovskaya – Enova 53  
  54. 54. Ruby Method: unearned_interest def unearned_interest amt = -1 * (principal_amount + [(customer_balance - interest_provisional - fees_provisional), 0].max) next_closing_date = Date.parse(self.next_closing_date) return 0 unless next_closing_date interest_period = [next_closing_date - Date.today + 1, 0].max interest_rate = oec_daily_rate.to_f interest = amt * interest_period * interest_rate interest = interest.round_down interest < 0.0 ? 0 : interest end 2013 Henrietta Dombrovskaya – Enova 54  
  55. 55. How It Works This methods is used as a callback in a more complex method, which returns Loan_Summary where most of the values can be selected directly from the database. For compliance reasons all calculations are executed in the Ruby model. We need a “virtual column” as one of the fields of the data output; the Ruby callback can be inserted into the data set here. The callback produces required value, and the whole data set is passed to the upper-level method for future processing and then to the web application. 2013 Henrietta Dombrovskaya – Enova 55  
  56. 56. 2013 Henrietta Dombrovskaya – Enova 56 LOAN_ID, FUNDING_DATE< FUNDING_AMOUNT FEE STATE APR LOAN_STATUS AMOUNT_DUE ACCOUNT_BALANCE PRINCIPAL_AMOUNT CUSTOMER_BALANCE INTEREST_PROVISIONAL <…> LANGUAGE_CD ACCOUNT_BALANCE METHOD Note, that all data elements can still be retrieved with a single select statement, which won’t be possible within the standard ORM framework.
  57. 57. On a Larger Scale In our new application a single Postgres function retrieves summaries for all customer loans. Here is some execution statistics: 2013 Henrietta Dombrovskaya – Enova Financial 57
  58. 58. Number of Database Calls/Customer 2013 Henrietta Dombrovskaya – Enova Financial 58   1 2 3 4 5 6 7 8 9 Old App avg_calls 121 196 373 449 632 765 819 923 1129 New App avg_calls 4 4 4 4 4 4 4 4 4 0 200 400 600 800 1000 1200 Old App avg_calls New App avg_calls
  59. 59. Execution Time/Customer 2013 Henrietta Dombrovskaya – Enova Financial 59   0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 New App avg_time Old App avg_time
  60. 60. Now •  We save both time •  And money! 2013 Henrietta Dombrovskaya – Enova 60
  61. 61. Were we the first to notice the problem? Definitely NOT! 2013 Henrietta Dombrovskaya – Enova 61
  62. 62. Related Work The problem of object-relational impedance mismatch is a constant discussion topic, when it comes to developing an efficient application. In recent years multiple attempts were made to try to resolve this issue with no significant outcome. 2013 Henrietta Dombrovskaya – Enova Financial 62  
  63. 63. First Came HORA HORA (Hybrid Object-Relational Architecture) approach was first introduced in 1993 and became a foundation of multiple ORM-based systems, such as: •  Java Persistence •  ActiveJDBC •  ADO.NET •  Ruby on Rails ActiveRecord •  Many others. All implementations have the same ORIM issues. 2013 Henrietta Dombrovskaya – Enova Financial 63  
  64. 64. Hybernate Hybernate is a high-performance Object/Relational persistence and query service, considered one of the most flexible and powerful Object/Relational solutions on the market. •  Mapping from Java classes to database tables and from Java data types to SQL data types. •  Significantly reducing development time •  Allowing the application developers concentrate on the business side of the project. 2013 Henrietta Dombrovskaya – Enova Financial 64  
  65. 65. Hybernate Developers Claim •  In contrast to other solutions it does not hide “the power of SQL” from developers. •  This claim is true in some sense, since the solution, indeed allows us to write queries similar to SQL queries. 2013 Henrietta Dombrovskaya – Enova Financial 65  
  66. 66. The Caveat •  Creation of complex queries using Hybernate is not an easy task •  Similar to other ORM systems, Hybernate prompts for solutions, which seem more natural for application developers 2013 Henrietta Dombrovskaya – Enova Financial 66  
  67. 67. Newer Version of Active Record •  Allows Eager loading and some customer queries, but has the same limitations as Hybernate •  Eager loading may cause an excessive application memory usage 2013 Henrietta Dombrovskaya – Enova Financial 67  
  68. 68. Agile Technology •  acknowledges the existence of ORIM ; •  not only acknowledges the technical impedance mismatch, but also a cultural impedance mismatch: “The object-oriented paradigm is based on proven software engineering principles. The relational paradigm, however, is based on proven mathematical principles”. •  while raising awareness of the problem, the solutions, proposed by Agile data technology refers mostly to the database schema changes and/or more careful design and data refactoring, but it offers little in terms of dealing with inefficient queries. 2013 Henrietta Dombrovskaya – Enova Financial 68  
  69. 69. What’s next? 2013 Henrietta Dombrovskaya – Enova69
  70. 70. Future Work Rewriting a whole application using our new methodology of splitting logic between an application and a database; we are in the middle of the process. Problems: –  Large amount of the legacy code, where most of the business logic is embedded in Ruby classes. –  Virtually no business specifications, which means that we have to extract the business logic from the existing code. –  The legacy application is evolving and existing models are being modified all the time. –  The human factor Because of this…. 2013 Henrietta Dombrovskaya – Enova Financial 70  
  71. 71. Moving In Two Directions: ü Continue to rewrite the larger parts of our applications, shooting for having only a couple of database queries per screen rendering; ü While doing this, clarifying our technology, making it more transparent and easier to use by application developers. 2013 Henrietta Dombrovskaya – Enova Financial 71  
  72. 72. Aknowledgements Many thanks to: •  Enova CIO/CTO Fred Lee •  Team Raven, and everybody who worked on the Aperture project during it’s lifetime: •  Sheldon Strauch and the entire DB Dev team •  Chad Slaughter and Kurt Stephens •  Ben Heilman and Donny Jekels 2013 Henrietta Dombrovskaya – Enova Financial 72   –  Cheetana Yogeesh –  Allan Zoppa –  Ana Lebon –  Kevin Glowacz –  Neha Bhardwaj –  Jef Jonjevic –  Marc Groulx –  Richard Lee –  Preeti Dhiman –  Luda Baklanova

×