Accelerating OpenERP accounting: Precalculated period sums

Borja López
Borja LópezSoftware Developer, SysAdmin at -
Accelerating OpenERP accounting:
   Precalculated period sums



                      Borja López Soilán
                      http://www.kami.es
Index
Current approach (sum of entries)
●   Current approach explained.
●   Performance analysis.
Proposal: “Precalculated period sums”
●   Alternative 1: Accumulated values using triggers
    –   Proposed by Ferdinand Gassauer (Chricar)
●   Alternative 2: Period totals using the ORM
    –   Proposed by Borja L.S. (@NeoPolus)
●   Current approach vs Precalculated period sums
Current approach: Sum of entries

          Currently each time you read the
credit/debit/balance of one account OpenERP has
     to recalculate it from the account entries
                    (move lines).

The magic is done by the “_query_get()” method
 of account.move.line, that selects the lines to
  consider, and the “__compute()” method of
     account.account that does the sums.
Inside the current approach
_query_get() filters: builds the “WHERE” part
of the SQL query that selects all the account
move lines involving a set of accounts.
●   Allows to do complex filters, but usually look like
    “include non-draft entries from these periods for these
    accounts”.
__compute() sums: uses the filter to query for
the sums of debit/credit/balance for the current
account and its children.
●   Does just one SQL query for all the accounts. (nice!)
●   Has to aggregate the children values on python.
Sample query done by __compute
SELECT l.account_id as id,
COALESCE(SUM(l.debit), 0) as debit,
COALESCE(SUM(l.credit), 0) as credit,
COALESCE(SUM(l.debit),0) -
COALESCE(SUM(l.credit), 0) as balance
FROM account_move_line l          Account + children = lot of ids!


WHERE l.account_id IN (2, 3, 4, 5, 6, ...,
1648, 1649, 1650, 1651) AND l.state <>
'draft' AND l.period_id IN (SELECT id FROM
account_period WHERE fiscalyear_id IN (1))
AND l.move_id IN (SELECT id FROM account_move
WHERE account_move.state = 'posted')
GROUP BY l.account_id
Sample query plan
                                                 QUERY PLAN

---------------------------------------------------------------------------------------------------------------------

 HashAggregate    (cost=57.83..57.85 rows=1 width=18)

   ->   Nested Loop Semi Join     (cost=45.00..57.82 rows=1 width=18)                  Ugh!, sequential scan
         Join Filter: (l.period_id = account_period.id)                              on a table with (potentially)
         ->   Nested Loop     (cost=45.00..57.52 rows=1 width=22)
                                                                                         lots of records... :(
                 ->   HashAggregate   (cost=45.00..45.01 rows=1 width=4)

                        ->   Seq Scan on account_move   (cost=0.00..45.00 rows=1 width=4)

                               Filter: ((state)::text = 'posted'::text)

                 ->   Index Scan using account_move_line_move_id_index on account_move_line l   (cost=0.00..12.49 rows=1 width=26)

                        Index Cond: (l.move_id = account_move.id)

                     Filter: (((l.state)::text <> 'draft'::text) AND (l.account_id = ANY ('{2,3,4,5, ...,
1649,1650,1651}'::integer[])))

         ->   Index Scan using account_period_fiscalyear_id_index on account_period     (cost=0.00..0.29 rows=1 width=4)

                 Index Cond: (account_period.fiscalyear_id = 1)
Performance Analysis
    Current approach big O 1/2
“Selects all the account move lines”
The query complexity depends on l, the
number of move lines for that account and
(recursive) children:
   O(query) = O(f(l))
“Has to aggregate the children values”
The complexity depends on c, the number of
children.
  O(aggregate) = O(g(c))
Current approach big O 2/2
O(__compute) = O(query) + O(aggregate)


O(__compute) = O(f(l)) + O(g(c))


What kind of functions are f and g?

Let's do some empiric testing (funnier than
maths, isn't it?)...
Let's test this chart... 1/2
The official Spanish
chart of accounts, when
empty:
  Has about 1600
  accounts.
  Has 5 levels.


(to test this chart of
accounts install the
l10n_es module)
Let's test this chart... 2/2
      How many accounts
      below each level?
Account code            Number of
                        children
                        (recursive)
Level 5 – 430000        0
(leaf account)
Level 4 - 4300          1
Level 3 - 430           6
Level 2 - 43            43
Level 1 - 4             192
Level 0 – 0             1678
(root account)


      To get the balance of account “4” we need to sum the balance of 192 accounts!
Ok, looks like the number of children c has a
lot of influence, and the number of moves l
has little or zero influence, g(c) >> f(l)
Lets split them...
Now it is clear that g(c) is linear!
(note: the nº of children grows exponentially)
O(g(c)) = O(c)
So, the influence was little, but linear too!
O(f(l)) = O(l)
Big O - Conclusion
O(__compute) = O(l) + O(c)

c has an unexpectedly big influence on the
results
=> Bad performance on complex charts of
accounts!
c does not grow with time, but l does...
=> OpenERP accounting becomes slower and
slower with time! (though it's not that bad as expected)
Proposal: Precalculated sums
OpenERP recalculates the debit/credit/balance
from move lines each time.
Most accounting programs store the totals per
period (or the cumulative values) for each
account. Why?
●   Reading the debit/credit/balance becomes much
    faster.
●   ...and reading is much more data intensive than
    writing:
    –   Accounting reports read lots of times lots of accounts.
    –   Accountants only update a few accounts at a time.
It's really faster?
Precalculated sums per period means:
●   O(p)query (get the debit/credit/balance of each
    period for that account) instead of O(l)query, with
    p being the number of periods, p << l.
    Using opening entries, or cumulative totals, p
    becomes constant => O(1)
●   If aggregated sums (with children values) are also
    precalculated, we don't have to do one
    O(c)aggregation per read.
It's O(1) for reading!!
    (but creating/editing entries is a bit slower)
Alternative 1: Accumulated values
         using triggers (I)
Proposed by Ferdinand Gassauer.
How does it work?
●   New object to store the accumulated
    debit/credit/balance per account and period (let's
    call it account.period.sum).
                        Opening   1st     2nd   3rd    4th
     Move line values   400       +200,   +25   -400   +25,
     in period                    +50                  +200
     Value in table     400       650     675   275    500


●   Triggers on Postgres (PL/pgSQL) update the
    account_period_sum table each time an account
    move line is created/updated/deleted.
Alternative 1: Accumulated values
         using triggers (II)
How does it work?(cont.)
●   The data is calculated accumulating the values from
    previous periods. (Ferdinand prototype requires an special naming of
    periods for this).
●   Creates SQL views based on the account
    account_period_sum table.
●   For reports that show data aggregated by period:
     –   New reports can be created that either directly use the
         SQL views, or use the account.period.sum object.
●   The account.account.__compute() method could be
    extended to optimize queries (modified to make use
    of the account_period_sum when possible) in the
    future.
Alternative 1: Accumulated values
          using triggers (III)
Good points                              Bad points
  Triggers guarantee that                  Database dependent
  the data is always in                    triggers.
  sync.
  (even if somebody writes directly to     Triggers are harder to
  the database!)                           maintain than Python
  Triggers are fast.                       code.

  Prototype available and                  Makes some
  working! - “used this method             assumptions on period
  already in very big                      names.
  installations - some 100                 (as OpenERP currently does
                                           not flag opening periods apart
  accountants some millions
                                           from closing ones)
  moves without any problems”
  (Ferdinand)
Alternative 2: Period totals using the
               ORM (I)
 Proposed by Borja L.S. (@NeoPolus).
 How does it work?
  ●   New object to store the debit/credit/balance sums
      per account and period (and state):
                          Opening   1st     2nd   3rd    4th
       Move line values   400       +200,   +25   -400   +25,
       in period                    +50                  +200
       Value in table     400       250     25    -400   225


  ●   Extends the account.move.line open object to
      update the account.sum objects each time a line is
      created/updated/deleted.
Alternative 2: Period totals using the
               ORM (II)
 How does it work?(cont.)
  ●   Extends account.account.__compute() method to
      optimize queries:
      –   If the query filters only by period/fiscal year/state, the
          data is retrieved from the account.sum object.
      –   If the query filters by dates, and one ore more fiscal
          periods are fully included on that range, the data is
          retrieved from for the account.sum objects (for the range
          covered by the periods) plus the account.move.lines (the
          range not covered by periods).
      –   Filtering by every other field (for example partner_id)
          causes a fallback into the normal __compute method.
Alternative 2: Period totals using the
              ORM (III)
Good points                Bad points
  Database                   Does not guarantee
  independent.               that the sums are in
  Optimizes all the          sync with the move
  accounting.                lines.
                             (but nobody should directly alter
                             the database in first place...)
  Flexible.
  No PL/pgSQL triggers       Python is slower than
  required, just Python      using triggers.
  => Easier to maintain.     No prototype yet! :)
                             (But take a look at
                             Tryton stock quantity computation)
Current approach VS Period sums
Current approach                 Precalculated sums
Pros                             Pros
  ●    No redundant data.          ●    Fast, always.
  ●    Simpler queries.            ●    Drill-down navigation.
Cons                             Cons
  ●    Slow.                       ●    Need to keep sums in
       –   Reports and                  sync with move lines.
           dashboard               ●    More complex
           charts/tables are
                                        (__compute) or
           performance hungry.
                                        specific queries to
  ●    Becomes even slower              make use of the
       with time.                       precalculated sums.
Precalculated sums – Drill down navigation
          (Chricar prototype) 1/3
Precalculated sums – Drill down navigation (Chricar prototype) 2/3
Precalculated sums – Drill down navigation (Chricar prototype) 3/3
And one last remark...




...all this is applicable to the stock quantities
                 computation too!
1 of 27

Recommended

Csc1100 lecture02 ch02-datatype_declaration by
Csc1100 lecture02 ch02-datatype_declarationCsc1100 lecture02 ch02-datatype_declaration
Csc1100 lecture02 ch02-datatype_declarationIIUM
451 views54 slides
Helping Teens Drive Safer Around Large Commercial Vehicles by
Helping Teens Drive Safer Around Large Commercial VehiclesHelping Teens Drive Safer Around Large Commercial Vehicles
Helping Teens Drive Safer Around Large Commercial VehiclesTexas A&M Transportation Institute
163 views13 slides
Noesiterapia escudero by
Noesiterapia escuderoNoesiterapia escudero
Noesiterapia escuderoLauraaqui
307 views169 slides
How To Get The Most Of Your Employees' Efforts by
How To Get The Most Of Your Employees' EffortsHow To Get The Most Of Your Employees' Efforts
How To Get The Most Of Your Employees' EffortsAniisu K Verghese
1K views33 slides
A3 presentación virus y antivirustltl by
A3 presentación virus y antivirustltlA3 presentación virus y antivirustltl
A3 presentación virus y antivirustltlJmarcospr90
648 views8 slides

More Related Content

Viewers also liked

Dropbox by
DropboxDropbox
Dropboxalvaro1725
291 views4 slides
Sintesis informativa 30 04 2013 by
Sintesis informativa 30 04 2013Sintesis informativa 30 04 2013
Sintesis informativa 30 04 2013megaradioexpress
741 views19 slides
How to Design for Mobile by
How to Design for MobileHow to Design for Mobile
How to Design for MobileAngela Naegele
1.5K views24 slides
Calendario 3ª grupo ix 13 14 by
Calendario 3ª grupo ix 13 14Calendario 3ª grupo ix 13 14
Calendario 3ª grupo ix 13 14Estepona Dxt
775 views7 slides
Wie High Potentials eine attraktiven Arbeitgeber finden by
Wie High Potentials eine attraktiven Arbeitgeber findenWie High Potentials eine attraktiven Arbeitgeber finden
Wie High Potentials eine attraktiven Arbeitgeber findenLüderitz -Einer von Euch-
785 views47 slides
Facebookla Pazarlama Taktikleri by
Facebookla Pazarlama TaktikleriFacebookla Pazarlama Taktikleri
Facebookla Pazarlama TaktikleriYunus Emre Sarıgül
1.4K views16 slides

Viewers also liked(17)

Calendario 3ª grupo ix 13 14 by Estepona Dxt
Calendario 3ª grupo ix 13 14Calendario 3ª grupo ix 13 14
Calendario 3ª grupo ix 13 14
Estepona Dxt775 views
Diplomatic list. - Free Online Library by aboundingconcei84
Diplomatic list. - Free Online LibraryDiplomatic list. - Free Online Library
Diplomatic list. - Free Online Library
aboundingconcei8432.3K views
Chapter 9 hitt pp slides by Mohit Bansal
Chapter 9 hitt pp slidesChapter 9 hitt pp slides
Chapter 9 hitt pp slides
Mohit Bansal2.1K views
Netværkslæring og Web 2.0 by Thomas Ryberg
Netværkslæring og Web 2.0Netværkslæring og Web 2.0
Netværkslæring og Web 2.0
Thomas Ryberg3.4K views
peut-on prévoir les modalités de fin de la relation de travail dés sa conclus... by Abdelhak ZAIM
peut-on prévoir les modalités de fin de la relation de travail dés sa conclus...peut-on prévoir les modalités de fin de la relation de travail dés sa conclus...
peut-on prévoir les modalités de fin de la relation de travail dés sa conclus...
Abdelhak ZAIM4.5K views
Examples of similies and metaphores power point by home
Examples of similies and metaphores power pointExamples of similies and metaphores power point
Examples of similies and metaphores power point
home1.6K views

Similar to Accelerating OpenERP accounting: Precalculated period sums

Monte Carlo Simulation for project estimates v1.0 by
Monte Carlo Simulation for project estimates v1.0Monte Carlo Simulation for project estimates v1.0
Monte Carlo Simulation for project estimates v1.0PMILebanonChapter
6.9K views65 slides
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck... by
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...Ontico
915 views36 slides
Amazon Redshift by
Amazon RedshiftAmazon Redshift
Amazon RedshiftJeff Patti
1.3K views37 slides
UNIVERSITY OF SOUTHERN QUEENSLANDCSC1401 - Foundation Progra.docx by
UNIVERSITY OF SOUTHERN QUEENSLANDCSC1401 - Foundation Progra.docxUNIVERSITY OF SOUTHERN QUEENSLANDCSC1401 - Foundation Progra.docx
UNIVERSITY OF SOUTHERN QUEENSLANDCSC1401 - Foundation Progra.docxdickonsondorris
5 views11 slides
Chapter 1 Data structure.pptx by
Chapter 1 Data structure.pptxChapter 1 Data structure.pptx
Chapter 1 Data structure.pptxwondmhunegn
54 views35 slides
Technologies used in the PVS-Studio code analyzer for finding bugs and potent... by
Technologies used in the PVS-Studio code analyzer for finding bugs and potent...Technologies used in the PVS-Studio code analyzer for finding bugs and potent...
Technologies used in the PVS-Studio code analyzer for finding bugs and potent...Andrey Karpov
15 views8 slides

Similar to Accelerating OpenERP accounting: Precalculated period sums(20)

Monte Carlo Simulation for project estimates v1.0 by PMILebanonChapter
Monte Carlo Simulation for project estimates v1.0Monte Carlo Simulation for project estimates v1.0
Monte Carlo Simulation for project estimates v1.0
PMILebanonChapter6.9K views
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck... by Ontico
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Peeking into the Black Hole Called PL/PGSQL - the New PL Profiler / Jan Wieck...
Ontico915 views
Amazon Redshift by Jeff Patti
Amazon RedshiftAmazon Redshift
Amazon Redshift
Jeff Patti1.3K views
UNIVERSITY OF SOUTHERN QUEENSLANDCSC1401 - Foundation Progra.docx by dickonsondorris
UNIVERSITY OF SOUTHERN QUEENSLANDCSC1401 - Foundation Progra.docxUNIVERSITY OF SOUTHERN QUEENSLANDCSC1401 - Foundation Progra.docx
UNIVERSITY OF SOUTHERN QUEENSLANDCSC1401 - Foundation Progra.docx
dickonsondorris5 views
Chapter 1 Data structure.pptx by wondmhunegn
Chapter 1 Data structure.pptxChapter 1 Data structure.pptx
Chapter 1 Data structure.pptx
wondmhunegn54 views
Technologies used in the PVS-Studio code analyzer for finding bugs and potent... by Andrey Karpov
Technologies used in the PVS-Studio code analyzer for finding bugs and potent...Technologies used in the PVS-Studio code analyzer for finding bugs and potent...
Technologies used in the PVS-Studio code analyzer for finding bugs and potent...
Andrey Karpov15 views
Skills Portfolio by rolee23
Skills PortfolioSkills Portfolio
Skills Portfolio
rolee23717 views
UNIT-1-PPTS-DAA.ppt by racha49
UNIT-1-PPTS-DAA.pptUNIT-1-PPTS-DAA.ppt
UNIT-1-PPTS-DAA.ppt
racha49364 views
This has to be done in C++ The below are the instructions- basically w.docx by jobesshirley
This has to be done in C++ The below are the instructions- basically w.docxThis has to be done in C++ The below are the instructions- basically w.docx
This has to be done in C++ The below are the instructions- basically w.docx
jobesshirley2 views
Cost-Based Optimizer in Apache Spark 2.2 by Databricks
Cost-Based Optimizer in Apache Spark 2.2 Cost-Based Optimizer in Apache Spark 2.2
Cost-Based Optimizer in Apache Spark 2.2
Databricks5.5K views
Introduction to Data Structure and algorithm.pptx by esuEthopi
Introduction to Data Structure and algorithm.pptxIntroduction to Data Structure and algorithm.pptx
Introduction to Data Structure and algorithm.pptx
esuEthopi25 views
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ... by Databricks
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
Databricks658 views
How to Create a l10n Payroll Structure by Odoo
How to Create a l10n Payroll StructureHow to Create a l10n Payroll Structure
How to Create a l10n Payroll Structure
Odoo1.1K views
Salesforce Batch processing - Atlanta SFUG by vraopolisetti
Salesforce Batch processing - Atlanta SFUGSalesforce Batch processing - Atlanta SFUG
Salesforce Batch processing - Atlanta SFUG
vraopolisetti2.1K views
ProjectReport - Maurya,Shailesh by sagar.247
ProjectReport - Maurya,ShaileshProjectReport - Maurya,Shailesh
ProjectReport - Maurya,Shailesh
sagar.247422 views
Making Pretty Charts in Splunk by Splunk
Making Pretty Charts in SplunkMaking Pretty Charts in Splunk
Making Pretty Charts in Splunk
Splunk34K views

Recently uploaded

Future of Indian ConsumerTech by
Future of Indian ConsumerTechFuture of Indian ConsumerTech
Future of Indian ConsumerTechKapil Khandelwal (KK)
21 views68 slides
Vertical User Stories by
Vertical User StoriesVertical User Stories
Vertical User StoriesMoisés Armani Ramírez
14 views16 slides
HTTP headers that make your website go faster - devs.gent November 2023 by
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023Thijs Feryn
22 views151 slides
PRODUCT PRESENTATION.pptx by
PRODUCT PRESENTATION.pptxPRODUCT PRESENTATION.pptx
PRODUCT PRESENTATION.pptxangelicacueva6
14 views1 slide
Ransomware is Knocking your Door_Final.pdf by
Ransomware is Knocking your Door_Final.pdfRansomware is Knocking your Door_Final.pdf
Ransomware is Knocking your Door_Final.pdfSecurity Bootcamp
55 views46 slides
Evolving the Network Automation Journey from Python to Platforms by
Evolving the Network Automation Journey from Python to PlatformsEvolving the Network Automation Journey from Python to Platforms
Evolving the Network Automation Journey from Python to PlatformsNetwork Automation Forum
13 views21 slides

Recently uploaded(20)

HTTP headers that make your website go faster - devs.gent November 2023 by Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn22 views
SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman33 views
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by Network Automation Forum
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Serverless computing with Google Cloud (2023-24) by wesley chun
Serverless computing with Google Cloud (2023-24)Serverless computing with Google Cloud (2023-24)
Serverless computing with Google Cloud (2023-24)
wesley chun11 views
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by Dr. Jimmy Schwarzkopf
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf
Transcript: The Details of Description Techniques tips and tangents on altern... by BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada136 views
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker37 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10248 views
handbook for web 3 adoption.pdf by Liveplex
handbook for web 3 adoption.pdfhandbook for web 3 adoption.pdf
handbook for web 3 adoption.pdf
Liveplex22 views
STPI OctaNE CoE Brochure.pdf by madhurjyapb
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdf
madhurjyapb14 views
Data Integrity for Banking and Financial Services by Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely21 views

Accelerating OpenERP accounting: Precalculated period sums

  • 1. Accelerating OpenERP accounting: Precalculated period sums Borja López Soilán http://www.kami.es
  • 2. Index Current approach (sum of entries) ● Current approach explained. ● Performance analysis. Proposal: “Precalculated period sums” ● Alternative 1: Accumulated values using triggers – Proposed by Ferdinand Gassauer (Chricar) ● Alternative 2: Period totals using the ORM – Proposed by Borja L.S. (@NeoPolus) ● Current approach vs Precalculated period sums
  • 3. Current approach: Sum of entries Currently each time you read the credit/debit/balance of one account OpenERP has to recalculate it from the account entries (move lines). The magic is done by the “_query_get()” method of account.move.line, that selects the lines to consider, and the “__compute()” method of account.account that does the sums.
  • 4. Inside the current approach _query_get() filters: builds the “WHERE” part of the SQL query that selects all the account move lines involving a set of accounts. ● Allows to do complex filters, but usually look like “include non-draft entries from these periods for these accounts”. __compute() sums: uses the filter to query for the sums of debit/credit/balance for the current account and its children. ● Does just one SQL query for all the accounts. (nice!) ● Has to aggregate the children values on python.
  • 5. Sample query done by __compute SELECT l.account_id as id, COALESCE(SUM(l.debit), 0) as debit, COALESCE(SUM(l.credit), 0) as credit, COALESCE(SUM(l.debit),0) - COALESCE(SUM(l.credit), 0) as balance FROM account_move_line l Account + children = lot of ids! WHERE l.account_id IN (2, 3, 4, 5, 6, ..., 1648, 1649, 1650, 1651) AND l.state <> 'draft' AND l.period_id IN (SELECT id FROM account_period WHERE fiscalyear_id IN (1)) AND l.move_id IN (SELECT id FROM account_move WHERE account_move.state = 'posted') GROUP BY l.account_id
  • 6. Sample query plan QUERY PLAN --------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=57.83..57.85 rows=1 width=18) -> Nested Loop Semi Join (cost=45.00..57.82 rows=1 width=18) Ugh!, sequential scan Join Filter: (l.period_id = account_period.id) on a table with (potentially) -> Nested Loop (cost=45.00..57.52 rows=1 width=22) lots of records... :( -> HashAggregate (cost=45.00..45.01 rows=1 width=4) -> Seq Scan on account_move (cost=0.00..45.00 rows=1 width=4) Filter: ((state)::text = 'posted'::text) -> Index Scan using account_move_line_move_id_index on account_move_line l (cost=0.00..12.49 rows=1 width=26) Index Cond: (l.move_id = account_move.id) Filter: (((l.state)::text <> 'draft'::text) AND (l.account_id = ANY ('{2,3,4,5, ..., 1649,1650,1651}'::integer[]))) -> Index Scan using account_period_fiscalyear_id_index on account_period (cost=0.00..0.29 rows=1 width=4) Index Cond: (account_period.fiscalyear_id = 1)
  • 7. Performance Analysis Current approach big O 1/2 “Selects all the account move lines” The query complexity depends on l, the number of move lines for that account and (recursive) children: O(query) = O(f(l)) “Has to aggregate the children values” The complexity depends on c, the number of children. O(aggregate) = O(g(c))
  • 8. Current approach big O 2/2 O(__compute) = O(query) + O(aggregate) O(__compute) = O(f(l)) + O(g(c)) What kind of functions are f and g? Let's do some empiric testing (funnier than maths, isn't it?)...
  • 9. Let's test this chart... 1/2 The official Spanish chart of accounts, when empty: Has about 1600 accounts. Has 5 levels. (to test this chart of accounts install the l10n_es module)
  • 10. Let's test this chart... 2/2 How many accounts below each level? Account code Number of children (recursive) Level 5 – 430000 0 (leaf account) Level 4 - 4300 1 Level 3 - 430 6 Level 2 - 43 43 Level 1 - 4 192 Level 0 – 0 1678 (root account) To get the balance of account “4” we need to sum the balance of 192 accounts!
  • 11. Ok, looks like the number of children c has a lot of influence, and the number of moves l has little or zero influence, g(c) >> f(l) Lets split them...
  • 12. Now it is clear that g(c) is linear! (note: the nº of children grows exponentially) O(g(c)) = O(c)
  • 13. So, the influence was little, but linear too! O(f(l)) = O(l)
  • 14. Big O - Conclusion O(__compute) = O(l) + O(c) c has an unexpectedly big influence on the results => Bad performance on complex charts of accounts! c does not grow with time, but l does... => OpenERP accounting becomes slower and slower with time! (though it's not that bad as expected)
  • 15. Proposal: Precalculated sums OpenERP recalculates the debit/credit/balance from move lines each time. Most accounting programs store the totals per period (or the cumulative values) for each account. Why? ● Reading the debit/credit/balance becomes much faster. ● ...and reading is much more data intensive than writing: – Accounting reports read lots of times lots of accounts. – Accountants only update a few accounts at a time.
  • 16. It's really faster? Precalculated sums per period means: ● O(p)query (get the debit/credit/balance of each period for that account) instead of O(l)query, with p being the number of periods, p << l. Using opening entries, or cumulative totals, p becomes constant => O(1) ● If aggregated sums (with children values) are also precalculated, we don't have to do one O(c)aggregation per read. It's O(1) for reading!! (but creating/editing entries is a bit slower)
  • 17. Alternative 1: Accumulated values using triggers (I) Proposed by Ferdinand Gassauer. How does it work? ● New object to store the accumulated debit/credit/balance per account and period (let's call it account.period.sum). Opening 1st 2nd 3rd 4th Move line values 400 +200, +25 -400 +25, in period +50 +200 Value in table 400 650 675 275 500 ● Triggers on Postgres (PL/pgSQL) update the account_period_sum table each time an account move line is created/updated/deleted.
  • 18. Alternative 1: Accumulated values using triggers (II) How does it work?(cont.) ● The data is calculated accumulating the values from previous periods. (Ferdinand prototype requires an special naming of periods for this). ● Creates SQL views based on the account account_period_sum table. ● For reports that show data aggregated by period: – New reports can be created that either directly use the SQL views, or use the account.period.sum object. ● The account.account.__compute() method could be extended to optimize queries (modified to make use of the account_period_sum when possible) in the future.
  • 19. Alternative 1: Accumulated values using triggers (III) Good points Bad points Triggers guarantee that Database dependent the data is always in triggers. sync. (even if somebody writes directly to Triggers are harder to the database!) maintain than Python Triggers are fast. code. Prototype available and Makes some working! - “used this method assumptions on period already in very big names. installations - some 100 (as OpenERP currently does not flag opening periods apart accountants some millions from closing ones) moves without any problems” (Ferdinand)
  • 20. Alternative 2: Period totals using the ORM (I) Proposed by Borja L.S. (@NeoPolus). How does it work? ● New object to store the debit/credit/balance sums per account and period (and state): Opening 1st 2nd 3rd 4th Move line values 400 +200, +25 -400 +25, in period +50 +200 Value in table 400 250 25 -400 225 ● Extends the account.move.line open object to update the account.sum objects each time a line is created/updated/deleted.
  • 21. Alternative 2: Period totals using the ORM (II) How does it work?(cont.) ● Extends account.account.__compute() method to optimize queries: – If the query filters only by period/fiscal year/state, the data is retrieved from the account.sum object. – If the query filters by dates, and one ore more fiscal periods are fully included on that range, the data is retrieved from for the account.sum objects (for the range covered by the periods) plus the account.move.lines (the range not covered by periods). – Filtering by every other field (for example partner_id) causes a fallback into the normal __compute method.
  • 22. Alternative 2: Period totals using the ORM (III) Good points Bad points Database Does not guarantee independent. that the sums are in Optimizes all the sync with the move accounting. lines. (but nobody should directly alter the database in first place...) Flexible. No PL/pgSQL triggers Python is slower than required, just Python using triggers. => Easier to maintain. No prototype yet! :) (But take a look at Tryton stock quantity computation)
  • 23. Current approach VS Period sums Current approach Precalculated sums Pros Pros ● No redundant data. ● Fast, always. ● Simpler queries. ● Drill-down navigation. Cons Cons ● Slow. ● Need to keep sums in – Reports and sync with move lines. dashboard ● More complex charts/tables are (__compute) or performance hungry. specific queries to ● Becomes even slower make use of the with time. precalculated sums.
  • 24. Precalculated sums – Drill down navigation (Chricar prototype) 1/3
  • 25. Precalculated sums – Drill down navigation (Chricar prototype) 2/3
  • 26. Precalculated sums – Drill down navigation (Chricar prototype) 3/3
  • 27. And one last remark... ...all this is applicable to the stock quantities computation too!