Implementation requirements
● Be correct
● Be secure
● Be efficient
○ Use scalable algorithms
○ Use few and efficient SQL queries
Key data structures
● Registry
● Record cache
● Fields to write
● Fields to compute
● Field triggers
Registry
● Goal: map model_name to model_class
● model_class should reflect model definitions
● browse() returns an instance of model_class
● Also holds metadata
Model definitions
class Foo0(models.Model):
_name = 'foo'
...
class Foo1(models.Model):
_inherit = 'foo'
...
class Foo2(models.Model):
_inherit = 'foo'
...
Model classes
class foo(Foo2, Foo1, Foo0, Base):
_name = 'foo'
class Foo0(models.Model):
_name = 'foo'
...
class Bar0(models.AbstractModel):
_name = 'bar'
...
class Foo1(models.Model):
_name = 'foo'
_inherit = ['bar', 'foo']
...
class bar(Bar0, Base):
_name = 'bar'
class foo(Foo1, bar, Foo0, Base):
_name = 'foo'
Model definitions Model classes
Registry setup
● Determine fields by introspection
● Add custom fields
● Add _inherits fields
● Setup fields (parameters, depends, ...)
Registry
● Challenges:
○ Getting the model classes right
○ Performance of setup
Record cache
● Goal: augmented database cache
● Stores field values
● Accessible on environment
Record cache prefetching
First iteration:
● Look up the cache → nothing
● Fetch field on record:
○ Look up record._prefetch_ids for
missing values → all ids
○ If field is a column, fetch all columns
○ Invoke records._read(fields)
● Look up the cache → value
Next iterations:
● Look up the cache → value
records = model.browse(ids)
for record in records:
print(record.name)
Record cache
● Access N records: 1 + N dict lookups
● Update N records: 1 + N dict updates
● Invalidate N records: 1 + N dict deletions
{field: {record_id: value}}
Record cache (context-dependent
fields)
● Access N records: 2 + N dict lookups
● Update N records: 2 + N dict updates
● Invalidate N records: 1 + K * N dict deletions
{field: {ctx_key: {record_id: value}}}
Record cache
● Challenge: cache consistency
● When things change:
○ invalidate
○ update
Fields to write
● Goal: minimize UPDATE queries
● Stores column values
● Accessible on the environment
Fields to write
● write() collects updates until...
● flush() makes actual database updates
○ At most 1 UPDATE per record
○ Group records with the same UPDATE
{model_name: {record_id: {field_name: value}}}
Fields to write
● Challenge: database consistency
● flush() is automatically called by search(), read(),
read_group()
● Manual SQL queries requires explicit flush()
Fields to compute
● Goal: delay computations
● For computed stored fields only
● Accessible on environment
{field: set(ids)}
Fields to compute (stored)
def __get__(self, record, owner):
if record.id in tocompute.get(self, ()):
self.compute_value(record.browse(tocompute[self]))
try:
return record.env.cache.get(record, self)
except CacheMiss:
if self.store:
record._in_cache_without(self)._fetch(self)
elif self.compute:
self.compute_value(record._in_cache_without(self))
else:
...
return record.env.cache.get(record, self)
Fields to compute (non-stored)
def __get__(self, record, owner):
if record.id in tocompute.get(self, ()):
self.compute_value(record.browse(tocompute[self]))
try:
return record.env.cache.get(record, self)
except CacheMiss:
if self.store:
record._in_cache_without(self)._fetch(self)
elif self.compute:
self.compute_value(record._in_cache_without(self))
else:
...
return record.env.cache.get(record, self)
Fields to compute
● flush() first accesses fields to compute
○ Compute and assign values (to write)
○ Perform UPDATE queries
● Challenge: managing cache with fields to compute
Field triggers
● Goal: when modifying records, determine what to:
○ mark to compute, or
○ invalidate from cache
● Based on field dependencies
Field triggers
class Order(models.Model):
@api.depends('discount', 'line_ids.subtotal')
def _compute_total(self):
...
class OrderLine(models.Model):
@api.depends('price', 'quantity')
def _compute_subtotal(self):
...
line_ids line_ids
subtotal
order_id
price
quantity
subtotal
totaltotal
discount
line_ids
total
...
...
Fields triggers
def modified(self, fnames):
tree = merge(
field_triggers[self._fields[fname]]
for fname in fnames
)
self._modified_triggers(tree)
def _modified_triggers(self, tree):
if self:
for field in tree.root:
mark_to_compute_or_invalidate(field, self)
for field, subtree in tree.children:
records = inverse(field, self)
records._modified_triggers(subtree)
Field triggers
● Challenges:
○ Building the trigger trees
○ Performance of inversing dependencies
Thank You

An in Depth Journey into Odoo's ORM

  • 2.
    Implementation requirements ● Becorrect ● Be secure ● Be efficient ○ Use scalable algorithms ○ Use few and efficient SQL queries
  • 3.
    Key data structures ●Registry ● Record cache ● Fields to write ● Fields to compute ● Field triggers
  • 4.
    Registry ● Goal: mapmodel_name to model_class ● model_class should reflect model definitions ● browse() returns an instance of model_class ● Also holds metadata
  • 5.
    Model definitions class Foo0(models.Model): _name= 'foo' ... class Foo1(models.Model): _inherit = 'foo' ... class Foo2(models.Model): _inherit = 'foo' ... Model classes class foo(Foo2, Foo1, Foo0, Base): _name = 'foo'
  • 6.
    class Foo0(models.Model): _name ='foo' ... class Bar0(models.AbstractModel): _name = 'bar' ... class Foo1(models.Model): _name = 'foo' _inherit = ['bar', 'foo'] ... class bar(Bar0, Base): _name = 'bar' class foo(Foo1, bar, Foo0, Base): _name = 'foo' Model definitions Model classes
  • 7.
    Registry setup ● Determinefields by introspection ● Add custom fields ● Add _inherits fields ● Setup fields (parameters, depends, ...)
  • 8.
    Registry ● Challenges: ○ Gettingthe model classes right ○ Performance of setup
  • 9.
    Record cache ● Goal:augmented database cache ● Stores field values ● Accessible on environment
  • 10.
    Record cache prefetching Firstiteration: ● Look up the cache → nothing ● Fetch field on record: ○ Look up record._prefetch_ids for missing values → all ids ○ If field is a column, fetch all columns ○ Invoke records._read(fields) ● Look up the cache → value Next iterations: ● Look up the cache → value records = model.browse(ids) for record in records: print(record.name)
  • 11.
    Record cache ● AccessN records: 1 + N dict lookups ● Update N records: 1 + N dict updates ● Invalidate N records: 1 + N dict deletions {field: {record_id: value}}
  • 12.
    Record cache (context-dependent fields) ●Access N records: 2 + N dict lookups ● Update N records: 2 + N dict updates ● Invalidate N records: 1 + K * N dict deletions {field: {ctx_key: {record_id: value}}}
  • 13.
    Record cache ● Challenge:cache consistency ● When things change: ○ invalidate ○ update
  • 14.
    Fields to write ●Goal: minimize UPDATE queries ● Stores column values ● Accessible on the environment
  • 15.
    Fields to write ●write() collects updates until... ● flush() makes actual database updates ○ At most 1 UPDATE per record ○ Group records with the same UPDATE {model_name: {record_id: {field_name: value}}}
  • 16.
    Fields to write ●Challenge: database consistency ● flush() is automatically called by search(), read(), read_group() ● Manual SQL queries requires explicit flush()
  • 17.
    Fields to compute ●Goal: delay computations ● For computed stored fields only ● Accessible on environment {field: set(ids)}
  • 18.
    Fields to compute(stored) def __get__(self, record, owner): if record.id in tocompute.get(self, ()): self.compute_value(record.browse(tocompute[self])) try: return record.env.cache.get(record, self) except CacheMiss: if self.store: record._in_cache_without(self)._fetch(self) elif self.compute: self.compute_value(record._in_cache_without(self)) else: ... return record.env.cache.get(record, self)
  • 19.
    Fields to compute(non-stored) def __get__(self, record, owner): if record.id in tocompute.get(self, ()): self.compute_value(record.browse(tocompute[self])) try: return record.env.cache.get(record, self) except CacheMiss: if self.store: record._in_cache_without(self)._fetch(self) elif self.compute: self.compute_value(record._in_cache_without(self)) else: ... return record.env.cache.get(record, self)
  • 20.
    Fields to compute ●flush() first accesses fields to compute ○ Compute and assign values (to write) ○ Perform UPDATE queries ● Challenge: managing cache with fields to compute
  • 21.
    Field triggers ● Goal:when modifying records, determine what to: ○ mark to compute, or ○ invalidate from cache ● Based on field dependencies
  • 22.
    Field triggers class Order(models.Model): @api.depends('discount','line_ids.subtotal') def _compute_total(self): ... class OrderLine(models.Model): @api.depends('price', 'quantity') def _compute_subtotal(self): ... line_ids line_ids subtotal order_id price quantity subtotal totaltotal discount line_ids total ... ...
  • 23.
    Fields triggers def modified(self,fnames): tree = merge( field_triggers[self._fields[fname]] for fname in fnames ) self._modified_triggers(tree) def _modified_triggers(self, tree): if self: for field in tree.root: mark_to_compute_or_invalidate(field, self) for field, subtree in tree.children: records = inverse(field, self) records._modified_triggers(subtree)
  • 24.
    Field triggers ● Challenges: ○Building the trigger trees ○ Performance of inversing dependencies
  • 25.

Editor's Notes

  • #2 Welcome developers! This talk gives details about the ORM implementation. I am Raphael Collet, and I work as a developer. My job is to maintain and improve the Odoo server-side framework.
  • #3 Be correct w.r.t. Specification Be secure w.r.t. User access rights Be efficient: Use scalable algorithms Use few and efficient SQL queries
  • #4 Like many systems, the ORM is implemented around a few key data structures. The basic model and field operations are implemented on top of those data structures. We present those data structures, and how they fulfill the ORM’s implementation requirements.
  • #5 The purpose is to map each model name to a corresponding Python class. The class is the one of records objects.
  • #6 Model classes are in the registry. The inheritance order makes the model class’ MRO consistent with definition classes.
  • #7 The model class of ‘foo’ inherits from the model class of ‘bar’. Any extension of ‘bar’ automatically impacts ‘foo’ in its MRO.
  • #8 Registry setup is done once at loading. Inherited fields are related fields with a few special cases.
  • #10 “Augmented” means to store more than the cache.
  • #11 Point out the cache operations Access 1 field on 1 record Access 1 field on N records Update K fields on N records
  • #13 This is the structure of the cache for fields that depend on the context (translated, company-dependent)
  • #14 In Odoo 13.0, an ORM refactoring put the focus on updating instead of invalidating. The purpose is performance (saving SQL queries).
  • #15 Updating 5 fields on a record should issue 1 UPDATE query.
  • #16 This is an optimization for updating columns in the model’s table. Introduced in Odoo 13.0. flush() without argument is model-independent.
  • #18 The idea is to avoid recomputing fields too early. Since Odoo 13.0, computations are delayed by default. Some compute methods take advantage of being called on batches.
  • #19 This is the field accessor method (much simplified). Highlighted code is what is reachable when the field is stored. Stored computed fields are computed when marked.
  • #20 Highlighted code is what is reachable when the field is non-stored. Non-stored computed fields are computed upon cache miss.
  • #22 Remember stored fields are computed when marked Non-stored fields are computed upon cache miss
  • #23 The fields inside the nodes are the ones to compute or invalidate. The fields on the edges are to be inversed. The trees are based on the transitive closure of field dependencies.
  • #24 This is how modified fields trigger the recomputation of dependent fields.