Ajax World2008 Eric Farrar


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Ajax World2008 Eric Farrar

  1. 1. 10101010101010001010101101101010101011 It’s 11 p.m., Do you know where you queries are? Eric Farrar, Sybase iAnywhere
  2. 2. Outline  What are ORMs and Active Records?  Tradeoffs  Playing Nice with your Database  Managing Indexes  Eager Loading and Client-Side Joins  Lazy Loading  Conclusion
  3. 3. Object-Relational Mapper  Systems to bridge the gap between object-oriented languages and relational databases class Employee < ActiveRecord::Base belongs_to :office end class Office < ActiveRecord::Base has_one :employee end  Inherently difficult:  Normalization (splitting data across tables)  Databases can only store scalar values  Add an extra layer of abstraction
  4. 4. Active Record Pattern  The ‘meat’ of an ORM that handles the CRUD work  Allows regular objects to be treated as persistent objects  Ideally, totally abstracts all database interaction my_office = Office.new() my_office.number = 123 me = Employee.new me.name = ‘Eric Farrar’ me.office = my_office
  5. 5. Examples of ORMs/Active Records  LINQ (Language Integrated Query)  Hibernate / NHibernate  Django  Ruby on Rails (ActiveRecord)  Many more…  For our purposes, we will use Rail’s ActiveRecord for the examples
  6. 6. Trade-offs  Advantages  Easy to learn  Simplifies database creation and management  No context switching between languages  You don’t need know about the database  Disadvantages  Performance suffers (up to 50% slower)  Often uses lowest-common denominator solution  Concurrency semantics often very difficult  You don’t need know about the database
  7. 7. Managing Indexes  Indexes are used to make things quick to look up  phone book vs. reverse look-up  Indexes should be present on anything you will search for  Searching for non-indexed properties will result in full table scan  By default, indexes are usually only put on primary keys  Lack of indexes often will not appear during development  Result will be a gradual slowdown (as data volume increases) as opposed to avalanche failure  Why not put an index on everything?  Multi-column indexes vs. single column indexes
  8. 8. Client-Side Join  Objects are usually ‘related’ to each other  belongs_to  has_one  has_many  has_and_belongs_to_many  ORMs use these relationship to allow object traversal  ex. me.office  Assuming 10000 employees, how many queries will this code produce? Employees.find(:all).each do |e| puts e.office.number end
  9. 9. “Man, this is heavy!”  Answer: 10001 Employees.find(:all).each do |e| # <-- 1 query here puts e.office.number # <-- 10,000 queries here end  Why? The application is doing the work of joining the data, not the database. This is called a ‘client-side’ join  This is solved by giving a hint to the ORM and the database that you intend to use the ‘office’ property Employees.find(:all :include => :office).each do |e| puts e.office.number end  This pattern is called eager loading
  10. 10. Inviting the Database to the Party  Eager loading solves the N+1 problem, but it is still only half way there  In ORMs, the relations are defined inside the object models  The ORM may know that Employees are Offices are related, but the database doesn’t know that  The database will obediently execute the query, but don’t expect it to do anything clever  Modern query optimizers will use every statistic available when determining query paths  Keeping them ignorant will result in bare-bones optimization
  11. 11. Lazy Loading  Eager loading deals with the case where you want more than your class includes  What if you want less?  Suppose your Employee class includes a picture field that is a high resolution bitmap (~ 3 mb)  The previous query will actually return the picture in order to fully populate the object Employees.find(:all).each do |e| puts e.name end  This innocent code will naively return > 30 Gb of data
  12. 12. Be Lazy  Instead, lazily load your object properties Employees.find(:all :select => [“name”]).each do |e| puts e.name end  Accessing e.picture will work by issuing another database query  This simple example ignores potential problems with concurrency  Use locking
  13. 13. Conclusions  ORMs and Active Records can provide large productivity advantages, typically at the expense of performance  ORMs should never be seen as an alternative to learning about databases (although it can be a good introduction)  At times, you will likely need to drop down to the database level (profiling, etc) to diagnose problems  Ideally, a programmer using a ORM will always consider how their code will actually look once it hits the database  Similarities to a C compiler  You should be able to answer “Yes!” to the question, “Do you know where your queries are?”