That ORM is Lying to You

© 2014 Aerospike. All rights reserved ‹#›
That ORM is Lying to You
Ronen Botzer
Aerospike

Preamble
■ I owe you at least one slide that fits the abstract’s style.¹
■ Your ORM is not unlike a crazy significant-other.²
■ You’ve been told you need to go steady. That passed-down logic
is so old you forgot (or never knew) where it came from.
■ You’ve dated, now you’ve settled down. But you feel stuck.
■ ActiveRecord, Spring, Propel, SQLAlchemy, Storm. The names
practically promised you’re going to get some action. But your
ORM is more drama than action.³
■ You are scared to mention an end to the relationship.
■ Fear of going it alone.
■ Worry about the alternatives.
■ Fear of hysterical responses from the ORM community or others.
■ You may be in love, but others tell their horror stories.
■ I'm stopping now, since I'm approaching FIMS.⁴
■ Let’s get to the backstory.⁵

Prehistory: The Relational Database Management System
■ 1970: ”A Relational Model of Data for Large Shared Data
Banks" by Edgar F. Codd describes relational modeling.¹
■ 1974: IBM starts developing System R (first commercial
use in 1977).
■ Includes SEQUEL, a query language based on Codd’s paper,
developed by Donald Chamberlain and Raymond Boyce.²
■ System R turns into SQL/DS(1981), which becomes DB2(1983).
■ 1977: Larry Ellison co-founds SDL after being inspired by
the 1970 Codd paper.
■ 1979: Now RSI, the company releases the Oracle database, which
includes an implementation of SQL.
■ While at IBM two UC Berkeley researchers, Eugene
Wong and Michael Stonebraker are inspired by the Codd
paper and initial System R work to create Ingres.

Prehistory: The RDBMS [continued]
■ 1974: Ingres prototype is open sourced and is worked on
by teams of students.
■ Includes the Quel query language.¹
■ After three years commercializing Ingres, Stonebraker returns to
UC Berkeley in 1985 and starts work on Postgres.²
■ 1984: An ex-Ingres team starts Sybase (first commercial
release in 1987).
■ Sybase includes the T-SQL query language.
■ Sybase cooperates with Microsoft which licenses its product,
rebranding it in 1992 as SQL Server.
■ 1985: Informix RDBMS includes the ISQL query engine.
■ 1995: Postgres replaces Quel with SQL closely mirroring
Oracle’s SQL. Renames to PostgreSQL.
■ 1996: An open-source RDBMS named MySQL is
released.

Here Come The Internets
■ 1989-1991: Tim Berners-Lee, while at CERN, works on
an information system which he names WWW, that uses
HTTP to transmit HTML from web server (CERN httpd) to
web browsers (WorldWideWeb Browser).
■ 1993: A team at NCSA led by Marc Andreessen releases
Mosaic, which becomes the first popular browser.¹
■ Commercialization of the World-Wide Web begins around
the mid-90s.
■ 1994: Netscape's Navigator browser and Commerce Server
include the scripting language JavaScript.
■ 1995: Microsoft's Internet Explorer browser and Internet
Information Server (IIS).
■ 1996: IIS gets an add-on scripting language, ASP.

Here Come The Internets [continued]
■ Individuals, then startups use open-source tools.
■ Server-side scripting mostly done through the Common Gateway
Interface (CGI) using Perl.
■ 1997: PHP2/FI emerges as another popular CGI language, then as
a web server module (SAPI).
■ 1996: Java shows up, promising one language to rule
them all. jk, WORA, FTW¹
■ 1999: Java 1.2. A shadow falls on Greenwood. Marketing
call it: J2EE.

“You Need an RDBMS”
■ Server-side engineers look around for persistent data
stores.
■ RDBMSs are what experienced engineers and recent
computer science graduates are familiar with.
■ The open-source databases MySQL and PostgreSQL are
picked up heavily by small projects and startups.
■ At first, scripts make direct use of SQL.
■ Data is loaded manually from SQL query results.
■ On writes, application-side data is broken up into one or more
DML operations (INSERT/UPDATE).
■ The concept of a persistent data store becomes
conflated with RDBMS at this point.

“You Need an ORM”
■ J2EE ❤️ Patterns. 😍😍😍
■ "You can’t be enterprise grade™ without them!"
■ Relevant patterns: Business Objects, Database Abstraction Layer
(DBAL) / Data Access Layer (DAL), Data Access Objects
(database specific mappers), and later Active Record (2003).
■ Incessant chatter by the Java marketing machine ensues.¹
■ The "patterns" meme even invades the minds of once
hacker-minded, scripting-oriented developers.²
■ Everybody and their grandmother writes an ORM.
■ 16 years later, this has not stopped.³
■ There are well over 100 published ORMs.
■ PHP, for example, is still spewing new ORMs.⁴
■ You write your ORM or join one of many warring camps.

Should we be using an ORM
to talk to the RDBMS?

The Good Aspects of ORM
■ CRUD operations on a single table have repetitive
SQL/DML. An ORM reduces that boilerplate code.
■ Decent ORMs are mostly config-free.
■ Generate access methods for business objects by inspecting the
schema of the matching tables.
■ Allegedly, ORMs allows ‘non-SQL’ developers to work
against an RDBMS blissfully unaware.
■ Decent ORMs cache data, reducing the load on the
RDBMS.¹

Some Bad Aspects of ORM…

The Bad: Leaky Abstraction
■ The notion that ORMs support ‘non-SQL’ developers is a
hoax.¹
■ Mapping new classes to the relational model requires an
understanding of DDL.
■ Modifications to the data model (migrations) do too.²
■ When you use an ORM your code usually ends up looking
like SQL anyway. Doctrine example: ³
$query = $conn->createQueryBuilder()
->select('u')
->from('users', 'u')
->where('u.id = :user_id')
->setParameter(':user_id', 1);
SELECT * FROM users AS WHERE id = 1

The Bad: A Lowest Common Denominator Database
■ A DBAL is a good idea if you are writing very generic
software for wide distribution, like a CMS.¹
■ Popular DBALs are like politicians - they need to appeal to the
dumbest database in the group.²
■ Like politicians, ORMs promote themselves with soothing
authoritative statements about how all data persistence problems
can be solved with one dogma…
■ Lowest common denominator features miss out on
optimizations specific to your database.³
■ Abstraction inversion: missing features must be emulated.⁴
■ Abstractions slow down your application with extra code.
■ Functional dependency.⁵
■ The notion that a DBAL allows you to switch databases
with ease is silly.⁶

The Bad: Speed and Resource Consumption
■ CRUD operations are slower with an ORM.¹
■ Using an ORM for Queries, especially ones JOINing
tables is even slower.
■ Generated SQL is often hideously slow and expensive.²
■ SQL is a robust query language, native to the RDBMS.
■ If your ORM has problems coming up with the correct JOINs you
will inevitably see slow queries that eat up a lot of CPU and
memory.
■ ORMs facilitate bad coding. The “n+1” problem.³

– High Scalability blog
‘The Case Against ORM Frameworks In
High Scalability Architectures’, 2008
“ORM frameworks are built to serve as wide an
audience as possible and while their success is
unquestionable in the commodity/middle market, they
are not and cannot possibly be tooled to
accommodate the atypical demands of high scalability
architecture.”

Laurie Voss, CTO at npmjs
"In Defense of SQL", 2010
http://seldo.com/weblog/2010/07/12/in_de
fence_of_sql
“ORM is slower than just using SQL, because abstraction
layers always are. But unlike other abstraction layers,
which make up for their performance hit with faster
development, ORM layers add almost nothing.“
“I want to be very, very clear about this: ORM is a stupid
idea.”

Mattias Geniar, Developer at Nucleus
"Bad ORM is infinitely worse than bad
SQL", 2012
http://ma.ttias.be/bad-orm-is-infinitely-
worse-than-bad-sql/
“As more and more developers use only ORM to create
applications, they lose their touch with the database
interaction, the queries behind it, the reasoning of why to
use a certain kind of query, the performance impact of an
INNER or OUTER joins, ...”

Ted Neward, 2006 [Updated 2012]
http://blogs.tedneward.com/2006/06/26/Th
e+Vietnam+Of+Computer+Science.aspx
“Object-Relational Mapping is the Vietnam of Computer
Science”

Is an RDBMS The Right Data
Store?

Do You Really Need an RDBMS?
■ The majority of ORMs are tightly coupled with RDBMS.
■ Impedance Mismatch.¹
■ OO notions such as encapsulation (hiding representation),
private/public accessibility, inheritance and polymorphism do not
map to an RDBMs.²
■ Major difference in data types. RDBMSs only support scalar types
such as integer and string.³
■ Manipulation of data in objects is imperative, and usually operates
on lists and maps. In an RDBMS it’s declarative.
■ Transactional differences - granularity of transaction for object is
an individual assignment modifying its attribute. RDBMSs use
much larger sets of operations, and are inefficient at updating the
value of a single field.

Do You Really Need an RDBMS? [continued]
■ Normalization kills web applications.¹
■ The goal in the 70s and 80s when normalization was formalized by
Codd and Boyce was for the data to be as small as possible.
■ The goal since the late 90s and web applications is speed.
■ No web application that is open to a global audience via
desktop and mobile apps is even in third normal form.²
■ Web applications quickly devolve to lower forms of normalization
(denormalization).
■ RESTful APIs are a particular example of modeling for the internet
age which is essentially single-table access.
http://example.com/api/:resource/:id

So… what should I use in my
application?

Perhaps You Actually Need a Key-Value Store
■ Don’t expect to go generic and get performance.
■ If you choose a specific database know why you’re doing so, and
optimize your app to its characteristics.
■ Use minimal abstractions, as they make sense to your particular
case (native client vs. ORM).¹
■ Key-value stores are often a more natural kind of data
store for a web application than an RDBMS.²
■ Support complex types which alleviate the need for a one:many
relationship to be expressed as a JOIN between two tables.
■ If you’re using an RDBMS as a key-value store, STOP. Use
something designed to do those operations much faster, with lower
latency.
■ If your database is fast enough you don’t need a cache nor
caching logic. Was caching your excuse? Less code means a
leaner, faster application, with less opportunity for bugs.
■ Don't shard yourself.³

How do we do it?

WRITING RELIABLY WITH HIGH PERFORMANCE
1. Write sent to row master
2. Latch against simultaneous writes
3. Apply write to master memory and replica
memory synchronously
4. Queue operations to disk
5. Signal completed transaction (optional storage
commit wait)
6. Master applies conflict resolution policy
(rollback/ rollforward)
master replica
1. Cluster discovers new node via gossip protocol
2. Paxos vote determines new data organization
3. Partition migrations scheduled
4. When a partition migration starts, write journal starts
on destination
5. Partition moves atomically
6. Journal is applied and source data deleted
transactions
continue
Writing with Immediate Consistency Adding a Node

DATABASE
OS FILE SYSTEM
PAGE CACHE
BLOCK INTERFACE
SSD HDD
BLOCK INTERFACE
SSD SSD
OPEN NVM
SSD
Ask me and I’ll tell you the answer.Ask me. I’ll look up the answer and then tell it to
you.
DATABASE
HYBRID MEMORY SYSTEM™
•Direct device access
•Large Block Writes
•Indexes in DRAM
•Highly Parallelized
•Log-structured FS “copy-on-write”
•Fast restart with shared memory
FLASH OPTIMIZED HIGH
PERFORMANCE

SHARED-NOTHING SYSTEM:100% DATA AVAILABILITY
■ Every node in a cluster is identical,
handles both transactions and long
running tasks
■ Data is replicated synchronously with
immediate consistency within the cluster
■ Data is replicated asynchronously
across data centers
OHIO Data Center

That ORM is Lying to You

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to That ORM is Lying to You

Similar to That ORM is Lying to You (20)

Recently uploaded

Recently uploaded (20)

That ORM is Lying to You

Editor's Notes