Lessons
from
a
Dying
CMS
Sandy Smith
sfsmith.com
@SandyS1
huh?
lesson
1:
history
ma7ers
why it was created
• clients: text-heavy policy shops
• needed topic-based lists (no silos)
• dates to 1998 (MySQL 3), no VC money
solu'on:
the
records
table
advantages
SELECT * FROM records WHERE topic = ‘health’
• Gets everything in the health topic
SELECT * FROM records WHERE topic = ‘health’
 AND datatype = 5
• Gets every document in the health
  topic
disadvantages
• HUGE table
• Self-joins are expensive
• Q: why don’t you fix it?
 A: Upgrades. <shudder>
     Also, did I mention no VCs?
configura'on
in
the
database

• Metadata about content types
  stored in 5 tables
• Coupled to content metadata like
  taxonomies
• Deploying changes was hard
what
I’d
do
now
• Normalize data, write converter
• Write manager to create
  normalized tables & columns
• Ensure everything has a title
  (Drupal does this...but still has a
  god table)
Lesson
2:
Decide
what

business
you’re
in




lesson
2:
What
business
are
you
in?
your
own
CMS:
Awesome!

• you have the features you need*
• not at mercy of “that idiot”*
• you gain experience
why
your
CMS
sucks

• you have the features you need*
  – *that you can afford

• care and feeding
• nobody’s going to help you*
• *“that idiot” is you
“what
am
I
payin’
you
for?”

1. website?
2. CMS?
what
I’d
do
now
what
I’d
do
now

• Use an open source CMS
  (switched to Drupal in 2008)
what
I’d
do
now

• Use an open source CMS
  (switched to Drupal in 2008)
• ...or at least use a framework
Lesson
3:
Beware
Lone

Wolves




lesson
3:
beware
lone
wolves

l’enfant
terrible:
fait
accompli
• Had been discussing common
   approach
• Started to use & critique data layer
  written by one programmer
• On one project, he wrote CMS
  after hours in 4 weeks
• Exec killed group project for ready-
  made CMS
2+
minds
are
beLer
than
1
           Custom Module
                                              -
           Pre-built Module




                                 Discussion


                                                  Quality
            Site Services


          Content Services


      Developer Services (SDK)




       Unified Content Model




   Syntax CMS Web Platform
                                              +
what
I’d
do
now
what
I’d
do
now

• Use an open source CMS
what
I’d
do
now

• Use an open source CMS
• ...or at least use a framework
what
I’d
do
now

• Use an open source CMS
• ...or at least use a framework
• Tight control of junior programmers
what
I’d
do
now

• Use an open source CMS
• ...or at least use a framework
• Tight control of junior programmers
• Remind execs of lost revenue
Lesson
5:
Highly
Coupled
==

Highly
Crappy




lesson
4:
highly
coupled
==
highly
crappy

uploading
a
file
 class pxdb_input extends pxdb_confront
 {
     function import($source = null)
     {
         parent::import($source);

         // uploaded files present somewhat of a
 special exception
         // that must be handled separately.
         $this->_import_uploaded_files();
     }
 }
uploading
a
file
 class pxdb_input extends pxdb_confront
 {
     function import($source = null)
     {
         parent::import($source);

         // uploaded files present somewhat of a
 special exception
         // that must be handled separately.
         $this->_import_uploaded_files();
     }
 }
uploading
a
file
 class pxdb_input extends pxdb_confront
 {
     function import($source = null)
     {
         parent::import($source); only exists in parent;
                                  not called anywhere else
         // uploaded files present somewhat of a
 special exception
         // that must be handled separately.
         $this->_import_uploaded_files();
     }
 }
Are
these
really
similar?
Inheritance
Inheritance

• Are generating forms, generating
  widgets, validating input, and
  writing data to the DB all the same
  type of action?
Inheritance

• Are generating forms, generating
  widgets, validating input, and
  writing data to the DB all the same
  type of action?

• They all use data, but they aren’t
  data.
Inheritance

• Are generating forms, generating
  widgets, validating input, and
  writing data to the DB all the same
  type of action?

• They all use data, but they aren’t
  data.

• WTF is confront anyway?
uploading
a
file
 class pxdb_input extends pxdb_confront
 {
     function import($source = null)
     {
         parent::import($source);

         // uploaded files present somewhat of a
 special exception
         // that must be handled separately.
         $this->_import_uploaded_files();
     }
 }
uploading
a
file
                                  parent::import() calls
 class pxdb_input extends pxdb_confront
                                  pxdb::import()
 {
     function import($source = null)
     {
         parent::import($source);

         // uploaded files present somewhat of a
 special exception
         // that must be handled separately.
         $this->_import_uploaded_files();
     }
 }
where
is
pxdb::import()?
sta'c,
singleton,
&
new
• “Couple” a method to other
  classes
• What if you want to do something
  different in one case?
• Increases complexity, makes
  debugging harder
• Increases rigidity
what
I’d
do
now
• Composition
  – pass data in as needed
  – pass widgets to form generator
  – pass validated data to model
• Separation of responsibility
  – controller imports data, hands to form
  – separate data model and data store
    (data mapper)
what
I’d
do
now

• Use configuration
  – enables changes based on environment
• Use a registry
  – configurable way to inject classes
lesson
5:
cheap
&
easy
hierarchies

the
problem

• You want to organize things in
  hierarchical categories, e.g.:
   Asia
   Asia/South Asia
   Asia/Central Asia
   Asia/East Asia
   Asia/South Asia/India
first
solu'on:
adjacency
list

  id   name           parent weight
   1   Asia
   2   South Asia       1       1
   3   East Asia        1       2
   4   Central Asia     1       3
   5   India            2
pluses

• Reordering is easy
• Insertion is easy
• Getting immediate children of a
  parent is easy
• Conceptually simple*
  – *if you know databases
minuses

• Some common functions require
  recursive functions
 – Reading entire branch of tree
 – Reading all ancestors of an entry
• Those functions are pretty
  common in breadcrumbs and
  menus
DBA
answer:
Nested
Sets
 id title          lft       rgt
  1 Asia                 1     10
  2 South Asia           2         5
  3 India                3         4
  4 East Asia            6         7
  5 Central Asia         8         9
pluses

• Most reads can be done with a
  single query
• Relies on fast numerical lookups
• Widely understood among DBA
  types
minuses
• Inserts require stored procedure or
  recursive function
• Deletes require stored procedure
  or recursive function
• Reordering requires stored
  procedure or recursive function
• Math is hard; let’s go shopping!
I
reinvent
Materialized
Path

• URIs already have hierarchy
• MySQL is pretty fast at text
  comparison
• Why not use URLs to map things,
  with a traditional weight field?
denormaliza'on
FTW

url                   name         weight
asia                  Asia
asia/south_asia       South Asia     1
asia/east_asia        East Asia      2
asia/central_asia     Central Asia   3
asia/south_asia/india India
pluses

• Selects are easy & familiar* with
  regexes:
   // get immediate children
   $sql = "
   SELECT *
   FROM
        records r
   WHERE
        r.url REGEXP '^$url/[^/]+$'
   ";
other
pluses

• Updates, deletes, inserts, and
  moving branches are easy
• Reordering still fairly easy
• Conceptually easy*
  – *if you have a background in Perl and
   regular expressions like me
minuses
• Requires processing to generate
  URLs, ensure no collisions
• REGEXP is slow
• Storage requirements much
  greater
• Selecting ordered trees requires
  trickery (consider code solution)
Lesson
7:
Measure
Everything




lesson
6:
measure
twice,
cut
once

why
is
this
so
SLOW?
• Default home page took 10
  seconds to load
• Complicated pages took longer
tools
maLer

• Tried various timing schemes;
  nothing gave much insight
• Convinced sysadmin to install
  XDebug 1.x
• Eureka!
xdebug
2.0|
maccallgrind
sample
discoveries

• ~1000 queries to generate page
  – Metadata calls killing us
• require_once is really expensive
• foreach() slower than while()*
  – * in PHP 4 ONLY!!!
solu'ons
• Queries:
  – Write cache object for metadata calls
  – Rewrite metadata classes to load all at
    beginning of request; store in memory
• Improve autoloader to put classes
  in array, include() if not present
• Foreach():
  – replace with while() when fixing/bored
results

• 1000 queries to 60-70
• 10 seconds load to 0.5 seconds
• Complex pages in 1.x seconds
lesson
review:
what
have
we
learned?

lessons:
lessons:
• Test. Don’t guess.
lessons:
• Test. Don’t guess.
• Research your problem.
  Somebody’s done it better than
  you.
lessons:
• Test. Don’t guess.
• Research your problem.
  Somebody’s done it better than
  you.
• Collaborate. Together we’re
  smarter than individually.
lessons:
• Test. Don’t guess.
• Research your problem.
  Somebody’s done it better than
  you.
• Collaborate. Together we’re
  smarter than individually.
• Don’t reinvent the wheel. Are you a
  wheelmaker or a driver?
Thank you
Sandy Smith
@SandyS1

Lessons from a Dying CMS

Editor's Notes