SlideShare a Scribd company logo
1 of 16
Download to read offline
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   i	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
	
  
	
  
	
  
On	
  the	
  Persistence	
  of	
  Memory…	
  
In	
  Database	
  Systems	
  
	
  
	
  
Picture	
  credit	
  Creative	
  Commons	
  
	
  
	
  
By	
  Neil	
  Raden	
  
Hired	
  Brains,	
  Inc.	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  December,	
  2012	
   	
  
	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   ii	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
	
  
Table	
  of	
  Contents	
  
	
  
Executive	
  Summary	
   1	
  
The	
  Basics	
   1	
  
Database	
  Memory	
  and	
  Processing	
  Models	
   3	
  
In-­‐Memory	
  Database	
   4	
  
Why	
  is	
  in-­‐memory,	
  a	
  fairly	
  old	
  concept,	
  interesting	
  again?	
   6	
  
Limitations	
  of	
  iMDB	
   8	
  
Cost	
   8	
  
Persistence	
   8	
  
Volume	
   9	
  
Dual-­Purpose	
  OLTP	
  and	
  Analytics	
   9	
  
Not	
  so	
  “green”	
   10	
  
The	
  Hybrid	
  DBMS	
   10	
  
Compare	
  and	
  Contrast	
   12	
  
Conclusion	
   12	
  
ABOUT	
  THE	
  AUTHOR	
   14	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   1	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
	
  
Executive	
  Summary	
  
	
  
Recent	
  drop	
  in	
  computer	
  memory	
  prices	
  and	
  the	
  introduction	
  of	
  early	
  implementations	
  
of	
  In-­‐Memory	
  database	
  solutions	
  have	
  recently	
  raised	
  the	
  level	
  of	
  interest	
  in	
  in-­‐memory	
  
databases,	
  but	
  the	
  topic	
  of	
  in-­‐memory	
  databases	
  is	
  not	
  new.	
  In	
  fact,	
  there	
  are	
  literally	
  
dozens	
  of	
  in-­‐memory	
  database	
  products,	
  some	
  in	
  production	
  for	
  decades,	
  but	
  due	
  to	
  
the	
  prohibitive	
  cost	
  differential	
  between	
  memory-­‐based	
  systems	
  and	
  disk-­‐based	
  
systems,	
  none	
  have	
  found	
  a	
  place	
  beyond	
  certain	
  niche	
  markets.	
  But	
  the	
  drastic	
  and	
  
remarkable	
  (there	
  is	
  hardly	
  a	
  word	
  to	
  describe	
  it)	
  drop	
  in	
  the	
  cost	
  of	
  memory	
  combined	
  
with	
  an	
  equally	
  remarkable	
  growth	
  in	
  density	
  and	
  capacity	
  is	
  driving	
  the	
  discussion	
  into	
  
the	
  mainstream	
  of	
  computing	
  architectures.	
  	
  
	
  
For	
  the	
  purposes	
  of	
  discussion,	
  we	
  refer	
  to	
  in-­‐memory	
  databases	
  systems	
  as	
  iMDB	
  and	
  
current	
  relational	
  database	
  systems	
  incorporating	
  large	
  memory	
  models	
  with	
  attached	
  
storage	
  (including	
  traditional	
  magnetic	
  disk	
  and	
  solid-­‐state	
  devices)	
  as	
  hybrid-­‐DBMS.	
  
Though	
  the	
  discussion	
  is	
  occasionally	
  technical,	
  our	
  conclusions	
  are	
  that:	
  
• iMDB	
  are	
  leveraging	
  lower-­‐cost	
  RAM	
  for	
  storage	
  but	
  still	
  lack	
  persistence	
  and	
  
data	
  scalability	
  while	
  limiting	
  the	
  types	
  of	
  solutions	
  supported	
  by	
  iMDB	
  
architecture.	
  
• Hybrid-­‐DBMS	
  is	
  a	
  proven	
  technology	
  and	
  provides	
  high	
  performance	
  and	
  flexible	
  
architecture	
  to	
  support	
  a	
  variety	
  of	
  analytics	
  applications.	
  
	
  
	
  
The	
  Basics	
  
	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   2	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
All	
  database	
  management	
  systems	
  (DBMS),	
  in	
  fact,	
  virtually	
  all	
  programs	
  in	
  conventional	
  
computing	
  environments	
  behave	
  exactly	
  the	
  same	
  way.	
  A	
  central	
  processing	
  unit	
  (CPU)	
  
performs	
  a	
  single	
  very	
  low-­‐level	
  instruction	
  on	
  a	
  single	
  piece	
  of	
  data.	
  While	
  complex	
  
application	
  programs	
  like	
  DBMS	
  have	
  many	
  layers	
  of	
  functionality	
  and	
  can	
  be	
  described	
  
logically	
  as	
  a	
  set	
  of	
  higher-­‐level	
  interworking	
  pieces,	
  the	
  CPU	
  has	
  utterly	
  no	
  insight	
  into	
  
this,	
  it	
  just	
  chugs	
  along	
  one	
  instruction	
  at	
  a	
  time.	
  If	
  you	
  were	
  to	
  sit	
  inside	
  a	
  CPU	
  and	
  
watch	
  its	
  stream	
  of	
  sequential	
  processes,	
  you	
  would	
  be	
  unable	
  to	
  determine	
  what	
  the	
  
controlling	
  program	
  was	
  doing.	
  So	
  database	
  software,	
  or	
  really,	
  any	
  software,	
  is	
  just	
  a	
  
logical	
  structure	
  that	
  encapsulates	
  all	
  of	
  the	
  smaller	
  steps.	
  When	
  things	
  get	
  calculated,	
  
they	
  bear	
  no	
  resemblance	
  to	
  the	
  whole.	
  A	
  CPU	
  doesn’t	
  know	
  what	
  a	
  join	
  or	
  an	
  index	
  is.	
  	
  
	
  
How	
  those	
  bits	
  of	
  work	
  are	
  presented	
  to	
  the	
  CPU	
  is	
  the	
  heart	
  of	
  the	
  application	
  design.	
  
In	
  other	
  words,	
  though	
  there	
  is	
  no	
  difference	
  in	
  how	
  CPU’s	
  execute	
  from	
  one	
  application	
  
to	
  another,	
  the	
  order	
  of	
  those	
  instructions	
  is	
  the	
  key	
  to	
  performance.	
  
	
  
Each	
  step	
  in	
  execution	
  is	
  composed	
  of	
  a	
  single	
  instruction	
  and	
  a	
  single	
  piece	
  of	
  data	
  
(though	
  today’s	
  CPU’s	
  are	
  composed	
  of	
  multiple	
  “cores,”	
  essentially	
  multiple	
  CPU’s	
  on	
  a	
  
single	
  chip).	
  The	
  instruction	
  and	
  the	
  data	
  have	
  to	
  be	
  presented	
  to	
  the	
  CPU	
  through	
  
memory,	
  either	
  system	
  RAM	
  or	
  a	
  memory	
  cache	
  on	
  the	
  CPU	
  itself.	
  It	
  makes	
  no	
  
difference	
  if	
  the	
  application	
  is	
  “in-­‐memory”	
  or	
  disk-­‐based,	
  the	
  CPU	
  has	
  to	
  be	
  presented	
  
with	
  the	
  instruction	
  (actually,	
  the	
  “instruction	
  set”	
  is	
  burned	
  into	
  the	
  CPU,	
  what	
  is	
  
presented	
  to	
  it	
  is	
  an	
  instruction	
  for	
  which	
  instruction	
  to	
  execute).	
  For	
  this	
  reason,	
  an	
  in-­‐
memory	
  architecture,	
  where	
  all	
  instructions	
  and	
  data	
  are	
  in	
  RAM	
  should,	
  in	
  theory,	
  
provide	
  superior	
  performance	
  compared	
  to	
  DBMS	
  that	
  must	
  fetch	
  data	
  from	
  remote	
  
mechanical	
  disk	
  drives.	
  	
  
	
  
Solid-­‐state	
  drives	
  (SSD)	
  mentioned	
  above	
  use	
  solid-­‐state	
  memory	
  chips,	
  typically	
  flash	
  
memory	
  (NAND),	
  instead	
  of	
  spinning	
  magnetic	
  disk	
  drives.	
  Flash	
  memory	
  (NAND),	
  is	
  less	
  
expensive	
  and	
  slower	
  than	
  RAM/SRAM,	
  but	
  it	
  is	
  non-­‐volatile,	
  meaning,	
  it	
  retains	
  data	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   3	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
	
  
without	
  power.	
  It	
  does	
  not	
  lose	
  data	
  in	
  the	
  case	
  of	
  a	
  system	
  shutdown.	
  RAM	
  is	
  volatile	
  
and	
  must	
  be	
  powered	
  continuously	
  and	
  requires	
  backup,	
  typically	
  conventional	
  disk	
  
drives	
  for	
  reliability.	
  	
  
	
  
One	
  could	
  say	
  that	
  a	
  DBMS	
  with	
  SSD	
  instead	
  of	
  traditional	
  disks	
  could	
  be	
  an	
  in-­‐memory	
  
device,	
  but	
  there	
  are	
  two	
  fundamental	
  differences.	
  First,	
  the	
  “memory”	
  chips	
  of	
  an	
  SSD	
  
are	
  part	
  of	
  a	
  disk	
  drive	
  “card”	
  or	
  assembly	
  that	
  uses	
  the	
  same	
  block	
  addressing	
  as	
  the	
  
disks	
  it	
  replaces.	
  In	
  other	
  words,	
  even	
  though	
  the	
  seek	
  time	
  finding	
  data	
  on	
  an	
  SSD	
  is	
  at	
  
least	
  an	
  order	
  of	
  magnitude	
  greater	
  than	
  a	
  spinning	
  magnetic	
  disk	
  (this	
  is	
  a	
  
generalization),	
  there	
  is	
  still	
  a	
  call	
  for	
  external	
  data,	
  handled	
  by	
  the	
  disk	
  controller	
  and	
  
passed	
  to	
  RAM.	
  An	
  interesting	
  arrangement,	
  typically	
  used	
  for	
  add-­‐on	
  accelerators,	
  not	
  
primary	
  database	
  operations	
  are	
  SSD’s	
  constructed	
  from	
  SRAM,	
  boosting	
  the	
  seek	
  time	
  
on	
  the	
  drives.	
  This	
  is	
  a	
  special-­‐purpose	
  architecture	
  and	
  very	
  expensive	
  and	
  not	
  further	
  
considered	
  here.	
  
	
  
Database	
  Memory	
  and	
  Processing	
  Models	
  
	
  
To	
  clear	
  up	
  confusion	
  between	
  various	
  models	
  for	
  memory	
  in	
  databases,	
  it’s	
  useful	
  to	
  
describe	
  the	
  predominant	
  versions.	
  There	
  is	
  a	
  difference	
  between	
  memory	
  models	
  in	
  
database	
  systems	
  for	
  processing.	
  The	
  two	
  predominant	
  memory	
  models	
  for	
  the	
  most	
  
common	
  database	
  systems	
  are	
  shared	
  memory	
  and	
  shared	
  nothing.	
  In	
  both,	
  memory	
  is	
  
used	
  only	
  for	
  processing,	
  not	
  for	
  persistent	
  storage.	
  This	
  is	
  the	
  essential	
  difference	
  
between	
  today’s	
  iMDBs	
  and	
  more	
  conventional	
  on-­‐disk	
  or	
  hybrid	
  systems.	
  	
  
	
  
In	
  the	
  shared	
  memory	
  model,	
  all	
  database	
  operations	
  use	
  the	
  same	
  single	
  aggregation	
  
of	
  memory	
  and	
  the	
  system	
  allocates	
  its	
  memory	
  and	
  processing	
  tasks.	
  All	
  memory	
  is	
  
available	
  to	
  every	
  processor.	
  In	
  a	
  shared	
  nothing	
  system,	
  each	
  separate	
  node	
  of	
  
processors	
  and	
  memory	
  do	
  their	
  own	
  work	
  in	
  parallel	
  and	
  are,	
  typically,	
  controlled	
  by	
  a	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   4	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
master	
  node	
  (which	
  can	
  be	
  physical	
  or	
  virtual).	
  In	
  reality,	
  nodes	
  in	
  a	
  shared	
  nothing	
  
environment	
  may,	
  themselves,	
  operate	
  as	
  independent	
  shared	
  memory	
  nodes.	
  But	
  in	
  
neither	
  case	
  is	
  data	
  stored	
  in	
  memory	
  until	
  it	
  is	
  called	
  for.	
  The	
  exception	
  is	
  when	
  data	
  is	
  
cached	
  (frequently	
  used	
  data	
  in	
  “pinned”	
  in	
  memory),	
  but	
  it	
  is	
  still	
  volatile	
  and	
  the	
  data	
  
can	
  be	
  flushed	
  at	
  any	
  time.	
  	
  
	
  
iMDB	
  operate	
  more	
  or	
  less	
  like	
  a	
  shared	
  memory	
  systems,	
  but	
  everything,	
  including	
  
operating	
  systems,	
  software	
  programs	
  (executables),	
  workspace,	
  indexes	
  and	
  data	
  are	
  
stored	
  in	
  RAM.	
  When	
  these	
  systems	
  are	
  scaled	
  out	
  with	
  multiple	
  nodes	
  connected	
  by	
  a	
  
network,	
  they	
  operate	
  more	
  like	
  a	
  grid	
  or	
  distributed	
  network	
  than	
  like	
  a	
  true	
  MPP-­‐
engineered	
  system.	
  However,	
  concepts	
  of	
  shared	
  memory	
  versus	
  shared	
  disk	
  (shared	
  
nothing)	
  are	
  a	
  little	
  obsolete	
  now	
  as	
  CPU’s	
  themselves	
  are	
  multi-­‐core,	
  meaning,	
  the	
  
processors	
  themselves	
  are	
  capable	
  of	
  parallel	
  processing,	
  provided	
  the	
  software	
  
program	
  (DBMs)	
  has	
  been	
  designed	
  to	
  take	
  advantage	
  of	
  it)	
  
	
  
This	
  description	
  is	
  a	
  simplification	
  and	
  there	
  are	
  many	
  exceptions,	
  but	
  in	
  general,	
  no	
  
database	
  management	
  system	
  stores	
  data	
  persistently	
  in	
  memory	
  except,	
  of	
  course,	
  
iMDB.	
  The	
  difference	
  between	
  the	
  various	
  memory	
  models	
  described	
  above	
  is	
  how	
  
memory	
  is	
  used	
  for	
  processing	
  data.	
  
	
  
In-­‐Memory	
  Database	
  
	
  
It	
  is	
  an	
  unassailable	
  truth	
  that	
  data	
  processed	
  from	
  memory	
  is	
  orders	
  of	
  magnitude	
  
faster	
  than	
  retrieving	
  it	
  from	
  a	
  disk	
  drive,	
  but	
  that	
  is	
  only	
  a	
  small	
  part	
  of	
  the	
  story.	
  
Historically,	
  CPU	
  processors	
  have	
  been	
  “I/O	
  bound,”	
  meaning	
  they	
  spent	
  a	
  significant	
  
amount	
  of	
  time	
  waiting	
  for	
  the	
  requested	
  data	
  to	
  arrive,	
  requiring	
  extreme	
  
countermeasures	
  in	
  software	
  design	
  to	
  minimize	
  the	
  latency.	
  With	
  data	
  streaming	
  to	
  
processors	
  at	
  the	
  speed	
  of	
  random-­‐access	
  memory	
  (RAM),	
  just	
  the	
  opposite	
  situation	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   5	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
	
  
can	
  occur	
  –	
  the	
  CPU’s	
  may	
  become	
  flooded	
  with	
  data	
  
and	
  unable	
  to	
  process	
  as	
  quickly	
  as	
  it	
  is	
  presented.	
  The	
  
point	
  cannot	
  be	
  stressed	
  enough	
  –	
  merely	
  boosting	
  the	
  
available	
  RAM	
  does	
  not	
  guarantee	
  smooth,	
  faster	
  
executions	
  of	
  existing	
  programs.	
  This	
  turn	
  of	
  events	
  calls	
  
for	
  careful	
  engineering	
  and	
  balance.	
  In	
  other	
  words,	
  
performance	
  of	
  complex	
  applications	
  is	
  rarely	
  resolved	
  
by	
  changing	
  one	
  thing,	
  it	
  usually	
  requires	
  rethinking	
  of	
  the	
  whole	
  approach.	
  The	
  result	
  is	
  
that	
  software	
  migration	
  to	
  in-­‐memory	
  usually	
  requires	
  a	
  great	
  deal	
  of	
  re-­‐work;	
  It	
  is	
  not	
  
just	
  move	
  and	
  drop.	
  
	
  
	
  Even	
  the	
  notion	
  of	
  iMDB	
  is	
  a	
  bit	
  of	
  a	
  misnomer	
  as	
  there	
  
is	
  still	
  the	
  requirement	
  for	
  separate	
  conventional	
  storage	
  
devices	
  for	
  mirroring	
  everything	
  for	
  persistence,	
  and	
  
keeping	
  the	
  iMDB	
  refreshed	
  and	
  reliable.	
  Systems	
  can	
  
fail,	
  which	
  means	
  in-­‐memory	
  systems	
  still	
  have	
  to	
  
maintain	
  multiple	
  copies	
  of	
  the	
  data,	
  and	
  a	
  complete	
  
reload	
  if	
  the	
  system	
  fails.	
  Adding	
  all	
  of	
  these	
  factors	
  
together	
  can	
  make	
  the	
  effort	
  quite	
  expensive	
  despite	
  the	
  
seemingly	
  reasonable	
  price	
  of	
  memory	
  today	
  (though	
  at	
  multiple	
  terabytes,	
  you	
  will	
  feel	
  
the	
  pinch).	
  In	
  addition,	
  to	
  make	
  maximum	
  use	
  of	
  RAM,	
  all	
  database	
  systems	
  use	
  
compression	
  of	
  data,	
  to	
  one	
  degree	
  or	
  another.	
  IMDBs	
  typically	
  employ	
  aggressive	
  
compression	
  algorithms	
  to	
  maximize	
  the	
  amount	
  of	
  data	
  that	
  can	
  be	
  put	
  in	
  working	
  
memory.	
  Back-­‐up	
  of	
  an	
  iMDB	
  is	
  usually	
  lightly-­‐	
  or	
  un-­‐compressed	
  so	
  it	
  can	
  be	
  read	
  by	
  
other	
  processes,	
  among	
  other	
  reasons.	
  Assuming	
  a	
  realistic	
  3.5x	
  compression	
  for	
  an	
  
iMDB	
  (not	
  all	
  RAM	
  is	
  available	
  for	
  the	
  data),	
  the	
  back-­‐up	
  drives	
  will	
  need	
  to	
  be	
  5X	
  the	
  
size	
  of	
  RAM,	
  and	
  there	
  may	
  be	
  multiple	
  archives,	
  and	
  the	
  backups	
  themselves	
  will	
  likely	
  
be	
  mirrored.	
  With	
  even	
  average-­‐sized	
  analytical	
  data	
  warehouses	
  today	
  running	
  about	
  
50	
  terabytes	
  (there	
  are,	
  of	
  course	
  much	
  larger	
  ones),	
  an	
  iMDB	
  to	
  accommodate	
  those	
  
The	
  point	
  cannot	
  be	
  
stressed	
  enough	
  –	
  merely	
  
boosting	
  the	
  available	
  
RAM	
  does	
  not	
  guarantee	
  
smooth,	
  faster	
  executions	
  
of	
  existing	
  programs.	
  	
  
Even	
  the	
  notion	
  of	
  iMDB	
  
is	
  a	
  bit	
  of	
  a	
  misnomer	
  as	
  
there	
  is	
  still	
  the	
  
requirement	
  for	
  separate	
  
conventional	
  storage	
  
devices	
  for	
  mirroring	
  
everything	
  for	
  
persistence,	
  and	
  keeping	
  
the	
  iMDB	
  refreshed	
  and	
  	
  
reliable.	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   6	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
would	
  need	
  75-­‐100TB	
  of	
  separate	
  disk	
  drives	
  to	
  handle	
  back-­‐ups,	
  snapshots,	
  logs	
  and	
  
staging	
  areas.	
  	
  
Another	
  thing	
  to	
  consider	
  is	
  that	
  a	
  database	
  still	
  has	
  to	
  perform	
  all	
  of	
  the	
  database	
  
functions,	
  from	
  loading	
  data	
  to	
  presenting	
  it	
  as	
  the	
  result	
  of	
  a	
  query.	
  Conventional	
  
relational	
  database	
  technology,	
  including	
  those	
  platforms	
  are	
  that	
  are	
  designed	
  
specifically	
  for	
  data	
  warehousing	
  and	
  analytical	
  work,	
  as	
  opposed	
  to	
  transactional	
  
processing,	
  must	
  employ	
  a	
  host	
  of	
  services	
  to	
  be	
  useful	
  to	
  an	
  enterprise	
  including:	
  
• Workload	
  management	
  for	
  efficient	
  management	
  of	
  the	
  resources	
  	
  
• Security	
  
• Reliability	
  	
  
• High	
  availability	
  
• Use	
  of	
  performance	
  statistics	
  for	
  query	
  optimization.	
  	
  
	
  
They	
  must	
  also	
  support,	
  in	
  addition	
  to	
  traditional	
  row-­‐based	
  schema,	
  columnar	
  
organization	
  of	
  the	
  data	
  which	
  is	
  particularly	
  effective	
  for	
  wide	
  tables	
  with	
  many	
  
attributes,	
  but	
  it	
  is	
  less	
  effective	
  with	
  more	
  normalized	
  schema	
  and	
  has	
  some	
  serious	
  
drawbacks	
  in	
  the	
  ability	
  to	
  update	
  the	
  database	
  in	
  real-­‐time.	
  But	
  columnar	
  orientation	
  is	
  
not	
  a	
  feature	
  limited	
  to	
  iMDBs	
  –	
  most	
  analytical	
  database	
  systems	
  incorporate	
  or	
  even	
  
operate	
  solely	
  in	
  columnar	
  mode.	
  	
  
	
  
Why	
  is	
  in-­‐memory,	
  a	
  fairly	
  old	
  concept,	
  interesting	
  again?	
  	
  
	
  
iMDBs	
  have	
  been	
  used	
  for	
  quite	
  some	
  time	
  but	
  they	
  have	
  always	
  been	
  limited	
  primarily	
  
by	
  three	
  factors:	
  The	
  cost	
  of	
  memory,	
  size	
  of	
  database,	
  and	
  the	
  persistence	
  of	
  data.	
  
Today,	
  a	
  dollar	
  will	
  buy	
  500	
  to	
  1000	
  times	
  as	
  much	
  memory	
  as	
  it	
  did	
  in	
  1995,	
  and	
  the	
  
capacity	
  per	
  square	
  inch	
  of	
  the	
  chips	
  has	
  increased	
  in	
  inverse	
  proportion.	
  Memory	
  
speeds	
  increased	
  as	
  well,	
  though	
  not	
  as	
  dramatically.	
  If	
  the	
  amount	
  of	
  data	
  that	
  could	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   7	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
	
  
be	
  stored	
  in	
  early	
  in-­‐memory	
  systems	
  was	
  too	
  small	
  for	
  most	
  applications,	
  1000	
  times	
  
more	
  memory	
  might	
  be	
  enough	
  for	
  in-­‐memory	
  to	
  be	
  feasible.	
  	
  
	
  
	
  
This	
  extremely	
  simplified	
  diagram	
  depicts	
  the	
  essential	
  (but	
  certainly	
  not	
  all)	
  differences	
  
between	
  an	
  iMDB	
  and	
  a	
  hybrid-­‐DBMS.	
  iMDB	
  maximizes	
  the	
  use	
  of	
  RAM	
  but	
  uses	
  
essentially	
  the	
  same	
  hardware	
  architecture	
  of	
  2	
  CPU’s	
  with	
  levels	
  of	
  on-­‐board	
  cache,	
  
and	
  RAM	
  for	
  holding	
  the	
  entire	
  database,	
  the	
  database	
  software,	
  working	
  space,	
  caches	
  
and	
  embedded	
  functionality.	
  The	
  only	
  difference	
  in	
  the	
  hybrid-­‐DBMS	
  is	
  less	
  reliance	
  on	
  
RAM	
  and	
  the	
  ability	
  to	
  address	
  vastly	
  greater	
  amounts	
  of	
  data	
  from	
  the	
  storage	
  
subsystem.	
  The	
  hybrid-­‐DBMS	
  has	
  documented	
  databases	
  of	
  greater	
  than	
  a	
  petabyte.	
  
iDBMS	
  typically	
  scale	
  out	
  to	
  16	
  servers	
  with	
  up	
  to	
  1	
  terabyte	
  of	
  RAM	
  each,	
  but	
  with	
  a	
  
significant	
  amount	
  of	
  RAM	
  taken	
  up	
  with	
  operating	
  system,	
  working	
  memory,	
  etc,.	
  
Therefore	
  even	
  with	
  5x	
  compression,	
  the	
  maximum	
  amount	
  of	
  uncompressed	
  data	
  per	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   8	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
server	
  is	
  no	
  more	
  than	
  40Tb.	
  Given	
  the	
  expense	
  of	
  these	
  large	
  iMDB	
  systems,	
  scaling	
  out	
  
to	
  sizes	
  that	
  are	
  needed	
  today	
  is	
  difficult.	
  
	
  
Limitations	
  of	
  iMDB	
  
	
  
In-­‐memory	
  databases	
  are	
  constrained	
  by	
  key	
  overwhelming	
  limitations:	
  	
  
	
  
• No	
  matter	
  how	
  inexpensive	
  RAM	
  is	
  today	
  compared	
  to	
  historical	
  cost,	
  it	
  is	
  still	
  
considerably	
  more	
  expensive	
  than	
  its	
  alternatives	
  limiting	
  its	
  useful	
  for	
  
enterprise	
  level	
  systems.	
  	
  
• Data	
  cannot	
  persist	
  in	
  memory	
  indefinitely.	
  It	
  is	
  inevitable	
  that	
  something	
  will	
  
fail,	
  which	
  requires	
  mechanisms	
  to	
  protect	
  the	
  data	
  that	
  can	
  erode	
  the	
  value	
  
proposition.	
  	
  
• With	
  today’s	
  data	
  volumes,	
  it	
  is	
  still	
  not	
  practical	
  to	
  use	
  an	
  in-­‐memory	
  approach	
  
for	
  a	
  data	
  warehouse.	
  
• iMDB	
  rely	
  on	
  the	
  system	
  being	
  up	
  24/7.	
  
	
  
Cost	
  
Though	
  RAM	
  is	
  10,000	
  times	
  faster	
  to	
  read	
  than	
  a	
  mechanical	
  disk	
  drive,	
  data	
  volumes	
  
today	
  are	
  enormous	
  and	
  growing.	
  A	
  petabye-­‐sized	
  in	
  memory	
  database	
  would	
  cost	
  
more	
  than	
  $5	
  million,	
  perhaps	
  twice	
  that.	
  	
  SSD	
  for	
  that	
  capacity	
  would	
  cost	
  1/5	
  to	
  1/10	
  
the	
  price.	
  And	
  a	
  hybrid-­‐DBMS,	
  hot/warm/cold	
  hierarchical	
  storage	
  architecture	
  would	
  
cost	
  far	
  less	
  than	
  that.	
  
Persistence	
  
In-­‐memory	
  architecture	
  still	
  requires	
  conventional	
  storage.	
  RAM	
  is	
  volatile	
  and	
  if	
  
something	
  fails,	
  or	
  even	
  just	
  hiccups,	
  there	
  can	
  be	
  data	
  loss.	
  Therefore,	
  everything	
  in	
  
memory	
  has	
  to	
  have	
  a	
  copy	
  on	
  less	
  volatile	
  storage	
  devices.	
  Updating	
  the	
  memory	
  
requires	
  log	
  files,	
  “Snapshots”	
  and	
  “checkpoints	
  which	
  can	
  slow	
  down	
  processing).	
  	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   9	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
	
  
Volume	
  
In-­‐memory	
  cannot	
  economically,	
  or	
  even	
  practically,	
  scale	
  to	
  the	
  volumes	
  of	
  today’s	
  
data	
  warehouses.	
  Ten	
  years	
  ago,	
  a	
  terabyte-­‐size	
  data	
  warehouse	
  was	
  remarkable,	
  but	
  
today,	
  there	
  are	
  dozens,	
  perhaps	
  even	
  more	
  than	
  a	
  hundred	
  greater	
  than	
  a	
  petabyte,	
  
one	
  thousand	
  times	
  larger.	
  Projections	
  are	
  that	
  this	
  growth	
  rate	
  is	
  not	
  diminishing.	
  
Dual-­Purpose	
  OLTP	
  and	
  Analytics	
  
Some	
  iMDB	
  products	
  promise	
  the	
  ability	
  to	
  perform	
  OLTP	
  and	
  analytical	
  processing	
  on	
  
the	
  same	
  platform,	
  with	
  the	
  same	
  data.	
  This	
  would	
  be	
  a	
  real	
  advantage	
  as	
  it	
  would	
  
alleviate	
  need	
  to	
  extract	
  and	
  transform	
  data	
  from	
  operational	
  systems	
  and	
  provide	
  
analytical	
  support	
  without	
  additional.	
  Unfortunately,	
  this	
  is	
  currently	
  impossible.	
  	
  
	
  
iMDB	
  platforms	
  generally	
  cannot	
  support	
  OLTP	
  because	
  they	
  have	
  to	
  wait	
  for	
  a	
  
transaction	
  to	
  complete	
  on	
  disk	
  to	
  be	
  ACID	
  compliant.	
  When	
  data	
  is	
  updated	
  in	
  
memory,	
  it	
  is	
  held	
  in	
  log	
  files	
  usually	
  stored	
  on	
  SSD	
  drives.	
  	
  iMDB	
  platforms	
  use	
  this	
  disk	
  
based	
  “persistent”	
  layer	
  to	
  “weather”	
  a	
  node	
  failure,	
  which,	
  in	
  a	
  narrow	
  sense,	
  suggests	
  
they	
  have	
  ACID	
  properties.	
  	
  When	
  the	
  iMDB	
  node	
  comes	
  back	
  up	
  (after	
  the	
  failed	
  part	
  is	
  
replaced	
  or	
  the	
  cold	
  standby	
  node	
  takes	
  over),	
  the	
  data	
  that	
  is	
  resident	
  on	
  the	
  disk	
  
“persistent”	
  layer	
  is	
  reloaded	
  back	
  into	
  memory.	
  	
  It	
  can	
  be	
  done	
  in	
  one	
  of	
  two	
  ways	
  –	
  
“Lazy”,	
  where	
  the	
  data	
  is	
  reloaded	
  as	
  queries	
  enter	
  the	
  system	
  and	
  request	
  a	
  specific	
  
table	
  (which	
  doesn’t	
  really	
  make	
  sense	
  since	
  the	
  iMDB	
  appears	
  in	
  memory	
  as	
  one	
  
dimensional	
  table),	
  or	
  “Full”	
  where	
  queries	
  must	
  wait	
  until	
  all	
  the	
  data	
  is	
  reloaded.	
  	
  In	
  
both	
  cases,	
  the	
  log	
  files	
  stored	
  on	
  disk	
  or	
  flash	
  have	
  to	
  be	
  read	
  and	
  applied1
.	
  	
  	
  
	
  
There	
  are	
  features	
  to	
  handle	
  different	
  kinds	
  of	
  failure,	
  though.	
  Both	
  the	
  SSD	
  area	
  and	
  
Disk	
  Persistent	
  layer	
  have	
  RAID	
  capability	
  to	
  cover	
  for	
  a	
  disk	
  failure.	
  So,	
  if	
  a	
  node	
  
has	
  a	
  problem,	
  but	
  keeps	
  power,	
  then	
  all	
  “may	
  be”	
  ok.	
  	
  It	
  is	
  an	
  “error	
  dependent”	
  issue.	
  
	
  If	
  there	
  is	
  a	
  problem	
  with	
  a	
  memory	
  chip,	
  it	
  is	
  unlikely	
  the	
  data	
  will	
  survive	
  -­‐-­‐	
  requiring	
  a	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   10	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
total	
  reload..	
  If	
  a	
  node	
  loses	
  power,	
  then	
  a	
  total	
  reload	
  of	
  all	
  the	
  data	
  that	
  was	
  on	
  that	
  
node	
  is	
  required.	
  
Not	
  so	
  “green”	
  
At	
  a	
  time	
  when	
  most	
  vendors	
  are	
  formulating	
  a	
  “green	
  ”	
  message,	
  it	
  turns	
  out	
  that	
  
iMDBs	
  require	
  a	
  lot	
  of	
  power,	
  considerably	
  more	
  than	
  spinning	
  drives	
  and	
  significantly	
  
more	
  than	
  solid-­‐state	
  drives	
  (SSD	
  –	
  more	
  on	
  this	
  below))	
  RAM	
  is	
  volatile	
  and	
  needs	
  to	
  be	
  
powered	
  24/7	
  if	
  the	
  data	
  is	
  to	
  persist.	
  
	
  
The	
  Hybrid	
  DBMS	
  
IMDB	
  vendors	
  often	
  portray	
  disk-­‐based	
  systems	
  as	
  dinosaurs	
  that	
  have	
  outlived	
  their	
  
usefulness,	
  but	
  in	
  fact,	
  they	
  are	
  the	
  result	
  of	
  30	
  years	
  of	
  research	
  and	
  development	
  by	
  
some	
  of	
  the	
  most	
  brilliant	
  minds	
  in	
  the	
  technology	
  industry	
  and	
  have	
  hardly	
  been	
  
standing	
  still.	
  In	
  the	
  same	
  way	
  relational	
  database	
  technology	
  gradually	
  gained	
  new	
  
hardware	
  capabilities	
  and	
  evolved	
  to	
  become	
  hybrid-­‐DMBS,	
  it	
  seems	
  likely	
  that	
  the	
  
major	
  database	
  vendors	
  will	
  continue	
  to	
  evolve	
  to	
  leverage	
  the	
  advantages	
  of	
  more	
  
memory	
  over	
  disk	
  drives.	
  The	
  dramatic	
  cost	
  reductions	
  of	
  memory	
  have	
  benefits	
  that	
  
accrue	
  to	
  hybrid-­‐DMBSs	
  too	
  –	
  solid-­‐state	
  disk	
  drives	
  replacing	
  traditional	
  magnetic	
  
drives	
  with	
  improvements	
  in	
  I/O	
  speed.	
  Teradata	
  Virtual	
  Storage	
  for	
  example	
  
automatically	
  manages	
  the	
  movement	
  of	
  the	
  hot	
  and	
  the	
  cold	
  data.	
  Large	
  memory	
  
models	
  are	
  common,	
  too,	
  even	
  if	
  the	
  persistent	
  data	
  remains	
  on	
  attached	
  storage	
  
instead	
  of	
  completely	
  in	
  memory.	
  	
  
Another	
  consideration	
  is	
  that	
  for	
  most	
  database	
  applications,	
  there	
  is	
  a	
  clear	
  difference	
  
between	
  hot	
  and	
  cold	
  data.	
  In	
  other	
  words,	
  data	
  that	
  is	
  used	
  at	
  the	
  moment	
  as	
  opposed	
  
to	
  data	
  that	
  is	
  use	
  less	
  frequently.	
  This	
  tilts	
  the	
  decision	
  between	
  disk-­‐only	
  and	
  in-­‐
memory	
  to	
  an	
  in-­‐between	
  alternative,	
  a	
  hybrid	
  scheme	
  with	
  large	
  memory,	
  SDD	
  drives,	
  
and	
  less	
  expensive	
  slower	
  HDD	
  for	
  warm	
  or	
  cold	
  data.	
  Hybrid-­‐DBMS	
  leverage	
  the	
  speed	
  
of	
  SSD	
  to	
  reduce	
  query	
  response	
  time	
  delays	
  by	
  cutting	
  the	
  painful	
  delay	
  times	
  
introduced	
  by	
  lengthy	
  I/O	
  queues	
  in	
  HDD	
  storage.	
  	
  A	
  query	
  requires	
  many	
  I/O	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   11	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
	
  
operations	
  to	
  complete	
  so	
  the	
  time	
  spent	
  with	
  I/O	
  requests	
  in	
  storage	
  queues	
  has	
  a	
  
direct	
  impact.	
  Not	
  only	
  does	
  the	
  speed	
  and	
  parallel	
  channel	
  capability	
  of	
  SSD	
  result	
  in	
  
40X	
  faster	
  I/O	
  completions,	
  but	
  the	
  queue	
  in	
  the	
  HDD	
  are	
  shortened	
  by	
  aiming	
  80%	
  of	
  
I/O	
  at	
  the	
  SSD,	
  this	
  can	
  result	
  up	
  to	
  a	
  60X	
  improvement	
  in	
  average	
  response	
  times.	
  
	
  
A	
  Hybrid	
  scheme	
  requires	
  not	
  only	
  a	
  physical	
  assemblage	
  of	
  devices,	
  but	
  also	
  an	
  
intelligent	
  data	
  manager	
  that	
  continually	
  and	
  transparently	
  optimizes	
  the	
  architecture	
  
by	
  moving	
  data	
  to	
  its	
  best	
  location.	
  The	
  figure	
  below	
  represents	
  Teradata’s	
  version	
  of	
  
such	
  as	
  system.2
	
  
	
  
	
  
Notice	
  that	
  in	
  this	
  scheme,	
  each	
  node	
  is	
  balanced	
  with	
  a	
  combination	
  of	
  CPU’s	
  and	
  their	
  
characteristics,	
  the	
  amount	
  of	
  RAM	
  and	
  the	
  storage	
  devices.	
  This	
  provides	
  for	
  optimum	
  
balance	
  between	
  processing,	
  memory	
  and	
  addressable	
  storage	
  which	
  leads	
  to	
  optimal	
  
performance.	
  It	
  does,	
  however,	
  somewhat	
  limit	
  configuration	
  flexibility	
  as	
  the	
  drives	
  
and	
  CPU’s	
  are	
  fixed.	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
2	
  Teradata	
  are	
  working	
  on	
  extending	
  the	
  data	
  management	
  to	
  the	
  memory	
  layer	
  	
  
	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   12	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
Compare	
  and	
  Contrast	
  
Today	
  there	
  are	
  two	
  ways	
  to	
  store	
  data	
  electronically:	
  on	
  arrays	
  of	
  solid-­‐state	
  memory	
  
chips	
  (on	
  either	
  a	
  memory	
  bus	
  or	
  on	
  SSD)	
  or	
  on	
  magnetic	
  disk	
  drives.	
  Solid-­‐state	
  chips	
  
are	
  obviously	
  faster	
  than	
  magnetic	
  drives	
  (although	
  in	
  some	
  cases,	
  the	
  differential	
  can	
  
be	
  overcome	
  with	
  good	
  platform	
  design	
  and	
  workload	
  management).	
  	
  Solid-­‐state	
  chips	
  
are	
  considerably	
  more	
  expensive	
  than	
  magnetic	
  drives,	
  and	
  volatile	
  RAM	
  chips	
  are	
  
considerably	
  more	
  expensive	
  (and	
  faster)	
  than	
  non-­‐volatile	
  RAM.	
  We	
  can’t	
  see	
  the	
  
future	
  with	
  perfect	
  clarity,	
  but	
  it	
  is	
  likely	
  for	
  the	
  foreseeable	
  future,	
  this	
  stratification	
  of	
  
memory	
  and	
  storage	
  will	
  not	
  change,	
  even	
  as	
  the	
  price/performance	
  of	
  each	
  continues	
  
to	
  improve.	
  The	
  faster	
  RAM	
  chips	
  will	
  remain	
  volatile,	
  making	
  full	
  in-­‐memory	
  databases	
  
impractical	
  for	
  most	
  uses.	
  	
  
iMDB	
  lack	
  the	
  balance	
  of	
  CPU	
  and	
  storage	
  could	
  lead	
  to	
  flooding	
  of	
  the	
  CPU’s.	
  iMDB	
  
trades	
  the	
  potential	
  for	
  I/O	
  latency	
  with	
  the	
  very	
  real	
  possibility	
  of	
  RAM	
  out-­‐performing	
  
the	
  processors.	
  Without	
  I/O	
  bottleneck,	
  processors	
  can	
  become	
  saturated.	
  This	
  is	
  
something	
  that	
  the	
  software	
  developers	
  should	
  be	
  aware	
  of,	
  and	
  design	
  for,	
  but	
  given	
  
the	
  relative	
  recency	
  of	
  certain	
  iMDB’s,	
  these	
  features	
  may	
  not	
  be	
  well	
  developed.	
  It	
  
may	
  be	
  the	
  case	
  that	
  client	
  applications	
  may	
  need	
  to	
  be	
  rewritten	
  to	
  not	
  only	
  take	
  
advantage	
  of	
  the	
  memory	
  resources	
  but	
  to	
  keep	
  them	
  from	
  bogging	
  down.	
  
iMDB	
  rely	
  on	
  large	
  banks	
  of	
  very	
  fast,	
  expensive	
  RAM,	
  but	
  also	
  on	
  the	
  other	
  types	
  of	
  
memory	
  and	
  storage	
  to	
  operate	
  for	
  high	
  availability	
  and	
  for	
  backup.	
  Hybrid-­‐DBMS	
  relies	
  
on	
  the	
  same	
  collection	
  of	
  memory	
  and	
  storage	
  types,	
  but	
  in	
  different	
  proportion.	
  A	
  
hybrid	
  system	
  uses	
  solid-­‐state	
  memory	
  judiciously	
  and	
  attempts	
  to	
  keep	
  as	
  much	
  data	
  
pinned	
  in	
  memory	
  as	
  possible	
  for	
  active	
  work,	
  but	
  relies	
  on	
  only	
  one	
  mechanism	
  for	
  
persistent	
  storage.	
  
	
  
Conclusion	
  
	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   13	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
	
  
iMDB	
  vendors	
  claim	
  that	
  In-­‐Memory	
  will	
  replace	
  traditional	
  hybrid-­‐DBMS,	
  unless	
  they	
  
are	
  new	
  laws	
  of	
  physics,	
  holding	
  persistent	
  data	
  for	
  months	
  or	
  years	
  simply	
  isn’t	
  feasible	
  
without	
  resorting	
  to	
  a	
  hybrid	
  in-­‐memory	
  and	
  disk-­‐based	
  system.	
  In	
  a	
  way,	
  one	
  can	
  think	
  
of	
  an	
  iMDB	
  as	
  merely	
  an	
  accelerator	
  for	
  a	
  conventional	
  database	
  because	
  it	
  cannot	
  
meet	
  the	
  requirements	
  durability	
  on	
  its	
  own.	
  	
  
On	
  the	
  other	
  hand,	
  hybrid-­‐DMBS	
  are	
  based	
  on	
  proven	
  data	
  warehousing	
  technologies	
  
and	
  offer	
  flexible	
  architectures	
  and	
  deliver	
  high	
  performance	
  with	
  automatic	
  storage	
  
management.	
  
It	
  would	
  be	
  easy	
  to	
  predict	
  that	
  iMDBs,	
  and	
  that	
  includes	
  DBMS	
  with	
  all	
  SSD	
  drives,	
  will	
  
eventually	
  overtake	
  disk-­‐based	
  systems.	
  However,	
  the	
  cost	
  of	
  memory	
  will	
  still	
  be	
  
greater,	
  no	
  matter	
  what	
  it	
  is,	
  than	
  disk	
  drives	
  and	
  though	
  it	
  is	
  impossible	
  to	
  predict,	
  the	
  
amount	
  of	
  data	
  captured	
  and	
  analyzed	
  will	
  continue	
  to	
  grow	
  at	
  a	
  rate	
  faster	
  than	
  the	
  
price/Gb	
  of	
  memory.	
  	
  
	
  
	
  
On	
  the	
  Persistence	
  of	
  Memory	
  (in	
  Database	
  Systems)	
   14	
  
	
  
©	
  2012	
  Hired	
  Brains	
  Inc.	
  	
  All	
  Rights	
  Reserved	
  
	
  
ABOUT	
  THE	
  AUTHOR	
  
	
  
	
  
	
  
Neil	
  Raden,	
  based	
  in	
  Santa	
  Fe,	
  NM,	
  is	
  an	
  industry	
  analyst	
  and	
  active	
  consultant,	
  widely	
  
published	
  author	
  and	
  speaker	
  and	
  the	
  founder	
  of	
  Hired	
  Brains,	
  Inc.,	
  
http://www.hiredbrains.com.	
  Hired	
  Brains	
  provides	
  consulting,	
  systems	
  integration	
  and	
  
implementation	
  services	
  in	
  Data	
  Warehousing,	
  Business	
  Intelligence,	
  “big	
  data:,	
  Decision	
  
Automation	
  and	
  Advanced	
  Analytics	
  for	
  clients	
  worldwide.	
  Hired	
  Brains	
  Research	
  
provides	
  consulting,	
  market	
  research,	
  product	
  marketing	
  and	
  advisory	
  services	
  to	
  the	
  
software	
  industry.	
  
	
  
Neil	
  was	
  a	
  contributing	
  author	
  to	
  one	
  of	
  the	
  first	
  (1995)	
  books	
  on	
  designing	
  data	
  
warehouses	
  and	
  he	
  is	
  more	
  recently	
  the	
  co-­‐author	
  of	
  Smart	
  (Enough)	
  Systems:	
  How	
  to	
  
Deliver	
  Competitive	
  Advantage	
  by	
  Automating	
  Hidden	
  Decisions,	
  Prentice-­‐Hall,	
  2007.	
  He	
  
welcomes	
  your	
  comments	
  at	
  nraden@hiredbrains.com	
  or	
  at	
  his	
  blog	
  at	
  Competing	
  on	
  
Decisions.	
  
	
  
	
  
	
  

More Related Content

What's hot

Distribution transparency and Distributed transaction
Distribution transparency and Distributed transactionDistribution transparency and Distributed transaction
Distribution transparency and Distributed transactionshraddha mane
 
Sap On Udb Layout
Sap On Udb LayoutSap On Udb Layout
Sap On Udb Layoutbpmfhu
 
20. Parallel Databases in DBMS
20. Parallel Databases in DBMS20. Parallel Databases in DBMS
20. Parallel Databases in DBMSkoolkampus
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architectureArpan Baishya
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
General Information About Information Technologies
General Information About Information TechnologiesGeneral Information About Information Technologies
General Information About Information Technologiestechgajanan
 
Distributed system unit II according to syllabus of RGPV, Bhopal
Distributed system unit II according to syllabus of  RGPV, BhopalDistributed system unit II according to syllabus of  RGPV, Bhopal
Distributed system unit II according to syllabus of RGPV, BhopalNANDINI SHARMA
 
A Study Of Disaggregated Memory Management Techniques With Hypervisor Based T...
A Study Of Disaggregated Memory Management Techniques With Hypervisor Based T...A Study Of Disaggregated Memory Management Techniques With Hypervisor Based T...
A Study Of Disaggregated Memory Management Techniques With Hypervisor Based T...IJSRED
 
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...Michael Hudak
 
I/O System and Case Study
I/O System and Case StudyI/O System and Case Study
I/O System and Case StudyGRamya Bharathi
 
IBM PROTECTIER REPLICATION FOR OUTSTANDING DATA PROTECTION
IBM PROTECTIER REPLICATION FOR OUTSTANDING DATA PROTECTIONIBM PROTECTIER REPLICATION FOR OUTSTANDING DATA PROTECTION
IBM PROTECTIER REPLICATION FOR OUTSTANDING DATA PROTECTIONIBM India Smarter Computing
 
VM and IO Topics in Linux
VM and IO Topics in LinuxVM and IO Topics in Linux
VM and IO Topics in Linuxcucufrog
 
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...peknap
 
Distributed Processing
Distributed ProcessingDistributed Processing
Distributed ProcessingImtiaz Hussain
 

What's hot (18)

Distribution transparency and Distributed transaction
Distribution transparency and Distributed transactionDistribution transparency and Distributed transaction
Distribution transparency and Distributed transaction
 
Sap On Udb Layout
Sap On Udb LayoutSap On Udb Layout
Sap On Udb Layout
 
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
 
20. Parallel Databases in DBMS
20. Parallel Databases in DBMS20. Parallel Databases in DBMS
20. Parallel Databases in DBMS
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architecture
 
Ghel os
Ghel osGhel os
Ghel os
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
General Information About Information Technologies
General Information About Information TechnologiesGeneral Information About Information Technologies
General Information About Information Technologies
 
Distributed system unit II according to syllabus of RGPV, Bhopal
Distributed system unit II according to syllabus of  RGPV, BhopalDistributed system unit II according to syllabus of  RGPV, Bhopal
Distributed system unit II according to syllabus of RGPV, Bhopal
 
A Study Of Disaggregated Memory Management Techniques With Hypervisor Based T...
A Study Of Disaggregated Memory Management Techniques With Hypervisor Based T...A Study Of Disaggregated Memory Management Techniques With Hypervisor Based T...
A Study Of Disaggregated Memory Management Techniques With Hypervisor Based T...
 
Computer Hardware
Computer HardwareComputer Hardware
Computer Hardware
 
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...
Great Article, Thanks Paul Feresten, Sr. Product Marketing Manager, and Rajes...
 
I/O System and Case Study
I/O System and Case StudyI/O System and Case Study
I/O System and Case Study
 
IBM PROTECTIER REPLICATION FOR OUTSTANDING DATA PROTECTION
IBM PROTECTIER REPLICATION FOR OUTSTANDING DATA PROTECTIONIBM PROTECTIER REPLICATION FOR OUTSTANDING DATA PROTECTION
IBM PROTECTIER REPLICATION FOR OUTSTANDING DATA PROTECTION
 
VM and IO Topics in Linux
VM and IO Topics in LinuxVM and IO Topics in Linux
VM and IO Topics in Linux
 
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
Controlling Memory Footprint at All Layers: Linux Kernel, Applications, Libra...
 
Distributed Database
Distributed DatabaseDistributed Database
Distributed Database
 
Distributed Processing
Distributed ProcessingDistributed Processing
Distributed Processing
 

Similar to Persistence of memory: In-memory Is Not Often the Answer

Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...
Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...
Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...sumithragunasekaran
 
Efficient and scalable multitenant placement approach for in memory database ...
Efficient and scalable multitenant placement approach for in memory database ...Efficient and scalable multitenant placement approach for in memory database ...
Efficient and scalable multitenant placement approach for in memory database ...CSITiaesprime
 
Deep semantic understanding
Deep semantic understandingDeep semantic understanding
Deep semantic understandingsidra ali
 
What Every Programmer Should Know About Memory
What Every Programmer Should Know About MemoryWhat Every Programmer Should Know About Memory
What Every Programmer Should Know About MemoryYing wei (Joe) Chou
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep DivesRush Shah
 
Solid State Drives - Seminar for Computer Engineering Semester 6 - VIT,Univer...
Solid State Drives - Seminar for Computer Engineering Semester 6 - VIT,Univer...Solid State Drives - Seminar for Computer Engineering Semester 6 - VIT,Univer...
Solid State Drives - Seminar for Computer Engineering Semester 6 - VIT,Univer...ravipbhat
 
In memory big data management and processing a survey
In memory big data management and processing a surveyIn memory big data management and processing a survey
In memory big data management and processing a surveyredpel dot com
 
Memory dbms
Memory dbmsMemory dbms
Memory dbmsTech_MX
 
IMDB_Scalability
IMDB_ScalabilityIMDB_Scalability
IMDB_ScalabilityIsrael Gold
 
Whitepaper_Cassandra_Datastax_Final
Whitepaper_Cassandra_Datastax_FinalWhitepaper_Cassandra_Datastax_Final
Whitepaper_Cassandra_Datastax_FinalMichele Hunter
 
IMDB_Scalability
IMDB_ScalabilityIMDB_Scalability
IMDB_ScalabilityIsrael Gold
 
Webcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopWebcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopImpetus Technologies
 
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptx
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptxUNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptx
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptxSnehaLatha68
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperDavid Walker
 
What every-programmer-should-know-about-memory
What every-programmer-should-know-about-memoryWhat every-programmer-should-know-about-memory
What every-programmer-should-know-about-memoryxan peng
 

Similar to Persistence of memory: In-memory Is Not Often the Answer (20)

Generic SAN Acceleration White Paper DRAFT
Generic SAN Acceleration White Paper DRAFTGeneric SAN Acceleration White Paper DRAFT
Generic SAN Acceleration White Paper DRAFT
 
Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...
Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...
Terminologies Used In Big data Environments,G.Sumithra,II-M.sc(computer scien...
 
Efficient and scalable multitenant placement approach for in memory database ...
Efficient and scalable multitenant placement approach for in memory database ...Efficient and scalable multitenant placement approach for in memory database ...
Efficient and scalable multitenant placement approach for in memory database ...
 
Deep semantic understanding
Deep semantic understandingDeep semantic understanding
Deep semantic understanding
 
What Every Programmer Should Know About Memory
What Every Programmer Should Know About MemoryWhat Every Programmer Should Know About Memory
What Every Programmer Should Know About Memory
 
Report 2.0.docx
Report 2.0.docxReport 2.0.docx
Report 2.0.docx
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
 
Solid State Drives - Seminar for Computer Engineering Semester 6 - VIT,Univer...
Solid State Drives - Seminar for Computer Engineering Semester 6 - VIT,Univer...Solid State Drives - Seminar for Computer Engineering Semester 6 - VIT,Univer...
Solid State Drives - Seminar for Computer Engineering Semester 6 - VIT,Univer...
 
Report 1.0.docx
Report 1.0.docxReport 1.0.docx
Report 1.0.docx
 
In memory big data management and processing a survey
In memory big data management and processing a surveyIn memory big data management and processing a survey
In memory big data management and processing a survey
 
Vectorization whitepaper
Vectorization whitepaperVectorization whitepaper
Vectorization whitepaper
 
Memory dbms
Memory dbmsMemory dbms
Memory dbms
 
IMDB_Scalability
IMDB_ScalabilityIMDB_Scalability
IMDB_Scalability
 
Whitepaper_Cassandra_Datastax_Final
Whitepaper_Cassandra_Datastax_FinalWhitepaper_Cassandra_Datastax_Final
Whitepaper_Cassandra_Datastax_Final
 
IMDB_Scalability
IMDB_ScalabilityIMDB_Scalability
IMDB_Scalability
 
Webcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond HadoopWebcast Q&A- Big Data Architectures Beyond Hadoop
Webcast Q&A- Big Data Architectures Beyond Hadoop
 
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptx
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptxUNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptx
UNIT 3 Memory Design for SOC.ppUNIT 3 Memory Design for SOC.pptx
 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - Paper
 
notes2 memory_cpu
notes2 memory_cpunotes2 memory_cpu
notes2 memory_cpu
 
What every-programmer-should-know-about-memory
What every-programmer-should-know-about-memoryWhat every-programmer-should-know-about-memory
What every-programmer-should-know-about-memory
 

More from Neil Raden

Kagan our constitutional crisis is already here
Kagan our constitutional crisis is already here Kagan our constitutional crisis is already here
Kagan our constitutional crisis is already here Neil Raden
 
Evaluating the opportunity for embedded ai in data productivity tools
Evaluating the opportunity for embedded ai in data productivity toolsEvaluating the opportunity for embedded ai in data productivity tools
Evaluating the opportunity for embedded ai in data productivity toolsNeil Raden
 
Data lakehouse fallacies
 Data lakehouse fallacies Data lakehouse fallacies
Data lakehouse fallaciesNeil Raden
 
Diginomica 2019 2020 not ai neil raden article links and captions
Diginomica 2019 2020 not ai  neil raden article links and captionsDiginomica 2019 2020 not ai  neil raden article links and captions
Diginomica 2019 2020 not ai neil raden article links and captionsNeil Raden
 
Diginomica 2019 2020 ai ai ethics neil raden articles links and captions
Diginomica 2019 2020 ai ai ethics neil raden articles links and captionsDiginomica 2019 2020 ai ai ethics neil raden articles links and captions
Diginomica 2019 2020 ai ai ethics neil raden articles links and captionsNeil Raden
 
Ethical use of ai for actuaries
Ethical use of ai for actuariesEthical use of ai for actuaries
Ethical use of ai for actuariesNeil Raden
 
Strategy Report for NextGen BI
Strategy Report for NextGen BIStrategy Report for NextGen BI
Strategy Report for NextGen BINeil Raden
 
Precision medicine and AI: problems ahead
Precision medicine and AI: problems aheadPrecision medicine and AI: problems ahead
Precision medicine and AI: problems aheadNeil Raden
 
Global Data Management: Governance, Security and Usefulness in a Hybrid World
Global Data Management: Governance, Security and Usefulness in a Hybrid WorldGlobal Data Management: Governance, Security and Usefulness in a Hybrid World
Global Data Management: Governance, Security and Usefulness in a Hybrid WorldNeil Raden
 
Relational Technologies Under Siege: Will Handsome Newcomers Displace the St...
Relational Technologies Under Siege:  Will Handsome Newcomers Displace the St...Relational Technologies Under Siege:  Will Handsome Newcomers Displace the St...
Relational Technologies Under Siege: Will Handsome Newcomers Displace the St...Neil Raden
 
Understanding the effects of steroid hormone exposure on direct gene regulati...
Understanding	the effects of steroid hormone exposure on direct gene regulati...Understanding	the effects of steroid hormone exposure on direct gene regulati...
Understanding the effects of steroid hormone exposure on direct gene regulati...Neil Raden
 
Storytelling Drives Usefulness in Business Intelligence
Storytelling Drives Usefulness in Business IntelligenceStorytelling Drives Usefulness in Business Intelligence
Storytelling Drives Usefulness in Business IntelligenceNeil Raden
 
The Case for Business Modeling
The Case for Business ModelingThe Case for Business Modeling
The Case for Business ModelingNeil Raden
 

More from Neil Raden (14)

Kagan our constitutional crisis is already here
Kagan our constitutional crisis is already here Kagan our constitutional crisis is already here
Kagan our constitutional crisis is already here
 
Keynote Dubai
Keynote DubaiKeynote Dubai
Keynote Dubai
 
Evaluating the opportunity for embedded ai in data productivity tools
Evaluating the opportunity for embedded ai in data productivity toolsEvaluating the opportunity for embedded ai in data productivity tools
Evaluating the opportunity for embedded ai in data productivity tools
 
Data lakehouse fallacies
 Data lakehouse fallacies Data lakehouse fallacies
Data lakehouse fallacies
 
Diginomica 2019 2020 not ai neil raden article links and captions
Diginomica 2019 2020 not ai  neil raden article links and captionsDiginomica 2019 2020 not ai  neil raden article links and captions
Diginomica 2019 2020 not ai neil raden article links and captions
 
Diginomica 2019 2020 ai ai ethics neil raden articles links and captions
Diginomica 2019 2020 ai ai ethics neil raden articles links and captionsDiginomica 2019 2020 ai ai ethics neil raden articles links and captions
Diginomica 2019 2020 ai ai ethics neil raden articles links and captions
 
Ethical use of ai for actuaries
Ethical use of ai for actuariesEthical use of ai for actuaries
Ethical use of ai for actuaries
 
Strategy Report for NextGen BI
Strategy Report for NextGen BIStrategy Report for NextGen BI
Strategy Report for NextGen BI
 
Precision medicine and AI: problems ahead
Precision medicine and AI: problems aheadPrecision medicine and AI: problems ahead
Precision medicine and AI: problems ahead
 
Global Data Management: Governance, Security and Usefulness in a Hybrid World
Global Data Management: Governance, Security and Usefulness in a Hybrid WorldGlobal Data Management: Governance, Security and Usefulness in a Hybrid World
Global Data Management: Governance, Security and Usefulness in a Hybrid World
 
Relational Technologies Under Siege: Will Handsome Newcomers Displace the St...
Relational Technologies Under Siege:  Will Handsome Newcomers Displace the St...Relational Technologies Under Siege:  Will Handsome Newcomers Displace the St...
Relational Technologies Under Siege: Will Handsome Newcomers Displace the St...
 
Understanding the effects of steroid hormone exposure on direct gene regulati...
Understanding	the effects of steroid hormone exposure on direct gene regulati...Understanding	the effects of steroid hormone exposure on direct gene regulati...
Understanding the effects of steroid hormone exposure on direct gene regulati...
 
Storytelling Drives Usefulness in Business Intelligence
Storytelling Drives Usefulness in Business IntelligenceStorytelling Drives Usefulness in Business Intelligence
Storytelling Drives Usefulness in Business Intelligence
 
The Case for Business Modeling
The Case for Business ModelingThe Case for Business Modeling
The Case for Business Modeling
 

Recently uploaded

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 

Recently uploaded (20)

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 

Persistence of memory: In-memory Is Not Often the Answer

  • 1. On  the  Persistence  of  Memory  (in  Database  Systems)   i     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved         On  the  Persistence  of  Memory…   In  Database  Systems       Picture  credit  Creative  Commons       By  Neil  Raden   Hired  Brains,  Inc.                                              December,  2012      
  • 2. On  the  Persistence  of  Memory  (in  Database  Systems)   ii     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved     Table  of  Contents     Executive  Summary   1   The  Basics   1   Database  Memory  and  Processing  Models   3   In-­‐Memory  Database   4   Why  is  in-­‐memory,  a  fairly  old  concept,  interesting  again?   6   Limitations  of  iMDB   8   Cost   8   Persistence   8   Volume   9   Dual-­Purpose  OLTP  and  Analytics   9   Not  so  “green”   10   The  Hybrid  DBMS   10   Compare  and  Contrast   12   Conclusion   12   ABOUT  THE  AUTHOR   14  
  • 3. On  the  Persistence  of  Memory  (in  Database  Systems)   1     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved     Executive  Summary     Recent  drop  in  computer  memory  prices  and  the  introduction  of  early  implementations   of  In-­‐Memory  database  solutions  have  recently  raised  the  level  of  interest  in  in-­‐memory   databases,  but  the  topic  of  in-­‐memory  databases  is  not  new.  In  fact,  there  are  literally   dozens  of  in-­‐memory  database  products,  some  in  production  for  decades,  but  due  to   the  prohibitive  cost  differential  between  memory-­‐based  systems  and  disk-­‐based   systems,  none  have  found  a  place  beyond  certain  niche  markets.  But  the  drastic  and   remarkable  (there  is  hardly  a  word  to  describe  it)  drop  in  the  cost  of  memory  combined   with  an  equally  remarkable  growth  in  density  and  capacity  is  driving  the  discussion  into   the  mainstream  of  computing  architectures.       For  the  purposes  of  discussion,  we  refer  to  in-­‐memory  databases  systems  as  iMDB  and   current  relational  database  systems  incorporating  large  memory  models  with  attached   storage  (including  traditional  magnetic  disk  and  solid-­‐state  devices)  as  hybrid-­‐DBMS.   Though  the  discussion  is  occasionally  technical,  our  conclusions  are  that:   • iMDB  are  leveraging  lower-­‐cost  RAM  for  storage  but  still  lack  persistence  and   data  scalability  while  limiting  the  types  of  solutions  supported  by  iMDB   architecture.   • Hybrid-­‐DBMS  is  a  proven  technology  and  provides  high  performance  and  flexible   architecture  to  support  a  variety  of  analytics  applications.       The  Basics    
  • 4. On  the  Persistence  of  Memory  (in  Database  Systems)   2     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved   All  database  management  systems  (DBMS),  in  fact,  virtually  all  programs  in  conventional   computing  environments  behave  exactly  the  same  way.  A  central  processing  unit  (CPU)   performs  a  single  very  low-­‐level  instruction  on  a  single  piece  of  data.  While  complex   application  programs  like  DBMS  have  many  layers  of  functionality  and  can  be  described   logically  as  a  set  of  higher-­‐level  interworking  pieces,  the  CPU  has  utterly  no  insight  into   this,  it  just  chugs  along  one  instruction  at  a  time.  If  you  were  to  sit  inside  a  CPU  and   watch  its  stream  of  sequential  processes,  you  would  be  unable  to  determine  what  the   controlling  program  was  doing.  So  database  software,  or  really,  any  software,  is  just  a   logical  structure  that  encapsulates  all  of  the  smaller  steps.  When  things  get  calculated,   they  bear  no  resemblance  to  the  whole.  A  CPU  doesn’t  know  what  a  join  or  an  index  is.       How  those  bits  of  work  are  presented  to  the  CPU  is  the  heart  of  the  application  design.   In  other  words,  though  there  is  no  difference  in  how  CPU’s  execute  from  one  application   to  another,  the  order  of  those  instructions  is  the  key  to  performance.     Each  step  in  execution  is  composed  of  a  single  instruction  and  a  single  piece  of  data   (though  today’s  CPU’s  are  composed  of  multiple  “cores,”  essentially  multiple  CPU’s  on  a   single  chip).  The  instruction  and  the  data  have  to  be  presented  to  the  CPU  through   memory,  either  system  RAM  or  a  memory  cache  on  the  CPU  itself.  It  makes  no   difference  if  the  application  is  “in-­‐memory”  or  disk-­‐based,  the  CPU  has  to  be  presented   with  the  instruction  (actually,  the  “instruction  set”  is  burned  into  the  CPU,  what  is   presented  to  it  is  an  instruction  for  which  instruction  to  execute).  For  this  reason,  an  in-­‐ memory  architecture,  where  all  instructions  and  data  are  in  RAM  should,  in  theory,   provide  superior  performance  compared  to  DBMS  that  must  fetch  data  from  remote   mechanical  disk  drives.       Solid-­‐state  drives  (SSD)  mentioned  above  use  solid-­‐state  memory  chips,  typically  flash   memory  (NAND),  instead  of  spinning  magnetic  disk  drives.  Flash  memory  (NAND),  is  less   expensive  and  slower  than  RAM/SRAM,  but  it  is  non-­‐volatile,  meaning,  it  retains  data  
  • 5. On  the  Persistence  of  Memory  (in  Database  Systems)   3     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved     without  power.  It  does  not  lose  data  in  the  case  of  a  system  shutdown.  RAM  is  volatile   and  must  be  powered  continuously  and  requires  backup,  typically  conventional  disk   drives  for  reliability.       One  could  say  that  a  DBMS  with  SSD  instead  of  traditional  disks  could  be  an  in-­‐memory   device,  but  there  are  two  fundamental  differences.  First,  the  “memory”  chips  of  an  SSD   are  part  of  a  disk  drive  “card”  or  assembly  that  uses  the  same  block  addressing  as  the   disks  it  replaces.  In  other  words,  even  though  the  seek  time  finding  data  on  an  SSD  is  at   least  an  order  of  magnitude  greater  than  a  spinning  magnetic  disk  (this  is  a   generalization),  there  is  still  a  call  for  external  data,  handled  by  the  disk  controller  and   passed  to  RAM.  An  interesting  arrangement,  typically  used  for  add-­‐on  accelerators,  not   primary  database  operations  are  SSD’s  constructed  from  SRAM,  boosting  the  seek  time   on  the  drives.  This  is  a  special-­‐purpose  architecture  and  very  expensive  and  not  further   considered  here.     Database  Memory  and  Processing  Models     To  clear  up  confusion  between  various  models  for  memory  in  databases,  it’s  useful  to   describe  the  predominant  versions.  There  is  a  difference  between  memory  models  in   database  systems  for  processing.  The  two  predominant  memory  models  for  the  most   common  database  systems  are  shared  memory  and  shared  nothing.  In  both,  memory  is   used  only  for  processing,  not  for  persistent  storage.  This  is  the  essential  difference   between  today’s  iMDBs  and  more  conventional  on-­‐disk  or  hybrid  systems.       In  the  shared  memory  model,  all  database  operations  use  the  same  single  aggregation   of  memory  and  the  system  allocates  its  memory  and  processing  tasks.  All  memory  is   available  to  every  processor.  In  a  shared  nothing  system,  each  separate  node  of   processors  and  memory  do  their  own  work  in  parallel  and  are,  typically,  controlled  by  a  
  • 6. On  the  Persistence  of  Memory  (in  Database  Systems)   4     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved   master  node  (which  can  be  physical  or  virtual).  In  reality,  nodes  in  a  shared  nothing   environment  may,  themselves,  operate  as  independent  shared  memory  nodes.  But  in   neither  case  is  data  stored  in  memory  until  it  is  called  for.  The  exception  is  when  data  is   cached  (frequently  used  data  in  “pinned”  in  memory),  but  it  is  still  volatile  and  the  data   can  be  flushed  at  any  time.       iMDB  operate  more  or  less  like  a  shared  memory  systems,  but  everything,  including   operating  systems,  software  programs  (executables),  workspace,  indexes  and  data  are   stored  in  RAM.  When  these  systems  are  scaled  out  with  multiple  nodes  connected  by  a   network,  they  operate  more  like  a  grid  or  distributed  network  than  like  a  true  MPP-­‐ engineered  system.  However,  concepts  of  shared  memory  versus  shared  disk  (shared   nothing)  are  a  little  obsolete  now  as  CPU’s  themselves  are  multi-­‐core,  meaning,  the   processors  themselves  are  capable  of  parallel  processing,  provided  the  software   program  (DBMs)  has  been  designed  to  take  advantage  of  it)     This  description  is  a  simplification  and  there  are  many  exceptions,  but  in  general,  no   database  management  system  stores  data  persistently  in  memory  except,  of  course,   iMDB.  The  difference  between  the  various  memory  models  described  above  is  how   memory  is  used  for  processing  data.     In-­‐Memory  Database     It  is  an  unassailable  truth  that  data  processed  from  memory  is  orders  of  magnitude   faster  than  retrieving  it  from  a  disk  drive,  but  that  is  only  a  small  part  of  the  story.   Historically,  CPU  processors  have  been  “I/O  bound,”  meaning  they  spent  a  significant   amount  of  time  waiting  for  the  requested  data  to  arrive,  requiring  extreme   countermeasures  in  software  design  to  minimize  the  latency.  With  data  streaming  to   processors  at  the  speed  of  random-­‐access  memory  (RAM),  just  the  opposite  situation  
  • 7. On  the  Persistence  of  Memory  (in  Database  Systems)   5     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved     can  occur  –  the  CPU’s  may  become  flooded  with  data   and  unable  to  process  as  quickly  as  it  is  presented.  The   point  cannot  be  stressed  enough  –  merely  boosting  the   available  RAM  does  not  guarantee  smooth,  faster   executions  of  existing  programs.  This  turn  of  events  calls   for  careful  engineering  and  balance.  In  other  words,   performance  of  complex  applications  is  rarely  resolved   by  changing  one  thing,  it  usually  requires  rethinking  of  the  whole  approach.  The  result  is   that  software  migration  to  in-­‐memory  usually  requires  a  great  deal  of  re-­‐work;  It  is  not   just  move  and  drop.      Even  the  notion  of  iMDB  is  a  bit  of  a  misnomer  as  there   is  still  the  requirement  for  separate  conventional  storage   devices  for  mirroring  everything  for  persistence,  and   keeping  the  iMDB  refreshed  and  reliable.  Systems  can   fail,  which  means  in-­‐memory  systems  still  have  to   maintain  multiple  copies  of  the  data,  and  a  complete   reload  if  the  system  fails.  Adding  all  of  these  factors   together  can  make  the  effort  quite  expensive  despite  the   seemingly  reasonable  price  of  memory  today  (though  at  multiple  terabytes,  you  will  feel   the  pinch).  In  addition,  to  make  maximum  use  of  RAM,  all  database  systems  use   compression  of  data,  to  one  degree  or  another.  IMDBs  typically  employ  aggressive   compression  algorithms  to  maximize  the  amount  of  data  that  can  be  put  in  working   memory.  Back-­‐up  of  an  iMDB  is  usually  lightly-­‐  or  un-­‐compressed  so  it  can  be  read  by   other  processes,  among  other  reasons.  Assuming  a  realistic  3.5x  compression  for  an   iMDB  (not  all  RAM  is  available  for  the  data),  the  back-­‐up  drives  will  need  to  be  5X  the   size  of  RAM,  and  there  may  be  multiple  archives,  and  the  backups  themselves  will  likely   be  mirrored.  With  even  average-­‐sized  analytical  data  warehouses  today  running  about   50  terabytes  (there  are,  of  course  much  larger  ones),  an  iMDB  to  accommodate  those   The  point  cannot  be   stressed  enough  –  merely   boosting  the  available   RAM  does  not  guarantee   smooth,  faster  executions   of  existing  programs.     Even  the  notion  of  iMDB   is  a  bit  of  a  misnomer  as   there  is  still  the   requirement  for  separate   conventional  storage   devices  for  mirroring   everything  for   persistence,  and  keeping   the  iMDB  refreshed  and     reliable.  
  • 8. On  the  Persistence  of  Memory  (in  Database  Systems)   6     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved   would  need  75-­‐100TB  of  separate  disk  drives  to  handle  back-­‐ups,  snapshots,  logs  and   staging  areas.     Another  thing  to  consider  is  that  a  database  still  has  to  perform  all  of  the  database   functions,  from  loading  data  to  presenting  it  as  the  result  of  a  query.  Conventional   relational  database  technology,  including  those  platforms  are  that  are  designed   specifically  for  data  warehousing  and  analytical  work,  as  opposed  to  transactional   processing,  must  employ  a  host  of  services  to  be  useful  to  an  enterprise  including:   • Workload  management  for  efficient  management  of  the  resources     • Security   • Reliability     • High  availability   • Use  of  performance  statistics  for  query  optimization.       They  must  also  support,  in  addition  to  traditional  row-­‐based  schema,  columnar   organization  of  the  data  which  is  particularly  effective  for  wide  tables  with  many   attributes,  but  it  is  less  effective  with  more  normalized  schema  and  has  some  serious   drawbacks  in  the  ability  to  update  the  database  in  real-­‐time.  But  columnar  orientation  is   not  a  feature  limited  to  iMDBs  –  most  analytical  database  systems  incorporate  or  even   operate  solely  in  columnar  mode.       Why  is  in-­‐memory,  a  fairly  old  concept,  interesting  again?       iMDBs  have  been  used  for  quite  some  time  but  they  have  always  been  limited  primarily   by  three  factors:  The  cost  of  memory,  size  of  database,  and  the  persistence  of  data.   Today,  a  dollar  will  buy  500  to  1000  times  as  much  memory  as  it  did  in  1995,  and  the   capacity  per  square  inch  of  the  chips  has  increased  in  inverse  proportion.  Memory   speeds  increased  as  well,  though  not  as  dramatically.  If  the  amount  of  data  that  could  
  • 9. On  the  Persistence  of  Memory  (in  Database  Systems)   7     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved     be  stored  in  early  in-­‐memory  systems  was  too  small  for  most  applications,  1000  times   more  memory  might  be  enough  for  in-­‐memory  to  be  feasible.         This  extremely  simplified  diagram  depicts  the  essential  (but  certainly  not  all)  differences   between  an  iMDB  and  a  hybrid-­‐DBMS.  iMDB  maximizes  the  use  of  RAM  but  uses   essentially  the  same  hardware  architecture  of  2  CPU’s  with  levels  of  on-­‐board  cache,   and  RAM  for  holding  the  entire  database,  the  database  software,  working  space,  caches   and  embedded  functionality.  The  only  difference  in  the  hybrid-­‐DBMS  is  less  reliance  on   RAM  and  the  ability  to  address  vastly  greater  amounts  of  data  from  the  storage   subsystem.  The  hybrid-­‐DBMS  has  documented  databases  of  greater  than  a  petabyte.   iDBMS  typically  scale  out  to  16  servers  with  up  to  1  terabyte  of  RAM  each,  but  with  a   significant  amount  of  RAM  taken  up  with  operating  system,  working  memory,  etc,.   Therefore  even  with  5x  compression,  the  maximum  amount  of  uncompressed  data  per  
  • 10. On  the  Persistence  of  Memory  (in  Database  Systems)   8     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved   server  is  no  more  than  40Tb.  Given  the  expense  of  these  large  iMDB  systems,  scaling  out   to  sizes  that  are  needed  today  is  difficult.     Limitations  of  iMDB     In-­‐memory  databases  are  constrained  by  key  overwhelming  limitations:       • No  matter  how  inexpensive  RAM  is  today  compared  to  historical  cost,  it  is  still   considerably  more  expensive  than  its  alternatives  limiting  its  useful  for   enterprise  level  systems.     • Data  cannot  persist  in  memory  indefinitely.  It  is  inevitable  that  something  will   fail,  which  requires  mechanisms  to  protect  the  data  that  can  erode  the  value   proposition.     • With  today’s  data  volumes,  it  is  still  not  practical  to  use  an  in-­‐memory  approach   for  a  data  warehouse.   • iMDB  rely  on  the  system  being  up  24/7.     Cost   Though  RAM  is  10,000  times  faster  to  read  than  a  mechanical  disk  drive,  data  volumes   today  are  enormous  and  growing.  A  petabye-­‐sized  in  memory  database  would  cost   more  than  $5  million,  perhaps  twice  that.    SSD  for  that  capacity  would  cost  1/5  to  1/10   the  price.  And  a  hybrid-­‐DBMS,  hot/warm/cold  hierarchical  storage  architecture  would   cost  far  less  than  that.   Persistence   In-­‐memory  architecture  still  requires  conventional  storage.  RAM  is  volatile  and  if   something  fails,  or  even  just  hiccups,  there  can  be  data  loss.  Therefore,  everything  in   memory  has  to  have  a  copy  on  less  volatile  storage  devices.  Updating  the  memory   requires  log  files,  “Snapshots”  and  “checkpoints  which  can  slow  down  processing).    
  • 11. On  the  Persistence  of  Memory  (in  Database  Systems)   9     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved     Volume   In-­‐memory  cannot  economically,  or  even  practically,  scale  to  the  volumes  of  today’s   data  warehouses.  Ten  years  ago,  a  terabyte-­‐size  data  warehouse  was  remarkable,  but   today,  there  are  dozens,  perhaps  even  more  than  a  hundred  greater  than  a  petabyte,   one  thousand  times  larger.  Projections  are  that  this  growth  rate  is  not  diminishing.   Dual-­Purpose  OLTP  and  Analytics   Some  iMDB  products  promise  the  ability  to  perform  OLTP  and  analytical  processing  on   the  same  platform,  with  the  same  data.  This  would  be  a  real  advantage  as  it  would   alleviate  need  to  extract  and  transform  data  from  operational  systems  and  provide   analytical  support  without  additional.  Unfortunately,  this  is  currently  impossible.       iMDB  platforms  generally  cannot  support  OLTP  because  they  have  to  wait  for  a   transaction  to  complete  on  disk  to  be  ACID  compliant.  When  data  is  updated  in   memory,  it  is  held  in  log  files  usually  stored  on  SSD  drives.    iMDB  platforms  use  this  disk   based  “persistent”  layer  to  “weather”  a  node  failure,  which,  in  a  narrow  sense,  suggests   they  have  ACID  properties.    When  the  iMDB  node  comes  back  up  (after  the  failed  part  is   replaced  or  the  cold  standby  node  takes  over),  the  data  that  is  resident  on  the  disk   “persistent”  layer  is  reloaded  back  into  memory.    It  can  be  done  in  one  of  two  ways  –   “Lazy”,  where  the  data  is  reloaded  as  queries  enter  the  system  and  request  a  specific   table  (which  doesn’t  really  make  sense  since  the  iMDB  appears  in  memory  as  one   dimensional  table),  or  “Full”  where  queries  must  wait  until  all  the  data  is  reloaded.    In   both  cases,  the  log  files  stored  on  disk  or  flash  have  to  be  read  and  applied1 .         There  are  features  to  handle  different  kinds  of  failure,  though.  Both  the  SSD  area  and   Disk  Persistent  layer  have  RAID  capability  to  cover  for  a  disk  failure.  So,  if  a  node   has  a  problem,  but  keeps  power,  then  all  “may  be”  ok.    It  is  an  “error  dependent”  issue.    If  there  is  a  problem  with  a  memory  chip,  it  is  unlikely  the  data  will  survive  -­‐-­‐  requiring  a                                                                                                                    
  • 12. On  the  Persistence  of  Memory  (in  Database  Systems)   10     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved   total  reload..  If  a  node  loses  power,  then  a  total  reload  of  all  the  data  that  was  on  that   node  is  required.   Not  so  “green”   At  a  time  when  most  vendors  are  formulating  a  “green  ”  message,  it  turns  out  that   iMDBs  require  a  lot  of  power,  considerably  more  than  spinning  drives  and  significantly   more  than  solid-­‐state  drives  (SSD  –  more  on  this  below))  RAM  is  volatile  and  needs  to  be   powered  24/7  if  the  data  is  to  persist.     The  Hybrid  DBMS   IMDB  vendors  often  portray  disk-­‐based  systems  as  dinosaurs  that  have  outlived  their   usefulness,  but  in  fact,  they  are  the  result  of  30  years  of  research  and  development  by   some  of  the  most  brilliant  minds  in  the  technology  industry  and  have  hardly  been   standing  still.  In  the  same  way  relational  database  technology  gradually  gained  new   hardware  capabilities  and  evolved  to  become  hybrid-­‐DMBS,  it  seems  likely  that  the   major  database  vendors  will  continue  to  evolve  to  leverage  the  advantages  of  more   memory  over  disk  drives.  The  dramatic  cost  reductions  of  memory  have  benefits  that   accrue  to  hybrid-­‐DMBSs  too  –  solid-­‐state  disk  drives  replacing  traditional  magnetic   drives  with  improvements  in  I/O  speed.  Teradata  Virtual  Storage  for  example   automatically  manages  the  movement  of  the  hot  and  the  cold  data.  Large  memory   models  are  common,  too,  even  if  the  persistent  data  remains  on  attached  storage   instead  of  completely  in  memory.     Another  consideration  is  that  for  most  database  applications,  there  is  a  clear  difference   between  hot  and  cold  data.  In  other  words,  data  that  is  used  at  the  moment  as  opposed   to  data  that  is  use  less  frequently.  This  tilts  the  decision  between  disk-­‐only  and  in-­‐ memory  to  an  in-­‐between  alternative,  a  hybrid  scheme  with  large  memory,  SDD  drives,   and  less  expensive  slower  HDD  for  warm  or  cold  data.  Hybrid-­‐DBMS  leverage  the  speed   of  SSD  to  reduce  query  response  time  delays  by  cutting  the  painful  delay  times   introduced  by  lengthy  I/O  queues  in  HDD  storage.    A  query  requires  many  I/O  
  • 13. On  the  Persistence  of  Memory  (in  Database  Systems)   11     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved     operations  to  complete  so  the  time  spent  with  I/O  requests  in  storage  queues  has  a   direct  impact.  Not  only  does  the  speed  and  parallel  channel  capability  of  SSD  result  in   40X  faster  I/O  completions,  but  the  queue  in  the  HDD  are  shortened  by  aiming  80%  of   I/O  at  the  SSD,  this  can  result  up  to  a  60X  improvement  in  average  response  times.     A  Hybrid  scheme  requires  not  only  a  physical  assemblage  of  devices,  but  also  an   intelligent  data  manager  that  continually  and  transparently  optimizes  the  architecture   by  moving  data  to  its  best  location.  The  figure  below  represents  Teradata’s  version  of   such  as  system.2       Notice  that  in  this  scheme,  each  node  is  balanced  with  a  combination  of  CPU’s  and  their   characteristics,  the  amount  of  RAM  and  the  storage  devices.  This  provides  for  optimum   balance  between  processing,  memory  and  addressable  storage  which  leads  to  optimal   performance.  It  does,  however,  somewhat  limit  configuration  flexibility  as  the  drives   and  CPU’s  are  fixed.                                                                                                                     2  Teradata  are  working  on  extending  the  data  management  to  the  memory  layer      
  • 14. On  the  Persistence  of  Memory  (in  Database  Systems)   12     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved   Compare  and  Contrast   Today  there  are  two  ways  to  store  data  electronically:  on  arrays  of  solid-­‐state  memory   chips  (on  either  a  memory  bus  or  on  SSD)  or  on  magnetic  disk  drives.  Solid-­‐state  chips   are  obviously  faster  than  magnetic  drives  (although  in  some  cases,  the  differential  can   be  overcome  with  good  platform  design  and  workload  management).    Solid-­‐state  chips   are  considerably  more  expensive  than  magnetic  drives,  and  volatile  RAM  chips  are   considerably  more  expensive  (and  faster)  than  non-­‐volatile  RAM.  We  can’t  see  the   future  with  perfect  clarity,  but  it  is  likely  for  the  foreseeable  future,  this  stratification  of   memory  and  storage  will  not  change,  even  as  the  price/performance  of  each  continues   to  improve.  The  faster  RAM  chips  will  remain  volatile,  making  full  in-­‐memory  databases   impractical  for  most  uses.     iMDB  lack  the  balance  of  CPU  and  storage  could  lead  to  flooding  of  the  CPU’s.  iMDB   trades  the  potential  for  I/O  latency  with  the  very  real  possibility  of  RAM  out-­‐performing   the  processors.  Without  I/O  bottleneck,  processors  can  become  saturated.  This  is   something  that  the  software  developers  should  be  aware  of,  and  design  for,  but  given   the  relative  recency  of  certain  iMDB’s,  these  features  may  not  be  well  developed.  It   may  be  the  case  that  client  applications  may  need  to  be  rewritten  to  not  only  take   advantage  of  the  memory  resources  but  to  keep  them  from  bogging  down.   iMDB  rely  on  large  banks  of  very  fast,  expensive  RAM,  but  also  on  the  other  types  of   memory  and  storage  to  operate  for  high  availability  and  for  backup.  Hybrid-­‐DBMS  relies   on  the  same  collection  of  memory  and  storage  types,  but  in  different  proportion.  A   hybrid  system  uses  solid-­‐state  memory  judiciously  and  attempts  to  keep  as  much  data   pinned  in  memory  as  possible  for  active  work,  but  relies  on  only  one  mechanism  for   persistent  storage.     Conclusion    
  • 15. On  the  Persistence  of  Memory  (in  Database  Systems)   13     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved     iMDB  vendors  claim  that  In-­‐Memory  will  replace  traditional  hybrid-­‐DBMS,  unless  they   are  new  laws  of  physics,  holding  persistent  data  for  months  or  years  simply  isn’t  feasible   without  resorting  to  a  hybrid  in-­‐memory  and  disk-­‐based  system.  In  a  way,  one  can  think   of  an  iMDB  as  merely  an  accelerator  for  a  conventional  database  because  it  cannot   meet  the  requirements  durability  on  its  own.     On  the  other  hand,  hybrid-­‐DMBS  are  based  on  proven  data  warehousing  technologies   and  offer  flexible  architectures  and  deliver  high  performance  with  automatic  storage   management.   It  would  be  easy  to  predict  that  iMDBs,  and  that  includes  DBMS  with  all  SSD  drives,  will   eventually  overtake  disk-­‐based  systems.  However,  the  cost  of  memory  will  still  be   greater,  no  matter  what  it  is,  than  disk  drives  and  though  it  is  impossible  to  predict,  the   amount  of  data  captured  and  analyzed  will  continue  to  grow  at  a  rate  faster  than  the   price/Gb  of  memory.        
  • 16. On  the  Persistence  of  Memory  (in  Database  Systems)   14     ©  2012  Hired  Brains  Inc.    All  Rights  Reserved     ABOUT  THE  AUTHOR         Neil  Raden,  based  in  Santa  Fe,  NM,  is  an  industry  analyst  and  active  consultant,  widely   published  author  and  speaker  and  the  founder  of  Hired  Brains,  Inc.,   http://www.hiredbrains.com.  Hired  Brains  provides  consulting,  systems  integration  and   implementation  services  in  Data  Warehousing,  Business  Intelligence,  “big  data:,  Decision   Automation  and  Advanced  Analytics  for  clients  worldwide.  Hired  Brains  Research   provides  consulting,  market  research,  product  marketing  and  advisory  services  to  the   software  industry.     Neil  was  a  contributing  author  to  one  of  the  first  (1995)  books  on  designing  data   warehouses  and  he  is  more  recently  the  co-­‐author  of  Smart  (Enough)  Systems:  How  to   Deliver  Competitive  Advantage  by  Automating  Hidden  Decisions,  Prentice-­‐Hall,  2007.  He   welcomes  your  comments  at  nraden@hiredbrains.com  or  at  his  blog  at  Competing  on   Decisions.