• Like

Thinking Outside the Cube: How In-Memory Bolsters Analytics

  • 339 views
Uploaded on

The Briefing Room with Mark Madsen and IBM …

The Briefing Room with Mark Madsen and IBM
Live Webcast on Aug. 27, 2013
Visit: www.insideanalysis.com

What's old is often new again, especially in the world of information management. The innovation of OLAP cubes years ago transformed business intelligence by empowering analysts with significantly faster number-crunching capabilities. Today, with data volumes exploding, a new kind of cube is offering similar value, thanks in large part to in-memory analytics.

of The Briefing Room to learn from veteran Analyst and practitioner Mark Madsen of Third Nature, who will explain how this new wave of in-memory technology can give analysts a needed boost for dealing with the rising tide of data volumes and types. He'll be briefed by Chris McPherson of IBM Business Analytics, who will tout IBM Cognos Dynamic Cubes, which were specifically designed to let business users maintain the speed and agility they need for their analytical solutions.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
339
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
14
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. The Briefing Room Thinking Outside the Cube: How In-Memory Bolsters Analytics
  • 2. Twitter Tag: #briefr The Briefing Room Welcome Host: Eric Kavanagh eric.kavanagh@bloorgroup.com
  • 3. Twitter Tag: #briefr The Briefing Room !   Reveal the essential characteristics of enterprise software, good and bad !   Provide a forum for detailed analysis of today s innovative technologies !   Give vendors a chance to explain their product to savvy analysts !   Allow audience members to pose serious questions... and get answers! Mission
  • 4. Twitter Tag: #briefr The Briefing Room Topics This Month: ANALYTIC PLATFORMS September: ANALYTICS October: DATA PROCESSING
  • 5. Twitter Tag: #briefr The Briefing Room Analytic Platforms ~Albert Einstein If  you  always  do  what  you   always  did,  you  will  always   get  what  you  always  got.   “ “
  • 6. Twitter Tag: #briefr The Briefing Room Analyst: Mark Madsen  Mark Madsen is president of Third Nature, Inc.
  • 7. Twitter Tag: #briefr The Briefing Room !   IBM Cognos Business Intelligence is an enterprise BI platform with an open-data access strategy !   The platform includes IBM Cognos Dynamic Cubes, an in- memory relational OLAP component that complements the existing query engine !   Dynamic Cubes can enable users to perform interactive analysis and reporting over terabytes of data IBM
  • 8. Twitter Tag: #briefr The Briefing Room Guest: Chris McPherson Chris McPherson is a Senior Product Manager on the IBM Business Analytics Platform team in the IBM Canada Ottawa Lab. His current area of responsibility is IBM Cognos Dynamic Cubes but prior to that, he was product owner for Modelling, Metadata and EII for the Cognos suite of tools. He has more than nine years of experience within the IBM Business Analytics organization.
  • 9. © 2012 IBM Corporation IBM Cognos Dynamic Cubes Chris McPherson – Senior Product Manager IBM Business Analytics
  • 10. © 2012 IBM Corporation10 High performance analytics over growing data volumes Aggregate awareness Aggregate optimization In-memory caching of members, data, expressions, results, and aggregates Dynamic Cubes Feature mission
  • 11. © 2012 IBM Corporation11 Extensive caching –  Shared caches for maximum reuse –  All caches are security aware Data Cache In-Memory Aggregate Cache Expression Cache Result Set Cache Member Cache Security MDX Engine Security Data Warehouse Security
  • 12. © 2012 IBM Corporation12 Security Security Security Data Cache In-memory Aggregates Expression Cache Member Cache MDX Engine Result Set Cache Query Processor DQM Dynamic Cube DQM
  • 13. © 2012 IBM Corporation13 Security Security Security Data Cache In-memory Aggregates Expression Cache Member Cache MDX Engine SQL queries to obtain member information Result Set Cache Query Processor DQM Dynamic Cube DQM
  • 14. © 2012 IBM Corporation14 Security Security Security Data Cache In-memory Aggregates Expression Cache Member Cache MDX Engine SQL queries to obtain member information SQL queries to obtain aggregate data Result Set Cache Query Processor DQM Dynamic Cube DQM
  • 15. © 2012 IBM Corporation15 Security Security Security Initial Query Data Cache In-memory Aggregates Expression Cache Member Cache MDX Engine SQL queries to obtain member information SQL queries to obtain fact and summary data SQL queries to obtain aggregate data Search aggregate cache for data Result Set Cache Query Processor DQM Dynamic Cube DQM
  • 16. © 2012 IBM Corporation16 Security Security Security Initial Query Data Cache In-memory Aggregates Expression Cache Member Cache MDX Engine SQL queries to obtain member information SQL queries to obtain fact and summary data SQL queries to obtain aggregate data Search aggregate cache for data Result Set Cache Query Processor DQM Dynamic Cube DQM
  • 17. © 2012 IBM Corporation17 Dynamic Cube Lifecycle 1. Model & publish The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again. 2. Deploy, manage 3. Reporting & analytics 4. Optimize Dynamic Cube Server Dynamic Cube Logs CM Warehouse
  • 18. © 2012 IBM Corporation18 1. Launch Aggregate Advisor Wizard 2. Run with or without workload Optimize per report, package, user, or time 3. Advisor returns with in-memory and/or in- database recommendations 4. Save recommendations § In-memory aggregates created on re-start à No re-modeling or re-authoring required § DBA creates in-database aggregate tables, and modeler updates model and redeploys Aggregate Advisor for in-memory aggregates Easy performance improvements
  • 19. © 2012 IBM Corporation19 Virtual Cubes Virtual cube used as source for another virtual cube Combines cubes with common Time dimension Virtual cubes combine two cubes Combines cubes with nearly identical dimensions Inventory Sales SalesInventory Store Sales Web Sales
  • 20. © 2012 IBM Corporation20 Time Current Month All Sales cube All Sales Current Month Sales Historic Sales Virtual Cubes Low latency & faster cube refresh
  • 21. © 2012 IBM Corporation21 Cognos Dynamic Cubes - Summary High Performance •  80x improvement with aggregates •  80% queries under 3 seconds •  Over 50% queries sub-second Growing Data Volumes •  Scalable to terabytes of fact data Flexible and Optimized •  You choose where to take advantage of in- memory capabilities •  Aggregate Advisor to easily create optimized aggregates Maximize Value of Data Warehouse •  Aggregate awareness to balance load across app and DB tiers •  Reduce load on database through use of application tier caching 21
  • 22. © 2012 IBM Corporation22
  • 23. Twitter Tag: #briefr The Briefing Room Perceptions & Questions Analyst: Mark Madsen
  • 24. Commentary  on   analysis  and   performance,   IBM  Business  Analy8cs   Briefing  Room       August, 27 2013 Mark Madsen www.ThirdNature.net @markmadsen  
  • 25. Terminology  Disambigufica8on   Analysis:   a.  The  separa7on  of  an  intellectual  or  material  whole  into   its  cons7tuent  parts  for  individual  study.   b.  The  study  of  such  cons7tuent  parts  and  their   interrela7onships  in  making  up  a  whole.     Analy8cs:  the  mathy  stuff,  like  sta7s7cs,  machine   learning,  numerical  methods,  data  mining*  (so  I  won’t   use  the  term  as  a  synonym  for  OLAP)     In-­‐memory:  a  vague  term  mainly  implying  not  using  disks   for  immediate  data  access  
  • 26. BI  is  using  broken  metaphors   We  think  of  BI  as  publishing,  which  is  only  one  part.  
  • 27. Most  BI  is  built  on  an  outdated  interac8on  model  
  • 28. Result  of  a  poor  interac8on  model   Delayed  interac<on  disrupts  work   "...each second of system response degradation leads to a similar degradation added to the user's time for the following [command]. This phenomenon seems to be related to an individual's attention span. The traditional model of a person thinking after each system response appears to be inaccurate. Instead, people seem to have a sequence of actions in mind, contained in a short-term mental memory buffer. Increases in SRT [system response time] seem to disrupt the thought processes, and this may result in having to rethink the sequence of actions to be continued.“ Note nonlinearity in graph, an indication that something important is happening. “The Economic Value of Rapid Response Time “, IBM 1982
  • 29.  Tradi8onal  BI  fails  to  put  users  into  the  flow  zone   Flow  (Csíkszentmihályi)   ▪  Concept  of  engagement  and   immersion  in  a  task     ▪  The  appropriate  applica7on   of  tools  and  knowledge  to   analy7cal  problems  enables   produc7vity.   ▪  The  s7lted  interac7on  of  BI   disrupts  flow.    
  • 30. Interac8on  8mescale  for  analysis  problems   Un7l  you  resolve  this  task  performance  gap,  real  analysis   work  is  a  challenge  (and  a  reason  why  Excel  remains   popular).   Days Hours Minutes Seconds Instantaneous come back tomorrow go to lunch take a break get some coffee check email/FB take a sip of coffee immerse yourself in work Flow is possible only in the “less than 3 second” range
  • 31. Future-­‐proofing   The  tool  market  is  shiIing,   driven  by  new  architectures   that  are  enabled  by  new   technologies.     Front-­‐end  tools  are  evolving   away  from  BI-­‐as-­‐publishing,   which  is  going  to  increase   the  burden  on  the  back  end   data  stores  and  cause   interac7on  problems.   You  need  to  evaluate  tools   based  on  more  detailed   usage  scenarios  and   interac7ve  capabili7es,  less   on  report-­‐building  features.  
  • 32. BI  should  support  two  sets  of  ac8ons.  One  is  monitoring   the  known,  one  is  analyzing  the  unknown.   Collect new data Monitor Analyze Exceptions Analyze Causes Decide Act No problem No idea Do nothing Act on the process Usually days/longer timeframe Act within the process Usually real-time to daily
  • 33. The real BI design point: context and point of use Information use is diverse and varies based on context: ▪  Get a quick answer ▪  Solve a one-off problem ▪  Make repetitive decisions ▪  Monitor routine processes ▪  Make complex decisions ▪  Choose a course of action ▪  Convince others to take action Different problems require different response times in order to be effective.
  • 34. How  expensive  was  performance?  500  GB  DW…   Maximum Capacities •  2 to 30 100MHz Intel Pentium processors •  Up to 3.5GB system memory •  Up to 1.7TB of on-line storage Base Configuration •  18 slot Sequent bus chassis •  1 Proc card - dual 100MHz Pentium CPUs •  1 2.1GB SCSI boot disk •  1 CD-ROM/QIC-525 1/4” Tape •  1 Memory controller (64MB, 256MB) •  1 Integrated Ethernet •  5-slot VMEbus chassis •  Room for 3 additional 5.25” devices Expansion Options •  Up to 400 SCSI-2 disks •  Up to 29 VMEbus slots •  Up to 8 QCIC I/O controllers •  Token Ring, FDDI LAN adapters •  Sync or Async communications ports Price: $1.6 million in 1993
  • 35. OLAP  was  a  response-­‐8me  answer   The  Codd  OLAP  paper  wriPen  for  a  vendor  in   1993:  state  of  the  art  client  technology  was  the  60   Mhz  Intel  Pen7um,  Windows  version  3.1;  server   tech  was  the  $1M+  database  server     It’s  s7ll  hard  to  get  less  than  3  second  response   7mes  from  a  round-­‐trip  to  a  DB     It’s  s7ll  hard  to  get  interac7on  right  when  the  BI   model  is  mainly  compose-­‐compile-­‐execute.    
  • 36. You lied about it being in-memory I didn’t say it would all fit in at the same time…
  • 37. Differen8a8ng  in-­‐memory  claims   Tool  vs  PlaEorm:  OLAP  is  (generally)  in-­‐memory   technology;  there  are  tradeoffs  in  the  choice   PlaEorm:   a)  Conven<onal:  use  a  large  buffer  pool  and  cache  or  pin   everything  in  memory.  Speeds  up  a  DB,  but  not  really   “in-­‐memory”.   b)  Memory  op<mized:  designed  assuming  all  or  mostly  in   memory;  map  the  data  needed  for  opera7ons  to   memory  and/or  add  features  to  recognize  and  use   large-­‐memory  configura7ons.   c)  In-­‐memory:  purpose-­‐built,  the  en7re  database  is   resident  in  main  memory;  the  only  disk  access  is   loading  on  a  cold  start  or  logging  changes.  
  • 38. Some  ques8ons  to  start  discussion   1.  Will  this  work  with  any  database  back-­‐end?   2.  Who  are  these  features  aimed  at:  end  users  or  the   people  who  define  structures  and  manage  data  for  the   end  users?   3.  Are  cube  defini7ons  sta7c  in  this  model?   4.  Can  cubes  be  populated  in  slices  or  layers  based  on  what   a  person  is  looking  at?   5.  How  do  the  caching  improvements  address  cube-­‐ building  7mes?   6.  Is  this  addressing  sta7c  performance  management  or   dynamic?   7.  Are  virtual  cubes  defined  by  the  user  or  admin  or  can   they  be  automa7c?  
  • 39. About  the  Presenter   Mark  Madsen  is  president  of  Third   Nature,  a  technology  research  and   consul7ng  firm  focused  on  business   intelligence,  data  integra7on  and  data   management.  Mark  is  an  award-­‐winning   author,  architect  and  CTO  whose  work   has  been  featured  in  numerous  industry   publica7ons.  Over  the  past  ten  years   Mark  received  awards  for  his  work  from   the  American  Produc7vity  &  Quality   Center,  TDWI,  and  the  Smithsonian   Ins7tute.  He  is  an  interna7onal  speaker,   a  contributor  at  Forbes  Online  and   Informa7on  Management.  For  more   informa7on  or  to  contact  Mark,  follow   @markmadsen  on  TwiPer  or  visit     hPp://ThirdNature.net    
  • 40. About  Third  Nature   Third Nature is a research and consulting firm focused on new and emerging technology and practices in analytics, business intelligence, and performance management. If your question is related to data, analytics, information strategy and technology infrastructure then you‘re at the right place. Our goal is to help companies take advantage of information-driven management practices and applications. We offer education, consulting and research services to support business and IT organizations as well as technology vendors. We fill the gap between what the industry analyst firms cover and what IT needs. We specialize in product and technology analysis, so we look at emerging technologies and markets, evaluating technology and hw it is applied rather than vendor market positions.
  • 41. CC  Image  AWribu8ons   Thanks  to  the  people  who  supplied  the  crea7ve  commons  licensed  images  used  in  this  presenta7on:   train_to_sea.jpg  -­‐  hPp://www.flickr.com/photos/innoxiuss/457069767/   well  town  hall.jpg  -­‐  hPp://flickr.com/photos/tuinkabouter/1135560976/              
  • 42. Twitter Tag: #briefr The Briefing Room
  • 43. Twitter Tag: #briefr The Briefing Room September: ANALYTICS October: DATA PROCESSING Upcoming Topics www.insideanalysis.com
  • 44. Twitter Tag: #briefr The Briefing Room Thank You for Your Attention