"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about Real Time?" - Slides (including TIBCO Examples) from JAX 2014 Online
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about Real Time?" - Slides (including TIBCO Examples) from JAX 2014 Online

  • 3,982 views
Uploaded on

I discuss a good big data architecture which includes Data Warehouse / Business Intelligence + Apache Hadoop + Real Time / Stream Processing. Several real world example are shown. TIBCO offers some......

I discuss a good big data architecture which includes Data Warehouse / Business Intelligence + Apache Hadoop + Real Time / Stream Processing. Several real world example are shown. TIBCO offers some very nice products for realizing these use cases, e.g. Spotfire (Business Intelligence / BI), StreamBase (Stream Processing), BusinessEvents (Complex Event Processing / CEP) and BusinessWorks (Integration / ESB). TIBCO is also ready for Hadoop by offering connectors and plugins for many important Hadoop frameworks / interfaces such as HDFS, Pig, Hive, Impala, Apache Flume and more.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,982
On Slideshare
1,778
From Embeds
2,204
Number of Embeds
16

Actions

Shares
Downloads
156
Comments
0
Likes
5

Embeds 2,204

http://www.kai-waehner.de 1,164
http://java.dzone.com 913
http://server.dzone.com 96
http://ruby.dzone.com 6
http://www.slideee.com 4
https://twitter.com 4
http://architects.dzone.com 3
http://python.dzone.com 2
http://php.dzone.com 2
https://content-preview.socialcast.com 2
http://translate.googleusercontent.com 2
http://css.dzone.com 2
http://dzone.com 1
http://fanli7.net 1
https://www.docsnode.com 1
http://www.google.co.in 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. © Copyright 2000-2014 TIBCO Software Inc. Hadoop and Data Warehouse – Friends, Enemies or Profiteers? What about Real Time? Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de
  • 2. © Copyright 2000-2014 TIBCO Software Inc. Disclaimer ! These opinions are my own and do not necessarily represent my employer
  • 3. © Copyright 2000-2014 TIBCO Software Inc. Key Messages Big Data is not just Hadoop, concentrate on Business Value! A good Big Data Architecture combines DWH, Hadoop and Real Time! The Integration Layer is getting even more important in the Big Data Era!
  • 4. © Copyright 2000-2014 TIBCO Software Inc. Agenda •  Terminology •  Data Warehouse and Business Intelligence •  Big Data Processing with Hadoop •  Big Data Processing in Real Time
  • 5. © Copyright 2000-2014 TIBCO Software Inc. Agenda •  Terminology •  Data Warehouse and Business Intelligence •  Big Data Processing with Hadoop •  Big Data Processing in Real Time
  • 6. © Copyright 2000-2014 TIBCO Software Inc. Big Data Architecture DWH  /  BI   Hadoop   Real  Time   Big  Data  Architecture  
  • 7. © Copyright 2000-2014 TIBCO Software Inc. DWH means analyzing OLAP Cubes h9p://www.exforsys.com/tutorials/msas/data-­‐warehouse-­‐database-­‐and-­‐oltp-­‐database.html  
  • 8. © Copyright 2000-2014 TIBCO Software Inc. Big Data means analyzing Everything h9p://blogs.teradata.com/internaDonal/tag/hadoop/   •  Store  everything   •  Even  without  structure   •  Use  whatever  you  need  (now  or  later)  
  • 9. © Copyright 2000-2014 TIBCO Software Inc. Big Data: Three shifts in the Way we analyze Information •  Messiness:  Using  ALL  data,  not  just  samples   •  Also  bad  data  (e.g.  Word  spell  checker,  Google  auto-­‐complete  and  „did   you  mean...“  recommendaDon     •  Correla-ons:  Instead  of  causaliDes   •  May  not  tell  us  WHY  something  is  happening,  but  THAT  it  is  happening   •  In  many  situaDons,  this  is  good  enough   •  What  drug  substance  cures  cancer?  When  should  I  buy  an  airplane  Dcket?     •  Datafica-on:  Store,  process,  combine,  reuse,  enhance  all  data!   •  DigitalisaDon  (Amazon  Kindle  à  Read)  vs.  DataficaDon  (Google  Books  à   Read,  Search,  Process,  ...)     •  Words  becomes  data:  Google  books:  not  just  read,  but  also  search,   analyse,  etc.   •  LocaDons  becomes  data:  GPS:  not  just  navigaDon,  but  also  insurance   costs,  economic  routes,  etc.      
  • 10. © Copyright 2000-2014 TIBCO Software Inc. What is Big Data? The combined Vs of Big Data Volume     (terabytes,   petabytes)                     Variety     (social  networks,   blog  posts,  logs,   sensors,  etc.)            Velocity                (realDme)           Value   X
  • 11. © Copyright 2000-2014 TIBCO Software Inc. Real Time Wikipedia Definition: •  Real time programs must guarantee response within strict time constraints, often referred to as "deadlines”. Real time responses are often understood to be in the order of milliseconds, and sometimes microseconds. •  The term "near real time” refers to the time delay introduced, by automated data processing or network transmission. •  The distinction between the terms "near real time" and "real time" is somewhat nebulous and must be defined for the situation at hand. Hereby, for this talk, I define: –  Real time == response in nanoseconds || microseconds || milliseconds || <= one second –  Near real time == (response time > one second)
  • 12. © Copyright 2000-2014 TIBCO Software Inc. Agenda •  Terminology •  Data Warehouse and Business Intelligence •  Big Data Processing with Hadoop •  Big Data Processing in Real Time
  • 13. © Copyright 2000-2014 TIBCO Software Inc. Big Data Architecture DWH  /  BI   Hadoop   Real  Time   Big  Data  Architecture  
  • 14. © Copyright 2000-2014 TIBCO Software Inc. DWH vs. BI •  Data Warehouse (DWH) à Storage •  Business Intelligence (BI) à Analytics •  Both terms are often used as synonym, i.e. when someone talks about a DWH, this might include analytics •  BI can be used without a DWH
  • 15. © Copyright 2000-2014 TIBCO Software Inc. Typical DWH Process h9p://wikibon.org/blog/not-­‐your-­‐fathers-­‐data-­‐analyDcs/     A  DWH  is  „Business  Case  driven“:   •  ReporDng   •  Dashboards   •  Drill  Down  AnalyDcs     Different  DWH  OpDons:   •  Enterprise  DWH  (  ==  EDW)     •  Department  /  Project  DWH   •  Embedded  BI  (into  ApplicaDons)    
  • 16. © Copyright 2000-2014 TIBCO Software Inc. BI == Reporting + Statistics + Data Discovery DWH   BI  
  • 17. © Copyright 2000-2014 TIBCO Software Inc. BI Visualization
  • 18. © Copyright 2000-2014 TIBCO Software Inc. Products DWH •  SQL: e.g. MySQL •  MPP: e.g. Teradata, EMC Greenplum, IBM Netezza –  Scale very well (almost linear), very high performance, hardware / software costs also increase a lot BI •  Microsoft Excel •  BI Tools: e.g. TIBCO Spotfire, Tableau, MicroStrategy Hint: Good BI tools •  allow data discovery / visualization using different sources, not just DWH •  are easy to use
  • 19. © Copyright 2000-2014 TIBCO Software Inc. BI Tool Example: TIBCO Spotfire
  • 20. © Copyright 2000-2014 TIBCO Software Inc. BI Tool Example: TIBCO Spotfire The  whole  team  needs  analyDcs.  Spo`ire  is  for   everyone,  helping  users  with  a  variety  of  skill   levels  to  visualize,  explore  and  share   informaDon:  It  has     •  At-­‐a-­‐glance  business  facts  for  managers   •  Dashboards  for  front-­‐line  decision-­‐makers   •  Visual  discovery  for  business  users   •  Deep  data  exploraDon  for  analysts   •  Advanced  predicDve  analyDcs  for   staDsDcians   •  And  beauDful  visualizaDons  to  empress   your  execuDves  
  • 21. © Copyright 2000-2014 TIBCO Software Inc. Example: TIBCO Spotfire
  • 22. © Copyright 2000-2014 TIBCO Software Inc. Live Demo „TIBCO  Spo`ire“  in  acDon...  
  • 23. © Copyright 2000-2014 TIBCO Software Inc. DWH Real World Use Case h9p://spo`ire.Dbco.com/resources/content-­‐center?Content%20Type=Case%20Studies  
  • 24. © Copyright 2000-2014 TIBCO Software Inc. DWH Real World Use Case h9p://spo`ire.Dbco.com/resources/content-­‐center?Content%20Type=Case%20Studies  
  • 25. © Copyright 2000-2014 TIBCO Software Inc. Embedded BI Real World Use Case h9ps://www.jaspersod.com/embeddedShowcase/periscope.html  
  • 26. © Copyright 2000-2014 TIBCO Software Inc. Problems of a DWH No flexibility / agility •  Just structured data •  Just some (maybe aggregated) history data •  Just good for already known business cases Low speed •  ETL is batch, usually takes hours or sometimes even days •  No proactive reactions possible à “too late architecture” High costs (per GB) •  Just selected data •  Too old data is often outsourced to archives
  • 27. © Copyright 2000-2014 TIBCO Software Inc. Classic BI vs. Big Data BI
  • 28. © Copyright 2000-2014 TIBCO Software Inc. Agenda •  Terminology •  Data Warehouse and Business Intelligence •  Big Data Processing with Hadoop •  Big Data Processing in Real Time
  • 29. © Copyright 2000-2014 TIBCO Software Inc. Big Data Architecture DWH  /  BI   Hadoop   Real  Time   Big  Data  Architecture  
  • 30. © Copyright 2000-2014 TIBCO Software Inc. Why no longer DWH, but Hadoop? Hadoop was built to solve problems of RDBMS and DWH… Benefits of Hadoop: •  Store and analyze all data –  all data == not just selected (maybe aggregated) data –  all data == structured + semi-structured + unstructured à be more flexible, adapt to changing business cases •  Better performance (massively parallel) •  Ad hoc data discovery – also for big data volumes •  Save money (commodity hardware, open source software)
  • 31. © Copyright 2000-2014 TIBCO Software Inc. What is Hadoop? Apache Hadoop, an open-source software library, is a framework that allows for the distributed processing of large data sets across clusters of commodity hardware using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
  • 32. © Copyright 2000-2014 TIBCO Software Inc. MapReduce Simple  example:     •  Input:  (very  large)  text  files  with  lists  of  strings,  such  as:      „318,  0043012650999991949032412004...0500001N9+01111+99999999999...“   •  We  are  interested  just  in  some  content:  year  and  temperate  (marked  in  red)   •  The  Map  Reduce  funcDon  has  to  compute  the  maximum  temperature  for  every  year  
  • 33. © Copyright 2000-2014 TIBCO Software Inc. Hadoop Products MapReduce HDFS Ecosystem Features included few many Apache Hadoop
  • 34. © Copyright 2000-2014 TIBCO Software Inc. Hadoop Ecosystem
  • 35. © Copyright 2000-2014 TIBCO Software Inc. Hadoop Products MapReduce HDFS Ecosystem Features included Hadoop   DistribuDon   few many Apache Hadoop Packaging Deployment-Tooling Support +
  • 36. © Copyright 2000-2014 TIBCO Software Inc. Hadoop Distributions (…  some  more  arising)   EMR  
  • 37. © Copyright 2000-2014 TIBCO Software Inc. Hadoop Products MapReduce HDFS Ecosystem Features included Hadoop   DistribuDon   Big  Data  Suite   few many Apache Hadoop Packaging Deployment-Tooling Support + Tooling / Modeling Code Generation Scheduling Integration +
  • 38. © Copyright 2000-2014 TIBCO Software Inc. Big Data Integration Suite: TIBCO BusinessWorks
  • 39. © Copyright 2000-2014 TIBCO Software Inc. Live Demo „TIBCO  BusinessWorks“  in  acDon...  
  • 40. © Copyright 2000-2014 TIBCO Software Inc. Hadoop Real World Use Case: Replace ETL to improve Performance “The advantage of their new system is that they can now look at their data [from their log processing system] in anyway they want: •  Nightly MapReduce jobs collect statistics about their mail system such as spam counts by domain, bytes transferred and number of logins. •  When they wanted to find out which part of the world their customers logged in from, a quick [ad hoc] MapReduce job was created and they had the answer within a few hours. Not really possible in your typical ETL system.” http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data (  no  TIBCO  reference)  
  • 41. © Copyright 2000-2014 TIBCO Software Inc. •  A lot of data must be stored „forever“ •  Numbers increase exponentially •  Goal: As cheap as possible •  Problem: Queries must still be possible (compliance!) •  Solution: Commodity servers and „Hadoop querying“ Global  Parcel  Service   h9p://archive.org/stream/BigDataImPraxiseinsatz-­‐SzenarienBeispieleEffekte/Big_Data_BITKOM-­‐Lei`aden_Sept.2012#page/n0/mode/2up   Hadoop Real World Use Case: Storage to reduce Costs (  no  TIBCO  reference)  
  • 42. © Copyright 2000-2014 TIBCO Software Inc. DWH or Hadoop? DWH   Hadoop   Data   Structured   All  data   Maturity   Established  in  Enterprise   New  concepts   Tooling   Installed,  good   knowledge  and   experience   New  tools,  coding   required,  business  can   sDll  use  SQL-­‐similar   queries  or  same  BI  tool   Costs   High  (per  GB)   Low  (per  GB)  
  • 43. © Copyright 2000-2014 TIBCO Software Inc. DWH plus Hadoop? DWH and Hadoop complement each other very well •  Store all data in Hadoop (cheap per GB) •  ETL from Hadoop to DWH (expensive per GB) •  Create specific reports / dashboards in DWH (leverage existing products and knowledge) •  Do Ad Hoc (Big) Data Discovery directly in Hadoop, no DWH needed Good BI tools support both, DWH and Hadoop! For example, TIBCO Spotfire has connectors to: •  RDBMS (e.g. MySQL) •  MPP (e.g. Teradata, IBM Netezza, Greenplum) •  Hadoop (e.g. Hive, Impala) •  In-Memory (e.g. TIBCO ActiveSpaces, SAP HANA)
  • 44. © Copyright 2000-2014 TIBCO Software Inc. Recommendation DWH vs. Hadoop vs. XYZ •  Short  term:   Use  Hadoop  (only)  when  you  can  save  (a  lot  of)  money  or  when  you  can  not  solve  your  business  problem   without  Hadoop.  A  lot  of  things  have  to  be  improved,  e.g.  governance,  security,  performance,  and  tool   support.     •   Long  term:   Hadoop  can  replace  DWH  (as  you  can  create  a  DWH  on  top  of  Hadoop  with  SQL  interface  already  today)!     •  Be  aware:   A  lot  of  other  opDons  emerge  for  analyzing  big  data  besides  Hadoop,  e.g.   -­‐  AnalyDcal  databases  with  SQL  interface  (MemSQL,  Citus  Data)   -­‐  Log  AnalyDcs  (Splunk,  TIBCO  LogLogic)   -­‐  Graph  databases  (Neo4j,  InfiniteGraph)  
  • 45. © Copyright 2000-2014 TIBCO Software Inc. Vendors Strategy... Hadoop vendors push Hadoop as DWH replacement à Called e.g. „Enterprise Data Hub“ (Cloudera) or „Data Lake“ (Hortonworks) h9p://gigaom.com/2013/10/29/clouderas-­‐plan-­‐to-­‐become-­‐the-­‐center-­‐of-­‐your-­‐data-­‐universe/   h9p://hortonworks.com/wp-­‐content/uploads/downloads/2013/04/ Hortonworks.ApacheHadoopPa9ernsOfUse.v1.0.pdf  
  • 46. © Copyright 2000-2014 TIBCO Software Inc. Vendors Strategy... MPP / DWH vendors add Hadoop support as complementary addon to their DWH à  Reason (probably): Market pressure! à  Benefit: One platform (including tooling and support) for DWH and Hadoop
  • 47. © Copyright 2000-2014 TIBCO Software Inc. Example: EMC combines DWH and Hadoop h9p://wikibon.org/wiki/v/EMC_Integrates_Greenplum_DB_and_Hadoop_with_Pivotal_HD   h9p://www.gopivotal.com/big-­‐data/pivotal-­‐hd  
  • 48. © Copyright 2000-2014 TIBCO Software Inc. Example: Teradata combines DWH and Hadoop h9p://www.teradata.com/Teradata-­‐Enterprise-­‐Access-­‐for-­‐Hadoop/   h9p://gigaom.com/2014/04/07/teradata-­‐says-­‐hadoop-­‐is-­‐good-­‐for-­‐business-­‐but-­‐for-­‐how-­‐long/  
  • 49. © Copyright 2000-2014 TIBCO Software Inc. Hadoop evolving from Batch to Near Real Time Hadoop is MapReduce == Batch (== hours, minutes, seconds) •  Good for complex transformations / computations of big data volumes •  Not so good for ad hoc data exploration •  Improvements: Hive Stinger (Hortonworks) etc. Non-MapReduce processing engines added in the meantime (YARN makes it possible) •  Ad hoc data discovery (== seconds) •  Hive / Pig with Apache Tez replacing MapReduce under the hood for data processing •  New Query engines, e.g. Impala (Cloudera) or Apache Drill (MapR) MPP vendors (e.g. Teradata, EMC Greenplum) also add own query engines •  Offer fast data exploration (without MapReduce) Some Hadoop problems remain •  No good, easy tooling (Hadoop ecosystem) à might be solved next years •  Missing maturity (alpha / beta versions) à might be solved next years •  No “real time” (== ms, ns), but “near real time” (> 1 sec) à “too late architecture”
  • 50. © Copyright 2000-2014 TIBCO Software Inc. Agenda •  Terminology •  Data Warehouse and Business Intelligence •  Big Data Processing with Hadoop •  Big Data Processing in Real Time
  • 51. © Copyright 2000-2014 TIBCO Software Inc. Big Data Architecture DWH  /  BI   Hadoop   Real  Time   Big  Data  Architecture  
  • 52. © Copyright 2000-2014 TIBCO Software Inc. Real Time: “The Two-Second Advantage” “A  li&le  bit  of  the  right  informa2on,  just  a   li&le  bit  beforehand  –  whether  it  is  a   couple  of  seconds,  minutes  or  hours  –  is   more  valuable  than  all  of  the  informa2on   in  the  world  six  months  later…  this  is  the   two-­‐second  advantage.”                                    Vikek  Ranadivé,  Founder  and  CEO  of  TIBCO  
  • 53. © Copyright 2000-2014 TIBCO Software Inc. The Value of Data decreases over Time
  • 54. © Copyright 2000-2014 TIBCO Software Inc. What is Big Data? The combined Vs of Big Data Volume     (terabytes,   petabytes)                     Variety     (social  networks,   blog  posts,  logs,   sensors,  etc.)            Velocity                (realDme)           X Fast     Data  
  • 55. © Copyright 2000-2014 TIBCO Software Inc. Real Time Architecture? EVENTS   Mainframe/ERP/DB/App   ACTION   TransacDon  Based  Architectures   EVENTS   Mainframe/ERP/DB/App   ACTION   Behavior  Based  Architectures   TransacDon   Data,  Event  and   AnalyDcs   Not  ElasDc,  Doesn’t  Scale,    “Always  Late”  architecture  and  analyDcs       ElasDc,  Scales,  Real  Dme  architecture     (Events,  Data  and  AnalyDcs)  
  • 56. © Copyright 2000-2014 TIBCO Software Inc. Complex Event / Stream Processing / In-Memory Concepts •  Streams: Monitoring millions of events in a specific time window to react proactively •  Stateful: Collect, filter and correlate events with state to anticipate outcomes and react proactively •  Transactional: Highly performant transactional event processing Products vs. Frameworks •  Products are mature, mission-critical, in production, e.g. TIBCO StreamBase, IBM InfoSphere Streams •  Open Source Frameworks, e.g. “Apache Spark” and “Apache Storm” –  Future will tell us about performance, tooling, support, etc. –  Can be combined with Hadoop –  Are complementary to Products such as TIBCO StreamBase In-Memory •  Can also be used for “big data” (Terabytes possible!) •  Usually complementary, i.e. they can be / have to be combined with stream processing / complex event processing
  • 57. © Copyright 2000-2014 TIBCO Software Inc. Stream Processing Architecture LiveView Datamart Con-nuous  Query   Continuous Query Processor Ad  Hoc  Query   Alerts   CEP   Messaging  (low  latency)   Messaging  (JMS)   Social  Media  Data   Market  Data   In-­‐Memory   ESB  Integra-on   Sensor  Data   Historical   Data   JDBC   Ac-veSpaces   Enterprise   data  
  • 58. © Copyright 2000-2014 TIBCO Software Inc. Stream Processing Architecture (Example: TIBCO StreamBase) TIBCO StreamBase Con-nuous   Query   Continuous Query Processor Ad  Hoc  Query   Alerts   Active Tables Trading  Signal   Transac-on  Cost   Orders  /  Execu-ons   Market  Data   Alert  SeMng   TIBCO LiveViewSnapshot  AND  always-­‐live   updates   Quickly  connect  to  streams   An;cipate  opportuni;es,  proac;ve  ac;on  
  • 59. © Copyright 2000-2014 TIBCO Software Inc. Example: TIBCO StreamBase Tooling StreamBase Development Studio •  Visual Development •  Visual Debugging •  Feed Simulation •  Unit Testing StreamBase LiveView •  Real Time Analytics and Visualization •  Ad hoc queries •  Alerts and Notifications •  Web, Mobile and API Integration
  • 60. © Copyright 2000-2014 TIBCO Software Inc. Real World: Real-Time Trade Surveillance Applica-ons   IntegraDon   NormalizaDon   AggregaDon   CorrelaDon   Rules   Alerts   AutomaDon   Adapters     and     Handlers   Adapters   and   Handlers   StreamBase  Server(s)   StreamBase  Studio  for   Developing  EventFlow  Applica-ons     Data  Management       Persistence  Stores   Logs   Market   Data   Trade  Data   Sta-c  Data   Systems   Data   Performance   Benchmarks   Automa-on   Desktop   Alerts   Inputs   Outputs  
  • 61. © Copyright 2000-2014 TIBCO Software Inc. Real Time (Stream Processing) Real World Use Case     Real-­‐Time  Fraud  DetecDon                     “The  firm  needs  to  monitor  machine-­‐driven  algorithms,  and  look  for  suspicious  pa9erns.  Sounds  simple,  right?  Not  so  simple!   In  this  case,  the  pa9erns  of  interest  required  correlaDon  of  5  streams  of  real-­‐Dme  data.  Pa9erns  happen  within  15-­‐30  second  windows,  during  which  thousands  of  dollars  could  be  lost.  A9acks  come  in   bursts.   The  data  required  to  find  these  pa9erns  was  loaded  into  a  data  warehouse  and  reports  were  checked  each  day.  Decisions  to  act  were  made  every  day.   LiveView  now  intercepts  the  data  before  it  hit  the  warehouse  by  connecDng  LiveView  to  the  source  of  data.  It  took  3  days  to  integrate  these  sources  because  it  took  that  long  to  find  someone  who   knew  where  3  of  the  data  streams  came  from!   StreamBase  detects  fraud  pa9erns  in  milliseconds.  But  the  really  interesDng  part  came  next.   Once  this  firm  could  see  pa9erns  of  fraud,  they  were  faced  with  a  new  challenge:  what  to  DO  about  it?  How  many  Dmes  did  the  pa9ern  need  to  be  repeated  unDl  acDve  surveillance  is  started?    Should   the  acDon  be  quaranDned  for  a  period,  or  halted  immediately?  All  these  quesDons  were  new,  and  the  answers  to  them  keeps  changing.   The  fact  that  the  answers  keep  changing  highlights  the  importance  of  ease  of  use.  AnalyDcs  must  be  changed  quickly  and  be  made  available  to  fraud  experts  -­‐  in  some  cases,  in  hours  -­‐  as  understanding   deepens,  and  as  the  bad  guys  change  their  tacDcs.   Be9er,  higher  value-­‐add  customer  service  for  highly  automated  industries.  Knowledge  workers  who  anDcipate  sales  opportuniDes.  Spowng  fraud  in  high-­‐speed  transacDons  streams  and  taking  acDon.“     Some  more  use  cases:   h9p://streambase.typepad.com/streambase_stream_process/2012/04/streambase-­‐liveview-­‐10-­‐3-­‐stories-­‐from-­‐the-­‐trenches.html  
  • 62. © Copyright 2000-2014 TIBCO Software Inc. Real Time (CEP + In-Memory) Real World Use Case “With  38  million  fans,  MGM  knows  how  to  put  its  customers   first,  it  takes  more  than  a  smile  too.  Customers  want  a   personalized,  tailored  experience,  one  that  knows  their   name  and  can  anDcipate  their  needs.  With  the  help  of  TIBCO   technologies  that  leverage  big  data  and  give  customers  a   digital  idenDty,  MGM  can  send  personalized  offers  directly   to  customers,  save  them  a  seat,  and  have  their  favorite  drink   on  the  way.  With  mulDple  customer  touch  points  and   channels,  MGM  can  reach  customers  in  more  ways,  and  in   more  places,  than  ever  before.”     h9ps://www.youtube.com/watch?v=X-­‐7S3kCOx9k   CEP:   •  Correlate   •  Analyze   •  AcDon   In-­‐Memory:   •  Enable  Real  Time   •  Only  customers  that  have  checked  in  
  • 63. © Copyright 2000-2014 TIBCO Software Inc. Live Demo „TIBCO  StreamBase“  in  acDon...  
  • 64. © Copyright 2000-2014 TIBCO Software Inc. Hadoop: •  Storage •  Complex computing (MapReduce) Real Time: •  Immediate (proactive) reactions – automated or manually by user •  Monitor streaming data in Real Time Example: TIBCO StreamBase and its Apache Flume connector for reading streaming data from Hadoop / HDFS or to send streaming data to Hadoop / HDFS Real Time plus Hadoop?
  • 65. © Copyright 2000-2014 TIBCO Software Inc. Use Case: •  Predict pricing movement in live bets Hadoop: •  Store all history information about all past bets •  Use MapReduce to precompute odds for new matches, based on all history data TIBCO StreamBase: •  Compute new odds in real time to react within a live game after events (e.g. when a team scores a goal) •  Monitor stream data in real time dashboards Real Time plus Hadoop Real World Use Case h9p://www.casestudyu.com/news/2014/04/04/7762652.htm   h9p://vimeo.com/91461315  
  • 66. © Copyright 2000-2014 TIBCO Software Inc. Recap: Big Data Architecture DWH  /  BI   Hadoop   Real  Time   Big  Data  Architecture  
  • 67. © Copyright 2000-2014 TIBCO Software Inc. Off Topic What about Integration?
  • 68. © Copyright 2000-2014 TIBCO Software Inc. Off Topic Integration is no talking point in this session… However: It gets even more important in the future! The number of different data sources and technologies increases even more than in the past –  CRM, ERP, Host, B2B, etc. will not disappear –  DWH, Hadoop cluster, event / streaming server, In- Memory DB have to communicate –  Cloud, Mobile, Internet of Things are no option, but our future!
  • 69. © Copyright 2000-2014 TIBCO Software Inc. Recap: Key Messages Big Data is not just Hadoop, concentrate on Business Value! A good Big Data Architecture combines DWH, Hadoop and Real Time! The Integration Layer is getting even more important in the Big Data Era!
  • 70. © Copyright 2000-2014 TIBCO Software Inc. Questions? Kai Wähner kwaehner@tibco.com, @KaiWaehner, www.kai-waehner.de