SOA & Big data	  Arnon	  Rotem-­‐Gal-­‐Oz	  
Sept	  2012	  –	  iOS6	  launched	  with	  new	  maps	  applica>on	  
hEp://theamazingios6maps.tumblr.com/	  But	  something	  went	  terribly	  wrong….	  
•  It	  isn’t	  just	  about	     geKng	  all	  the	  data	     there	  •  Algorithms	  are	  cool	     but	  we	  need	  ...
hEp://theamazingios6maps.tumblr.com/	                                 It	  isn’t	  just	  one	  pile	  of	  data	  
Integra>ng	  Big	  data	  &	  SOA	  	                        Yoel	  Ben	  Avraham	  -­‐	  hEp://www.flickr.com/photos/epubl...
Data	  	             Refinery	  	  Ofer	  Berger	  	  hEp://www.haifacity.com/allsites/allpic/a/A1738/A1738Pic3326.jpg	  
ETL	  integra>on	              Department	  DB	  integra>on	                                   Server	  File-­‐based	  int...
                             Object	  soup	   ASB           BLT              AFT                 TGI                  FRY ...
                                   Services	   ASB           BLT              AFT                  TGI                  FR...
Adheres	  to          	                Policy   	               Governed	  by   	                                Binds	  t...
Customer	               Interac>ons	  Agents	                       Categories	  
Integra>ng	  Big	  data	  &	  SOA	  	                        Yoel	  Ben	  Avraham	  -­‐	  hEp://www.flickr.com/photos/epubl...
Saga	                Key	                          SOA	  component	            PaEern	  component	                        ...
HCatalog                                                     HBase                     Data                     Management...
So,	  what’s	  the	  	  problem	  ?	  
 &	  Big	  data	  	  can’t	  move	  
Performance	  of	  joins	  in	  distributed	           system	  sucks!	                                                   ...
Cookie	  cuEer	   scalability	  	  
Cell	  architecture	                 Node	                  2	          Node	           3	                 Node	          ...
Cell	  Architecture	     HBase                HBase           HBase  Categories        Customers           ORCA           ...
Orchestra>on	                                                                                         	                   ...
Map	  Reduce	  processing	  pipeline	                                                                     Customers       ...
Map	  Reduce	  processing	  pipeline	                                       Customers                                     ...
Data	  Facets	  
Memcached	                                                                   GigaSpaces	                                  ...
Data	  is	  mul>-­‐>ered	                                           Datamart(s)	                 Cube	             Real-­‐...
Data	  is	  mul>-­‐>ered	                                          Datamart(s)	          Real-­‐>me	  Data	  warehouse	   ...
SOA	  leaves	  us	  with	  a	  lot	  of	  isolated	  data	  
Aggregated	  Repor>ng	                                                                SQL endpoint                        ...
5                                                              Report	                                                    ...
3                                      Report tool      Drill through                                  7       REST API   ...
Take	  aways	  SOA	  &	  Big	  data	  are	  beEer	  together	  
Arnon	  Rotem-­‐Gal-­‐Oz	    	                                                              arnonr@nice.com	              ...
SOA & Big Data
SOA & Big Data
Upcoming SlideShare
Loading in...5
×

SOA & Big Data

4,041

Published on

Some of the challenges and oppurtunities

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,041
On Slideshare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
82
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

SOA & Big Data

  1. 1. SOA & Big data  Arnon  Rotem-­‐Gal-­‐Oz  
  2. 2. Sept  2012  –  iOS6  launched  with  new  maps  applica>on  
  3. 3. hEp://theamazingios6maps.tumblr.com/  But  something  went  terribly  wrong….  
  4. 4. •  It  isn’t  just  about   geKng  all  the  data   there  •  Algorithms  are  cool   but  we  need  humans   in  the  loop  •  Hire  the  right  people  •  Test  !  Test  !  Test!     hEp://theamazingios6maps.tumblr.com/  
  5. 5. hEp://theamazingios6maps.tumblr.com/   It  isn’t  just  one  pile  of  data  
  6. 6. Integra>ng  Big  data  &  SOA     Yoel  Ben  Avraham  -­‐  hEp://www.flickr.com/photos/epublicist/3546059144/  
  7. 7. Data     Refinery    Ofer  Berger    hEp://www.haifacity.com/allsites/allpic/a/A1738/A1738Pic3326.jpg  
  8. 8. ETL  integra>on   Department  DB  integra>on   Server  File-­‐based  integra>on  Online  integra>on   DB  
  9. 9.   Object  soup   ASB BLT AFT TGI FRY ECP HDL SWG DRW MFP WCP QYD DLY SKD DLY XPSWIU KYF XOI ZIS CUI WKD WHRASB GEX RMO JIA HCO BST VUH KFC AJT FQA DKE
  10. 10.   Services   ASB BLT AFT TGI FRY Customer HDL SWG Orders ECP DRW MFP WCP QYD DLY SKD DLY XPSWIU KYF XOI ZIS CUI WKD Promotions WHRASB Invoices RMO HCO JIA GEX BST VUH KFC AJT FQA DKE
  11. 11. Adheres  to   Policy   Governed  by   Binds  to   Endpoint   Exposes   Serves   Service    consumer   Understands   Contracts   Implements     Service Describes   Key         Component   Sends/receives Messages Sends/receives Rela>on  
  12. 12. Customer   Interac>ons  Agents   Categories  
  13. 13. Integra>ng  Big  data  &  SOA     Yoel  Ben  Avraham  -­‐  hEp://www.flickr.com/photos/epublicist/3546059144/  
  14. 14. Saga   Key   SOA  component   PaEern  component   Rela>on   Concern/aEribute     Prepare/commit/undo   Protocol   Perform     acDvity     Register       RegistraDon           Compensate  Prepare  /   Coordinator*          commit  /           Perform            undo   acDvity     AcDviDes  and  replies         Par>cipator           Compensate             Create   AcDviDes  and  replies   context   Ini>ator   Service  consumer   Service    
  15. 15. HCatalog HBase Data Management HBase Interactions HBase Customer HBase Interaction ETL Recordings HBaseNIM Raw Categories (HDFS) Resolved Interactions(H DFS) Resolved Interactions(H DFS) Hadoop Cluster
  16. 16. So,  what’s  the    problem  ?  
  17. 17.  &  Big  data    can’t  move  
  18. 18. Performance  of  joins  in  distributed   system  sucks!   Interactions Interactions Interactions 0-99 200-299 100-199 customers A-H customers I-M customers N-Z Node 1 Node 2 Node 3 {”Interac>on":  {      "id":  ”5",        ”par>cipants":  {          ”customer":  [              {”surname":  ”McDonalds",  ”name":  ”Old"},]      }   }}  
  19. 19. Cookie  cuEer   scalability    
  20. 20. Cell  architecture   Node   2   Node   3   Node   1   Node   N  
  21. 21. Cell  Architecture   HBase HBase HBase Categories Customers ORCA BUS Interactions Reference … Data HDFS HBase HBase
  22. 22. Orchestra>on      Initiate business process   Endpoint         Manage   process     Service   Schedule     Route request Invoke   servic   Workflow instance es  Workflow Host   Manage   Monitor     Endpoint    engine workflows   workflows   workflows       Service    
  23. 23. Map  Reduce  processing  pipeline   Customers Local cache Retrive segment Categorize Update Segment InteractionID, Segment Row Segment Row data - create Resolve segment Customer IDs Segment document document (Custoemr) (Categorization) (Interaction) (Interaction) Map pipeline Write Write Categories Interaction Results (interaction) (Categorization) Map Interaction & Update Categorize Prepare data Segments Interaction Interaction mart Export document (Categorization) (Datamart) (Interaction) Reduce pipeline Write Write Write Categories Interaction Interaction Results (interaction) (interaction) (Categorization) Reduce Hadoop Map/Reduce
  24. 24. Map  Reduce  processing  pipeline   Customers Local cache Retrive segment Categorize Update Segment InteractionID, Segment RowSegment Row data - create Resolve segment Customer IDs Segment document document (Custoemr) (Categorization) (Interaction) (Interaction) Map pipeline Write Write Categories Interaction Results (interaction) (Categorization) Map
  25. 25. Data  Facets  
  26. 26. Memcached   GigaSpaces   Redis   GridGain   Caching Data grid Oracle  Coherence   Columnar WebSphere  eXtreme  Scale   Hama   HBase   Cassandra   In-memory Pregel   Accumulo   Hypertable   Key-value store Neo4j   Graph Hadoop   GlusterFS  Distributed file systems RavenDB   ScaleBase   MongoDB   Relational Document CouchDB   NewSQL IndexTank   Amazon  RDS   Analytics/MPP VoltDB   Apache  Solr   AKvio   Aster  Data   Microsoo  PDW   Indexing Columnar ParAccel   SAP  HANA   Oracle  Exadata   HP  Ver>ca   IBM  Netezza   EMC  Greenplum  
  27. 27. Data  is  mul>-­‐>ered   Datamart(s)   Cube   Real-­‐>me  Datawarehouse   (RDBMS)   (MOLAP)   (in  memory)  (Hadoop/Hbase)       1-­‐7  days         detailed     6-­‐12  months   6-­‐?  Months     Detailed   aggregated   20  years       detailed   1-­‐3  years  aggregated   aggregated      
  28. 28. Data  is  mul>-­‐>ered   Datamart(s)   Real-­‐>me  Data  warehouse   (Columnar)    (Hadoop/Hbase)     1-­‐7  days     6-­‐12  months   detailed       Detailed   20  years       detailed   aggregated      
  29. 29. SOA  leaves  us  with  a  lot  of  isolated  data  
  30. 30. Aggregated  Repor>ng   SQL endpoint                           Endpoint Produce     Request reports   ODS/DM                         Report Report     Raw  data     Out   Transpose   Endpoint                           Pull data Ingest       Join                          Subscribed/ Load Landing  area   Clean  pulled data Transform Service Data backend SQL endpoint                          
  31. 31. 5 Report   service   Views   2 Raw  data       Load     DW/ODS  1 service     4 2 3 Transforma>on   service    Landing   1
  32. 32. 3 Report tool Drill through 7 REST API 5 9 6 8 10 HBase 2 Aggregation 4 map/reduce 2 1 ETL Details AggregatesRaw data (map/reduce (HDFS) +ETL) Data mart
  33. 33. Take  aways  SOA  &  Big  data  are  beEer  together  
  34. 34. Arnon  Rotem-­‐Gal-­‐Oz     arnonr@nice.com     hEp://www.nice.com    hEp://arnon.me/soa-­‐paEerns     arnon@rgoarchitects.com   @arnonrgo     hEp://arnon.me    
  1. ¿Le ha llamado la atención una diapositiva en particular?

    Recortar diapositivas es una manera útil de recopilar información importante para consultarla más tarde.

×