SOA & Big Data
Upcoming SlideShare
Loading in...5
×
 

SOA & Big Data

on

  • 3,976 views

Some of the challenges and oppurtunities

Some of the challenges and oppurtunities

Statistics

Views

Total Views
3,976
Views on SlideShare
2,555
Embed Views
1,421

Actions

Likes
2
Downloads
59
Comments
0

18 Embeds 1,421

http://java.dzone.com 919
http://arnon.me 320
http://feeds.feedburner.com 85
http://architects.dzone.com 32
http://soa.dzone.com 22
http://www.linkedin.com 11
http://rritw.com 8
http://www.newsblur.com 6
http://feeds2.feedburner.com 4
http://translate.googleusercontent.com 3
http://cloud.feedly.com 2
http://feedproxy.google.com 2
http://www.dzone.com 2
http://www.rgoarchitects.com 1
https://twitter.com 1
http://www.inoreader.com 1
http://feedly.com 1
https://www.linkedin.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

SOA & Big Data SOA & Big Data Presentation Transcript

  • SOA & Big data  Arnon  Rotem-­‐Gal-­‐Oz  
  • Sept  2012  –  iOS6  launched  with  new  maps  applica>on  
  • hEp://theamazingios6maps.tumblr.com/  But  something  went  terribly  wrong….  
  • •  It  isn’t  just  about   geKng  all  the  data   there  •  Algorithms  are  cool   but  we  need  humans   in  the  loop  •  Hire  the  right  people  •  Test  !  Test  !  Test!     hEp://theamazingios6maps.tumblr.com/  
  • hEp://theamazingios6maps.tumblr.com/   It  isn’t  just  one  pile  of  data  
  • Integra>ng  Big  data  &  SOA     Yoel  Ben  Avraham  -­‐  hEp://www.flickr.com/photos/epublicist/3546059144/  
  • Data     Refinery    Ofer  Berger    hEp://www.haifacity.com/allsites/allpic/a/A1738/A1738Pic3326.jpg  
  • ETL  integra>on   Department  DB  integra>on   Server  File-­‐based  integra>on  Online  integra>on   DB  
  •   Object  soup   ASB BLT AFT TGI FRY ECP HDL SWG DRW MFP WCP QYD DLY SKD DLY XPSWIU KYF XOI ZIS CUI WKD WHRASB GEX RMO JIA HCO BST VUH KFC AJT FQA DKE
  •   Services   ASB BLT AFT TGI FRY Customer HDL SWG Orders ECP DRW MFP WCP QYD DLY SKD DLY XPSWIU KYF XOI ZIS CUI WKD Promotions WHRASB Invoices RMO HCO JIA GEX BST VUH KFC AJT FQA DKE
  • Adheres  to   Policy   Governed  by   Binds  to   Endpoint   Exposes   Serves   Service    consumer   Understands   Contracts   Implements     Service Describes   Key         Component   Sends/receives Messages Sends/receives Rela>on  
  • Customer   Interac>ons  Agents   Categories  
  • Integra>ng  Big  data  &  SOA     Yoel  Ben  Avraham  -­‐  hEp://www.flickr.com/photos/epublicist/3546059144/  
  • Saga   Key   SOA  component   PaEern  component   Rela>on   Concern/aEribute     Prepare/commit/undo   Protocol   Perform     acDvity     Register       RegistraDon           Compensate  Prepare  /   Coordinator*          commit  /           Perform            undo   acDvity     AcDviDes  and  replies         Par>cipator           Compensate             Create   AcDviDes  and  replies   context   Ini>ator   Service  consumer   Service    
  • HCatalog HBase Data Management HBase Interactions HBase Customer HBase Interaction ETL Recordings HBaseNIM Raw Categories (HDFS) Resolved Interactions(H DFS) Resolved Interactions(H DFS) Hadoop Cluster
  • So,  what’s  the    problem  ?  
  •  &  Big  data    can’t  move  
  • Performance  of  joins  in  distributed   system  sucks!   Interactions Interactions Interactions 0-99 200-299 100-199 customers A-H customers I-M customers N-Z Node 1 Node 2 Node 3 {”Interac>on":  {      "id":  ”5",        ”par>cipants":  {          ”customer":  [              {”surname":  ”McDonalds",  ”name":  ”Old"},]      }   }}  
  • Cookie  cuEer   scalability    
  • Cell  architecture   Node   2   Node   3   Node   1   Node   N  
  • Cell  Architecture   HBase HBase HBase Categories Customers ORCA BUS Interactions Reference … Data HDFS HBase HBase
  • Orchestra>on      Initiate business process   Endpoint         Manage   process     Service   Schedule     Route request Invoke   servic   Workflow instance es  Workflow Host   Manage   Monitor     Endpoint    engine workflows   workflows   workflows       Service    
  • Map  Reduce  processing  pipeline   Customers Local cache Retrive segment Categorize Update Segment InteractionID, Segment Row Segment Row data - create Resolve segment Customer IDs Segment document document (Custoemr) (Categorization) (Interaction) (Interaction) Map pipeline Write Write Categories Interaction Results (interaction) (Categorization) Map Interaction & Update Categorize Prepare data Segments Interaction Interaction mart Export document (Categorization) (Datamart) (Interaction) Reduce pipeline Write Write Write Categories Interaction Interaction Results (interaction) (interaction) (Categorization) Reduce Hadoop Map/Reduce
  • Map  Reduce  processing  pipeline   Customers Local cache Retrive segment Categorize Update Segment InteractionID, Segment RowSegment Row data - create Resolve segment Customer IDs Segment document document (Custoemr) (Categorization) (Interaction) (Interaction) Map pipeline Write Write Categories Interaction Results (interaction) (Categorization) Map
  • Data  Facets  
  • Memcached   GigaSpaces   Redis   GridGain   Caching Data grid Oracle  Coherence   Columnar WebSphere  eXtreme  Scale   Hama   HBase   Cassandra   In-memory Pregel   Accumulo   Hypertable   Key-value store Neo4j   Graph Hadoop   GlusterFS  Distributed file systems RavenDB   ScaleBase   MongoDB   Relational Document CouchDB   NewSQL IndexTank   Amazon  RDS   Analytics/MPP VoltDB   Apache  Solr   AKvio   Aster  Data   Microsoo  PDW   Indexing Columnar ParAccel   SAP  HANA   Oracle  Exadata   HP  Ver>ca   IBM  Netezza   EMC  Greenplum  
  • Data  is  mul>-­‐>ered   Datamart(s)   Cube   Real-­‐>me  Datawarehouse   (RDBMS)   (MOLAP)   (in  memory)  (Hadoop/Hbase)       1-­‐7  days         detailed     6-­‐12  months   6-­‐?  Months     Detailed   aggregated   20  years       detailed   1-­‐3  years  aggregated   aggregated      
  • Data  is  mul>-­‐>ered   Datamart(s)   Real-­‐>me  Data  warehouse   (Columnar)    (Hadoop/Hbase)     1-­‐7  days     6-­‐12  months   detailed       Detailed   20  years       detailed   aggregated      
  • SOA  leaves  us  with  a  lot  of  isolated  data  
  • Aggregated  Repor>ng   SQL endpoint                           Endpoint Produce     Request reports   ODS/DM                         Report Report     Raw  data     Out   Transpose   Endpoint                           Pull data Ingest       Join                          Subscribed/ Load Landing  area   Clean  pulled data Transform Service Data backend SQL endpoint                          
  • 5 Report   service   Views   2 Raw  data       Load     DW/ODS  1 service     4 2 3 Transforma>on   service    Landing   1
  • 3 Report tool Drill through 7 REST API 5 9 6 8 10 HBase 2 Aggregation 4 map/reduce 2 1 ETL Details AggregatesRaw data (map/reduce (HDFS) +ETL) Data mart
  • Take  aways  SOA  &  Big  data  are  beEer  together  
  • Arnon  Rotem-­‐Gal-­‐Oz     arnonr@nice.com     hEp://www.nice.com    hEp://arnon.me/soa-­‐paEerns     arnon@rgoarchitects.com   @arnonrgo     hEp://arnon.me