Project	
  "Babelfish"	
  
A	
  data	
  warehouse	
  to	
  a5ack	
  
complexity	
  
Prof.	
  Dr.	
  Christoph	
  Denzler	
  &	
  Daniel	
  Kröni	
  
{christoph.denzler,	
  daniel.kroeni}@Inw.ch	
  
StarKng	
  PosiKon	
  
•  Finnova	
  is	
  a	
  soOware	
  house	
  developing	
  a	
  
bankware	
  soluKon	
  for	
  universal	
  banks.	
  
•  About	
  300	
  employees,	
  200	
  of	
  them	
  in	
  
development,	
  engineering,	
  applicaKon	
  
management	
  and	
  customer	
  care	
  
•  Banking	
  System	
  
– more	
  than	
  7	
  million	
  lines	
  of	
  code	
  
– controlled	
  by	
  15'000	
  parameters	
  
– around	
  2000	
  UI	
  screens	
  
IncepKon	
  
•  SoOware	
  grew	
  over	
  
past	
  15	
  years	
  
–  approx.	
  13	
  person	
  years	
  
of	
  development	
  per	
  
month	
  
•  Architectural	
  challenges	
  
–  new	
  business	
  models	
  
–  new	
  regulaKons	
  
–  internaKonal	
  customers	
  
–  bigger	
  customers	
  
–  new	
  technologies	
  
→ How	
  to	
  keep	
  track	
  of	
  
–  architecture	
  
–  code	
  
–  tests	
  
–  customers	
  
parametrizaKon	
  
–  bug	
  reports	
  
–  change	
  requests	
  
–  developers	
  output	
  
?	
  
Product	
  
Concrete	
  Problems	
  
•  The	
  business	
  logic	
  is	
  changed.	
  In	
  which	
  GUIs	
  will	
  this	
  be	
  
visible?	
  
•  A	
  customer	
  reports	
  a	
  bug	
  on	
  screen	
  XY.	
  Which	
  parts	
  of	
  
the	
  code	
  do	
  handle	
  this	
  screen	
  and	
  its	
  data?	
  Which	
  
developer	
  is	
  resoponsible	
  for	
  this	
  code?	
  
•  Does	
  a	
  new	
  funcKon	
  break	
  architectural	
  guidelines?	
  
E.g.	
  does	
  it	
  introduce	
  dependency	
  loops?	
  
•  Which	
  modules	
  of	
  the	
  soOware	
  do	
  not	
  have	
  to	
  be	
  
taken	
  offline	
  during	
  a	
  system	
  upgrade?	
  
•  which	
  tests	
  need	
  to	
  be	
  rerun	
  aOer	
  a	
  change	
  in	
  code?	
  
ExpectaKons	
  
•  Improve	
  quality	
  of	
  bankware	
  soluKon	
  by	
  
– earlier	
  detecKon	
  of	
  architecture	
  violaKons	
  
•  Improve	
  issue	
  handling	
  
– faster	
  locality	
  determinaKon	
  of	
  bugs	
  
•  Improve	
  tesKng	
  by	
  
– tesKng	
  only	
  what	
  has	
  changed	
  
•  Improve	
  stability	
  by	
  
– reliable	
  dependency	
  informaKon	
  during	
  
deployment	
  and	
  producKon	
  
System	
  Overview	
  
Import"
WebAPI"
Core

System"
Core	
  System	
  
Versioning"
Schema"
DSL"
Core	
  System	
  
•  Neo4j	
  Graph	
  Database	
  
•  Model:	
  Directed	
  Property	
  Graph	
  
•  Nodes	
  
•  Typed	
  Edges	
  
•  ProperKes	
  
	
  
•  QuanKKes	
  	
  
	
  #Nodes	
  ~	
  6'300'000	
  
	
  #Edges	
  >	
  15'000'000	
  
Versioning"
Schema"
DSL"
name:	
  "Credit"	
   name:	
  "Log"	
  
calls	
  	
  
Core	
  System	
  
•  Version	
  aware	
  API	
  
•  access	
  graph	
  as	
  of	
  a	
  specific	
  version	
  
•  Allows	
  to	
  query	
  what	
  changed	
  	
  
•  when,	
  most	
  oOen,	
  together,	
  ...	
  
•  Mapping	
  of	
  versioned	
  nodes	
  to	
  DB	
  nodes	
  
Versioning"
Schema"
DSL"
name:	
  "Credit"	
  
LOC:	
  832	
  	
  
name:	
  "Credit"	
  
LOC:	
  832	
  
from:	
  13	
  
to:	
  _	
  
LOC:	
  750	
  
from:	
  1	
  
to:	
  12	
  
	
  
	
  Logical	
  QuanKKes	
  
#Nodes	
  2'046'128	
  
#Edges	
  4'292'867	
  
	
  
	
  
Storage	
  QuanKKes	
  	
  
	
  #Nodes	
  ~	
  6'300'000	
  
	
  #Edges	
  >	
  15'000'000	
  
Core	
  System	
  
•  Domain	
  model	
  
•  Common	
  vocabulary	
  with	
  the	
  partner	
  
•  Index	
  
•  Query	
  language	
  
Versioning"
Schema"
DSL"
Package	
  
name:	
  String	
  
LOC:	
  Long	
  
Release	
  
id:	
  Long	
  
name:	
  String	
  
Calls	
  
Contains	
  
Core	
  System	
  
•  Custom	
  Query	
  Language	
  
•  Schema	
  aware	
  
•  Version	
  aware	
  
•  Fast	
  graph	
  traversal	
  
•  Describing	
  the	
  structure	
  of	
  paths	
  as	
  
with	
  a	
  formal	
  grammar	
  
•  CollecKng	
  properKes	
  on	
  the	
  way	
  
•  SQL	
  postprocessing	
  
•  Implemented	
  as	
  an	
  internal	
  Scala	
  DSL	
  
•  Easy	
  to	
  extend	
  
Versioning"
Schema"
DSL"
Query	
  Language:	
  Basics	
  
•  Schema	
  aware	
  
–  Refer	
  to	
  nodes	
  /	
  edges	
  /	
  properKes	
  
•  Graph	
  navigaKon	
  primiKves	
  
–  V,	
  E,	
  inE,	
  outV,	
  outE,	
  inV	
  
•  Grammar	
  style	
  combinators	
  
–  ~,	
  |,	
  ?,	
  *,	
  +	
  
outE	
   inV	
  
inE	
  outV	
  
out	
  
in	
  
V(Package)	
  ~	
  where(Package.Name)("Log")	
  ~	
  in(_Calls_).+	
  
Query	
  Language:	
  Basics	
  
Log	
  
Credit	
  
ZV	
  
Customer	
  
FX	
   Poryolio	
  
V(Package)	
  ~	
  where(Package.Name)("Log")	
  ~	
  in(_Calls_).+	
  
Query	
  Language:	
  Basics	
  
Log	
  
Credit	
  
ZV	
  
Customer	
  
FX	
   Poryolio	
  
V(Package)	
  ~	
  where(Package.Name)("Log")	
  ~	
  in(_Calls_).+	
  
Query	
  Language:	
  Basics	
  
Log	
  
Credit	
  
ZV	
  
Customer	
  
FX	
   Poryolio	
  
V(Package)	
  ~	
  where(Package.Name)("Log")	
  ~	
  in(_Calls_).+	
  
Query	
  Language:	
  Basics	
  
Log	
  
Credit	
  
ZV	
  
Customer	
  
FX	
   Poryolio	
  
V(Package)	
  ~	
  where(Package.Name)("Log")	
  ~	
  in(_Calls_).+	
  
Query	
  Language:	
  Extensions	
  
	
  
	
  
	
  
•  Labeling	
  
–  Name	
  values	
  for	
  later	
  processing	
  
•  ExtracKon	
  
–  Select	
  what	
  you	
  want	
  in	
  your	
  table	
  
•  SQL	
  Postprocessing	
  
–  SQL	
  is	
  nice	
  for	
  aggregaKon	
  
from	
  {	
  
	
  	
  V(Package)	
  ~	
  in(_Calls_).+	
  ~	
  get(Package.Name).as("n")	
  
}	
  extract	
  {	
  "n"	
  }	
  sql	
  {	
  
	
  	
  "SELECT	
  n	
  FROM	
  t1	
  ORDER	
  BY	
  n	
  DESC"	
  
}	
  
QuesKons	
  /	
  Remarks	
  
?/!	
  

Project "Babelfish" - A data warehouse to attack complexity

  • 1.
    Project  "Babelfish"   A  data  warehouse  to  a5ack   complexity   Prof.  Dr.  Christoph  Denzler  &  Daniel  Kröni   {christoph.denzler,  daniel.kroeni}@Inw.ch  
  • 2.
    StarKng  PosiKon   • Finnova  is  a  soOware  house  developing  a   bankware  soluKon  for  universal  banks.   •  About  300  employees,  200  of  them  in   development,  engineering,  applicaKon   management  and  customer  care   •  Banking  System   – more  than  7  million  lines  of  code   – controlled  by  15'000  parameters   – around  2000  UI  screens  
  • 3.
    IncepKon   •  SoOware  grew  over   past  15  years   –  approx.  13  person  years   of  development  per   month   •  Architectural  challenges   –  new  business  models   –  new  regulaKons   –  internaKonal  customers   –  bigger  customers   –  new  technologies   → How  to  keep  track  of   –  architecture   –  code   –  tests   –  customers   parametrizaKon   –  bug  reports   –  change  requests   –  developers  output   ?  
  • 4.
  • 5.
    Concrete  Problems   • The  business  logic  is  changed.  In  which  GUIs  will  this  be   visible?   •  A  customer  reports  a  bug  on  screen  XY.  Which  parts  of   the  code  do  handle  this  screen  and  its  data?  Which   developer  is  resoponsible  for  this  code?   •  Does  a  new  funcKon  break  architectural  guidelines?   E.g.  does  it  introduce  dependency  loops?   •  Which  modules  of  the  soOware  do  not  have  to  be   taken  offline  during  a  system  upgrade?   •  which  tests  need  to  be  rerun  aOer  a  change  in  code?  
  • 6.
    ExpectaKons   •  Improve  quality  of  bankware  soluKon  by   – earlier  detecKon  of  architecture  violaKons   •  Improve  issue  handling   – faster  locality  determinaKon  of  bugs   •  Improve  tesKng  by   – tesKng  only  what  has  changed   •  Improve  stability  by   – reliable  dependency  informaKon  during   deployment  and  producKon  
  • 7.
  • 8.
  • 9.
    Core  System   • Neo4j  Graph  Database   •  Model:  Directed  Property  Graph   •  Nodes   •  Typed  Edges   •  ProperKes     •  QuanKKes      #Nodes  ~  6'300'000    #Edges  >  15'000'000   Versioning" Schema" DSL" name:  "Credit"   name:  "Log"   calls    
  • 10.
    Core  System   • Version  aware  API   •  access  graph  as  of  a  specific  version   •  Allows  to  query  what  changed     •  when,  most  oOen,  together,  ...   •  Mapping  of  versioned  nodes  to  DB  nodes   Versioning" Schema" DSL" name:  "Credit"   LOC:  832     name:  "Credit"   LOC:  832   from:  13   to:  _   LOC:  750   from:  1   to:  12      Logical  QuanKKes   #Nodes  2'046'128   #Edges  4'292'867       Storage  QuanKKes      #Nodes  ~  6'300'000    #Edges  >  15'000'000  
  • 11.
    Core  System   • Domain  model   •  Common  vocabulary  with  the  partner   •  Index   •  Query  language   Versioning" Schema" DSL" Package   name:  String   LOC:  Long   Release   id:  Long   name:  String   Calls   Contains  
  • 12.
    Core  System   • Custom  Query  Language   •  Schema  aware   •  Version  aware   •  Fast  graph  traversal   •  Describing  the  structure  of  paths  as   with  a  formal  grammar   •  CollecKng  properKes  on  the  way   •  SQL  postprocessing   •  Implemented  as  an  internal  Scala  DSL   •  Easy  to  extend   Versioning" Schema" DSL"
  • 13.
    Query  Language:  Basics   •  Schema  aware   –  Refer  to  nodes  /  edges  /  properKes   •  Graph  navigaKon  primiKves   –  V,  E,  inE,  outV,  outE,  inV   •  Grammar  style  combinators   –  ~,  |,  ?,  *,  +   outE   inV   inE  outV   out   in   V(Package)  ~  where(Package.Name)("Log")  ~  in(_Calls_).+  
  • 14.
    Query  Language:  Basics   Log   Credit   ZV   Customer   FX   Poryolio   V(Package)  ~  where(Package.Name)("Log")  ~  in(_Calls_).+  
  • 15.
    Query  Language:  Basics   Log   Credit   ZV   Customer   FX   Poryolio   V(Package)  ~  where(Package.Name)("Log")  ~  in(_Calls_).+  
  • 16.
    Query  Language:  Basics   Log   Credit   ZV   Customer   FX   Poryolio   V(Package)  ~  where(Package.Name)("Log")  ~  in(_Calls_).+  
  • 17.
    Query  Language:  Basics   Log   Credit   ZV   Customer   FX   Poryolio   V(Package)  ~  where(Package.Name)("Log")  ~  in(_Calls_).+  
  • 18.
    Query  Language:  Extensions         •  Labeling   –  Name  values  for  later  processing   •  ExtracKon   –  Select  what  you  want  in  your  table   •  SQL  Postprocessing   –  SQL  is  nice  for  aggregaKon   from  {      V(Package)  ~  in(_Calls_).+  ~  get(Package.Name).as("n")   }  extract  {  "n"  }  sql  {      "SELECT  n  FROM  t1  ORDER  BY  n  DESC"   }  
  • 19.