By Antonio Castellón (as Blue-Infinity consultant) , February, 2014
for Philip Morris International R&D
CIKB
Software Architecture Design Proposal
Problem : Data Complex
Problem : Data Complex to Model
Problem : Dynamic Data ( Uncertainty )
End User requirements and data itself sometimes generate
different types of uncertainty
Problem : GUI - User experience
I’m not stupid but
…. this interface
is too
complicated !!!
Problem : GUI - Adaptable + Flexible
Problem : GUI – Technology + Design
Be careful with awesome solutions that not fit
design and engineering at the same time
“The Solution” is a mix of 4 …
“The Solution” - Is a mix of …
An Architecture
A set of Data
A cool User Interface
And a mad developer to do it
(joke)
“The Solution” – Brick 1
An Architecture
Architecture – we aim to
• Reduce the complexity
• To be reusable
• Easy in deployment
• Allows dynamic updates
• To be adaptive
• Fast in responses
• Low memory profile
• To provide security
• …
Architecture – The response
Architecture – The response
Open Service Gateway initiative
Defines the standard.
Architecture – OSGi supported by
Architecture – OSGi implemented by …
. . .
Architecture – In summary, OSGi goals are …
Service Oriented +
Modular (bundles)
Bundle (x)
Service (x’)
Service (y)
Service (x)
Architecture – OSGi : Simple overview
Console Logging Admin …
Web Server
WAB
Application
1
WAB
Application
2
…
Application
Service 1
Application
Service 2
…
…
OSGi Instance 1
JVM
…
Bundles to be
developed for us
Bundles to be
installed
A set of Data
“The Solution” – Brick 2
Data
NoSQL
( Not Only SQL )
Data – NoSQL – Different implementations
Data - NoSQL – Comparing data structure
Image from: http://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/
Data - NoSQL – Compare
98% of the business
requirements
There is still billions of
nodes and relationships
Data – Our selection
Graph
Databases
Data – Graph Databases – Why?
Flexible data structure
Doesn’t matter if the relations will change in the future.
Closer match to business logic
Data – Graph Databases – Why?
Natural query system
You tell what you want, not how to get it.
with recursive cluster (party, path, depth)
as ( select cast(@userId as character varying),
cast(@userId as character varying), 1
union
(
select (case
when this.party = amc.userA then amc.userB
when this.party = amc.userB then amc.userA
end), (this.path || '.' || (case
when this.party = amc.userA then amc.userB
when this.party = amc.userB then amc.userA
end)), this.depth + 1
from cluster this, chat amc
where ((this.party = amc.userA and
position(amc.userB in this.path) = 0)
or (this.party = amc.userB and position(amc.userA
in this.path) = 0)) AND this.depth < @depth + 1 )
)
select party, path
from cluster
where not exists (
select *
from cluster c2 where cluster.party = c2.party
and (
char_length(cluster.path) > char_length(c2.path)
or (char_length(cluster.path) =
char_length(c2.path)) and (cluster.path > c2.path)
)
)
order by party, path;
SQL = several hours to be executed
VS
START b = node:User(UserId=‘Manolo')
MATCH (b) --(friend)--(friendoffriend)
RETURN count(friendoffriend)
Cypher Language = 635ms
Data - Graph Databases – Why?
Fits very well with complex data
Data - Graph Databases – Why?
Fits very well with Bio-Informatics
0.9 Billion
relationsips
Data – Graph Databases – Why?
Fast Prototyping and development
We don’t need to lose too much time to define the schema (fine-grained).
Data - Graph Databases – What is it?
Properties
Labels
Relationships
Data - Graph Databases - Implemented by …
Data - Graph Databases - Compare
Name API Query
Methods
Consistency Staff (people) /
Community
OrientDB Java Traverser
API, Blueprints,
Rexster
Own SQL-like
Query
Language,
Gremlin
ACID, MVCC 3 / Low
Neo4j Java, Python,
JPython, Ruby,
JRuby,
JavaScript
(Node.js), PHP,
.NET, Django,
Clojure, Spring,
Scala, or REST
(any language)
Cypher
(native/preferre
d), Native Java
APIs (special
cases),
Traverser API,
REST,
Blueprints,
Gremlin
ACID 42 / Very High
DEX Java, C++,
.NET
Native Java, C#
and C++ APIs,
Blueprints,
Gremlin
Consistency,
durability and
partial isolation
and atomicity
5 / ?
Data - Graph Databases – Compare
Data - Graph Databases - Neo4j customers
Data - Graph Database - Neo4j - Partners
Data - Graph Database - Neo4j - Licenses
“The Solution” – Brick 3
A cool User Interface
GUI
+
GUI
UI Graphs
Model / View / Controller
( on Browser using Jscript )
JAX-RS (RESTful web services)
JSON responses
On OSGi bundle as a webservice
On Browser client
Data Driven Documents
GUI - AngularJS – What is it?
RESTful
+
JSON
GUI - D3.js – What is it?
GUI - D3.js – Rich and cool interfaces
GUI - Examples
GUI - Licenses
No requires any payment to use or to modify their code.
“The Solution” – The last brick
At least a mad developer to
do it (joke)
Architecture – Current draft
KARAF :: OSGi kernel platform
Shell admin
web admin
console
ServiceMix (Optional) :: Enterprise Service Bus
Groovy 2.2.1
Runtime
Jetty Server
8.1.9 Runtime
CIKB
Neo4j 2.0.0
Server
Core ( Business )
Database
connector
CVS
connector
SAW
connector
LIMS
connector
User Portal
UCSD
Connector
XML
Connector
AngularJS + D3.js
…
Admin
Portal
…
Thanks you for your attention.
End

CIKB - Software Architecture Analysis Design

  • 1.
    By Antonio Castellón(as Blue-Infinity consultant) , February, 2014 for Philip Morris International R&D CIKB Software Architecture Design Proposal
  • 2.
  • 3.
    Problem : DataComplex to Model
  • 4.
    Problem : DynamicData ( Uncertainty ) End User requirements and data itself sometimes generate different types of uncertainty
  • 5.
    Problem : GUI- User experience I’m not stupid but …. this interface is too complicated !!!
  • 6.
    Problem : GUI- Adaptable + Flexible
  • 7.
    Problem : GUI– Technology + Design Be careful with awesome solutions that not fit design and engineering at the same time
  • 8.
    “The Solution” isa mix of 4 …
  • 9.
    “The Solution” -Is a mix of … An Architecture A set of Data A cool User Interface And a mad developer to do it (joke)
  • 10.
    “The Solution” –Brick 1 An Architecture
  • 11.
    Architecture – weaim to • Reduce the complexity • To be reusable • Easy in deployment • Allows dynamic updates • To be adaptive • Fast in responses • Low memory profile • To provide security • …
  • 12.
  • 13.
    Architecture – Theresponse Open Service Gateway initiative Defines the standard.
  • 14.
  • 15.
    Architecture – OSGiimplemented by … . . .
  • 16.
    Architecture – Insummary, OSGi goals are … Service Oriented + Modular (bundles) Bundle (x) Service (x’) Service (y) Service (x)
  • 17.
    Architecture – OSGi: Simple overview Console Logging Admin … Web Server WAB Application 1 WAB Application 2 … Application Service 1 Application Service 2 … … OSGi Instance 1 JVM … Bundles to be developed for us Bundles to be installed
  • 18.
    A set ofData “The Solution” – Brick 2
  • 19.
  • 20.
    Data – NoSQL– Different implementations
  • 21.
    Data - NoSQL– Comparing data structure Image from: http://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/
  • 22.
    Data - NoSQL– Compare 98% of the business requirements There is still billions of nodes and relationships
  • 23.
    Data – Ourselection Graph Databases
  • 24.
    Data – GraphDatabases – Why? Flexible data structure Doesn’t matter if the relations will change in the future. Closer match to business logic
  • 25.
    Data – GraphDatabases – Why? Natural query system You tell what you want, not how to get it. with recursive cluster (party, path, depth) as ( select cast(@userId as character varying), cast(@userId as character varying), 1 union ( select (case when this.party = amc.userA then amc.userB when this.party = amc.userB then amc.userA end), (this.path || '.' || (case when this.party = amc.userA then amc.userB when this.party = amc.userB then amc.userA end)), this.depth + 1 from cluster this, chat amc where ((this.party = amc.userA and position(amc.userB in this.path) = 0) or (this.party = amc.userB and position(amc.userA in this.path) = 0)) AND this.depth < @depth + 1 ) ) select party, path from cluster where not exists ( select * from cluster c2 where cluster.party = c2.party and ( char_length(cluster.path) > char_length(c2.path) or (char_length(cluster.path) = char_length(c2.path)) and (cluster.path > c2.path) ) ) order by party, path; SQL = several hours to be executed VS START b = node:User(UserId=‘Manolo') MATCH (b) --(friend)--(friendoffriend) RETURN count(friendoffriend) Cypher Language = 635ms
  • 26.
    Data - GraphDatabases – Why? Fits very well with complex data
  • 27.
    Data - GraphDatabases – Why? Fits very well with Bio-Informatics 0.9 Billion relationsips
  • 28.
    Data – GraphDatabases – Why? Fast Prototyping and development We don’t need to lose too much time to define the schema (fine-grained).
  • 29.
    Data - GraphDatabases – What is it? Properties Labels Relationships
  • 30.
    Data - GraphDatabases - Implemented by …
  • 31.
    Data - GraphDatabases - Compare Name API Query Methods Consistency Staff (people) / Community OrientDB Java Traverser API, Blueprints, Rexster Own SQL-like Query Language, Gremlin ACID, MVCC 3 / Low Neo4j Java, Python, JPython, Ruby, JRuby, JavaScript (Node.js), PHP, .NET, Django, Clojure, Spring, Scala, or REST (any language) Cypher (native/preferre d), Native Java APIs (special cases), Traverser API, REST, Blueprints, Gremlin ACID 42 / Very High DEX Java, C++, .NET Native Java, C# and C++ APIs, Blueprints, Gremlin Consistency, durability and partial isolation and atomicity 5 / ?
  • 32.
    Data - GraphDatabases – Compare
  • 33.
    Data - GraphDatabases - Neo4j customers
  • 34.
    Data - GraphDatabase - Neo4j - Partners
  • 35.
    Data - GraphDatabase - Neo4j - Licenses
  • 36.
    “The Solution” –Brick 3 A cool User Interface
  • 37.
  • 38.
    GUI UI Graphs Model /View / Controller ( on Browser using Jscript ) JAX-RS (RESTful web services) JSON responses On OSGi bundle as a webservice On Browser client Data Driven Documents
  • 39.
    GUI - AngularJS– What is it? RESTful + JSON
  • 40.
    GUI - D3.js– What is it?
  • 41.
    GUI - D3.js– Rich and cool interfaces
  • 42.
  • 43.
    GUI - Licenses Norequires any payment to use or to modify their code.
  • 44.
    “The Solution” –The last brick At least a mad developer to do it (joke)
  • 45.
    Architecture – Currentdraft KARAF :: OSGi kernel platform Shell admin web admin console ServiceMix (Optional) :: Enterprise Service Bus Groovy 2.2.1 Runtime Jetty Server 8.1.9 Runtime CIKB Neo4j 2.0.0 Server Core ( Business ) Database connector CVS connector SAW connector LIMS connector User Portal UCSD Connector XML Connector AngularJS + D3.js … Admin Portal …
  • 46.
    Thanks you foryour attention. End

Editor's Notes

  • #2 thanks for attending this presentation, I hope that it covers your expectations. This is only a high level description about the reasons to choose the selected architecture and their tools… Therefore, Do Not hesitate to interrupt me if you have any question, I will glad to explain in more details anything, if this allows to you to understand much better the final solution.
  • #3 Data is complex from their definition, too many relationships between different nodes and different domains.
  • #4 To fit from the „real“ world to an standard Entity Relational Model is a nightmare and it‘s a focus of errors if something need to be changed in the future (to introduce new properties, new objects, new relationships, etc. )
  • #5 The important thing from any design is to acquire correctly at least the 99% of the User requirements, but it‘s impossible if the user generate uncertainly from different reasons (and also when exists different users with different domains or points of view).
  • #6 One part where all software solutions spent more time is to developing the User Interface. Need to be flexible and adaptable from different requirements and uses or at least, that the technnology used provide the most easy way to create a good user experiences. INTUITIVE
  • #7 Need to be flexible and adaptable from different requirements based in different platforms to be used.
  • #8 To select the correct technology is also the goal to create a success project. Not all is based only in the front-end, and also, not all is based in the backend.
  • #9 It‘s our solution, we known that is possible to do it using different approachs ... All roads lead to Rome, but some are more easy than others
  • #10 A good solution is never easy to do...but if it is simple, it‘s much better.
  • #11 A good solution is never easy to do...but if it is simple, it‘s much better.
  • #16 Some of them are oriented as a wen server applications, but others are more service oriented.
  • #18 Each module/bundle is a service that publish to the others some functionallity using the OSGi framework where they are living.
  • #19 A good solution is never easy to do...but if it is simple, it‘s much better.
  • #20 It‘s a complement, this technology appears several years ago...but the last years was impossed by the requirements about the scalability, clustering and performance.
  • #21 In difference with the RDBMS, the implementation for each solution differs sometimes between these solutions because each solution is based in another paradigme and focused in different perspectives based on different types of organization data.
  • #24 It‘s a complement, this technology appears several years ago...but the last years was impossed by the requirements about the scalability, clustering and performance.
  • #25 - Data is according with the mind of the expert area (ex: Lab. people) and not with the mind of the IT Expert area. Good reference: http://highlyscalable.wordpress.com/2012/03/01/nosql-data-modeling-techniques/
  • #26 http://www.slideshare.net/ayeeson/0221-cypher-for-sql-professionals
  • #36 Cypher probably will be standard of GDB... ACID – standard for consistency of data..
  • #37 A good solution is never easy to do...but if it is simple, it‘s much better.
  • #45 A good solution is never easy to do...but if it is simple, it‘s much better.