This paper presents an approach to using semantic technolo- gies to achieve better and more flexible integration of IT systems. The author believes that the described approach is applicable to a great many organizations, and that it can lead to far more dynamic IT architectures than what is common today.
2. Architectural challenges
• System architecture is not enterprise architecture
– what's good for a single system is not necessarily good for the
enterprise as a whole
– local decisions make sense locally, but not necessarily globally
• Organizational reorganizations
– merges, aquisitions, political reorganizations, ...
– these all have implications for IT architecture
• Master data management
– of course you only have a single customer database
– ...until you buy a company that has their own
• Architecture dictators
– are a tempting solution to impose some general structure,
– but generally impede progress at the local level
2
3. Lego!
• The goal is not just an architecture that's
correct today
• Because we know the situation will soon
change
• The goal is an architecture that's easily
adapted to tomorrow's environment
3
4. A step-by-step process
2.
Master
3.
Reference
4.
Generic
5.
Access
7.
Seman<c
6.
Search
data
data
services
control
formats
1.
Reference
model
4
5. Mapping the IT landscape Step
#1
• Which IT systems exist?
• Which entities do they have?
• What are their properties?
• What web services exist?
• What is their input and output?
App
En#tet
Egenskap
Tjeneste
Format
5
6. The value of the map Step
#1
• Living high-level view of systems and services
– navigation/search/visualization,
– connect to relevant documentation, where it exists,
– far superior to PowerPoint/Word
• The main value is in the structure
– that is, the use of a proper semantic model
• No need to build or buy software
– it can all be done with existing open source
components
6
7. Problems with the map Step
#1
• There is no connection across systems
– shows that 7 systems have the concept "case"
– but so what?
• What is needed is a cross-mapping
– must show to what degree the 7 concepts overlap
– perhaps there are other concepts that mean the same,
but have different names?
7
8. Build a reference model Step
#1
• Model central concepts
– entities and their properties
– type hierarchy for both
– independent of any specific system; model the
organization's understanding
• This is not a canonical data model
– not a format
– not forcing any systems to actually use the model
– systems can use entities/properties which do not
(yet) exist in the model
8
9. What is cohabitation? Step
#1
• Law on individual pensions
– LOV-2008-06-27-62, § 3-7
• Med samboer forstås her a) person som kunden har felles bolig og felles barn med, b) person som kunden lever
sammen med i ekteskaps- eller partnerskapslignende forhold når det godtgjøres at forholdet har bestått uavbrutt i
de siste fem år før kundens død, og det ikke forelå forhold som ville hindre at lovlig ekteskap eller registrert
partnerskap ble inngått.
• Regulation on collecting information
– FOR-2005-07-08-826, punkt 1
• samboere: personer som lever sammen og har felles barn.
• Law on "vergemål"
– LOV-2010-03-26-9, § 2
• Med samboere menes i denne loven to personer som bor sammen i
et ekteskapslignende forhold.
• ...
9
11. Properties, too Step
#1
cohab
1
Cohabita<on
Person
start
date
cohab
2
end
date
...
11
12. Connect the system data Step
#1
Cohabita<on
Cohabita<on
Cohabita<on
Cohabita<on
Cohabita<on
PENSION
XXX
VERGE
YYY
Cohabita<on
INNH
COHAB
COHABITERS
BPG_COHAB
App
#1
App
#2
App
#3
12
13. Degrees of correlation Step
#1
A
B
• Perfect correlation
– not unusual, neither is it always the case
• Specialization
– that is, B is a narrower concept than A is
• Overlap
– A and B share a common subset
• Resembles
– A and B are related, but the connection is not clear
13
14. Uses of the model Step
#1
• We describe to understand
– we want to understand so that we can improve
• Analysis of the architecture
– starting point for a restructuring
– identify master data issues
– etc
• A data dictionary
– useful when converting legacy data
– useful for bug fixing
– key personell no longer have to answer questions all the
time
– etc
14
15. From documentation to services Trinn
#1
• So far we've only discussed documentation for
humans
– this is highly useful in a number of ways
– but it is only the beginning
• The model has a semantic structure
– therefore we can use it to build new kinds of services
15
16. Master data control Step
#2
• Pick one system to be the master for each kind
of data
– where this can really be centralized
• Other systems needing the data must become
clients of the master
– this is a gradual transition
• We also need a protocol which the clients can
use to retrieve the data
16
17. A service broker Step
#2
• A service which routes
requests
• A layer above the ESB
Broker
Reference
– the ESB takes care of
model
transport
– it might also broker between
several ESBs ESB
• Uses its knowledge of
information and services
• Decouples clients from
servers
• Makes the architecture a
lot more flexible
17
18. Master data protocol Step
#2
App
#1
App
#2
App
#3
Atom
(SDshare)
• Used by clients to
retrieve data Broker
Lookup
updates Referanse-‐
modell
• Makes it possible
to gradually Sync
request:
en<ty
+
<me
migrate
• Master can
change without App
#4
the clients
knowing
18
19. Collect reference data Step
#3
• Most systems share a number of fairly static
lists
– list of countries, list of diagnosis codes, list of
provinces, ...
• There is no reason to maintain this in duplicate
in different systems
– the lists also need common identifiers for the items
• The lists might as well go into the reference
model
– can be retrieved from there by client systems
19
20. Generic lookup service Step
#4
• Would not work
"out of the box"
App
#2
App
#3
App
#4
• Must be carefully
Service
#1
Service
#2
Service
#3
set up so that it
works
Who
has
• Possible because
data
about
this is a controlled
X?
environment
Broker
Give
me
an
en<ty
of
type
X
with
ID
23414
App
#1
20
21. Generic translation service Step
#4
X-‐>Y
Y-‐>Z
X-‐>Z
Y-‐>X
Z-‐>Y
Z-‐>X
Broker
• Sometimes the client wants format X, but the
server can only supply Y
– the broker can find a translator service, and
– ensure that the translation happens automatically
21
22. Some science fiction Step
#4
• We already have
– the structure of XML formats X and Y described, and
– connected to the reference model
• In some cases we can then generate the
translator automatically
– made prototype in 2004
– it worked!
– but it won't always work
22
23. Impact analysis Step
#4
• If we register clients and their requests in the
model we know more about the uses of the
architecture
• It becomes possible to find the answer to
questions like
– "can we stop this service?"
– "does anyone use this format?"
– ...
23
24. Access control Step
#5
• Rules for this are usually
– not documented anywhere,
– encoded in software all over the enterprise
• It can also be represented in the model
– user groups can be connected to the data they are
allowed to access/modify
– there is no need to represent individual users
• All this can be retrieved via web services
24
25. Generic querying with SPARQL Trinn
#6
• More advanced lookup
– using SPARQL as the query language
– can do more than just looking up IDs
– doing queries "into the cloud"
• The reference model is used to interpret the
query
– splits it up and delegates to different services
– the broker then assembles the result
• SPARQL is not a very powerful language
– that is why this is possible
25
26. Semantic formats "on the wire" Step
#7
• Ordinary XML formats are static
– semantic formats are dynamic
• New fields and entity types can be added
– without changing the format
– without confusing recipients
• Transparent support for subtyping
– allowing even more flexibility in interpretation
• Support for merging
– again transparent for recipients
26
27. Scenario Step
#7
Database
XTM
App
#2
Service
#2
XTM
SPARQL
Service
#1
SPARQL
Merge
and
XTM
filter
Translator
Broker
formatY
SPARQL,
formatY
Need
all
X
sa<sfying
certain
criteria,
in
format
Y
App
#1
27
28. Conclusion
• Clients don't need to know where the data are
– much looser coupling
– much easier to restructure
• Clients do not need to relate to the many data
models that are in use
– instead they need only refer to the reference model
• Clients do not need to worry about the data
formats used by servers
– instead, they simply ask for data in the format they
want
28