Building a semantic enterprise content management system from scratch v1

•

2 likes•1,088 views

How we built a practical ontology-driven corporate intranet portal in the cloud in three months using off-the-shelf technology. Presented at SemTechBiz San Francisco, June 6th 2012.

Technology Education

Building a Semantic Enterprise
Content Management System from Scratch
How
we
built
a
prac/cal
ontology-‐driven
corporate
intranet
portal
in
the
cloud
in
three
months
using
oﬀ-‐the-‐shelf
technology

SemTechBiz
San
Francisco,
June
6th
2012
Ron
Michael
Ze-lemoyer
and
Cliﬀ
Jurkiewicz
@ronmichael
and
@cessna_pilot

Mobile & Desktop Apps

Web Apps & Services

fynydd

:in-‐id
-‐
noun
Semantic Knowledge Management

1.

a
word
of
Welsh
origin

meaning
mountain.
User Interface Design
2.

a
company
of
big
thinkers,

innovative
problem
solvers,

and
doers.
Systems Architecture

fynydd.com
Reporting & Analytics

How we got here

@thomson “TranslaAonal

reuters #kolexperts
@jwindz medicine
meets
the

semanAc
web”
#semtech
2009
#sla2009

@candp

#stardog

@ronmichael

@fynydd Cambridge
#semtechbiz 2012

Steve
Jobs
Crea%vity
is
just
connec%ng
things.

Traditional enterprise content management

Andy
Warhol
They
say
that
/me

changes
things,
but

you
actually
have
to

change
them
yourself.

Semantic enterprise content management

represents
recognizes
responds
to

the
meaning
of
content
the
goals
of
users

Build it yourself

Julius
Caesar
Crea/ng
is
the
essence
of
life.

Stand on the shoulders of giants

Henry
Ford
I
invented
nothing
new.
I
simply
assembled

the
discoveries
of

other
people.
Had
I

worked
ﬁBy
or
ten
or

even
ﬁve
years

before,
I
would
have

failed.
So
it
is
with

every
new
thing.

Keep your head in the cloud

Henry
David
Thoreau

If
you
have
built
castle
s
in
the
air,

your
work
need
not
be
that
is
where
they
sho
lost;
uld
be.

Be agile

arles
Darwin
Ch the
species
trongest
of
ntelligent.
I t
is
not
the
s r
the
most
i
that
survives
no the
most
adaptable
It
is
the
o ne
that
is

to
change.

Tame your content

Dr.
Seuss
So
the
writer
who
breeds
more
words
than
he
needs,
is
making
a
chore
for
the
reader
who
reads.

Foundation
Microsoft

SharePoint

?
Cambridge

Ontology

• Deﬁne
your
goal:
increase
content
ﬁndability
• Build
simply
and
as
you
need
it
• Provide
simple
management
tools
• Sell
stakeholders
on
its
value
• Hide
it
from
users

Browse

• Research
and
curate
top
level
menus
• Generate
dynamic
sub
menus
• Generate
related
content
links
• Adopt
friendly
URLs
• Design
beau/ful
pages

Search

• Start
with
autocomplete
• Use
a
“snap-‐to-‐grid”
approach
• Make
it
contextual
and
personalized
• Provide
federated
and
adap/ve
results
• Design
beau/ful
search
results

Search
User
input

Context

Content
SPARQL OperaAons SQL
metadata Ontology LINQ Content
data

Public Secret AnalyAcal
datasets sauce data

Results
&
suggesAons

Administration

• Give
authors
manual
&
automa/c
tagging
• Show
content-‐level
analy/cs

• Build
a
great
editor
• Design
beau/ful
adminsitra/ve
tools

Keep moving

Lexus
Anything
not
is
moving
bac
moving
forw
ard

kward.

Start building

William
Wordsworth
To
begin,
begin.

Libraries and Code
dotNetRDF
h-p://dotnetrdf.org

Squickl
SQL
data
access
library
h-ps://github.com/ronmichael/squickl.net

AWS
Snapshot
Scheduler
h-ps://github.com/ronmichael/aws-‐snapshot-‐scheduler

Stardog
Bites
MSSQL
CLR
extensions
h-ps://github.com/ronmichael/stardog-‐bites-‐mssql

CFrame
Content
Management
Framework
h-ps://github.com/ronmichael/cframe

dotNetRDF
Stardog
Helper
h-ps://github.com/ronmichael/dotnetrdf-‐stardog-‐helper

References
IntegraAng
SemanAc
Systems
John
F.
Sowa:
h-p://go.fynydd.com/vxzum

An
Ontology-‐Based
Knowledge
Management
Pla]orm
Aldea
et
al:
h-p://go.fynydd.com/opble

SemanAc
Enterprise
Content
Management
Mark
Fisher,
Amit
Sheth:
h-p://go.fynydd.com/qﬂlv

The
SemanAc
Web
and
Entertainment
Weekly
Donna
Slawsky:
h-p://go.fynydd.com/dygpj

Improving
Content
Management
with
SemanAc
Technologies
Fernando
Carolo
and
Leonardo
Burlamaqui:
h-p://go.fynydd.com/bpvor

Content
Management
Bible
Bob
Boiko:
h-p://go.fynydd.com/xhjbi

Recently uploaded (20)

Decarbonising Buildings: Making a net-zero built environment a reality

Emixa Mendix Meetup 11 April 2024 about Mendix Native development

Moving Beyond Passwords: FIDO Paris Seminar.pdf

Design pattern talk by Kaya Weers - 2024 (v2)

Generative Artificial Intelligence: How generative AI works.pdf

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx

MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24

Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)

Infrared simulation and processing on Nvidia platforms

All These Sophisticated Attacks, Can We Really Detect Them - PDF

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx

Top 10 Hubspot Development Companies in 2024

Zeshan Sattar- Assessing the skill requirements and industry expectations for...

Testing tools and AI - ideas what to try with some tool examples

A Framework for Development in the AI Age

Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...

So einfach geht modernes Roaming fuer Notes und Nomad.pdf

A Glance At The Java Performance Toolbox

Building a semantic enterprise content management system from scratch v1

1. Building a Semantic Enterprise Content Management System from Scratch How we built a prac/cal ontology-‐driven corporate intranet portal in the cloud in three months using oﬀ-‐the-‐shelf technology SemTechBiz San Francisco, June 6th 2012 Ron Michael Ze-lemoyer and Cliﬀ Jurkiewicz @ronmichael and @cessna_pilot

2. Mobile & Desktop Apps Web Apps & Services fynydd :in-‐id -‐ noun Semantic Knowledge Management 1. a word of Welsh origin meaning mountain. User Interface Design 2. a company of big thinkers, innovative problem solvers, and doers. Systems Architecture fynydd.com Reporting & Analytics

3. How we got here @thomson “TranslaAonal reuters #kolexperts @jwindz medicine meets the semanAc web” #semtech 2009 #sla2009 @candp #stardog @ronmichael @fynydd Cambridge #semtechbiz 2012 Steve Jobs Crea%vity is just connec%ng things.

4. Traditional enterprise content management Andy Warhol They say that /me changes things, but you actually have to change them yourself.

5. Semantic enterprise content management represents recognizes responds to the meaning of content the goals of users

6. Build it yourself Julius Caesar Crea/ng is the essence of life.

7. Stand on the shoulders of giants Henry Ford I invented nothing new. I simply assembled the discoveries of other people. Had I worked ﬁBy or ten or even ﬁve years before, I would have failed. So it is with every new thing.

8. Keep your head in the cloud Henry David Thoreau If you have built castle s in the air, your work need not be that is where they sho lost; uld be.

9. Be agile arles Darwin Ch the species trongest of ntelligent. I t is not the s r the most i that survives no the most adaptable It is the o ne that is to change.

10. Tame your content Dr. Seuss So the writer who breeds more words than he needs, is making a chore for the reader who reads.

11. Architecture dotNetRDF

12. Foundation Microsoft SharePoint ? Cambridge

13. Ontology • Deﬁne your goal: increase content ﬁndability • Build simply and as you need it • Provide simple management tools • Sell stakeholders on its value • Hide it from users

14. Browse • Research and curate top level menus • Generate dynamic sub menus • Generate related content links • Adopt friendly URLs • Design beau/ful pages

15. Search • Start with autocomplete • Use a “snap-‐to-‐grid” approach • Make it contextual and personalized • Provide federated and adap/ve results • Design beau/ful search results

16. Search User input Context Content SPARQL OperaAons SQL metadata Ontology LINQ Content data Public Secret AnalyAcal datasets sauce data Results & suggesAons

17. Administration • Give authors manual & automa/c tagging • Show content-‐level analy/cs • Build a great editor • Design beau/ful adminsitra/ve tools

18. Keep moving Lexus Anything not is moving bac moving forw ard kward.

19. Start building William Wordsworth To begin, begin.

20. Libraries and Code dotNetRDF h-p://dotnetrdf.org Squickl SQL data access library h-ps://github.com/ronmichael/squickl.net AWS Snapshot Scheduler h-ps://github.com/ronmichael/aws-‐snapshot-‐scheduler Stardog Bites MSSQL CLR extensions h-ps://github.com/ronmichael/stardog-‐bites-‐mssql CFrame Content Management Framework h-ps://github.com/ronmichael/cframe dotNetRDF Stardog Helper h-ps://github.com/ronmichael/dotnetrdf-‐stardog-‐helper

21. References IntegraAng SemanAc Systems John F. Sowa: h-p://go.fynydd.com/vxzum An Ontology-‐Based Knowledge Management Pla]orm Aldea et al: h-p://go.fynydd.com/opble SemanAc Enterprise Content Management Mark Fisher, Amit Sheth: h-p://go.fynydd.com/qﬂlv The SemanAc Web and Entertainment Weekly Donna Slawsky: h-p://go.fynydd.com/dygpj Improving Content Management with SemanAc Technologies Fernando Carolo and Leonardo Burlamaqui: h-p://go.fynydd.com/bpvor Content Management Bible Bob Boiko: h-p://go.fynydd.com/xhjbi

22. fynydd.com Don’t forget your towel.

Editor's Notes

About three years ago Jesse Dudley was working at Thomson Reuters on a product called KOLexperts that identifies experts in the pharma and biotech industries by analyzing content in places like PubMed. She attended the Special Libraries Association (SLA) Conference in June of 2009 in DC and, because of her work on KOLexperts, she attended a presentation titled “ Translational medicine meets the semantic web” by Olivier Bodenreider from the National Library of Medicine. This was her introduction into semtech after which she started spreading semtech stuff to me and I spread it along to Fynydd. It had obvious value for a lot of enterprise knowledge management tools we work on. So as we worked with customers interested in improving their knowledge sharing tools and intranets we started experimenting and recommending it. We started working with Clark and Parsia and began building prototype content management systems that ran on Stardog, their new RDF database. This eventually resulted in a semantic content management prototype and framework we called Cambridge, which has been well received in various incarnations by a couple clients. And then almost exactly three years from SLA 2009 we are speaking at SemtechBiz 2012.
Traditional ECM is most often the intranet portal. It’s primitive, slow to change, hard to deploy. It’s broken. It’s time to change.
SECMS tries to solve some of these problems by understanding the meaning of content and the goals of users. SECMS is the intersection of meaning and goals. We store information in more logical and standard formats (RDF) and use more modern and standard tools (SPARQL) to query them.
Some design principles. First is build it yourself. Often debated - no perfect answer. Why did we? -Semtech marketplace for this kind of thing is in its infancy, esp. UI and UX -Innovative and cutting edge solution -Tools shape thinking- differentiate yourself
Next: don’t build all of it yourself. Its the age of the mashup. Get advice and assistance from the best in the field. Build using the best software components and tools, open source, commercial, etc.
The cliched cloud slide. Why does the cloud matter? Provisioning real servers is slow and costly, bureaucratic. Even if final deployment is onsite, cloud is great for prototyping. Scale quickly. Cheaper and more efficient servers. While prototyping you can never be sure what resources you’ll need.
Another cliched slide: agile development. But why does it matter? Talk to clients - end users, not management - understand problems. Build iteratively. Build a system that doesn't require lots of documentation Build iteratively. Respond to change in business, marketplace, technology, capabilities. I
Last design principle: sometimes you need to upgrade your content. Our policy & procedure story. Started thinking how to build tool to deal with existing content. But content was written and organized for an old medium - paper - then pushed to PDF. Redundant, disorganized, mixed together. Once we switched gears, rewrote & improve content, solution was easier to build and better for users.
Now for implementation. AWS: Incredibly flexible and innovative .NET and C#: great framework, language, well accepted in enterprise MSSQL: good for non-RDF needs, well accepted in enterprise, SQL Express is free Stardog: great RDF database, fast and easy to use dotNetRDF: open source, talk to Stardog with ease
.NET is our platform but what about a foundation? Build or buy? Lots of debate and procrastination. All choices required similar development times Build your own: faster to prototype, most flexible, better ability to innovate Avoid politics of deciding between systems already in place [lotus quickr, teamsite, sharepoint] Generic .NET solution moves easily into whatever framework customer has/wants
One of our biggest problem was overcomplicating the ontology, e.g. answer questions Define goal : findability. build as you need it Don’t make it complicated, build as you need it. Treat ontology like content not code. build nice tools, prepare for it to change often. Biggest thing of all - don’t talk to users too much ontology (or tech in general). it’s only a means to an end. But selling its value to stakeholders can work.
Initially planned for dynamic menus based on role, but too complicated & unnecessary. Curated top menus based on user research, card sorting, etc worked best. Dynamic sub menus and related content links work. Friendly urls are often forgotten - good for experts, for sharing Beautiful page - UX - layouts - whitespace & margins- improve browsability and user satisfaction.
Don’t delay autocomplete, it improves search dramatically. Take your inputs and “snap them to a grid” to find an answer. Context is important, personalization is important Federation: include all types of results. Adaptive: build in your own analytics early on and use them for self diagnosis and improvement Beautiful results are easier to read.
Tagging: simple approach of picking “subject” (hasSubject) and “audience” (hasAudience) entities from a hierarchical view of select pieces of ontology. Expand to let them choose other relationships ( eg. hasDestination mars) Simple auto tagging recommendations by matching text; add more complex with tools like Open Calais? Inline analytics were very valuable tool for authors and mgmt. Of course, editor has to be great, as should entire admin -- too often ignored.
Must constantly improve - plan and budget for it early on. Start with a basic tool that looks great and has some semantics, prove it, grow it. People are used to constant improvement - internet, cars, etc. Focus on search, navigation, UX and performance.

Building a semantic enterprise content management system from scratch v1

Recommended

Recommended

More Related Content

Similar to Building a semantic enterprise content management system from scratch v1

Similar to Building a semantic enterprise content management system from scratch v1 (20)

Recently uploaded

Recently uploaded (20)

Building a semantic enterprise content management system from scratch v1

Editor's Notes