Text Analytics & Linked Data
Management As-a-Service
Marin Dimitrov, Alex Simov, Yavor Petkov
May 31st, 2015
Text Analytics & Linked Data Management -aaS / Wasabi’2015 #1May 2015
About Ontotext
• Provides products & solutions for content
enrichment and metadata management
– 70 employees, headquarters in Sofia (Bulgaria)
– Sales presence in London, NYC & Boston
• Major clients and industries
– Media & Publishing
– Health Care & Life Sciences
– Cultural Heritage & Digital Libraries
– Government
– Education
#2Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Semantic Technology adoption challenges
• The Self-Service Semantic Suite (S4)
• Lessons learned
Contents
#3Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Semantic Technology Adoption
Challenges
#4Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Time-to-value gap (Gartner)
#5Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
From Wasabi @
ESWC’2014
Performance,
Integration,
Penetration,
Payback & ROI
• Limiting factors
– Complexity & cost of existing solutions
– Limited resources to evaluate novel technologies
(startups)
– Slow procurement processes, risk aversion (enterprises)
• How can we…
– Reduce time-to-market
– Reduce adoption risks
– Optimise costs
Semantic Technology adoption
#6Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
The Self-Service Semantic Suite
(S4)
#7Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Capabilities for text analytics, content enrichment
and smart data management
– Text analytics for news, life sciences and social media
– RDF graph database as-a-service
– Access to large open knowledge graphs
• Available on-demand, anytime, anywhere
– Simple RESTful services
• Simple pay-per-use pricing
– No upfront commitments
What is S4?
#8Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
What is S4?
#9Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Enables quick prototyping
– Instantly available, no provisioning & operations
required
– Focus on building applications, don’t worry about
infrastructure
• Free tier!
• Easy to start, shorter learning curve
– Various add-ons, SDKs and demo code
• Based on enterprise semantic technology
Benefits
#10Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Text analytics services
– News annotation
– News categorisation
– Biomedical
– Twitter
• Entity linking & disambiguation
– Mappings to DBpedia & GeoNames instances
– Mappings to biomedical data sources (LinkedLifeData)
• HTML, MS Word, XML, plain text input
• Simple JSON output
Text analytics with S4
#11Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
News analytics example
#12
S4 result
Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Low-cost graph DBaaS available 24/7
• Ideal for small & moderate data volumes
– database options: 1M, 10M, 50M, 250M and 1B triples
• Instantly deploy new databases when needed
• Zero administration: automated operations,
maintenance & upgrades
• Users pay only for the actual database utilisation
– Number of triples stored + number of queries per month
• OpenRDF REST API
Fully managed RDF DB in the Cloud
#13Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Fully managed RDF DB in the Cloud
#14Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• SPARQL query endpoint to the FactForge semantic
data warehouse
– 500 million entities / 5 billion triples
• Key LOD datasets integrated
– DBpedia, Freebase/WikiData, GeoNames, WordNet
– Dublin Core, SKOS, PROTON ontologies and
vocabularies
Knowledge graphs with S4
#15Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Cloud native architecture of S4
#16Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Elasticity vs
High Availability vs
Cost Efficiency
Lessons Learned
#17Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• You must build a “cost aware” cloud platform
• Cloud-native architectures are more efficient, but
more difficult to build
• A microservices architecture improve system
resilience & agility, but difficult to design right
• Extensive and continuous benchmarking &
monitoring
– Some problems emerge only at large scale
• Assume failures will happen & design for resilience
Lessons learned
#18Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Thank you!
#19Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015

Text Analytics & Linked Data Management As-a-Service

  • 1.
    Text Analytics &Linked Data Management As-a-Service Marin Dimitrov, Alex Simov, Yavor Petkov May 31st, 2015 Text Analytics & Linked Data Management -aaS / Wasabi’2015 #1May 2015
  • 2.
    About Ontotext • Providesproducts & solutions for content enrichment and metadata management – 70 employees, headquarters in Sofia (Bulgaria) – Sales presence in London, NYC & Boston • Major clients and industries – Media & Publishing – Health Care & Life Sciences – Cultural Heritage & Digital Libraries – Government – Education #2Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 3.
    • Semantic Technologyadoption challenges • The Self-Service Semantic Suite (S4) • Lessons learned Contents #3Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 4.
    Semantic Technology Adoption Challenges #4TextAnalytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 5.
    Time-to-value gap (Gartner) #5TextAnalytics & Linked Data Management -aaS / Wasabi’2015 May 2015 From Wasabi @ ESWC’2014 Performance, Integration, Penetration, Payback & ROI
  • 6.
    • Limiting factors –Complexity & cost of existing solutions – Limited resources to evaluate novel technologies (startups) – Slow procurement processes, risk aversion (enterprises) • How can we… – Reduce time-to-market – Reduce adoption risks – Optimise costs Semantic Technology adoption #6Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 7.
    The Self-Service SemanticSuite (S4) #7Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 8.
    • Capabilities fortext analytics, content enrichment and smart data management – Text analytics for news, life sciences and social media – RDF graph database as-a-service – Access to large open knowledge graphs • Available on-demand, anytime, anywhere – Simple RESTful services • Simple pay-per-use pricing – No upfront commitments What is S4? #8Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 9.
    What is S4? #9TextAnalytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 10.
    • Enables quickprototyping – Instantly available, no provisioning & operations required – Focus on building applications, don’t worry about infrastructure • Free tier! • Easy to start, shorter learning curve – Various add-ons, SDKs and demo code • Based on enterprise semantic technology Benefits #10Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 11.
    • Text analyticsservices – News annotation – News categorisation – Biomedical – Twitter • Entity linking & disambiguation – Mappings to DBpedia & GeoNames instances – Mappings to biomedical data sources (LinkedLifeData) • HTML, MS Word, XML, plain text input • Simple JSON output Text analytics with S4 #11Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 12.
    News analytics example #12 S4result Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 13.
    • Low-cost graphDBaaS available 24/7 • Ideal for small & moderate data volumes – database options: 1M, 10M, 50M, 250M and 1B triples • Instantly deploy new databases when needed • Zero administration: automated operations, maintenance & upgrades • Users pay only for the actual database utilisation – Number of triples stored + number of queries per month • OpenRDF REST API Fully managed RDF DB in the Cloud #13Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 14.
    Fully managed RDFDB in the Cloud #14Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 15.
    • SPARQL queryendpoint to the FactForge semantic data warehouse – 500 million entities / 5 billion triples • Key LOD datasets integrated – DBpedia, Freebase/WikiData, GeoNames, WordNet – Dublin Core, SKOS, PROTON ontologies and vocabularies Knowledge graphs with S4 #15Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 16.
    Cloud native architectureof S4 #16Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015 Elasticity vs High Availability vs Cost Efficiency
  • 17.
    Lessons Learned #17Text Analytics& Linked Data Management -aaS / Wasabi’2015 May 2015
  • 18.
    • You mustbuild a “cost aware” cloud platform • Cloud-native architectures are more efficient, but more difficult to build • A microservices architecture improve system resilience & agility, but difficult to design right • Extensive and continuous benchmarking & monitoring – Some problems emerge only at large scale • Assume failures will happen & design for resilience Lessons learned #18Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 19.
    Thank you! #19Text Analytics& Linked Data Management -aaS / Wasabi’2015 May 2015