SlideShare a Scribd company logo
Extracting value from data:
Introducing AnnoMarket - the cloud-based text
annotation marketplace

Helen Lippell, Press Association

The research leading to these results has received funding from the European Community's
Seventh Framework Programme (FP7/2007-2013) under grant agreement n°296322”
Getting started
 AnnoMarket is a European research project
 PA part of a consortium with the University of Sheffield,

French start-up IMR and Bulgarian semantic specialists
Ontotext

 I work as a data wrangler within the PA technology team,

working with linked data curation and semantic modelling

 Agenda:
 Brief overview of text analytics
 Introducing the AnnoMarket platform

Women in Data: NLP edition, Open Data Institute, 3 December 2013

2
Text analytics isn’t that new



People have been trying to make sense of
unstructured data for a long time
Rosetta Stone an early use case!





Experts compared patterns in the 3 texts and
eventually could identify entities in the previouslyincomprehensible hieroglyphics

This 1950s definition startlingly accurate:



H.P. Luhn, IBM Journal, 1958:

"...utilize data-processing machines for auto- abstracting
and auto-encoding of documents for creating interest profiles
for each of the 'action points' in an organization. Both
incoming and internally generated documents are
automatically abstracted, characterized by a word pattern,
and sent automatically to appropriate action points”



Now there’s an unprecedented buzz around
text analytics




Big Data movement
Semantics gaining traction in business applications

Women in Data: NLP edition, Open Data Institute, 3 December 2013

3
Who uses text analytics?










Anyone who wants to derive value
from unstructured data at scale
Not just spooks…
Scientific and technical
Media and publishing
Open data community
Researchers
Data-driven businesses
Customer experience

Women in Data: NLP edition, Open Data Institute, 3 December 2013

4
What text analytics can do

 Named entity recognition
 Disambiguation
 Eg Iceland!

 Entity types

 Eg People, places, things, organisations

 Relevance

 Pattern-identified entities
 Eg amounts of money, postcodes

 Co-occurrence
 Classification and categorisation
 Sentiment analysis
Women in Data: NLP edition, Open Data Institute, 3 December 2013

5
AnnoMarket
 The marketplace -

An “App store” for text analysis services

 Breaking down barriers to entry for SMEs (developers and endusers alike)

 Built on robust, mature GATE applications (open-source with global
community supported by the University of Sheffield)

 Benefits to end-users

 Affordable, pay-for-what-you-need model
 SaaS, cloud-based
 Flexible input and output formats

 Benefits to suppliers
 Payments system
 Access to user base

 (A note on look and feel: It is basic at the moment!)
Women in Data: NLP edition, Open Data Institute, 3 December 2013

6
Running an annotation job









Find a service
Test it on site

Upload documents or
specify a custom crawl
Manage server (GATE
Teamware or Mimir)

Platform handles
execution of job, keeps
user updated
Download results or
export to a GATE Mimir
instance
Formats include XML,
HTML, PDF, DOC

- GATE Teamware – web-based
management platform for annotation
- GATE Mimir – open-source
framework for integrated semantic
search

Women in Data: NLP edition, Open Data Institute, 3 December 2013

7
Uploading pipelines

 Straightforward process
 Standard components:

 Pipeline – GATE saved application state
 Supporting files (eg gazetteers)
 Metadata for the platform and user-facing pages

 Files checked then put live
 Platform tracks usage and handles payments

Women in Data: NLP edition, Open Data Institute, 3 December 2013

8
AnnoMarket screenshots

Browsable portal

Tag-based filtering

Input config

Output config

Women in Data: NLP edition, Open Data Institute, 3 December 2013

9
News pipeline tool

- Customised pipeline which annotates named entities in the news domain (optimised for the UK)
- Leverages PA’s knowledgebase and Linked Data references, also other entity types
Women in Data: NLP edition, Open Data Institute, 3 December 2013

10
Get involved

 Public beta
 Register your interest now
 We’ll email when it’s open
 Free credit to early registrants

 Ultimate aim:

 A sustainable platform that generates revenue for

contributors who wouldn’t have an outlet otherwise

 Play with the platform
 Feed back to us – bugs, functionality, finding
resources, what more you’d like to see, etc!

Women in Data: NLP edition, Open Data Institute, 3 December 2013

11
Get in touch

 Public beta – http://annomarket.com
 Project site – https://annomarket.eu
 @AnnoMarket
 helen.lippell@pressassociation.com
 @octodude

Women in Data: NLP edition, Open Data Institute, 3 December 2013

12

More Related Content

Viewers also liked

China Intro
China IntroChina Intro
China Intro
Matthew Lawhead
 
Folksonomia
FolksonomiaFolksonomia
Folksonomia
pipapu2
 
Folksonomies And Complex Works
Folksonomies And Complex WorksFolksonomies And Complex Works
Folksonomies And Complex Works
guestb54ee9
 
Canada Notes
Canada NotesCanada Notes
Canada Notes
Matthew Lawhead
 
Make Enterprise Search Less Broken
Make Enterprise Search Less BrokenMake Enterprise Search Less Broken
Make Enterprise Search Less Broken
Helen Lippell
 
Sharable notes
Sharable notesSharable notes
Sharable notes
Matthew Lawhead
 
Make Enterprise Search Less Broken
Make Enterprise Search Less BrokenMake Enterprise Search Less Broken
Make Enterprise Search Less Broken
Helen Lippell
 
Make Enterprise Search Less Broken
Make Enterprise Search Less BrokenMake Enterprise Search Less Broken
Make Enterprise Search Less Broken
Helen Lippell
 
Review Unit 4
Review Unit 4Review Unit 4
Review Unit 4cochran86
 
Southwest Asia Notes
Southwest Asia NotesSouthwest Asia Notes
Southwest Asia Notes
Matthew Lawhead
 
Let's CLIL! Definition.
Let's CLIL! Definition.Let's CLIL! Definition.
Let's CLIL! Definition.
Montse Irun _Chavarria
 
Aula Hidrologia - Método Racional
Aula Hidrologia - Método RacionalAula Hidrologia - Método Racional
Aula Hidrologia - Método Racional
Lucas Sant'ana
 
Global History Regents Review
Global History Regents ReviewGlobal History Regents Review
Global History Regents Reviewguest88d06e
 
Avaluació per i de l'aprenentatge
Avaluació per i de l'aprenentatgeAvaluació per i de l'aprenentatge
Avaluació per i de l'aprenentatge
Montse Irun _Chavarria
 
South America - Intro PPT
South America - Intro PPTSouth America - Intro PPT
South America - Intro PPT
Matthew Lawhead
 
Plantejament seqüència didàctica digital
Plantejament seqüència didàctica digitalPlantejament seqüència didàctica digital
Plantejament seqüència didàctica digital
Montse Irun _Chavarria
 
What language learners need to know
What language learners need to knowWhat language learners need to know
What language learners need to know
Montse Irun _Chavarria
 

Viewers also liked (18)

China Intro
China IntroChina Intro
China Intro
 
Folksonomia
FolksonomiaFolksonomia
Folksonomia
 
Folksonomies And Complex Works
Folksonomies And Complex WorksFolksonomies And Complex Works
Folksonomies And Complex Works
 
Canada Notes
Canada NotesCanada Notes
Canada Notes
 
Make Enterprise Search Less Broken
Make Enterprise Search Less BrokenMake Enterprise Search Less Broken
Make Enterprise Search Less Broken
 
Sharable notes
Sharable notesSharable notes
Sharable notes
 
Make Enterprise Search Less Broken
Make Enterprise Search Less BrokenMake Enterprise Search Less Broken
Make Enterprise Search Less Broken
 
Make Enterprise Search Less Broken
Make Enterprise Search Less BrokenMake Enterprise Search Less Broken
Make Enterprise Search Less Broken
 
Review Unit 4
Review Unit 4Review Unit 4
Review Unit 4
 
Southwest Asia Notes
Southwest Asia NotesSouthwest Asia Notes
Southwest Asia Notes
 
Chemistry
ChemistryChemistry
Chemistry
 
Let's CLIL! Definition.
Let's CLIL! Definition.Let's CLIL! Definition.
Let's CLIL! Definition.
 
Aula Hidrologia - Método Racional
Aula Hidrologia - Método RacionalAula Hidrologia - Método Racional
Aula Hidrologia - Método Racional
 
Global History Regents Review
Global History Regents ReviewGlobal History Regents Review
Global History Regents Review
 
Avaluació per i de l'aprenentatge
Avaluació per i de l'aprenentatgeAvaluació per i de l'aprenentatge
Avaluació per i de l'aprenentatge
 
South America - Intro PPT
South America - Intro PPTSouth America - Intro PPT
South America - Intro PPT
 
Plantejament seqüència didàctica digital
Plantejament seqüència didàctica digitalPlantejament seqüència didàctica digital
Plantejament seqüència didàctica digital
 
What language learners need to know
What language learners need to knowWhat language learners need to know
What language learners need to know
 

Similar to AnnoMarket - Cloud-based text analytics

Vital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent AppsVital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent Apps
Vital.AI
 
ITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit AnalyticsITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit Analytics
Andrew Clark
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impact
Elena Simperl
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningEditor IJCATR
 
Intro to AI.pptx
Intro to AI.pptxIntro to AI.pptx
Intro to AI.pptx
TusharGupta635112
 
OpenML data@Sheffield
OpenML data@SheffieldOpenML data@Sheffield
OpenML data@Sheffield
Joaquin Vanschoren
 
India build problem
India build problemIndia build problem
India build problem
ICE CUBE
 
Big social data analytics - social network analysis
Big social data analytics - social network analysis Big social data analytics - social network analysis
Big social data analytics - social network analysis
Jari Jussila
 
YASH DATA SCIENCE SEMINAR.pptx
YASH DATA SCIENCE SEMINAR.pptxYASH DATA SCIENCE SEMINAR.pptx
YASH DATA SCIENCE SEMINAR.pptx
YashShiva3
 
UCIAD overview
UCIAD overviewUCIAD overview
UCIAD overview
Mathieu d'Aquin
 
Data Market Austria and Data Science Continuing Education Course
Data Market Austria and Data Science Continuing Education CourseData Market Austria and Data Science Continuing Education Course
Data Market Austria and Data Science Continuing Education Course
Vienna Data Science Group
 
Streaming HYpothesis REasoning
Streaming HYpothesis REasoningStreaming HYpothesis REasoning
Streaming HYpothesis REasoning
William Smith
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !
Christophe Guéret
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
Elena Simperl
 
Icwe2016 CRS4 Lugano
Icwe2016 CRS4 LuganoIcwe2016 CRS4 Lugano
Icwe2016 CRS4 Lugano
Cristian Lai
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
Elena Simperl
 
Workshop_Presentation.pptx
Workshop_Presentation.pptxWorkshop_Presentation.pptx
Workshop_Presentation.pptx
RUDRAPRASADSABAR
 
Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Denodo
 
Modern association rule mining methods
Modern association rule mining methodsModern association rule mining methods
Modern association rule mining methods
ijcsity
 
A Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfA Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdf
GeethaPratyusha
 

Similar to AnnoMarket - Cloud-based text analytics (20)

Vital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent AppsVital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent Apps
 
ITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit AnalyticsITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit Analytics
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impact
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data Mining
 
Intro to AI.pptx
Intro to AI.pptxIntro to AI.pptx
Intro to AI.pptx
 
OpenML data@Sheffield
OpenML data@SheffieldOpenML data@Sheffield
OpenML data@Sheffield
 
India build problem
India build problemIndia build problem
India build problem
 
Big social data analytics - social network analysis
Big social data analytics - social network analysis Big social data analytics - social network analysis
Big social data analytics - social network analysis
 
YASH DATA SCIENCE SEMINAR.pptx
YASH DATA SCIENCE SEMINAR.pptxYASH DATA SCIENCE SEMINAR.pptx
YASH DATA SCIENCE SEMINAR.pptx
 
UCIAD overview
UCIAD overviewUCIAD overview
UCIAD overview
 
Data Market Austria and Data Science Continuing Education Course
Data Market Austria and Data Science Continuing Education CourseData Market Austria and Data Science Continuing Education Course
Data Market Austria and Data Science Continuing Education Course
 
Streaming HYpothesis REasoning
Streaming HYpothesis REasoningStreaming HYpothesis REasoning
Streaming HYpothesis REasoning
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 
Icwe2016 CRS4 Lugano
Icwe2016 CRS4 LuganoIcwe2016 CRS4 Lugano
Icwe2016 CRS4 Lugano
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
Workshop_Presentation.pptx
Workshop_Presentation.pptxWorkshop_Presentation.pptx
Workshop_Presentation.pptx
 
Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...
 
Modern association rule mining methods
Modern association rule mining methodsModern association rule mining methods
Modern association rule mining methods
 
A Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfA Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdf
 

Recently uploaded

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 

Recently uploaded (20)

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 

AnnoMarket - Cloud-based text analytics

  • 1. Extracting value from data: Introducing AnnoMarket - the cloud-based text annotation marketplace Helen Lippell, Press Association The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement n°296322”
  • 2. Getting started  AnnoMarket is a European research project  PA part of a consortium with the University of Sheffield, French start-up IMR and Bulgarian semantic specialists Ontotext  I work as a data wrangler within the PA technology team, working with linked data curation and semantic modelling  Agenda:  Brief overview of text analytics  Introducing the AnnoMarket platform Women in Data: NLP edition, Open Data Institute, 3 December 2013 2
  • 3. Text analytics isn’t that new   People have been trying to make sense of unstructured data for a long time Rosetta Stone an early use case!   Experts compared patterns in the 3 texts and eventually could identify entities in the previouslyincomprehensible hieroglyphics This 1950s definition startlingly accurate:  H.P. Luhn, IBM Journal, 1958: "...utilize data-processing machines for auto- abstracting and auto-encoding of documents for creating interest profiles for each of the 'action points' in an organization. Both incoming and internally generated documents are automatically abstracted, characterized by a word pattern, and sent automatically to appropriate action points”  Now there’s an unprecedented buzz around text analytics   Big Data movement Semantics gaining traction in business applications Women in Data: NLP edition, Open Data Institute, 3 December 2013 3
  • 4. Who uses text analytics?         Anyone who wants to derive value from unstructured data at scale Not just spooks… Scientific and technical Media and publishing Open data community Researchers Data-driven businesses Customer experience Women in Data: NLP edition, Open Data Institute, 3 December 2013 4
  • 5. What text analytics can do  Named entity recognition  Disambiguation  Eg Iceland!  Entity types  Eg People, places, things, organisations  Relevance  Pattern-identified entities  Eg amounts of money, postcodes  Co-occurrence  Classification and categorisation  Sentiment analysis Women in Data: NLP edition, Open Data Institute, 3 December 2013 5
  • 6. AnnoMarket  The marketplace - An “App store” for text analysis services  Breaking down barriers to entry for SMEs (developers and endusers alike)  Built on robust, mature GATE applications (open-source with global community supported by the University of Sheffield)  Benefits to end-users  Affordable, pay-for-what-you-need model  SaaS, cloud-based  Flexible input and output formats  Benefits to suppliers  Payments system  Access to user base  (A note on look and feel: It is basic at the moment!) Women in Data: NLP edition, Open Data Institute, 3 December 2013 6
  • 7. Running an annotation job        Find a service Test it on site Upload documents or specify a custom crawl Manage server (GATE Teamware or Mimir) Platform handles execution of job, keeps user updated Download results or export to a GATE Mimir instance Formats include XML, HTML, PDF, DOC - GATE Teamware – web-based management platform for annotation - GATE Mimir – open-source framework for integrated semantic search Women in Data: NLP edition, Open Data Institute, 3 December 2013 7
  • 8. Uploading pipelines  Straightforward process  Standard components:  Pipeline – GATE saved application state  Supporting files (eg gazetteers)  Metadata for the platform and user-facing pages  Files checked then put live  Platform tracks usage and handles payments Women in Data: NLP edition, Open Data Institute, 3 December 2013 8
  • 9. AnnoMarket screenshots Browsable portal Tag-based filtering Input config Output config Women in Data: NLP edition, Open Data Institute, 3 December 2013 9
  • 10. News pipeline tool - Customised pipeline which annotates named entities in the news domain (optimised for the UK) - Leverages PA’s knowledgebase and Linked Data references, also other entity types Women in Data: NLP edition, Open Data Institute, 3 December 2013 10
  • 11. Get involved  Public beta  Register your interest now  We’ll email when it’s open  Free credit to early registrants  Ultimate aim:  A sustainable platform that generates revenue for contributors who wouldn’t have an outlet otherwise  Play with the platform  Feed back to us – bugs, functionality, finding resources, what more you’d like to see, etc! Women in Data: NLP edition, Open Data Institute, 3 December 2013 11
  • 12. Get in touch  Public beta – http://annomarket.com  Project site – https://annomarket.eu  @AnnoMarket  helen.lippell@pressassociation.com  @octodude Women in Data: NLP edition, Open Data Institute, 3 December 2013 12