Panacea Project Tutorial (MT Summit 2013)

TRL Group - Institute for Applied Linguistics (Universitat Pompeu Fabra)
TRL Group - Institute for Applied Linguistics (Universitat Pompeu Fabra)TRL Group - Institute for Applied Linguistics (Universitat Pompeu Fabra)
PANACEA Tutorial
MT Summit
3rd September 2013
Marc Poch, Universitat Pompeu Fabra
marc.pochriera@upf.edu
Antonio Toral, Dublin City University
atoral@computing.dcu.ie
1
Tutorial Objectives
• Become familiar with the web service paradigm and its
advantages
• Become familiar with the PANACEA platform
• Learn how to search and use web services
• Learn how to chain web services to build more complex
processing pipelines (workflows)
• Learn how to create and share your own web services
2
Tutorial outline
• Introduction to the PANACEA platform [30 min]
• Tour of the PANACEA webs [15 min]
• Find and try web services [45 min]
• Interoperability [10 min]
• Break [15 min]
• Chain web services: workflows [60 min]
• Be part of the platform: creating and sharing web services
[30 min]
3
Introduction to the PANACEA platform [30 min]
4
PANACEA Project
Objectives
 Development of a platform (a space of interoperability
defined by standardized protocols and common
interfaces) for the easy integration of a variety of
software components, tools and methodologies deployed
as web services to configure a factory for the automation
of acquisition, processing and annotation of language
resources.
5
6
Tools to be
shared
remotely
Web Service
wrapper
The Registry
Common
Interfaces
Format
Converters
Workflow
editor and
engine
Sharing
workflows
From local tools
to
sharing workflows
Platform definition
• The PANACEA platform is an
interoperability space based on tools,
guidelines, a Common Interface definition,
and a “Travelling Object” specification
Components:
• Tools: Taverna, BioCatalogue, myExperiment,
Soaplab, storage system
• Common Interface: WS interoperability
• Travelling Object: XCES, GrAF, CoNLL, LMF
• Documentation
Formal
definition
Technical
Definition
7
Clients: Java,
Python, Perl, etc.
Platform tools and portals
8
JAX-WS,
Axis, CXF,
etc.
Workflows
Social
Network
RegistryWeb Services
Share tools
(remotely run
distributed tools)
Share and find
Web Services
Call / chain
Web Services
Share and find
workflows
SOAP or
REST
Soaplab
Server
Biocatalogue Taverna
www.taverna.org.uk
PANACEA Registry:
registry.elda.org
PANACEA myExperiment:
myexperiment.elda.org
myExperiment
PANACEA Platform: uses, adapts and improves myGrid tools for eScience (used
in biology, social science, music, astronomy, multimedia and chemistry).
Technological option:
Web Services
SOAPLAB 2
(SOAP)
• Easy deployment of command line
tools as WS. (Java, Python, C++,
UIMA, etc. )
• Clients: Java, Python, Perl, Taverna,
etc.
• No coding needed! Only metadata
• “Polling” techniques for long
lasting tasks
• Web form to run the web services
• URL input / output ready
• PANACEA improvement for SOAP
messaging (network usage and memory)
• PANACEA limit multiple users
TAVERNA
BioCatalogue
Web Services
Workflow
editor
Registry
Social
network
myExperiment
9
Technological option:
Registry and myExperiment
SOAPLAB 2
(SOAP)
• User friendly GUI
• Free, open source, Continuously
maintained
• Search function
• Users rating (users feedback)
• Service annotations and Language
Categorization (PANACEA)
• Monitoring system (web service
status and data results)
TAVERNA
BioCatalogue
Web Services
Workflow
editor
Registry
Social
network
myExperiment
10
Technological option:
Taverna
SOAPLAB 2
(SOAP)
• User friendly GUI
• Free and open source
• Continuously maintained (v. 2.4)
• SOAP and REST web services
• Credentials manger (passwords,
certificates, etc.)
• Multiple files processing (“lists”)
• PANACEA Workflows, best
practises, videos, etc. :
• Parallelization, Error recovery:
“retries”, Polling
• PANACEA collaboration: bug fixing
and pre-release tests
TAVERNA
BioCatalogue
Web Services
Workflow
editor
Registry
Social
network
myExperiment
11
12
Service providers
How can I
share the
tools on my
server?
Where can I
make my Web
Services public?
JAX-WS,
Axis, CXF,
etc.
Web Services
SOAP or
REST
Soaplab
Server
Registry Biocatalogue
PANACEA Registry:
registry.elda.org
13
Can I run tools
without installing
them?
Where can I find
Web Services?
How can I run
Web Services?
How can I chain
WS?
Web Services
Registry
-Java, Python, perl clients, etc.
- Web clients: Soaplab Spinet
-Java, Python, perl
clients, etc.
- Taverna workflows
Workflows
Users
Tour of the PANACEA webs [15 min]
14
http://panacea-lr.eu
15
http://registry.elda.org http://myexperiment.elda.org
http://panacea-lr.eu/en/tutorials http://panacea-lr.eu/en/info-for-
professionals/documents/
The PANACEA web
• Main Page
• Link Buttons (Registry, myExperiment, tutorials and
documentation)
• Tutorials Page / Videos
• Documentation
• Deliverables
http://panacea-lr.eu
16
The Registry
• The PANACEA Registry is a BioCatalogue instance (the
source code has been used to deploy the registry on a server)
• Features:
• Annotation capabilities and categorization
• Search function
• Automatic status check system for web services
http://registry.elda.org
17
myExperiment
• The PANACEA myExperiment is a myExperiment
instance (the source code has been used to deploy it on a server)
• Features:
• Annotation capabilities
• Search function
• “Services tab beta” added to PANACEA myExperiment.
Users can list web services from the Registry and see in
which workflows have been used. 
18
Find and try web services [45 min]
19
Registry tutorial
• http://registry.elda.org
• Global view of the Registry
• Search engine
• Categorization System
• Metadata and documentation
• Monitoring System
• http://vimeo.com/24790416
20
Calling Web Services
• Spinet web client (Soaplab web services)
• Taverna
• SOAP, WSDL. Examples (perl, python)
– http://ws02.iula.upf.edu/panacea/examples/soaplab-clients/soaplab_clients.zip
• Soaplab command-line client
sh $SOAPLAB_FOLDER/build/run/run-cmdline-client
-protocol axis1
-e http://srv-cngl.computing.dcu.ie/panacea-soaplab2-axis/services/panacea.europarl_lowercase
-w -r input_direct_data "ASDA"
21
Spinet Tutorial
• Spinet is the Soaplab web client used to test and run WS
deployed on a Soaplab Server.
• Every Service provider has (at least) a Soaplab Server
• the Demo...
- Access Spinet directly from the Registry
“Test Form Location (Spinet Web Client):”
- Configure mandatory parameters and RUN the WS
- 10 minutes to try to find and run some web services.
You can start from http://registry.elda.org
22
Exercise
Twitter NLP + Registry
(3rd party tool) 
• This web service is based on the Twitter NLP tool developed by
Noah's ARK group.
• Noah's ARK group is Noah Smith's research group at the
Language Technologies Institute, School of Computer Science,
Carnegie Mellon University.
1. Search the WS in the Registry
2. Check monitoring system
3. Use web client with example data
23
WS advantages (for users)
• No installation
• No maintenance
• No machine resources
• Easily found on the Registry
• Usability
• Can be combined in workflows (share experiments)
24
Interoperability [10 min]
25
• Three levels of interoperability:
– Communication protocols: SOAP, REST
– Data
– Parameters
Format
N
Tool
A
Format
M
Tool
B
Format
L
Tool
C
Format
N
Tool
A
empty
Tool
B
empty
Tool
C
Interoperability
Tool B does not “understand” format N!
All tools understand the previous format
Tool
A
Tool
B
A
B
C
D
A
B
C
D
Tool
A
Tool
B
Y
T
Q
Z
A
B
C
D
26
Common Interface
• A Common Interface (CI) defines the mandatory
parameters for every functionality, e.g. PoS
tagging:
PoS
Tagger A
MANDATORY:
input
language
OPTIONALS:
Param A
PoS
Tagger B
MANDATORY:
input
language
OPTIONALS:
PARAM 1
PARAM 2
http://panacea-lr.eu/en/info-for-professionals/documents/
http://registry.elda.org 27
Travelling Object
• Travelling Object (TO): common data and metadata
format used in PANACEA to make components
interoperable
• TO1: minimal common vertical in-line format used by
deployed tools (based on the XCES standard)
• TO2: stand-off format. Based on the GrAF standard, the
XML serialization of LAF (ISO 24612, 2009)
• LMF: for lexical resources
• CONLL: for parsers
• Converters and adapted WS outputs
28
Format Converters
31 Format converters on the PANACEA Registry
• Freeling to TO. CNR http://registry.elda.org/services/207
• KAF to TO. CNR http://registry.elda.org/services/208
• Basic Xces to txt. CNR http://registry.elda.org/services/209
• PoS tag. (Freeling treetagger) to GrAF. UPF http://registry.elda.org/services/142
• Dependency parsing (Freeling) to GrAF. UPF http://registry.elda.org/services/197
• Dependency CoNLL to GrAF. CNR http://registry.elda.org/services/254
• Word doc to txt. UPF http://registry.elda.org/services/112
• In-house mwe to LMF. CNR http://registry.elda.org/services/296
• Pdf to text. UPF http://registry.elda.org/services/116
• Multi. encodings converter (ISO, UTF, etc.). UPF http://registry.elda.org/services/114
• Aligner to TO. DCU http://registry.elda.org/services/69
• Sentence alignment to TMX. DCU http://registry.elda.org/services/219
• Treetagger to MOSES (factored models). DCU http://registry.elda.org/services/275
• UIMA to GrAF. ILSP http://registry.elda.org/services/182
Providers are encouraged to provide converters for the formats they are interested on
29
3rd party tools integration
• PANACEA WS wrapper (Soaplab) and the CI make
it easy for WS Providers to integrate 3rd party tools.
• ILSP tools are UIMA tools UIMA
• Freeling UPC
• Treetagger University of Stuttgart
• Twitter NLP Carnegie Mellon University
• MALT Parser Uppsala University
• DeSR Università di Pisa
• MOSES, GIZA++, other aligners Edinburgh, etc.
• DELiC4MT, MT evaluation DCU
• Berckeley tagger, parser, aligner Berkeley University
30
Chain web services: workflows [60 min]
31
Workflows
• Once we can run WS...
• …it’s time to chain them
• Workflows are process chains that combine
multiple WS and/or processors.
• We use Taverna 2.4 http://www.taverna.org.uk
- Documentation:
http://dev.mygrid.org.uk/wiki/display/taverna/Documentation+and+Videos
Quick start guide, videos, etc.
32
Workflow tutorial
• Find and run a workflow
• http://vimeo.com/28449833
• Building a workflow from scratch
• http://vimeo.com/28450024
33
Workflow Demos I
Web cleaner and anonymizer
http://myexperiment.elda.org/workflows/98
• Input: a list of URLs to process
– Example: a web article from www.fifa.com
1. ILSP Web cleaner and text extractor WS
2. UPF Anonymizer WS
– Internally calls Freeling NER WS (3rd party tool)
Interoperability 
34
Video: http://ws02.iula.upf.edu/panacea/examples/videos/Panacea_web_cleaner_and_anonymization_v01.mp4
Workflow Demos II
Creation of a bilingual dictionary
(only FR-EN)
– http://myexperiment.elda.org/workflows/93
– Input: Pairs of Basic Xces Documents
• English: http://nlp.ilsp.gr/panacea/Bilingual/data/20101222/LAB_EN_FR/www.ilo.org/1.xml
• French: http://nlp.ilsp.gr/panacea/Bilingual/data/20101222/LAB_EN_FR/www.ilo.org/191.xml
1. Sentence alignment: Hunalign (3rd party tool) Interoperability 
2. PoS tagging: Treetagger (3rd party tool) Interoperability 
3. Build phrase tables: Moses (3rd party tool) Interoperability 
4. Bilingual dictionary extractor
Video: http://ws02.iula.upf.edu/panacea/examples/videos/Panacea_bilingual_dictionary_extraction_v01.mp4
35
Be part of the platform: creating and sharing web
services [30 min]
36
Deploy your tools as WS
• There are multiple solutions:
– Soaplab, CLAM, Apache Axis2, Apache CXF,
Spring
http://soaplab.sourceforge.net/soaplab2
• PANACEA can provide tips on setting up a
Soaplab2 server
37
Share your WS
in the Registry
• Provide your web services publically:
gain visibility, make your work useful for others
• http://panacea-lr.eu/en/tutorials
• How to register WS
• http://panacea-lr.eu/system/tutorials/How%20to%20Register%20services%20in%20Panacea-v4.pdf
38
Thank you
Questions?
39
1 of 39

Recommended

NiFi Developer Guide by
NiFi Developer GuideNiFi Developer Guide
NiFi Developer GuideDeon Huang
710 views44 slides
Upping your NiFi Game with Docker by
Upping your NiFi Game with DockerUpping your NiFi Game with Docker
Upping your NiFi Game with DockerAldrin Piri
2.2K views19 slides
Spotify: Playing for millions, tuning for more by
Spotify: Playing for millions, tuning for moreSpotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreNick Barkas
2K views18 slides
Enabling IPv6 Services Transparently by
Enabling IPv6 Services TransparentlyEnabling IPv6 Services Transparently
Enabling IPv6 Services TransparentlyCarlos Martinez Cagnazzo
1.3K views24 slides
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St... by
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data
200 views71 slides
Hive & HBase For Transaction Processing by
Hive & HBase For Transaction ProcessingHive & HBase For Transaction Processing
Hive & HBase For Transaction ProcessingDataWorks Summit
792 views21 slides

More Related Content

What's hot

HTTP/2: What's new? by
HTTP/2: What's new? HTTP/2: What's new?
HTTP/2: What's new? Piet van Dongen
58 views62 slides
Fluentd - RubyKansai 65 by
Fluentd - RubyKansai 65Fluentd - RubyKansai 65
Fluentd - RubyKansai 65N Masahiro
2.9K views52 slides
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013 by
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Christopher Curtin
8.7K views48 slides
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ... by
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...Lucas Jellema
3.4K views48 slides
IPv6: We Care So You Don't Have To by
IPv6: We Care So You Don't Have ToIPv6: We Care So You Don't Have To
IPv6: We Care So You Don't Have ToGary Wilhelm
698 views22 slides
Apache Kafka by
Apache KafkaApache Kafka
Apache KafkaJoe Stein
23.8K views21 slides

What's hot(7)

Fluentd - RubyKansai 65 by N Masahiro
Fluentd - RubyKansai 65Fluentd - RubyKansai 65
Fluentd - RubyKansai 65
N Masahiro2.9K views
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013 by Christopher Curtin
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Christopher Curtin8.7K views
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ... by Lucas Jellema
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
AMIS SIG - Introducing Apache Kafka - Scalable, reliable Event Bus & Message ...
Lucas Jellema3.4K views
IPv6: We Care So You Don't Have To by Gary Wilhelm
IPv6: We Care So You Don't Have ToIPv6: We Care So You Don't Have To
IPv6: We Care So You Don't Have To
Gary Wilhelm698 views
Apache Kafka by Joe Stein
Apache KafkaApache Kafka
Apache Kafka
Joe Stein23.8K views
WAMP as a platform for composite SOA applications and its implementation on Lua by Konstantin Burkalev
WAMP as a platform for composite SOA applications and its implementation on LuaWAMP as a platform for composite SOA applications and its implementation on Lua
WAMP as a platform for composite SOA applications and its implementation on Lua

Similar to Panacea Project Tutorial (MT Summit 2013)

The Taverna Software Suite by
The Taverna Software SuiteThe Taverna Software Suite
The Taverna Software SuitemyGrid team
2.4K views37 slides
ONAP on Vagrant by
ONAP on VagrantONAP on Vagrant
ONAP on VagrantVictor Morales
366 views18 slides
Taverna summary by
Taverna summaryTaverna summary
Taverna summarymyGrid team
928 views28 slides
Apache Kafka by
Apache KafkaApache Kafka
Apache Kafkaemreakis
1.2K views23 slides
Taverna as a service by
Taverna as a serviceTaverna as a service
Taverna as a serviceRafael C. Jimenez
35 views32 slides
Kafka Explainaton by
Kafka ExplainatonKafka Explainaton
Kafka ExplainatonNguyenChiHoangMinh
15 views53 slides

Similar to Panacea Project Tutorial (MT Summit 2013)(20)

The Taverna Software Suite by myGrid team
The Taverna Software SuiteThe Taverna Software Suite
The Taverna Software Suite
myGrid team2.4K views
Apache Kafka by emreakis
Apache KafkaApache Kafka
Apache Kafka
emreakis1.2K views
Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A... by Lucas Jellema
Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A...Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A...
Event Bus as Backbone for Decoupled Microservice Choreography (Oracle Code, A...
Lucas Jellema2.8K views
Integrating Taverna Player into Scratchpads by Robert Haines
Integrating Taverna Player into ScratchpadsIntegrating Taverna Player into Scratchpads
Integrating Taverna Player into Scratchpads
Robert Haines909 views
Event Bus as Backbone for Decoupled Microservice Choreography (JFall 2017) by Lucas Jellema
Event Bus as Backbone for Decoupled Microservice Choreography (JFall 2017)Event Bus as Backbone for Decoupled Microservice Choreography (JFall 2017)
Event Bus as Backbone for Decoupled Microservice Choreography (JFall 2017)
Lucas Jellema2.6K views
Collector Web Services by publisyst
Collector Web ServicesCollector Web Services
Collector Web Services
publisyst851 views
Event Bus as Backbone for Decoupled Microservice Choreography - Lecture and W... by Lucas Jellema
Event Bus as Backbone for Decoupled Microservice Choreography - Lecture and W...Event Bus as Backbone for Decoupled Microservice Choreography - Lecture and W...
Event Bus as Backbone for Decoupled Microservice Choreography - Lecture and W...
Lucas Jellema286 views
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r... by Timothy Spann
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Timothy Spann553 views
Alina Cojocariu - Flex and Android tests with Ranorex by Codecamp Romania
Alina Cojocariu - Flex and Android tests with RanorexAlina Cojocariu - Flex and Android tests with Ranorex
Alina Cojocariu - Flex and Android tests with Ranorex
Codecamp Romania736 views
Adding Real-time Features to PHP Applications by Ronny López
Adding Real-time Features to PHP ApplicationsAdding Real-time Features to PHP Applications
Adding Real-time Features to PHP Applications
Ronny López2.5K views
Kubernetes Infra 2.0 by Deepak Sood
Kubernetes Infra 2.0Kubernetes Infra 2.0
Kubernetes Infra 2.0
Deepak Sood151 views
Benefits of an Open environment with Wakanda by Alexandre Morgaut
Benefits of an Open environment with WakandaBenefits of an Open environment with Wakanda
Benefits of an Open environment with Wakanda
Alexandre Morgaut2.2K views
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat... by Altinity Ltd
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
Altinity Ltd603 views
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ... by Kai Wähner
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Event-Driven Stream Processing and Model Deployment with Apache Kafka, Kafka ...
Kai Wähner3.3K views
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl... by confluent
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFl...
confluent4.5K views

Recently uploaded

Evolving the Network Automation Journey from Python to Platforms by
Evolving the Network Automation Journey from Python to PlatformsEvolving the Network Automation Journey from Python to Platforms
Evolving the Network Automation Journey from Python to PlatformsNetwork Automation Forum
13 views21 slides
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...Bernd Ruecker
40 views69 slides
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc
11 views29 slides
Uni Systems for Power Platform.pptx by
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
56 views21 slides
Kyo - Functional Scala 2023.pdf by
Kyo - Functional Scala 2023.pdfKyo - Functional Scala 2023.pdf
Kyo - Functional Scala 2023.pdfFlavio W. Brasil
400 views92 slides
HTTP headers that make your website go faster - devs.gent November 2023 by
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023Thijs Feryn
22 views151 slides

Recently uploaded(20)

iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker40 views
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc11 views
HTTP headers that make your website go faster - devs.gent November 2023 by Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn22 views
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab21 views
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... by James Anderson
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
James Anderson92 views
Case Study Copenhagen Energy and Business Central.pdf by Aitana
Case Study Copenhagen Energy and Business Central.pdfCase Study Copenhagen Energy and Business Central.pdf
Case Study Copenhagen Energy and Business Central.pdf
Aitana16 views
"Surviving highload with Node.js", Andrii Shumada by Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays22 views
Piloting & Scaling Successfully With Microsoft Viva by Richard Harbridge
Piloting & Scaling Successfully With Microsoft VivaPiloting & Scaling Successfully With Microsoft Viva
Piloting & Scaling Successfully With Microsoft Viva
The Forbidden VPN Secrets.pdf by Mariam Shaba
The Forbidden VPN Secrets.pdfThe Forbidden VPN Secrets.pdf
The Forbidden VPN Secrets.pdf
Mariam Shaba20 views
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 by IttrainingIttraining
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
"Running students' code in isolation. The hard way", Yurii Holiuk by Fwdays
"Running students' code in isolation. The hard way", Yurii Holiuk "Running students' code in isolation. The hard way", Yurii Holiuk
"Running students' code in isolation. The hard way", Yurii Holiuk
Fwdays17 views

Panacea Project Tutorial (MT Summit 2013)

  • 1. PANACEA Tutorial MT Summit 3rd September 2013 Marc Poch, Universitat Pompeu Fabra marc.pochriera@upf.edu Antonio Toral, Dublin City University atoral@computing.dcu.ie 1
  • 2. Tutorial Objectives • Become familiar with the web service paradigm and its advantages • Become familiar with the PANACEA platform • Learn how to search and use web services • Learn how to chain web services to build more complex processing pipelines (workflows) • Learn how to create and share your own web services 2
  • 3. Tutorial outline • Introduction to the PANACEA platform [30 min] • Tour of the PANACEA webs [15 min] • Find and try web services [45 min] • Interoperability [10 min] • Break [15 min] • Chain web services: workflows [60 min] • Be part of the platform: creating and sharing web services [30 min] 3
  • 4. Introduction to the PANACEA platform [30 min] 4
  • 5. PANACEA Project Objectives  Development of a platform (a space of interoperability defined by standardized protocols and common interfaces) for the easy integration of a variety of software components, tools and methodologies deployed as web services to configure a factory for the automation of acquisition, processing and annotation of language resources. 5
  • 6. 6 Tools to be shared remotely Web Service wrapper The Registry Common Interfaces Format Converters Workflow editor and engine Sharing workflows From local tools to sharing workflows
  • 7. Platform definition • The PANACEA platform is an interoperability space based on tools, guidelines, a Common Interface definition, and a “Travelling Object” specification Components: • Tools: Taverna, BioCatalogue, myExperiment, Soaplab, storage system • Common Interface: WS interoperability • Travelling Object: XCES, GrAF, CoNLL, LMF • Documentation Formal definition Technical Definition 7
  • 8. Clients: Java, Python, Perl, etc. Platform tools and portals 8 JAX-WS, Axis, CXF, etc. Workflows Social Network RegistryWeb Services Share tools (remotely run distributed tools) Share and find Web Services Call / chain Web Services Share and find workflows SOAP or REST Soaplab Server Biocatalogue Taverna www.taverna.org.uk PANACEA Registry: registry.elda.org PANACEA myExperiment: myexperiment.elda.org myExperiment PANACEA Platform: uses, adapts and improves myGrid tools for eScience (used in biology, social science, music, astronomy, multimedia and chemistry).
  • 9. Technological option: Web Services SOAPLAB 2 (SOAP) • Easy deployment of command line tools as WS. (Java, Python, C++, UIMA, etc. ) • Clients: Java, Python, Perl, Taverna, etc. • No coding needed! Only metadata • “Polling” techniques for long lasting tasks • Web form to run the web services • URL input / output ready • PANACEA improvement for SOAP messaging (network usage and memory) • PANACEA limit multiple users TAVERNA BioCatalogue Web Services Workflow editor Registry Social network myExperiment 9
  • 10. Technological option: Registry and myExperiment SOAPLAB 2 (SOAP) • User friendly GUI • Free, open source, Continuously maintained • Search function • Users rating (users feedback) • Service annotations and Language Categorization (PANACEA) • Monitoring system (web service status and data results) TAVERNA BioCatalogue Web Services Workflow editor Registry Social network myExperiment 10
  • 11. Technological option: Taverna SOAPLAB 2 (SOAP) • User friendly GUI • Free and open source • Continuously maintained (v. 2.4) • SOAP and REST web services • Credentials manger (passwords, certificates, etc.) • Multiple files processing (“lists”) • PANACEA Workflows, best practises, videos, etc. : • Parallelization, Error recovery: “retries”, Polling • PANACEA collaboration: bug fixing and pre-release tests TAVERNA BioCatalogue Web Services Workflow editor Registry Social network myExperiment 11
  • 12. 12 Service providers How can I share the tools on my server? Where can I make my Web Services public? JAX-WS, Axis, CXF, etc. Web Services SOAP or REST Soaplab Server Registry Biocatalogue PANACEA Registry: registry.elda.org
  • 13. 13 Can I run tools without installing them? Where can I find Web Services? How can I run Web Services? How can I chain WS? Web Services Registry -Java, Python, perl clients, etc. - Web clients: Soaplab Spinet -Java, Python, perl clients, etc. - Taverna workflows Workflows Users
  • 14. Tour of the PANACEA webs [15 min] 14
  • 16. The PANACEA web • Main Page • Link Buttons (Registry, myExperiment, tutorials and documentation) • Tutorials Page / Videos • Documentation • Deliverables http://panacea-lr.eu 16
  • 17. The Registry • The PANACEA Registry is a BioCatalogue instance (the source code has been used to deploy the registry on a server) • Features: • Annotation capabilities and categorization • Search function • Automatic status check system for web services http://registry.elda.org 17
  • 18. myExperiment • The PANACEA myExperiment is a myExperiment instance (the source code has been used to deploy it on a server) • Features: • Annotation capabilities • Search function • “Services tab beta” added to PANACEA myExperiment. Users can list web services from the Registry and see in which workflows have been used.  18
  • 19. Find and try web services [45 min] 19
  • 20. Registry tutorial • http://registry.elda.org • Global view of the Registry • Search engine • Categorization System • Metadata and documentation • Monitoring System • http://vimeo.com/24790416 20
  • 21. Calling Web Services • Spinet web client (Soaplab web services) • Taverna • SOAP, WSDL. Examples (perl, python) – http://ws02.iula.upf.edu/panacea/examples/soaplab-clients/soaplab_clients.zip • Soaplab command-line client sh $SOAPLAB_FOLDER/build/run/run-cmdline-client -protocol axis1 -e http://srv-cngl.computing.dcu.ie/panacea-soaplab2-axis/services/panacea.europarl_lowercase -w -r input_direct_data "ASDA" 21
  • 22. Spinet Tutorial • Spinet is the Soaplab web client used to test and run WS deployed on a Soaplab Server. • Every Service provider has (at least) a Soaplab Server • the Demo... - Access Spinet directly from the Registry “Test Form Location (Spinet Web Client):” - Configure mandatory parameters and RUN the WS - 10 minutes to try to find and run some web services. You can start from http://registry.elda.org 22
  • 23. Exercise Twitter NLP + Registry (3rd party tool)  • This web service is based on the Twitter NLP tool developed by Noah's ARK group. • Noah's ARK group is Noah Smith's research group at the Language Technologies Institute, School of Computer Science, Carnegie Mellon University. 1. Search the WS in the Registry 2. Check monitoring system 3. Use web client with example data 23
  • 24. WS advantages (for users) • No installation • No maintenance • No machine resources • Easily found on the Registry • Usability • Can be combined in workflows (share experiments) 24
  • 26. • Three levels of interoperability: – Communication protocols: SOAP, REST – Data – Parameters Format N Tool A Format M Tool B Format L Tool C Format N Tool A empty Tool B empty Tool C Interoperability Tool B does not “understand” format N! All tools understand the previous format Tool A Tool B A B C D A B C D Tool A Tool B Y T Q Z A B C D 26
  • 27. Common Interface • A Common Interface (CI) defines the mandatory parameters for every functionality, e.g. PoS tagging: PoS Tagger A MANDATORY: input language OPTIONALS: Param A PoS Tagger B MANDATORY: input language OPTIONALS: PARAM 1 PARAM 2 http://panacea-lr.eu/en/info-for-professionals/documents/ http://registry.elda.org 27
  • 28. Travelling Object • Travelling Object (TO): common data and metadata format used in PANACEA to make components interoperable • TO1: minimal common vertical in-line format used by deployed tools (based on the XCES standard) • TO2: stand-off format. Based on the GrAF standard, the XML serialization of LAF (ISO 24612, 2009) • LMF: for lexical resources • CONLL: for parsers • Converters and adapted WS outputs 28
  • 29. Format Converters 31 Format converters on the PANACEA Registry • Freeling to TO. CNR http://registry.elda.org/services/207 • KAF to TO. CNR http://registry.elda.org/services/208 • Basic Xces to txt. CNR http://registry.elda.org/services/209 • PoS tag. (Freeling treetagger) to GrAF. UPF http://registry.elda.org/services/142 • Dependency parsing (Freeling) to GrAF. UPF http://registry.elda.org/services/197 • Dependency CoNLL to GrAF. CNR http://registry.elda.org/services/254 • Word doc to txt. UPF http://registry.elda.org/services/112 • In-house mwe to LMF. CNR http://registry.elda.org/services/296 • Pdf to text. UPF http://registry.elda.org/services/116 • Multi. encodings converter (ISO, UTF, etc.). UPF http://registry.elda.org/services/114 • Aligner to TO. DCU http://registry.elda.org/services/69 • Sentence alignment to TMX. DCU http://registry.elda.org/services/219 • Treetagger to MOSES (factored models). DCU http://registry.elda.org/services/275 • UIMA to GrAF. ILSP http://registry.elda.org/services/182 Providers are encouraged to provide converters for the formats they are interested on 29
  • 30. 3rd party tools integration • PANACEA WS wrapper (Soaplab) and the CI make it easy for WS Providers to integrate 3rd party tools. • ILSP tools are UIMA tools UIMA • Freeling UPC • Treetagger University of Stuttgart • Twitter NLP Carnegie Mellon University • MALT Parser Uppsala University • DeSR Università di Pisa • MOSES, GIZA++, other aligners Edinburgh, etc. • DELiC4MT, MT evaluation DCU • Berckeley tagger, parser, aligner Berkeley University 30
  • 31. Chain web services: workflows [60 min] 31
  • 32. Workflows • Once we can run WS... • …it’s time to chain them • Workflows are process chains that combine multiple WS and/or processors. • We use Taverna 2.4 http://www.taverna.org.uk - Documentation: http://dev.mygrid.org.uk/wiki/display/taverna/Documentation+and+Videos Quick start guide, videos, etc. 32
  • 33. Workflow tutorial • Find and run a workflow • http://vimeo.com/28449833 • Building a workflow from scratch • http://vimeo.com/28450024 33
  • 34. Workflow Demos I Web cleaner and anonymizer http://myexperiment.elda.org/workflows/98 • Input: a list of URLs to process – Example: a web article from www.fifa.com 1. ILSP Web cleaner and text extractor WS 2. UPF Anonymizer WS – Internally calls Freeling NER WS (3rd party tool) Interoperability  34 Video: http://ws02.iula.upf.edu/panacea/examples/videos/Panacea_web_cleaner_and_anonymization_v01.mp4
  • 35. Workflow Demos II Creation of a bilingual dictionary (only FR-EN) – http://myexperiment.elda.org/workflows/93 – Input: Pairs of Basic Xces Documents • English: http://nlp.ilsp.gr/panacea/Bilingual/data/20101222/LAB_EN_FR/www.ilo.org/1.xml • French: http://nlp.ilsp.gr/panacea/Bilingual/data/20101222/LAB_EN_FR/www.ilo.org/191.xml 1. Sentence alignment: Hunalign (3rd party tool) Interoperability  2. PoS tagging: Treetagger (3rd party tool) Interoperability  3. Build phrase tables: Moses (3rd party tool) Interoperability  4. Bilingual dictionary extractor Video: http://ws02.iula.upf.edu/panacea/examples/videos/Panacea_bilingual_dictionary_extraction_v01.mp4 35
  • 36. Be part of the platform: creating and sharing web services [30 min] 36
  • 37. Deploy your tools as WS • There are multiple solutions: – Soaplab, CLAM, Apache Axis2, Apache CXF, Spring http://soaplab.sourceforge.net/soaplab2 • PANACEA can provide tips on setting up a Soaplab2 server 37
  • 38. Share your WS in the Registry • Provide your web services publically: gain visibility, make your work useful for others • http://panacea-lr.eu/en/tutorials • How to register WS • http://panacea-lr.eu/system/tutorials/How%20to%20Register%20services%20in%20Panacea-v4.pdf 38