Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 38

Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services

0

Share

Download to read offline

Free Webinar on the Lynx Services Platform LySP: Architecture and basic Services

The main objective of the Lynx research and innovation project is to create an ecosystem of smart cloud services to better manage compliance, based on a Legal Knowledge Graph (LKG) which integrates and links multilingual and heterogeneous compliance data sources including legislation, case law, standards, regulations and other private contracts, beside others.

This webinar will provide insights into all smart services of the Lynx Services Platform (LySP) including demos of these LySP services, as for instance: Named Entity Extraction (NER) by DFKI, Relation Extraction and Question-Answering by SWC, Machine Translation by Tilde or the Lexicala cross-lingual lexical data service by KDictionaries.

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services

  1. 1. BUILDING THE LEGAL KNOWLEDGE GRAPH FOR SMART COMPLIANCE SERVICES IN MULTILINGUAL EUROPE http://lynx-project.eu/ Lynx - Compliance made easy Legal Knowledge Graph for Multilingual Compliance Services Webinar: Lynx Services Platform (LySP) - Part 2: The Services 18/02/2021, 10.30am-11.30am CET
  2. 2. Agenda • Introduction & the Lynx project - 5’ Martin Kaltenböck (Co-Founder and CFO of Semantic Web Company, SWC) • Lynx Services Platform: The Services - Introduction - 5’ Artem Revenko (Director of Research & Innovation, Semantic Web Company) • Lynx Services Platform: The Services in Detail - 40’ María Navas-Loro (Ontology Engineering Group, Artificial Intelligence Department, UPM), Julian Moreno (Researcher at DFKI), Ruben Martinez (Manager Customer Service, Tilde), Pablo Calleja (Ontology Engineering Group, Artificial Intelligence Department, UPM), Ilan Kernerman (CEO at Lexicala by K Dictionaries), Christian Sageder (CEO at Cybly). • Questions & Answers - 10’
  3. 3. The Lynx project ICT14-2016-2017 (IA) Innovation action Pillar: Industrial Leadership Work Programme Year: H2020-2016-2017 Work Programme Part: Information and Communication Technologies TOPIC : Big Data PPP: cross-sectorial and cross-lingual data integration and experimentation Duration: 40 months Start date: 1st December 2017 Estimated Project Cost: €3,638,065.00 Requested EU Contribution: €2,959,247.52 Project Officer: Johan BODENKAMP/Pierre-Paul SONDAG
  4. 4. Our Aim
  5. 5. Our Mission Smart services to better manage compliance LKG of European legal and regulatory open data Multilingual and multi-jurisdictional data
  6. 6. Lynx Services 16 Services: • 7 Enrichment • 5 Annotation • 2 Conversion • 4 Search and Information Retrieval • 2 Vocabulary • 3 Platform https://lynx-project.eu/doc/api/
  7. 7. Annotation Services
  8. 8. Temporal Expression Service (1) Finds the following types of expressions: • DATE: April, 23/05, in 1998. • TIME: At 2 o’clock, 5pm. • SET: every Thursday, twice a month. • DURATION: two days and a half, three years. • INTERVALS (ongoing): From 3rd of April to 6th May.
  9. 9. Temporal Expression Service (2) Once the previous expressions are found, they are normalized. (...) In 19981 it increased exponentially; that summer2 (...) (1) → 1998 (2) → summer of 1998 (1998-SU)
  10. 10. Temporal Expression Service (3) Languages covered: ● Legal focused ruled-based approaches: ○ Spanish ○ English ○ German ● Standard external tool for: ○ Italian ○ Dutch For more information, please check: https://www.youtube.com/watch?v=6-CwPal2ArE
  11. 11. Named Entity Recognition Service • Four model families: • General Domain: • Statistical language models (EN, DE) • BERT based Neural Networks (EN, DE, ES) • Legal Domain (DE): • Conditional Random Fields (CRF) • Bilateral Long Short Term Memory Neural Networks (BiLSTM) • Corpus: German court decisions • 67,000 sentences and 54,000 entities • 7 coarse-grained classes and 19 fine-grained classes
  12. 12. German Named Entity Corpus
  13. 13. Geolocation Service ● Three approaches: • Statistical language model • Trained with a specific German and English corpus • 17 fine-grained classes • Dictionary based approach • Spanish dictionary of companies • Rule-based approach • Regular expressions for recognition of addresses
  14. 14. English and German Geo. Corpus
  15. 15. Rule-based approach
  16. 16. Entity Linking Link a target (“Jaguar”) in a context to the correct entity in a knowledge base. Assumption: All senses of the target are present in the knowledge base. Usually suitable for large knowledge bases, for example DBpedia, WordNet. Relax assumption -> decide if a target should be linked to some entity in knowledge base. Suitable for smaller enterprise knowledge graphs.
  17. 17. Conversion Services
  18. 18. Machine Translation Service Language challenges in the digital environment
  19. 19. Machine Translation - Benefits 1 Internal & External multilingual communication Improve the organizations communication culture, starting from your internal team to speaking the language of the customer 2 Increase translation productivity by 35% Provide immediate human-like translations, facilitate processes of large volume text translation 3 Enter new markets Scale your business, move content quickly and enter new markets as fast as possible while reducing the time and capital spent on projects
  20. 20. Machine Translation - in Lynx - External service: use directly from most up-to-date cloud platform with Neural MT technology & terminology capabilities. Regular technological updates - Source Document, Text and Annotation translation - Use case specific - contracts, labor law, renewable energy (trained on Lynx partners documents and identified sources)
  21. 21. Extractive Summarization Service ● Selection of relevant sentences • TF-IDF • Encode documents and calculate weights for sentences using TF-IDF • Centroids and composability of word embeddings • Extract keywords and concepts • Composing embeddings • Created centroid (document‘s) • Project sentence in embedding space • Relevance scores (distance to centroid)
  22. 22. Abstractive Summarization Service ● Based on Neural Networks and Transformer encoders
  23. 23. Search and Information Retrieval
  24. 24. Cross-Lingual Search (1) • Full text search in multi lingual corporas • APIs for • Add / Delete Lynx Documents to the search index, a Lynx Document Part is its on document in the index • Search documents / parts • Possibility of complex search queries • AND, OR, NOT, MUST, NEAR, (), Phrases, • Filters for metadata • Search term will be translated to the language of the corpora based on the targeted jurisdiction
  25. 25. Cross-Lingual Search (2) Example: Maternity leave Spain AND Austria detect language detect jurisdiction(s) to query + language create a AST (abstract syntax tree) translate and expand query query Index annotation query ● GEO ● NER ● EL ● Translation ● Dictionary ● Terminology English Austria, Spain ● Austria, Spain, EU ● German, Spanish, English ● permiso por maternidad ● Karenzzeit ● Maternity leave (licencia de maternidad AND metadata.jurisdiction:ES) OR (Karenzzeit AND metadata.jurisdiction:AT) OR (maternity leave AND metadata.jurisdiction:EU) OR
  26. 26. Search and Information Retrieval (1) http://lkg.lynx-project.eu/ • Web Portal & RESTful API • Relies on an Elasticsearch DCM • Manages parts of documents as independent documents
  27. 27. Search and Information Retrieval (2) Parameters of search query • words/terms • collection • jurisdiction • language • part of another document • rows • ... Evaluation • Gold standard created by CuatreCasas • Spanish worker’s statute document • 152 questions (en/es) with answers (sections) • Achieved >85% of accuracy • Experimentation with: • stems, synonyms, term extraction
  28. 28. Vocabulary Services
  29. 29. Dictionary Services Domain-independent lexical data • formats: XML, JSON, JSON-LD • endpoints: SPARQL, Lexicala API • languages: Dutch | English | German | Spanish
  30. 30. Dictionary Services: Entry Components • headwords and expressions, inflections and variants • phonetic transcription (IPA) and alternative script • part of speech, grammatical gender and number • subcategorization and valency • definitions, sense indication and disambiguation • examples of usage • synonyms, antonyms, domains, context • range of application, register, sentiment, geo usage • translations
  31. 31. Dictionary Services: Sample Entry
  32. 32. Dictionary Services: RDF Pipeline (1) • data modelled with OntoLex, adhering to lexicog module • XML → JSON → JSON LD conversion pipeline • incremental approach • mapping XML paths to corresponding Linked Data element • URI naming strategy established • implementation of the model • validation
  33. 33. Dictionary Services: RDF Pipeline (2)
  34. 34. Dictionary Services: Sample Query • A response to querying all lexical senses linked to the RDF entry :LexiconEN/bow-n, gathering the information originating from the different homographs as well as from other resources in which bow is given as a translation. • The query currently results in 56 possible senses in different languages of bow as an English noun across the Global series.
  35. 35. Terminology Service ● Corpus based terminologies per pilot: Labour Law, Contracts, Industrial Standards ● Multilingual, disambiguated knowledge retrieved from the LLOD ● Languages covered: ○ Dutch ○ English ○ German ○ Spanish Avaliable at: http://lkg.lynx-project.eu/kos
  36. 36. Lynx Webinar Series • Webinar 1: Lynx overall introduction When: 10.12.2020, 10.30am CET (1 hour) Recording: https://youtube.com/playlist?list=PLxa__IZYjIaiGbl3a-PyK3DqNhhMdnnHv • Webinar 2: 3 Business Cases on top of the Lynx Legal Knowledge Graph When: 14.1.2021, 10.30am CET Recording: https://youtube.com/playlist?list=PLxa__IZYjIaiDL2O22ureD_nLmtgRq9LB • Webinar 3: The Lynx Services Platform (LySP) - Part 1: Overview When: 11/02/2021, 11.30am CET Recording: https://youtube.com/playlist?list=PLxa__IZYjIahhiSXoJbVyxv_iAliExH5e • Webinar 4: The Lynx Services Platform (LySP) - Part 2: The Services When: 18/02/2021, 10.30am CET Recording: https://youtube.com/playlist?list=PLxa__IZYjIaiv5MeV7uZsujv-MOi6SE-a
  37. 37. CONTACTS CONSORTIUM Please raise your questions now…. http://lynx-project.eu/

×