Using Neural SPARQL Machines to translate an utterance into a structured query for question answering over the Linked Open Data cloud.
Invited talk at the 6th Leipzig Semantic Web Day (LSWT2018).
5 Lessons Learned from Designing Neural Models for Information RetrievalBhaskar Mitra
Slides from my keynote talk at the Recherche d'Information SEmantique (RISE) workshop at CORIA-TALN 2018 conference in Rennes, France.
(Abstract)
Neural Information Retrieval (or neural IR) is the application of shallow or deep neural networks to IR tasks. Unlike classical IR models, these machine learning (ML) based approaches are data-hungry, requiring large scale training data before they can be deployed. Traditional learning to rank models employ supervised ML techniques—including neural networks—over hand-crafted IR features. By contrast, more recently proposed neural models learn representations of language from raw text that can bridge the gap between the query and the document vocabulary.
Neural IR is an emerging field and research publications in the area has been increasing in recent years. While the community explores new architectures and training regimes, a new set of challenges, opportunities, and design principles are emerging in the context of these new IR models. In this talk, I will share five lessons learned from my personal research in the area of neural IR. I will present a framework for discussing different unsupervised approaches to learning latent representations of text. I will cover several challenges to learning effective text representations for IR and discuss how latent space models should be combined with observed feature spaces for better retrieval performance. Finally, I will conclude with a few case studies that demonstrates the application of neural approaches to IR that go beyond text matching.
Machine Learning Methods for Analysing and Linking RDF DataJens Lehmann
Invited Talk at the 8th International Conference on Scalable Uncertainty Management (SUM)
The talk outlines applications of supervised structured machine learning and presents a specific refinement operator based approach for RDF/OWL. It also outlines how similar ideas can be used in other (formal) languages, in particular link specifications.
Effective Semantics for Engineering NLP SystemsAndre Freitas
Provide a synthesis of the emerging representation trends behind NLP systems.
Shift in perspective:
Effective engineering (task driven, scalable) instead of sound formalism.
Best-effort representation.
Knowledge Graphs (Frege revisited)
Information Extraction & Text Classification
Distributional Semantic Models
Knowledge Graphs & Distributional Semantics
(Distributional-Relational Models)
Applications of DRMs
KG Completion
Semantic Parsing
Natural Language Inference
Natural Language Processing (NLP) is often taught at the academic level from the perspective of computational linguists. However, as data scientists, we have a richer view of the world of natural language - unstructured data that by its very nature has important latent information for humans. NLP practitioners have benefitted from machine learning techniques to unlock meaning from large corpora, and in this class we’ll explore how to do that particularly with Python, the Natural Language Toolkit (NLTK), and to a lesser extent, the Gensim Library.
NLTK is an excellent library for machine learning-based NLP, written in Python by experts from both academia and industry. Python allows you to create rich data applications rapidly, iterating on hypotheses. Gensim provides vector-based topic modeling, which is currently absent in both NLTK and Scikit-Learn. The combination of Python + NLTK means that you can easily add language-aware data products to your larger analytical workflows and applications.
5 Lessons Learned from Designing Neural Models for Information RetrievalBhaskar Mitra
Slides from my keynote talk at the Recherche d'Information SEmantique (RISE) workshop at CORIA-TALN 2018 conference in Rennes, France.
(Abstract)
Neural Information Retrieval (or neural IR) is the application of shallow or deep neural networks to IR tasks. Unlike classical IR models, these machine learning (ML) based approaches are data-hungry, requiring large scale training data before they can be deployed. Traditional learning to rank models employ supervised ML techniques—including neural networks—over hand-crafted IR features. By contrast, more recently proposed neural models learn representations of language from raw text that can bridge the gap between the query and the document vocabulary.
Neural IR is an emerging field and research publications in the area has been increasing in recent years. While the community explores new architectures and training regimes, a new set of challenges, opportunities, and design principles are emerging in the context of these new IR models. In this talk, I will share five lessons learned from my personal research in the area of neural IR. I will present a framework for discussing different unsupervised approaches to learning latent representations of text. I will cover several challenges to learning effective text representations for IR and discuss how latent space models should be combined with observed feature spaces for better retrieval performance. Finally, I will conclude with a few case studies that demonstrates the application of neural approaches to IR that go beyond text matching.
Machine Learning Methods for Analysing and Linking RDF DataJens Lehmann
Invited Talk at the 8th International Conference on Scalable Uncertainty Management (SUM)
The talk outlines applications of supervised structured machine learning and presents a specific refinement operator based approach for RDF/OWL. It also outlines how similar ideas can be used in other (formal) languages, in particular link specifications.
Effective Semantics for Engineering NLP SystemsAndre Freitas
Provide a synthesis of the emerging representation trends behind NLP systems.
Shift in perspective:
Effective engineering (task driven, scalable) instead of sound formalism.
Best-effort representation.
Knowledge Graphs (Frege revisited)
Information Extraction & Text Classification
Distributional Semantic Models
Knowledge Graphs & Distributional Semantics
(Distributional-Relational Models)
Applications of DRMs
KG Completion
Semantic Parsing
Natural Language Inference
Natural Language Processing (NLP) is often taught at the academic level from the perspective of computational linguists. However, as data scientists, we have a richer view of the world of natural language - unstructured data that by its very nature has important latent information for humans. NLP practitioners have benefitted from machine learning techniques to unlock meaning from large corpora, and in this class we’ll explore how to do that particularly with Python, the Natural Language Toolkit (NLTK), and to a lesser extent, the Gensim Library.
NLTK is an excellent library for machine learning-based NLP, written in Python by experts from both academia and industry. Python allows you to create rich data applications rapidly, iterating on hypotheses. Gensim provides vector-based topic modeling, which is currently absent in both NLTK and Scikit-Learn. The combination of Python + NLTK means that you can easily add language-aware data products to your larger analytical workflows and applications.
Presentation of "Challenges in transfer learning in NLP" from Madrid Natural Language Processing Meetup Event, May, 2019.
https://www.meetup.com/es-ES/Madrid-Natural-Language-Processing-meetup/
Practical related work in repository: https://github.com/laraolmos/madrid-nlp-meetup
Different Semantic Perspectives for Question Answering SystemsAndre Freitas
Question Answering systems define one of the most complex tasks in computational semantics. The intrinsic complexity of the QA task allows researchers of QA systems to investigate and explore different perspectives of semantics. However, this complexity also induces a bias towards a systems perspective, where researchers are alienated from a deeper reasoning on the semantic principles that are in place within the different components of the system. In this talk we will explore the semantic challenges, principles and perspectives behind the components of QA systems, aiming at providing a principled map and overview on the contribution of each component within the QA semantic interpretation goal.
Federated data stores using semantic web technologySteve Ray
Semantic web, or linked data technology can help address interoperability problems in the internet, and particularly in support of the Internet of Things. This is an simple introduction to this technology.
a system called natural language interface which transforms user's natural language question into SPARQL query
find related papers here https://sites.google.com/site/fadhlinams81/publication
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...NASIG
Andrew Senior, presenter
Over the course of the past five years, several vocabularies have coalesced into clear options for the description of continuing resources in linked data. BIBFRAME 2.0, PRESSoo, and RDA are such examples of robust vocabularies for serials and integrating resource description. Schema.org, a widely-used vocabulary on the Web and employed by OCLC in their linked data work, may also offer sufficient granularity for serial descriptions using schema extensions.
Leaders in the field of bibliographic linked data have begun to investigate the relationships between these vocabularies, such as OCLC’s study of BIBFRAME and schema.org. Specifically for continuing resources, the CONSER BIBFRAME Task Group is investigating the mapping of the CONSER Standard Record to BIBFRAME and the relationship to PRESSoo.
As institutions assess the means with which we can publish our serials bibliographic information as linked data, how will these varying vocabularies with diverse levels of granularity co-exist in the discovery ecosystem? Using examples from current and historical serials and newspapers, this presentation proposes a meta-analysis and comparative mapping of key continuing resources elements in the BIBFRAME, PRESSoo, RDA and schema.org vocabularies. It will examine whether alignment is possible at this stage—in order to collocate shared or related elements—and immediately leverage existing web standards to enhance discoverability for continuing resources in linked data.
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...Normunds Grūzītis
In the era of Big Data and Deep Learning, there is a common view that machine learning approaches are the only way to cope with the robust and scalable information extraction and summarization. It has been recently proposed that the CNL approach could be scaled up, building on the concept of embedded CNL and, thus, allowing for CNL-based information extraction from e.g. normative or medical texts that are rather controlled by nature but still infringe the boundaries of CNL. Although it is arguable if CNL can be exploited to approach the robust wide-coverage semantic parsing for use cases like media monitoring, its potential becomes much more obvious in the opposite direction: generation of story highlights from the summarized AMR graphs, which is in the focus of this position paper.
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Seth Grimes
Presentation by Nathan Schneider, Assistant Professor of Linguistics and Computer Science at Georgetown University, to the Washington DC Natural Language Processing meetup, October 14, 2019 (https://www.meetup.com/DC-NLP/events/264894589/).
The Ins and Outs of Preposition Semantics: Challenges in Comprehensive Corpu...Seth Grimes
Presentation by Nathan Scheider, Georgetown University, to the Washington DC Natural Language Processing meetup, October 14, 2019, https://www.meetup.com/DC-NLP/events/264894589/.
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseBertram Ludäscher
Deductive Databases & Logic Programs: Back to the Future!
Colloquium talk on the occasion of the retirement of Prof. Dr. Georg Lausen, May 10th, 2019, Universität Freiburg, Germany
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Presentation of "Challenges in transfer learning in NLP" from Madrid Natural Language Processing Meetup Event, May, 2019.
https://www.meetup.com/es-ES/Madrid-Natural-Language-Processing-meetup/
Practical related work in repository: https://github.com/laraolmos/madrid-nlp-meetup
Different Semantic Perspectives for Question Answering SystemsAndre Freitas
Question Answering systems define one of the most complex tasks in computational semantics. The intrinsic complexity of the QA task allows researchers of QA systems to investigate and explore different perspectives of semantics. However, this complexity also induces a bias towards a systems perspective, where researchers are alienated from a deeper reasoning on the semantic principles that are in place within the different components of the system. In this talk we will explore the semantic challenges, principles and perspectives behind the components of QA systems, aiming at providing a principled map and overview on the contribution of each component within the QA semantic interpretation goal.
Federated data stores using semantic web technologySteve Ray
Semantic web, or linked data technology can help address interoperability problems in the internet, and particularly in support of the Internet of Things. This is an simple introduction to this technology.
a system called natural language interface which transforms user's natural language question into SPARQL query
find related papers here https://sites.google.com/site/fadhlinams81/publication
Bringing It All Together: Mapping Continuing Resources Vocabularies for Linke...NASIG
Andrew Senior, presenter
Over the course of the past five years, several vocabularies have coalesced into clear options for the description of continuing resources in linked data. BIBFRAME 2.0, PRESSoo, and RDA are such examples of robust vocabularies for serials and integrating resource description. Schema.org, a widely-used vocabulary on the Web and employed by OCLC in their linked data work, may also offer sufficient granularity for serial descriptions using schema extensions.
Leaders in the field of bibliographic linked data have begun to investigate the relationships between these vocabularies, such as OCLC’s study of BIBFRAME and schema.org. Specifically for continuing resources, the CONSER BIBFRAME Task Group is investigating the mapping of the CONSER Standard Record to BIBFRAME and the relationship to PRESSoo.
As institutions assess the means with which we can publish our serials bibliographic information as linked data, how will these varying vocabularies with diverse levels of granularity co-exist in the discovery ecosystem? Using examples from current and historical serials and newspapers, this presentation proposes a meta-analysis and comparative mapping of key continuing resources elements in the BIBFRAME, PRESSoo, RDA and schema.org vocabularies. It will examine whether alignment is possible at this stage—in order to collocate shared or related elements—and immediately leverage existing web standards to enhance discoverability for continuing resources in linked data.
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...Normunds Grūzītis
In the era of Big Data and Deep Learning, there is a common view that machine learning approaches are the only way to cope with the robust and scalable information extraction and summarization. It has been recently proposed that the CNL approach could be scaled up, building on the concept of embedded CNL and, thus, allowing for CNL-based information extraction from e.g. normative or medical texts that are rather controlled by nature but still infringe the boundaries of CNL. Although it is arguable if CNL can be exploited to approach the robust wide-coverage semantic parsing for use cases like media monitoring, its potential becomes much more obvious in the opposite direction: generation of story highlights from the summarized AMR graphs, which is in the focus of this position paper.
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Seth Grimes
Presentation by Nathan Schneider, Assistant Professor of Linguistics and Computer Science at Georgetown University, to the Washington DC Natural Language Processing meetup, October 14, 2019 (https://www.meetup.com/DC-NLP/events/264894589/).
The Ins and Outs of Preposition Semantics: Challenges in Comprehensive Corpu...Seth Grimes
Presentation by Nathan Scheider, Georgetown University, to the Washington DC Natural Language Processing meetup, October 14, 2019, https://www.meetup.com/DC-NLP/events/264894589/.
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseBertram Ludäscher
Deductive Databases & Logic Programs: Back to the Future!
Colloquium talk on the occasion of the retirement of Prof. Dr. Georg Lausen, May 10th, 2019, Universität Freiburg, Germany
Similar to Translating Natural Language into SPARQL for Neural Question Answering (20)
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Studia Poinsotiana
I Introduction
II Subalternation and Theology
III Theology and Dogmatic Declarations
IV The Mixed Principles of Theology
V Virtual Revelation: The Unity of Theology
VI Theology as a Natural Science
VII Theology’s Certitude
VIII Conclusion
Notes
Bibliography
All the contents are fully attributable to the author, Doctor Victor Salas. Should you wish to get this text republished, get in touch with the author or the editorial committee of the Studia Poinsotiana. Insofar as possible, we will be happy to broker your contact.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Toxic effects of heavy metals : Lead and Arsenicsanjana502982
Heavy metals are naturally occuring metallic chemical elements that have relatively high density, and are toxic at even low concentrations. All toxic metals are termed as heavy metals irrespective of their atomic mass and density, eg. arsenic, lead, mercury, cadmium, thallium, chromium, etc.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Translating Natural Language into SPARQL for Neural Question Answering
1. TRANSLATING
NATURAL LANGUAGE INTO SPARQL
FOR NEURAL QUESTION ANSWERING
Tommaso Soru
AKSW, University of Leipzig, Germany
6. Leipziger Semantic WebTag (LSWT2018) – 18.06.2018
2. LINKED OPEN DATA
👍 >10K published datasets
👍 ~150B triples as (s, p, o)
👎 Low accessibility
lod-cloud.net
2
3. SPARQL QUERY LANGUAGE
3
SELECT ?x WHERE {
?x a ontology:Person .
?x ontology:birthPlace dbpedia:Leipzig .
}
dbpedia:Walter_Ulbricht
dbpedia:Anita_Berber
dbpedia:Martin_Benno_Schmidt
…
4. NATURAL LANGUAGETO SPARQL
4
SELECT ?x WHERE {
?x a ontology:Person .
?x ontology:birthPlace dbpedia:Leipzig .
}
people born in Leipzig
who was born in Leipzig?
Leipzig is the birth place of whom?
5. MODELING NATURAL LANGUAGE
• Model semantics at word and phrase level.
• Be robust to small imperfections (e.g., a missing article).
• Handle question compositionality.
• Work with all human languages.
5
Language Model using Recurrent Neural Networks!
9. THE GENERATOR
9
Build question-query pairs from a set of manually-annotated templates.
where was <A> born?
select var_x where brack_open <A> dbo_birthPlace
var_x sep_dot brack_close
10. CHALLENGE #1: TEMPLATE DISCOVERY
10
where was <A> born?
select var_x where brack_open <A> dbo_birthPlace
var_x sep_dot brack_close
[…] Joe Abercrombie (born 1974) – fantasy writer and film
editor, was born in Lancaster and attended LRGS […]
Idea! Mine templates from a large text corpus using entity pairs. dbpedia:Joe_Abercrombie
dbpedia:Lancaster
ontology:birthPlace
12. CHALLENGE #2:WORD EXPANSION
12
How to deal with synonyms and out-of-vocabulary words?
Credits: github.com/ahaas/synonymvis
Distributional Semantics
Similar words are represented by
similar vectors (or word embeddings).
Language model handles word
disambiguation using context.
13. THE INTERPRETER
13
Sequence interpretation for SPARQL query reconstruction.
select var_x where brack_open var_x rdf_type
dbo_Person sep_dot var_x dbo_birthPlace dbr_Leipzig
Missing brack_close
SELECT ?x WHERE {
?x a ontology:Person .
?x ontology:birthPlace dbpedia:Leipzig
}
14. CHALLENGE #3: COMPOSITIONALITY
14
?x a ontology:Person .
?x dbo:birthPlace dbr:Dresden .
people born in Dresden
dbr:Saxony dbo:capital ?x .
what’s the capital of Saxony?
?x a ontology:Person .
?x dbo:birthPlace ?y .
dbr:Saxony dbo:capital ?y .
people born in the capital of Saxony
Learn the correct variable assignments in the reconstructed query.
+
=
Curriculum Learning
Learn to translate at baby steps.
15. CURRENT STATE
15
• Non-funded work
• Involving people from these institutes:
• AKSW, University of Leipzig
• HTWK / Leipzig University of Applied Sciences
• Paderborn University
• Bonn University
• DBpedia’s Google Summer of Code 2018
• Looking for partnerships!
16. Tommaso Soru
AKSW Research Group
University of Leipzig
Germany
tsoru@informatik.uni-leipzig.de
http://tommaso-soru.it
🤖 https://github.com/AKSW/NSpM
Thank you.
16