This presentation described a Multilingual Retrieval Interface for Structured data on the Web, a talk given at NLIWoD workshop at ISWC 2014. The approach is based on Grammatical framework and semantic web and linked data technologies
Guadalinfo es la red pública andaluza de centros de acceso ciudadano a la sociedad de la información, con más de 800 centros que ofrecen servicios TIC, formación en competencias digitales y apoyo a proyectos comunitarios para promover la igualdad de oportunidades. Los centros Guadalinfo, dirigidos por Agentes de Innovación Local, brindan capacitación digital adaptada a las necesidades de los ciudadanos para mejorar su empleabilidad, participación y calidad de vida.
An approach for automated matching of Linked Open Data at the schema level with very high scoring evaluation results, based on comparison with manually mapped schemata of major LOD datasets to PROTON upper level ontology.
This document proposes a system to help users formulate queries and select relevant search engines. It will:
1. Create a thematic user profile based on pages they visit to understand their interests.
2. Translate keyword queries into conceptual queries using the user's interests.
3. Suggest pairs of conceptual queries and relevant search engines by matching query concepts to search engine topics.
The system aims to improve on single keyword queries and selecting only Google by providing more conceptual, targeted suggestions. It will be evaluated by comparing its query/engine suggestions to Google's default suggestions.
El documento trata sobre las arenas y gravas utilizadas en hormigones y morteros. Explica que la normativa define los requisitos de los áridos como su tamaño, forma y pureza. También describe los ensayos para evaluar las características de los áridos y el efecto de la curva granulométrica en la resistencia del hormigón. Por último, propone el diseño de una instalación para la producción de arena de alta calidad para mortero y hormigón.
Tsh masterclass you've got seed... now what?TechMeetups
• What does a Series A mean? It is your first round with professional investors, with true financial targets and commitments.
• How to get from seed to A: deliver an MVP, build out your team and get customer traction
o MVP: the core of your business to prove the business
o Committed team: working together full-time and covering the key bases of the business
o Product/Market Fit: proving that you have built something that target clients will pay at a commercially viable rate
• When do you need Series A
o When you have proved the above
o 3-6 months before you need the money
o When you need to get to the next level
• Investors matter
o Money isn’t money... remember that you will be in a close relationship with these investors for years! They should be a resource to help grow your business. They should be people that you would really like to work with!
o They will generally follow-up in subsequent funding rounds and will have influence over your future equity
• Start early- start to casually network and pitch 3-6 months ahead of time. Investors will want loads of data, but try not to let it distract you from running your business (consider doing this ahead of time or appointing someone to focus on this)
• Build a relationship- have trust for the road ahead & understand their firm (their investment thesis)
• When pitching, be UNIQUE and put the vision in context of your competitors
• Agree on milestones with your investors early. This is important to align the vision of success and motivate the team. Less direct milestones might be of equal importance. Examples can include: user conversion/engagement rates, team & product development.
• You have series A- now what?
o Know your runway- how much time do you have before you need to fundraise?
o Start investing to grow the business
o Communicate the milestones that you agreed with your investors with your team
Set a communication schedule and keep to it
Communicate loads!
o Think about your own role (will you be the person who will lead the company into series B and beyond, or should you start specializing?)
o In your board meetings, use it as a working session to problem solve and automate your data so that you don’t spend too much time generating data just before the meeting
• After Series A, B/C is for expansion and team building, C/D/later is growth including expanding to new locales, product lines, etc.
BBC JUICER API Presentation - for SeedHack 4.0 - BBC News LabsBBC News Labs
This document describes the Juicer Data and APIs provided by Ontoba. It summarizes that the Juicer APIs allow querying of over 500k news articles that have been semantically annotated with DBpedia concepts, events and storylines. It provides endpoints and examples for finding concepts, concept occurrences, co-occurrences, and searching or querying news articles semantically through SPARQL. Live examples and documentation links are also included.
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013Mariana Damova, Ph.D
The document discusses building an ontology-based application to communicate museum content in multiple languages on the Semantic Web. It aims to make cultural heritage accessible to both humans and computers by generating natural language descriptions from semantic data. The application uses Grammatical Framework to linearly multiple museum datasets and ontologies into 15 languages. It addresses challenges in cross-linguistically representing classes, properties, word order, tense, and reference. The system was demonstrated to generate descriptions of paintings from the Louvre museum in English and French.
Grammatical Framework for implementing multilingual frames and constructionsNormunds Grūzītis
We propose Grammatical Framework, GF, as a unified formalism and a toolkit for implementing both computational frame semantic grammars and computational construction grammars, allowing for seamless combination of both perspectives. We show that such grammars, as well as lexicons, can be extracted systematically and, thus, largely automatically from the existing semi-formal framenets and constructicons by extending the existing GF resource grammar library. Moreover, we propose GF as a framework for implementing multilingual frame semantic and construction grammars, currently testing our approach on English and Swedish, as well as Russian.
Guadalinfo es la red pública andaluza de centros de acceso ciudadano a la sociedad de la información, con más de 800 centros que ofrecen servicios TIC, formación en competencias digitales y apoyo a proyectos comunitarios para promover la igualdad de oportunidades. Los centros Guadalinfo, dirigidos por Agentes de Innovación Local, brindan capacitación digital adaptada a las necesidades de los ciudadanos para mejorar su empleabilidad, participación y calidad de vida.
An approach for automated matching of Linked Open Data at the schema level with very high scoring evaluation results, based on comparison with manually mapped schemata of major LOD datasets to PROTON upper level ontology.
This document proposes a system to help users formulate queries and select relevant search engines. It will:
1. Create a thematic user profile based on pages they visit to understand their interests.
2. Translate keyword queries into conceptual queries using the user's interests.
3. Suggest pairs of conceptual queries and relevant search engines by matching query concepts to search engine topics.
The system aims to improve on single keyword queries and selecting only Google by providing more conceptual, targeted suggestions. It will be evaluated by comparing its query/engine suggestions to Google's default suggestions.
El documento trata sobre las arenas y gravas utilizadas en hormigones y morteros. Explica que la normativa define los requisitos de los áridos como su tamaño, forma y pureza. También describe los ensayos para evaluar las características de los áridos y el efecto de la curva granulométrica en la resistencia del hormigón. Por último, propone el diseño de una instalación para la producción de arena de alta calidad para mortero y hormigón.
Tsh masterclass you've got seed... now what?TechMeetups
• What does a Series A mean? It is your first round with professional investors, with true financial targets and commitments.
• How to get from seed to A: deliver an MVP, build out your team and get customer traction
o MVP: the core of your business to prove the business
o Committed team: working together full-time and covering the key bases of the business
o Product/Market Fit: proving that you have built something that target clients will pay at a commercially viable rate
• When do you need Series A
o When you have proved the above
o 3-6 months before you need the money
o When you need to get to the next level
• Investors matter
o Money isn’t money... remember that you will be in a close relationship with these investors for years! They should be a resource to help grow your business. They should be people that you would really like to work with!
o They will generally follow-up in subsequent funding rounds and will have influence over your future equity
• Start early- start to casually network and pitch 3-6 months ahead of time. Investors will want loads of data, but try not to let it distract you from running your business (consider doing this ahead of time or appointing someone to focus on this)
• Build a relationship- have trust for the road ahead & understand their firm (their investment thesis)
• When pitching, be UNIQUE and put the vision in context of your competitors
• Agree on milestones with your investors early. This is important to align the vision of success and motivate the team. Less direct milestones might be of equal importance. Examples can include: user conversion/engagement rates, team & product development.
• You have series A- now what?
o Know your runway- how much time do you have before you need to fundraise?
o Start investing to grow the business
o Communicate the milestones that you agreed with your investors with your team
Set a communication schedule and keep to it
Communicate loads!
o Think about your own role (will you be the person who will lead the company into series B and beyond, or should you start specializing?)
o In your board meetings, use it as a working session to problem solve and automate your data so that you don’t spend too much time generating data just before the meeting
• After Series A, B/C is for expansion and team building, C/D/later is growth including expanding to new locales, product lines, etc.
BBC JUICER API Presentation - for SeedHack 4.0 - BBC News LabsBBC News Labs
This document describes the Juicer Data and APIs provided by Ontoba. It summarizes that the Juicer APIs allow querying of over 500k news articles that have been semantically annotated with DBpedia concepts, events and storylines. It provides endpoints and examples for finding concepts, concept occurrences, co-occurrences, and searching or querying news articles semantically through SPARQL. Live examples and documentation links are also included.
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013Mariana Damova, Ph.D
The document discusses building an ontology-based application to communicate museum content in multiple languages on the Semantic Web. It aims to make cultural heritage accessible to both humans and computers by generating natural language descriptions from semantic data. The application uses Grammatical Framework to linearly multiple museum datasets and ontologies into 15 languages. It addresses challenges in cross-linguistically representing classes, properties, word order, tense, and reference. The system was demonstrated to generate descriptions of paintings from the Louvre museum in English and French.
Grammatical Framework for implementing multilingual frames and constructionsNormunds Grūzītis
We propose Grammatical Framework, GF, as a unified formalism and a toolkit for implementing both computational frame semantic grammars and computational construction grammars, allowing for seamless combination of both perspectives. We show that such grammars, as well as lexicons, can be extracted systematically and, thus, largely automatically from the existing semi-formal framenets and constructicons by extending the existing GF resource grammar library. Moreover, we propose GF as a framework for implementing multilingual frame semantic and construction grammars, currently testing our approach on English and Swedish, as well as Russian.
Can Deep Learning Techniques Improve Entity Linking?Julien PLU
Julien Plu presented on using deep learning techniques to improve entity linking. He discussed using word embeddings and neural networks to better recognize and link entities in documents by understanding semantic relationships between words. Current supervised methods require large training sets and are not robust to new entity types, while unsupervised methods have difficulties computing relatedness between candidate entities. Deep learning approaches may help address these issues through their ability to learn complex patterns from large amounts of unlabeled text data.
The ability to take data, understand it, visualize it and extract useful information from it is becoming a hugely important skill. How can you turn all those logs, histories of purchases and trades or open government data, into useful information that help your business make money?
In this talk, we’ll look at doing data science using F#. The F# language is perfectly suited for this task – type providers integrate external data directly into the language – your language suddenly _understands_ CSV, XML, JSON, REST services and other sources. The interactive development style makes it easy to explore data and test your algorithms as you’re writing them. Rich set of libraries for working with data frames, time series and for visualization gives you all the tools you need. And finally – F# easily integrates with statistical environments like R and Matlab, giving you access to the industry standard libraries.
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...Enrico Daga
Slides of the presentation at #ENDORSE2023
The SPARQL Anything project: http://sparql-anything.cc
Endorse Conference 2023, see
https://twitter.com/EULawDataPubs/status/1635663471349223425
--
Abstract:
What should a data integration framework for knowledge graph experts look like?
Approaches can transform the non-RDF data sources by applying ad-hoc transformations to existing ontologies (Any23), using a mapping language (RML) or expanding on existing standards with custom operators (SPARQL Generate). These solutions result either in code that is difficult to maintain and reuse or require KG experts to learn a variety of languages and custom tools. Recent research on Knowledge Graph construction proposes the design of a façade, a notion borrowed from object-oriented software engineering. This idea is applied to SPARQL Anything, a system that allows querying heterogeneous resources as if they were in RDF, in standard SPARQL 1.1.
The SPARQL Anything project supports a wide variety of file formats, from popular ones (CSV, JSON, XML, Spreadsheets) to others that are not supported by alternative solutions (Markdown, YAML, DOCx, Bibtex). Features include querying Web APIs with high flexibility, parametrized queries, and chaining multiple transformations into complex pipelines.
We describe the design rationale of the SPARQL Anything system and its application in two EU-funded projects and in the industry. We provide references to an extensive set of reusable showcases. We report on the value-to-users of the founding assumptions of SPARQL Anything, compared to alternative solutions to knowledge graph construction.
Information-Rich Programming in F# with Semantic DataSteffen Staab
Programming with rich data frequently implies that one
needs to search for, understand, integrate and program with
new data - with each of these steps constituting a major
obstacle to successful data use.
In this talk we will explain and demonstrate how our approach,
LITEQ - Language Integrated Types, Extensions and Queries for
RDF Graphs, which is realized as part of the F# / Visual Studio-
environment, supports the software developer. Using the extended
IDE the developer may now
a. explore new, previously unseen data sources,
which are either natively in RDF or mapped into RDF;
b. use the exploration of schemata and data in order to
construct types and objects in the F# environment;
c. automatically map between data and programming language objects in
order to make them persistent in the data source;
d. have extended typing functionality added to the F#
environment and resulting from the exploration of the data source
and its mapping into F#.
Core to this approach is the novel node path query language, NPQL,
that allows for interactive, intuitive exploration of data schemata and
data proper as well as for the mapping and definition
of types, object collections and individual objects.
Beyond the existing type provider mechanism for F#
our approach also allows for property-based navigation
and runtime querying for data objects.
Knowledge graph construction with a façade - The SPARQL Anything ProjectEnrico Daga
The document discusses a project called "SPARQL Anything" which aims to simplify knowledge graph construction by using SPARQL as the single language for representing and transforming diverse data formats into RDF. It presents an approach called "Facade-X" which defines a common RDF structure that can be applied over different formats like CSV, JSON, HTML, etc. This facade focuses on the RDF meta-model and aims to apply minimal ontological commitments. The document outlines how Facade-X can be used to represent different formats and provides examples of using SPARQL to transform sample data into RDF without committing to a specific domain ontology.
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...MeetupDataScienceRoma
The document discusses Roberto Navigli giving a talk on BabelNet, a multilingual semantic network created by merging various knowledge resources, and Babelfy, a state-of-the-art multilingual word sense disambiguation system that leverages BabelNet. The talk outlines BabelNet and Babelfy, demonstrates Babelfy for word sense disambiguation, and discusses how working with BabelNet provides coverage of numerous resources by annotating with their combined knowledge.
The ability to take data, understand it, visualize it and extract useful information from it is becoming a hugely important skill. How can you turn all those logs, histories of purchases and trades or open government data, into useful information that help your business make money?
In this talk, we’ll look at doing data science using F#. The F# language is perfectly suited for this task – type providers integrate external data directly into the language – your language suddenly _understands_ CSV, XML, JSON, REST services and other sources. The interactive development style makes it easy to explore data and test your algorithms as you’re writing them. Rich set of libraries for working with data frames, time series and for visualization gives you all the tools you need. And finally – F# easily integrates with statistical environments like R and Matlab, giving you access to the industry standard libraries.
Summer School LD4SC 2015 - RDF(S) and SPARQLPieter Pauwels
Presentation about RDF(S) and SPARQL at the Summer School on Linked Data 4 Smart Cities, in Cercedilla (2015): http://smartcity.linkeddata.es/LD4SC/, organised by the Ontology Engineering Group of Universidad Politécnica de Madrid.
The document provides an introduction to ontology development and the Semantic Web. It discusses the history and development of the Semantic Web, from Sir Tim Berners-Lee's initial creation of the World Wide Web to recent applications like Siri. It also outlines the layered "cake" architecture of the Semantic Web, describing standards and technologies at each layer like RDF, OWL, and SPARQL that add structure and meaning to data on the Web. The document raises questions about the challenges of fully realizing the Semantic Web's potential for automated reasoning and concludes that while progress has been made, many areas remain the subject of ongoing research.
DM2E Content (Doron Goldfarb – ONB Austrian National Library) at Enabling humanities research in the Linked Open Web – DM2E final event (11 December 2014, Navacchio, Italy)
Keynote new convergences between natural language processing and knowledge ...semanticsconference
This document discusses the evolution of natural language processing (NLP) and knowledge engineering (KE) and their convergence in artificial intelligence. It outlines how deep learning is increasingly being used for NLP tasks like representation learning and distributional semantics. It also discusses semantic relations and challenges in extracting relations from text using NLP and KE techniques.
ISOcat and RELcat, two cooperating semantic registriesMenzo Windhouwer
M. Windhouwer, I. Schuurman. ISOcat and RELcat, two cooperating semantic registries. At the 24th Meeting of Computational Linguistics in the Netherlands (CLIN 24), Leiden, The Netherlands, January 17, 2014.
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Julien PLU
The document presents research on developing an adaptive entity linking system called ADEL. It discusses 6 problems in entity linking and proposes research questions to address adaptivity to different text, entity types, knowledge bases, and languages. It describes ADEL's modular framework including extraction, linking, and pruning modules. Evaluation shows ADEL achieves state-of-the-art results on multiple datasets. Future work focuses on knowledge base and language adaptivity, improving the system, and engineering a distributed architecture.
The Datalift Project aims to publish and interconnect government open data. It develops tools and methodologies to transform raw datasets into interconnected semantic data. The project's first phase focuses on opening data by developing an infrastructure to ease publication. The second phase will validate the platform by publishing real datasets. The goal of Datalift is to move data from its raw published state to being fully interconnected on the Semantic Web.
This presentation gives insight to the overall Horizon 2020 Program and more specifically for the period 2018-2020 with emphasis to ICT. Mariana Damova is the National Contact Point for Horizon 2020 ICT in Bulgaria
Geography of Letters - The Spirituality of Sofia in the Historic MemoryMariana Damova, Ph.D
Presentation of the project The Spirituality of Sofia in the Historic Memory at the Round table on the future perspectives for Digital humanities in SEE within the Summer School in Advanced Tools for Digital Humanities and IT
The document describes IndustryInform, a semantic-based search and recommendation service for business networking. It allows industrial enterprises to advertise themselves, helps potential clients and investors find matching businesses, and provides a data as a service facility (DaaS) through annual subscriptions or pay-per-query plans. The service uses semantic web technologies and linked data to power searches across a database of over 50 million information units about 300,000 companies in 7 countries. It has features like extended search, company/user registration, and results displayed in table or Google-like formats. The system was developed by Mozaika's Humanizing Technologies Lab and has an engineering team, graphic designer, and business/marketing team to manage it.
Can Deep Learning Techniques Improve Entity Linking?Julien PLU
Julien Plu presented on using deep learning techniques to improve entity linking. He discussed using word embeddings and neural networks to better recognize and link entities in documents by understanding semantic relationships between words. Current supervised methods require large training sets and are not robust to new entity types, while unsupervised methods have difficulties computing relatedness between candidate entities. Deep learning approaches may help address these issues through their ability to learn complex patterns from large amounts of unlabeled text data.
The ability to take data, understand it, visualize it and extract useful information from it is becoming a hugely important skill. How can you turn all those logs, histories of purchases and trades or open government data, into useful information that help your business make money?
In this talk, we’ll look at doing data science using F#. The F# language is perfectly suited for this task – type providers integrate external data directly into the language – your language suddenly _understands_ CSV, XML, JSON, REST services and other sources. The interactive development style makes it easy to explore data and test your algorithms as you’re writing them. Rich set of libraries for working with data frames, time series and for visualization gives you all the tools you need. And finally – F# easily integrates with statistical environments like R and Matlab, giving you access to the industry standard libraries.
Streamlining Knowledge Graph Construction with a façade: the SPARQL Anything...Enrico Daga
Slides of the presentation at #ENDORSE2023
The SPARQL Anything project: http://sparql-anything.cc
Endorse Conference 2023, see
https://twitter.com/EULawDataPubs/status/1635663471349223425
--
Abstract:
What should a data integration framework for knowledge graph experts look like?
Approaches can transform the non-RDF data sources by applying ad-hoc transformations to existing ontologies (Any23), using a mapping language (RML) or expanding on existing standards with custom operators (SPARQL Generate). These solutions result either in code that is difficult to maintain and reuse or require KG experts to learn a variety of languages and custom tools. Recent research on Knowledge Graph construction proposes the design of a façade, a notion borrowed from object-oriented software engineering. This idea is applied to SPARQL Anything, a system that allows querying heterogeneous resources as if they were in RDF, in standard SPARQL 1.1.
The SPARQL Anything project supports a wide variety of file formats, from popular ones (CSV, JSON, XML, Spreadsheets) to others that are not supported by alternative solutions (Markdown, YAML, DOCx, Bibtex). Features include querying Web APIs with high flexibility, parametrized queries, and chaining multiple transformations into complex pipelines.
We describe the design rationale of the SPARQL Anything system and its application in two EU-funded projects and in the industry. We provide references to an extensive set of reusable showcases. We report on the value-to-users of the founding assumptions of SPARQL Anything, compared to alternative solutions to knowledge graph construction.
Information-Rich Programming in F# with Semantic DataSteffen Staab
Programming with rich data frequently implies that one
needs to search for, understand, integrate and program with
new data - with each of these steps constituting a major
obstacle to successful data use.
In this talk we will explain and demonstrate how our approach,
LITEQ - Language Integrated Types, Extensions and Queries for
RDF Graphs, which is realized as part of the F# / Visual Studio-
environment, supports the software developer. Using the extended
IDE the developer may now
a. explore new, previously unseen data sources,
which are either natively in RDF or mapped into RDF;
b. use the exploration of schemata and data in order to
construct types and objects in the F# environment;
c. automatically map between data and programming language objects in
order to make them persistent in the data source;
d. have extended typing functionality added to the F#
environment and resulting from the exploration of the data source
and its mapping into F#.
Core to this approach is the novel node path query language, NPQL,
that allows for interactive, intuitive exploration of data schemata and
data proper as well as for the mapping and definition
of types, object collections and individual objects.
Beyond the existing type provider mechanism for F#
our approach also allows for property-based navigation
and runtime querying for data objects.
Knowledge graph construction with a façade - The SPARQL Anything ProjectEnrico Daga
The document discusses a project called "SPARQL Anything" which aims to simplify knowledge graph construction by using SPARQL as the single language for representing and transforming diverse data formats into RDF. It presents an approach called "Facade-X" which defines a common RDF structure that can be applied over different formats like CSV, JSON, HTML, etc. This facade focuses on the RDF meta-model and aims to apply minimal ontological commitments. The document outlines how Facade-X can be used to represent different formats and provides examples of using SPARQL to transform sample data into RDF without committing to a specific domain ontology.
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...MeetupDataScienceRoma
The document discusses Roberto Navigli giving a talk on BabelNet, a multilingual semantic network created by merging various knowledge resources, and Babelfy, a state-of-the-art multilingual word sense disambiguation system that leverages BabelNet. The talk outlines BabelNet and Babelfy, demonstrates Babelfy for word sense disambiguation, and discusses how working with BabelNet provides coverage of numerous resources by annotating with their combined knowledge.
The ability to take data, understand it, visualize it and extract useful information from it is becoming a hugely important skill. How can you turn all those logs, histories of purchases and trades or open government data, into useful information that help your business make money?
In this talk, we’ll look at doing data science using F#. The F# language is perfectly suited for this task – type providers integrate external data directly into the language – your language suddenly _understands_ CSV, XML, JSON, REST services and other sources. The interactive development style makes it easy to explore data and test your algorithms as you’re writing them. Rich set of libraries for working with data frames, time series and for visualization gives you all the tools you need. And finally – F# easily integrates with statistical environments like R and Matlab, giving you access to the industry standard libraries.
Summer School LD4SC 2015 - RDF(S) and SPARQLPieter Pauwels
Presentation about RDF(S) and SPARQL at the Summer School on Linked Data 4 Smart Cities, in Cercedilla (2015): http://smartcity.linkeddata.es/LD4SC/, organised by the Ontology Engineering Group of Universidad Politécnica de Madrid.
The document provides an introduction to ontology development and the Semantic Web. It discusses the history and development of the Semantic Web, from Sir Tim Berners-Lee's initial creation of the World Wide Web to recent applications like Siri. It also outlines the layered "cake" architecture of the Semantic Web, describing standards and technologies at each layer like RDF, OWL, and SPARQL that add structure and meaning to data on the Web. The document raises questions about the challenges of fully realizing the Semantic Web's potential for automated reasoning and concludes that while progress has been made, many areas remain the subject of ongoing research.
DM2E Content (Doron Goldfarb – ONB Austrian National Library) at Enabling humanities research in the Linked Open Web – DM2E final event (11 December 2014, Navacchio, Italy)
Keynote new convergences between natural language processing and knowledge ...semanticsconference
This document discusses the evolution of natural language processing (NLP) and knowledge engineering (KE) and their convergence in artificial intelligence. It outlines how deep learning is increasingly being used for NLP tasks like representation learning and distributional semantics. It also discusses semantic relations and challenges in extracting relations from text using NLP and KE techniques.
ISOcat and RELcat, two cooperating semantic registriesMenzo Windhouwer
M. Windhouwer, I. Schuurman. ISOcat and RELcat, two cooperating semantic registries. At the 24th Meeting of Computational Linguistics in the Netherlands (CLIN 24), Leiden, The Netherlands, January 17, 2014.
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Julien PLU
The document presents research on developing an adaptive entity linking system called ADEL. It discusses 6 problems in entity linking and proposes research questions to address adaptivity to different text, entity types, knowledge bases, and languages. It describes ADEL's modular framework including extraction, linking, and pruning modules. Evaluation shows ADEL achieves state-of-the-art results on multiple datasets. Future work focuses on knowledge base and language adaptivity, improving the system, and engineering a distributed architecture.
The Datalift Project aims to publish and interconnect government open data. It develops tools and methodologies to transform raw datasets into interconnected semantic data. The project's first phase focuses on opening data by developing an infrastructure to ease publication. The second phase will validate the platform by publishing real datasets. The goal of Datalift is to move data from its raw published state to being fully interconnected on the Semantic Web.
Similar to NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on the Web (15)
This presentation gives insight to the overall Horizon 2020 Program and more specifically for the period 2018-2020 with emphasis to ICT. Mariana Damova is the National Contact Point for Horizon 2020 ICT in Bulgaria
Geography of Letters - The Spirituality of Sofia in the Historic MemoryMariana Damova, Ph.D
Presentation of the project The Spirituality of Sofia in the Historic Memory at the Round table on the future perspectives for Digital humanities in SEE within the Summer School in Advanced Tools for Digital Humanities and IT
The document describes IndustryInform, a semantic-based search and recommendation service for business networking. It allows industrial enterprises to advertise themselves, helps potential clients and investors find matching businesses, and provides a data as a service facility (DaaS) through annual subscriptions or pay-per-query plans. The service uses semantic web technologies and linked data to power searches across a database of over 50 million information units about 300,000 companies in 7 countries. It has features like extended search, company/user registration, and results displayed in table or Google-like formats. The system was developed by Mozaika's Humanizing Technologies Lab and has an engineering team, graphic designer, and business/marketing team to manage it.
Mozaika is a research center and SME operating since 2013 in the areas of data science, natural language interfaces, and human insight. It provides consulting, R&D projects, and data as a service solutions tailored to human behavior. Mozaika has expertise in semantic technologies, cognitive systems, and multimodal interactivity. It has completed projects in business networking, human resources management, cultural heritage, education, and aerospace with clients and partners from both private companies and research organizations.
This document summarizes Mozaika, a research center focused on humanizing technologies. It discusses technologies that make emerging technologies more understandable and give people more control, including reducing data complexity through semantic technologies. It provides examples of Mozaika's projects involving skills matching, city experience summarization, satellite communications, linked open data, and e-publishing. The goal is for technology to better support and enhance humanity.
Communication channels for the european single digital marketMariana Damova, Ph.D
Presentation about the importance of tackling the multilinguality in the strategy agenda for the European Digital Single Market, and about the role of language technology and the European language technology community in solving this issue endorsed by public funding
This is a presentation targeted to leaders of cultural institutions in Bulgaria to inform them about the opportunities to publish cultural content in Bulgariana and in Europeana and about what would be their benefits for doing this.
This document discusses humanizing technologies and trends in developing technologies that are more human-centric. It provides examples of technologies being developed by Mozaika, a research center, to better integrate technologies into human lives in natural ways. Mozaika is working on projects involving summarization, skills matching, information management, publishing, and more using techniques like natural language processing, sentiment analysis, and semantic technologies.
Presentation held at a meeting in Bulgaria (Varna Regional Library) coorganized by Europeana, Bulgariana, Varna Regional library and BBIA about Europeana.
This presentation is an overview of the Bulgarian participation in the virtual museum Europeana, and the path of establishing a National Aggregator to Europeana.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
2. Semantic Data Infrastructures
Leonardo
Mona Lisa
RDF Repository
SPARQL
:Painter
:Painting
:painted
Mona Lisa
?
rdf:type
rdf:type
• Semantic Web
• Linked data
• SPARQL query language
2
10/19/2014
3. Natural Interface through NL
10/19/2014
3
EN
FR
DE
Who painted Mona Lisa?
Qui a paint Mona Lisa?
Wer hat Mona Lisa gemahlt?
Who is Mona Lisa’s painter?
Qui est le paintre de Mona Lisa?
Wer ist der Mahler von Mona Lisa ?
Who created Mona Lisa?
Qui a créé Mona Lisa?
Wer hat Mona Lisa geschöpft?
Leonardo da Vinci
A:
Q:
NL to ontology interoperability
5. GF – Grammar Framework
•Type-theoretical grammar formalism supporting multilingual applications
•Two-layered architecture
–Abstract syntax - semantics
–Concrete syntax – language dependent surface structure
10/19/2014
5
Abstract syntax: Concrete English syntax: Abstract representation:
cat NP, VP, S; lincat NP, VP, S = {s: Str}; lin Mary = mkNP (mkPN "Mary");
fun Mary, John: NP; lin Mary = {s = "Mary"}; lin John = mkNP (mkPN "John");
fun Love: NP -> VP; lin John = {s = "John"}; lin Love o = mkVP (mkV2 "love") o;
fun Pred : NP -> VP -> S; lin Love o = {s = "loves" ++ o.s}; lin Pred sub v = mkS (mkCl sub v); lin Pred sub v = {s = sub.s ++ v.s};
Ex: John loves Mary
7. YAQL (Yet Another Query language)
•A common architecture with one base module and domain knowledge representation
•Straightforward abstract syntax generation from ontology with just the minimum lexical types
–Common noun – Kind
–Noun phrase – Entity
–Verb phrase – Property
–Verb phrase with higher arity – Relation
•Reusable generic grammar structure
10/19/2014
7
Abstract syntax: Concrete syntax:
Move ; Move = Utt ;
Query ; Query = QS ;
MQuery : Query -> Move ;
Query
Command
Answer
8. YAQL and the Semantic Web
The category Kind gets coupled with OWL entities
10/19/2014
8
•Yet another NL two layers so that a new domain model can be easily integrated into the query module
•Bidirectional translation in 15 languages (Bulgarian, Catalan, Danish, Dutch, English, Finnish, French, Hebrew, Italian, German, Norwegian, Romanian, Russian, Spanish, Swedish)
Text
Query
Answer
Data
Lexicon
RGL
YAQL
10. Who painted Mona Lisa?
English: Who painted t ?
QPainter t = mkQS pastTense (mkQCl who_IP paint_V2 t)
Finnish: Whose painting is t ?}
QPainter t = mkQS (mkQCl (mkIP (E.GenIP who_IP) (mkN "maalaama")) t)
French: By who is t ?
QPainter t = mkQS (mkQCl (mkIAdv by8agent_Prep who_IP) t)
10/19/2014
10
$MQuery (QPainter (PTitle TMona_Lisa))$
Abstract syntax
SPARQL
MQuery q = "PREFIX painting:<http://spraakbanken.gu.se/rdf/owl/painting.owl#> PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>> SELECT distinct"++ q.wh1 ++ " WHERE { ?painting rdf:type painting:Painting; rdfs:label ?title; " ++ q.wh2 ++ q.prop++"}" ;
Concrete syntax
11. Evaluation
•User satisfaction
•Efficiency in terms of time, effort and cost
•Effectiveness, how the system scales up
10/19/2014
11
Coverage:
1159 query patterns in 15 languages
10 characteristics of CH objects
Extendibility
New query grammar - 150 lines of code
Evaluation
Random queries in 7 languages with very few native informants’ corrections
12. Conclusion
•NL to ontology interoperability approach
•Multilingual interface for retrieval of structured data from the Web
•Easily extendable initial base of YAQL transformations
•Great coverage of paraphrases
•Expert language/information engineers required
10/19/2014
12
13. Thank you for your attention
10/19/2014
13
Contacts:
dana.dannells@svenska.gu.se
ra.monique@gmail.com
mariana.damova@mozajka.co
?