Semantic search helps business people find answers to pressing questions by wading through oceans of information to find nuggets of meaningful information. In this presentation we’ll discuss how semantic search and content analysis technologies are starting to appear in the marketplace today. We’ll provide a recap of what semantic search is and what the key benefits are, then we’ll answer the following questions:
• Is semantic search a feature, an application, or enterprise system?
• How can I add semantic search to my existing work processes?
• Will I need to replace my existing content technologies?
• What will I need to do to prepare my content for semantic search?
• Is semantic search just for documents or can I search my data too?
• Can I use semantic search to find information on the internet and other public data sources?
• Are there standards to consider?
Semantic search helps business people find answers to pressing questions by wading through oceans of information to find nuggets of meaningful information. In this presentation we’ll discuss how semantic search and content analysis technologies are starting to appear in the marketplace today. We’ll provide a recap of what semantic search is and what the key benefits are, then we’ll answer the following questions:
• Is semantic search a feature, an application, or enterprise system?
• How can I add semantic search to my existing work processes?
• Will I need to replace my existing content technologies?
• What will I need to do to prepare my content for semantic search?
• Is semantic search just for documents or can I search my data too?
• Can I use semantic search to find information on the internet and other public data sources?
• Are there standards to consider?
Talk at the 2nd Summer Workshop of the Center for Semantic Web Research (January 16, 2016, Santiago, Chile) about the construction of Yahoo's Knowledge Graph and associated research challenges.
Reflected Intelligence: Real world AI in Digital TransformationTrey Grainger
The goal of most digital transformations is to create competitive advantage by enhancing customer experience and employee success, so giving these stakeholders the ability to find the right information at their moment of need is paramount. Employees and customers increasingly expect an intuitive, interactive experience where they can simply type or speak their questions or keywords into a search box, their intent will be understood, and the best answers and content are then immediately presented.
Providing this compelling experience, however, requires a deep understanding of your content, your unique business domain, and the collective and personalized needs of each of your users. Modern artificial intelligence (AI) approaches are able to continuously learn from both your content and the ongoing stream of user interactions with your applications, and to automatically reflect back that learned intelligence in order to instantly and scalably deliver contextually-relevant answers to employees and customers.
In this talk, we'll discuss how AI is currently being deployed across the Fortune 1000 to accomplish these goals, both in the digital workplace (helping employees more efficiently get answers and make decisions) and in digital commerce (understanding customer intent and connecting them with the best information and products). We'll separate fact from fiction as we break down the hype around AI and show how it is being practically implemented today to power many real-world digital transformations for the next generation of employees and customers.
Presentation of the Semantic Knowledge Graph research paper at the 2016 IEEE 3rd International Conference on Data Science and Advanced Analytics (Montreal, Canada - October 18th, 2016)
Abstract—This paper describes a new kind of knowledge representation and mining system which we are calling the Semantic Knowledge Graph. At its heart, the Semantic Knowledge Graph leverages an inverted index, along with a complementary uninverted index, to represent nodes (terms) and edges (the documents within intersecting postings lists for multiple terms/nodes). This provides a layer of indirection between each pair of nodes and their corresponding edge, enabling edges to materialize dynamically from underlying corpus statistics. As a result, any combination of nodes can have edges to any other nodes materialize and be scored to reveal latent relationships between the nodes. This provides numerous benefits: the knowledge graph can be built automatically from a real-world corpus of data, new nodes - along with their combined edges - can be instantly materialized from any arbitrary combination of preexisting nodes (using set operations), and a full model of the semantic relationships between all entities within a domain can be represented and dynamically traversed using a highly compact representation of the graph. Such a system has widespread applications in areas as diverse as knowledge modeling and reasoning, natural language processing, anomaly detection, data cleansing, semantic search, analytics, data classification, root cause analysis, and recommendations systems. The main contribution of this paper is the introduction of a novel system - the Semantic Knowledge Graph - which is able to dynamically discover and score interesting relationships between any arbitrary combination of entities (words, phrases, or extracted concepts) through dynamically materializing nodes and edges from a compact graphical representation built automatically from a corpus of data representative of a knowledge domain.
Natural Language Search with Knowledge Graphs (Haystack 2019)Trey Grainger
To optimally interpret most natural language queries, it is necessary to understand the phrases, entities, commands, and relationships represented or implied within the search. Knowledge graphs serve as useful instantiations of ontologies which can help represent this kind of knowledge within a domain.
In this talk, we'll walk through techniques to build knowledge graphs automatically from your own domain-specific content, how you can update and edit the nodes and relationships, and how you can seamlessly integrate them into your search solution for enhanced query interpretation and semantic search. We'll have some fun with some of the more search-centric use cased of knowledge graphs, such as entity extraction, query expansion, disambiguation, and pattern identification within our queries: for example, transforming the query "bbq near haystack" into
{ filter:["doc_type":"restaurant"], "query": { "boost": { "b": "recip(geodist(38.034780,-78.486790),1,1000,1000)", "query": "bbq OR barbeque OR barbecue" } } }
We'll also specifically cover use of the Semantic Knowledge Graph, a particularly interesting knowledge graph implementation available within Apache Solr that can be auto-generated from your own domain-specific content and which provides highly-nuanced, contextual interpretation of all of the terms, phrases and entities within your domain. We'll see a live demo with real world data demonstrating how you can build and apply your own knowledge graphs to power much more relevant query understanding within your search engine.
Natural Language Search with Knowledge Graphs (Activate 2019)Trey Grainger
To optimally interpret most natural language queries, its important to understand a highly-nuanced, contextual interpretation of the domain-specific phrases, entities, commands, and relationships represented or implied within the search and within your domain.
In this talk, we'll walk through such a search system powered by Solr's Text Tagger and Semantic Knowledge graph. We'll have fun with some of the more search-centric use cases of knowledge graphs, such as entity extraction, query expansion, disambiguation, and pattern identification within our queries: for example, transforming the query "best bbq near activate" into:
{!func}mul(min(popularity,1),100) bbq^0.91032 ribs^0.65674 brisket^0.63386 doc_type:"restaurant" {!geofilt d=50 sfield="coordinates_pt" pt="38.916120,-77.045220"}
We'll see a live demo with real world data demonstrating how you can build and apply your own knowledge graphs to power much more relevant query understanding like this within your search engine.
Slides for the iDB summer school (Sapporo, Japan) http://db-event.jpn.org/idb2013/
Typically, Web mining approaches have focused on enhancing or learning about user seeking behavior, from query log analysis and click through usage, employing the web graph structure for ranking to detecting spam or web page duplicates. Lately, there's a trend on mining web content semantics and dynamics in order to enhance search capabilities by either providing direct answers to users or allowing for advanced interfaces or capabilities. In this tutorial we will look into different ways of mining textual information from Web archives, with a particular focus on how to extract and disambiguate entities, and how to put them in use in various search scenarios. Further, we will discuss how web dynamics affects information access and how to exploit them in a search context.
Natural Language Search with Knowledge Graphs (Chicago Meetup)Trey Grainger
To optimally interpret most natural language queries, its important to understand a highly-nuanced, contextual interpretation of the domain-specific phrases, entities, commands, and relationships represented or implied within the search and within your domain.
In this talk, we'll walk through such a search system powered by Solr's Text Tagger and Semantic Knowledge graph. We'll have fun with some of the more search-centric use cases of knowledge graphs, such as entity extraction, query expansion, disambiguation, and pattern identification within our queries: for example, transforming the query "best bbq near activate" into:
{!func}mul(min(popularity,1),100) bbq^0.91032 ribs^0.65674 brisket^0.63386 doc_type:"restaurant" {!geofilt d=50 sfield="coordinates_pt" pt="38.916120,-77.045220"}
We'll see a live demo with real world data demonstrating how you can build and apply your own knowledge graphs to power much more relevant query understanding like this within your search engine.
The Next Generation of AI-powered SearchTrey Grainger
What does it really mean to deliver an "AI-powered Search" solution? In this talk, we’ll bring clarity to this topic, showing you how to marry the art of the possible with the real-world challenges involved in understanding your content, your users, and your domain. We'll dive into emerging trends in AI-powered Search, as well as many of the stumbling blocks found in even the most advanced AI and Search applications, showing how to proactively plan for and avoid them. We'll walk through the various uses of reflected intelligence and feedback loops for continuous learning from user behavioral signals and content updates, also covering the increasing importance of virtual assistants and personalized search use cases found within the intersection of traditional search and recommendation engines. Our goal will be to provide a baseline of mainstream AI-powered Search capabilities available today, and to paint a picture of what we can all expect just on the horizon.
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
What if your search engine could automatically tune its own domain-specific relevancy model? What if it could learn the important phrases and topics within your domain, automatically identify alternate spellings (synonyms, acronyms, and related phrases) and disambiguate multiple meanings of those phrases, learn the conceptual relationships embedded within your documents, and even use machine-learned ranking to discover the relative importance of different features and then automatically optimize its own ranking algorithms for your domain?
In this presentation, you’ll learn you how to do just that - to evolving Lucene/Solr implementations into self-learning data systems which are able to accept user queries, deliver relevance-ranked results, and automatically learn from your users’ subsequent interactions to continually deliver a more relevant experience for each keyword, category, and group of users.
Such a self-learning system leverages reflected intelligence to consistently improve its understanding of the content (documents and queries), the context of specific users, and the relevance signals present in the collective feedback from every prior user interaction with the system. Come learn how to move beyond manual relevancy tuning and toward a closed-loop system leveraging both the embedded meaning within your content and the wisdom of the crowds to automatically generate search relevancy algorithms optimized for your domain.
Thought Vectors and Knowledge Graphs in AI-powered SearchTrey Grainger
While traditional keyword search is still useful, pure text-based keyword matching is quickly becoming obsolete; today, it is a necessary but not sufficient tool for delivering relevant results and intelligent search experiences.
In this talk, we'll cover some of the emerging trends in AI-powered search, including the use of thought vectors (multi-level vector embeddings) and semantic knowledge graphs to contextually interpret and conceptualize queries. We'll walk through some live query interpretation demos to demonstrate the power that can be delivered through these semantic search techniques leveraging auto-generated knowledge graphs learned from your content and user interactions.
Talk at the 2nd Summer Workshop of the Center for Semantic Web Research (January 16, 2016, Santiago, Chile) about the construction of Yahoo's Knowledge Graph and associated research challenges.
Reflected Intelligence: Real world AI in Digital TransformationTrey Grainger
The goal of most digital transformations is to create competitive advantage by enhancing customer experience and employee success, so giving these stakeholders the ability to find the right information at their moment of need is paramount. Employees and customers increasingly expect an intuitive, interactive experience where they can simply type or speak their questions or keywords into a search box, their intent will be understood, and the best answers and content are then immediately presented.
Providing this compelling experience, however, requires a deep understanding of your content, your unique business domain, and the collective and personalized needs of each of your users. Modern artificial intelligence (AI) approaches are able to continuously learn from both your content and the ongoing stream of user interactions with your applications, and to automatically reflect back that learned intelligence in order to instantly and scalably deliver contextually-relevant answers to employees and customers.
In this talk, we'll discuss how AI is currently being deployed across the Fortune 1000 to accomplish these goals, both in the digital workplace (helping employees more efficiently get answers and make decisions) and in digital commerce (understanding customer intent and connecting them with the best information and products). We'll separate fact from fiction as we break down the hype around AI and show how it is being practically implemented today to power many real-world digital transformations for the next generation of employees and customers.
Presentation of the Semantic Knowledge Graph research paper at the 2016 IEEE 3rd International Conference on Data Science and Advanced Analytics (Montreal, Canada - October 18th, 2016)
Abstract—This paper describes a new kind of knowledge representation and mining system which we are calling the Semantic Knowledge Graph. At its heart, the Semantic Knowledge Graph leverages an inverted index, along with a complementary uninverted index, to represent nodes (terms) and edges (the documents within intersecting postings lists for multiple terms/nodes). This provides a layer of indirection between each pair of nodes and their corresponding edge, enabling edges to materialize dynamically from underlying corpus statistics. As a result, any combination of nodes can have edges to any other nodes materialize and be scored to reveal latent relationships between the nodes. This provides numerous benefits: the knowledge graph can be built automatically from a real-world corpus of data, new nodes - along with their combined edges - can be instantly materialized from any arbitrary combination of preexisting nodes (using set operations), and a full model of the semantic relationships between all entities within a domain can be represented and dynamically traversed using a highly compact representation of the graph. Such a system has widespread applications in areas as diverse as knowledge modeling and reasoning, natural language processing, anomaly detection, data cleansing, semantic search, analytics, data classification, root cause analysis, and recommendations systems. The main contribution of this paper is the introduction of a novel system - the Semantic Knowledge Graph - which is able to dynamically discover and score interesting relationships between any arbitrary combination of entities (words, phrases, or extracted concepts) through dynamically materializing nodes and edges from a compact graphical representation built automatically from a corpus of data representative of a knowledge domain.
Natural Language Search with Knowledge Graphs (Haystack 2019)Trey Grainger
To optimally interpret most natural language queries, it is necessary to understand the phrases, entities, commands, and relationships represented or implied within the search. Knowledge graphs serve as useful instantiations of ontologies which can help represent this kind of knowledge within a domain.
In this talk, we'll walk through techniques to build knowledge graphs automatically from your own domain-specific content, how you can update and edit the nodes and relationships, and how you can seamlessly integrate them into your search solution for enhanced query interpretation and semantic search. We'll have some fun with some of the more search-centric use cased of knowledge graphs, such as entity extraction, query expansion, disambiguation, and pattern identification within our queries: for example, transforming the query "bbq near haystack" into
{ filter:["doc_type":"restaurant"], "query": { "boost": { "b": "recip(geodist(38.034780,-78.486790),1,1000,1000)", "query": "bbq OR barbeque OR barbecue" } } }
We'll also specifically cover use of the Semantic Knowledge Graph, a particularly interesting knowledge graph implementation available within Apache Solr that can be auto-generated from your own domain-specific content and which provides highly-nuanced, contextual interpretation of all of the terms, phrases and entities within your domain. We'll see a live demo with real world data demonstrating how you can build and apply your own knowledge graphs to power much more relevant query understanding within your search engine.
Natural Language Search with Knowledge Graphs (Activate 2019)Trey Grainger
To optimally interpret most natural language queries, its important to understand a highly-nuanced, contextual interpretation of the domain-specific phrases, entities, commands, and relationships represented or implied within the search and within your domain.
In this talk, we'll walk through such a search system powered by Solr's Text Tagger and Semantic Knowledge graph. We'll have fun with some of the more search-centric use cases of knowledge graphs, such as entity extraction, query expansion, disambiguation, and pattern identification within our queries: for example, transforming the query "best bbq near activate" into:
{!func}mul(min(popularity,1),100) bbq^0.91032 ribs^0.65674 brisket^0.63386 doc_type:"restaurant" {!geofilt d=50 sfield="coordinates_pt" pt="38.916120,-77.045220"}
We'll see a live demo with real world data demonstrating how you can build and apply your own knowledge graphs to power much more relevant query understanding like this within your search engine.
Slides for the iDB summer school (Sapporo, Japan) http://db-event.jpn.org/idb2013/
Typically, Web mining approaches have focused on enhancing or learning about user seeking behavior, from query log analysis and click through usage, employing the web graph structure for ranking to detecting spam or web page duplicates. Lately, there's a trend on mining web content semantics and dynamics in order to enhance search capabilities by either providing direct answers to users or allowing for advanced interfaces or capabilities. In this tutorial we will look into different ways of mining textual information from Web archives, with a particular focus on how to extract and disambiguate entities, and how to put them in use in various search scenarios. Further, we will discuss how web dynamics affects information access and how to exploit them in a search context.
Natural Language Search with Knowledge Graphs (Chicago Meetup)Trey Grainger
To optimally interpret most natural language queries, its important to understand a highly-nuanced, contextual interpretation of the domain-specific phrases, entities, commands, and relationships represented or implied within the search and within your domain.
In this talk, we'll walk through such a search system powered by Solr's Text Tagger and Semantic Knowledge graph. We'll have fun with some of the more search-centric use cases of knowledge graphs, such as entity extraction, query expansion, disambiguation, and pattern identification within our queries: for example, transforming the query "best bbq near activate" into:
{!func}mul(min(popularity,1),100) bbq^0.91032 ribs^0.65674 brisket^0.63386 doc_type:"restaurant" {!geofilt d=50 sfield="coordinates_pt" pt="38.916120,-77.045220"}
We'll see a live demo with real world data demonstrating how you can build and apply your own knowledge graphs to power much more relevant query understanding like this within your search engine.
The Next Generation of AI-powered SearchTrey Grainger
What does it really mean to deliver an "AI-powered Search" solution? In this talk, we’ll bring clarity to this topic, showing you how to marry the art of the possible with the real-world challenges involved in understanding your content, your users, and your domain. We'll dive into emerging trends in AI-powered Search, as well as many of the stumbling blocks found in even the most advanced AI and Search applications, showing how to proactively plan for and avoid them. We'll walk through the various uses of reflected intelligence and feedback loops for continuous learning from user behavioral signals and content updates, also covering the increasing importance of virtual assistants and personalized search use cases found within the intersection of traditional search and recommendation engines. Our goal will be to provide a baseline of mainstream AI-powered Search capabilities available today, and to paint a picture of what we can all expect just on the horizon.
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
What if your search engine could automatically tune its own domain-specific relevancy model? What if it could learn the important phrases and topics within your domain, automatically identify alternate spellings (synonyms, acronyms, and related phrases) and disambiguate multiple meanings of those phrases, learn the conceptual relationships embedded within your documents, and even use machine-learned ranking to discover the relative importance of different features and then automatically optimize its own ranking algorithms for your domain?
In this presentation, you’ll learn you how to do just that - to evolving Lucene/Solr implementations into self-learning data systems which are able to accept user queries, deliver relevance-ranked results, and automatically learn from your users’ subsequent interactions to continually deliver a more relevant experience for each keyword, category, and group of users.
Such a self-learning system leverages reflected intelligence to consistently improve its understanding of the content (documents and queries), the context of specific users, and the relevance signals present in the collective feedback from every prior user interaction with the system. Come learn how to move beyond manual relevancy tuning and toward a closed-loop system leveraging both the embedded meaning within your content and the wisdom of the crowds to automatically generate search relevancy algorithms optimized for your domain.
Thought Vectors and Knowledge Graphs in AI-powered SearchTrey Grainger
While traditional keyword search is still useful, pure text-based keyword matching is quickly becoming obsolete; today, it is a necessary but not sufficient tool for delivering relevant results and intelligent search experiences.
In this talk, we'll cover some of the emerging trends in AI-powered search, including the use of thought vectors (multi-level vector embeddings) and semantic knowledge graphs to contextually interpret and conceptualize queries. We'll walk through some live query interpretation demos to demonstrate the power that can be delivered through these semantic search techniques leveraging auto-generated knowledge graphs learned from your content and user interactions.
EnergyUse - A Collective Semantic Platform for Monitoring and Discussing Ener...Gregoire Burel
Conserving fossil-based energy to reduce carbon emissions is key to slowing down global warming. The 2015 Paris agreement on climate change emphasised the importance of raising public awareness and participation to address this societal challenge. In this paper we introduce EnergyUse; a collective platform for raising awareness on climate change, by enabling users to view and compare the actual energy consumption of various appliances, and to share and discuss energy conservation tips in an open and social environment. The platform collects data from smart plugs, and exports appliance consumption information and community generated energy tips as linked data. In this paper we report on the system design, data modelling, platform usage and early deployment with a set of 58 initial participants. We also discuss the challenges, lessons learnt, and future platform developments.
[WEBINAR] Boost Your Paid Search ROI with Bing and Yahoo! SearchTrada
In this webinar, Trada and Microsoft team up to give you information about key adCenter tools, insights, and best practices that will help you optimize your Bing and Yahoo! Search campaign and improve your paid search ROI.
Whether you’re new to adCenter or would just like to improve your campaign’s performance, learn how to:
• Reach more customers.
• Raise conversion rates.
• Increase your return on advertising spend.
Check out Trada Reviews here: http://www.trada.com/trada-reviews/
Adding Semantic Edge to Your Content – From Authoring to DeliveryOntotext
Within the last few years we see and ever increasing demand for more accurate user specific content which on the other hand overwhelms content providers.This is where smart publishing platforms come into play. They aim at bringing the right content at the right time – digested, easy to comprehend, fast to navigate, and tailored to the readers’ personal interests.
The technologies that power them help publishers to automate the metadata enrichment process, making it more consistent, accurate and rich.
Ontological approach for improving semantic web search resultseSAT Journals
Abstract we propose personalized information system to provide more user-oriented information considering context information such as a personal profile based, location based and click based. Our system can provide associated search results from relations between the objects using context ontologies modeling created by the categorized layers data. Based on the similarities between item descriptions and user profiles, and the semantic relations between concepts, content based and collaborative recommendation models are supported by the system. User defined rules based on the ontology description of the service and interoperates within any service domain that has an ontology description. Keywords- Ontology; Semantic Web; RDF;OWL
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextLeon Derczynski
Code: http://gate.ac.uk/wiki/twitie.html
Paper: https://gate.ac.uk/sale/ranlp2013/twitie/twitie-ranlp2013.pdf
Twitter is the largest source of microblog text, responsible for gigabytes of human discourse every day. Processing microblog text is difficult: the genre is noisy, documents have little context, and utterances are very short. As such, conventional NLP tools fail when faced with tweets and other microblog text. We present TwitIE, an open-source NLP pipeline customised to microblog text at every stage. Additionally, it includes Twitter-specific data import and metadata handling. This paper introduces each stage of the TwitIE pipeline, which is a modification of the GATE ANNIE open-source pipeline for news text. An evaluation against some state-of-the-art systems is also presented.
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalMauro Dragoni
The presentation provides an overview of what an ontology is and how it can be used for representing information and for retrieving data with a particular focus on the linguistic resources available for supporting this kind of task. Overview of semantic-based retrieval approaches by highlighting the pro and cons of using semantic approaches with respect to classic ones. Use cases are presented and discussed
Employees, Business Partners and Bad Guys: What Web Data Reveals About Person...Connotate
This presentation will discuss how to collect Web data with precision, transform it and then apply next-generation text analytics to reveal insights about the past activities of persons of interest and/or predict future outcomes. Featured guest speaker Claire Schmidt will discuss results of a project which proved the potential of using automated Web data collection and advanced analytics to identify potential child victims of exploitation.
Delivered at Enterprise Search and Discovery 2015, this presentation takes a look at the search landscape users enjoy outside the firewall and the expectations it fosters inside. It presents contemporary user research on enterprise search behavior and uses these findings to make recommendations to enhance enterprise search effectiveness.
While we have been busy trying to "define the damn thing" IA or answering the age old question of who rules, UX, IxDA or IA, the search engines have been busily transitioning to a machine mediated experience model for ranking. This means that SEO is now the responsibility of UX/IA whether we like it or not. This presentation lays out how search engines evaluate user experience and how we can influence this evaluation with an optimized design.
Search engines have changed a lot over the last 15 years and optimizing Websites for them must keep up. This presentation looks at the search landscape and present strategies and tactics for optimizing for today's search.
South Big Data Hub: Text Data Analysis PanelTrey Grainger
Slides from Trey's opening presentation for the South Big Data Hub's Text Data Analysis Panel on December 8th, 2016. Trey provided a quick introduction to Apache Solr, described how companies are using Solr to power relevant search in industry, and provided a glimpse on where the industry is heading with regard to implementing more intelligent and relevant semantic search.
Walk Before You Run: Prerequisites to Linked DataKenning Arlitsch
Presentation on April 23, 2015 at the Amigos Library Services online conference: "Linked Data & RDF: New Frontiers in Metadata and Access"
Covers traditional SEO and Semantic Web Optimization, including Semantic Web Identity and a Schema.org project at Montana State University Library.
Search Solutions 2011: Successful Enterprise Search By DesignMarianne Sweeny
When your colleagues say they want Google, they don’t mean the Google Search Appliance. They mean the Google Search user experience: pervasive, expedient and delivering the information that they need. Successful enterprise search does not start with the application features, is not part of the information architecture, does not come from a controlled vocabulary and does not emerge on its own from the developers. It requires enterprise-specific data mining, enterprise-specific user-centered design and fine tuning to turn “search sucks” into search success within the firewall. This presentation looks at action items, tools and deliverables for Discovery, Planning, Design and Post Launch phases of an enterprise search deployment.
Bearish SEO: Defining the User Experience for Google’s Panda Search LandscapeMarianne Sweeny
The search sun shifted in March 2011 when Google started rolling out the beginning of the Panda update. Instead of using the famous PageRank, a link-based relevance calculation, Panda rests on a machine interpretation of user experience to decide which sites are most relevant to a searchers quest for knowledge. This means that IA and UX practitioners need to start thinking about the machine implications of the way they structure information on the web, and think ahead about the human implications for how search engines present their sites in response to searcher queries. Bearish SEO will present real, actionable methods for content providers, information architects and user experience designers to directly influence search engine discoverability. Need is an experience. It is a state of being. The goal for this presentation is to ensure that user experience professionals become an integral part of designing search experience.
"Semantic Integration Is What You Do Before The Deep Learning". dev.bg Machine Learning seminar, 13 May 2019.
It's well known that 80\% of the effort of a data scientist is spent on data preparation. Semantic integration is arguably the best way to spend this effort more efficiently and to reuse it between tasks, projects and organizations. Knowledge Graphs (KG) and Linked Open Data (LOD) have become very popular recently. They are used by Google, Amazon, Bing, Samsung, Springer Nature, Microsoft Academic, AirBnb… and any large enterprise that would like to have a holistic (360 degree) view of its business. The Semantic Web (web 3.0) is a way to build a Giant Global Graph, just like the normal web is a Global Web of Documents. IEEE already talks about Big Data Semantics. We review the topic of KGs and their applicability to Machine Learning.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Semantic Search at Yahoo
1. Semantic Search at Yahoo
P R E S E N T E D B Y P e t e r M i k a , S r . R e s e a r c h S c i e n t i s t , Y a h o o L a b s ⎪ A p r i l 1 6 , 2 0 1 4
2. The Semantic Web (2001-)
4/16/20142
Part of Tim Berners-Lee‟s
original proposal for the Web
Beginning of a research community
› Ontology engineering
› Logical inference
› Agents, web services
Rough start in deployment
› Misplaced expectations
› Lack of adoption
3. The Semantic Web, May 2001
“At the doctor's office, Lucy instructed her
Semantic Web agent through her handheld Web
browser. The agent promptly retrieved
information about Mom's prescribed treatment
from the doctor's agent, looked up several lists
of providers, and checked for the ones in-plan
for Mom's insurance within a 20-mile radius of
her home and with a rating of excellent or very
good on trusted rating services. It then began
trying to find a match between available
appointment times (supplied by the agents of
individual providers through their Web sites) and
Pete's and Lucy's busy schedules.”
(The emphasized keywords indicate terms
whose semantics, or meaning, were defined for
the agent through the Semantic Web.)
4/16/20143
Misplaced expectations?
4. Lack of adoption
Standardization ahead of adoption
› URI, RDF, RDF/XML, RDFa, JSON-LD,
OWL, RIF, SPARQL, OWL-S, POWDER …
Chicken and egg problem
› No users/use cases, hence no data
› No data, because no users/use cases
By 2007, some modest progress
› Metadata in HTML: microformats
› Linked Data: simplifying the stack
5. Web search by 2007
5
Large classes of queries are solved to perfection
Improvements in web search are harder and harder to come by
› Relevance models, hyperlink structure and interaction data
› Combination of features using machine learning
› Heavy investment in computational power
• real-time indexing, instant search, datacenters and edge services
6. Language issues
› Multiple interpretations
• jaguar
• paris hilton
› Secondary meaning
• george bush (and I mean the beer brewer
in Arizona)
› Subjectivity
• reliable digital camera
• paris hilton sexy
› Imprecise or overly precise searches
• jim hendler
Complex needs
› Missing information
• brad pitt zombie
• florida man with 115 guns
• 35 year old computer scientist living in
barcelona
› Category queries
• countries in africa
• barcelona nightlife
› Transactional or computational queries
• 120 dollars in euros
• digital camera under 300 dollars
• world temperature in 2020
Poorly solved information needs remain
Many of these queries would
not be asked by users, who
learned over time what search
technology can and can not
do.
7. Web search by 2007
7
Are there even any true keyword queries?
› Lyrics, quotes and bugs… anything else?
Remaining challenges are not computational, but in modeling user
cognition
› Need a deeper understanding of the query, the content and/or the world at large
8. Microsearch internal prototype (2007)
Personal and
private
homepage
of the same
person
(clear from the
snippet but it
could be also
automatically
de-duplicated)
Conferences
he plans to attend
and his vacations
from homepage
plus bio events
from LinkedIn
Geolocation
9. Enhanced Results
Computing abstracts is hard
› Summarization of HTML
• Template detection
• Selecting relevant snippets
• Composing readable text
› Efficiency constraints
Structured data to replace or complement text summary
› Key/value pairs
› Deep links
› Image or Video
10. Yahoo SearchMonkey (2008)
1. Extract structured data
› Semantic Web markup
• Example:
<span property=“vcard:city”>Santa Clara</span>
<span property=“vcard:region”>CA</span>
› Information Extraction
2. Presentation
› Fixed presentation templates
• One template per object type
› Applications
• Third-party modules to display data (SearchMonkey)
11. Effectiveness of enhanced results
Explicit user feedback
› Side-by-side editorial evaluation (A/B testing)
• Editors are shown a traditional search result and enhanced result for the same page
• Users prefer enhanced results in 84% of the cases and traditional results in 3% (N=384)
Implicit user feedback
› Click-through rate analysis
• Long dwell time limit of 100s (Ciemiewicz et al. 2010)
• 15% increase in „good‟ clicks
› User interaction model
• Enhanced results lead users to relevant documents (IV) even though less likely to clicked than
textual (III)
• Enhanced results effectively reduce bad clicks!
See
› Kevin Haas, Peter Mika, Paul Tarjan, Roi Blanco: Enhanced results for web search. SIGIR
2011: 725-734
12. Adoption among search providers
Google announces Rich Snippets - June, 2009
› Faceted search for recipes - Feb, 2011
Bing tiles – Feb, 2011
Facebook‟s Like button and the Open Graph Protocol (2010)
› Shows up in profiles and news feed
› Site owners can later reach users who have liked an object
13. schema.org
Agreement on a shared set of schemas for common types of web
content
› Bing, Google, and Yahoo! as initial founders (June, 2011)
• Yandex joins schema.org in Nov, 2011
› Similar in intent to sitemaps.org
• Use a single format to communicate the same information to all three search engines
schema.org covers areas of interest to all search engines
› Business listings (local), creative works (video), recipes, reviews and more
› Microdata, RDFa, JSON-LD syntax
Collaborative effort
› Growing number of 3rd party contributions
› schema.org discussions at public-vocabs@w3.org
14. Adoption among publishers
R.V. Guha: Light at the end of the tunnel (ISWC 2013 keynote)
› Over 15% of all pages now have schema.org markup
› Over 5 million sites, over 25 billion entity references
› In other words
• Same order of magnitude as the web
See also
› P. Mika, T. Potter. Metadata Statistics for a Large Web Corpus, LDOW 2012
• Based on Bing US corpus
• 31% of webpages, 5% of domains contain some metadata
› WebDataCommons
• Based on CommonCrawl Nov 2013
• 26% of webpages, 14% of domains contain some metadata
15. Semantic Search
Active research field at the intersection of IR, NLP, DB and SemWeb
› ESAIR at SIGIR, SemSearch at ESWC/WWW, EOS and JIWES at SIGIR, Semantic Search
at VLDB
Exploiting semantic understanding in the retrieval process
› User intent and resources are represented using semantic models
• Not just symbolic representations
› Semantic models are exploited in the matching and ranking of resources
Tasks
› information extraction
› information reconciliation/tracking
› query understanding
› retrieving/ranking entities/attributes/relations
› result presentation
16. Information extraction and reconciliation
Ontology management
› Editorially maintained OWL ontology with 300+ classes
› Covering the domains of interest of Yahoo
Information extraction
› Automated information extraction
• e.g. wrapper induction
› Metadata from HTML pages
• Focused crawler
› Public datasets (e.g. Dbpedia)
› Proprietary data
Data fusion
› Manual mapping from the source schemas to the ontology
› Supervised entity reconciliation
• Kedar Bellare, Carlo Curino, Ashwin Machanavajihala, Peter Mika, Mandar Rahurkar, Aamod Sane:
WOO: A Scalable and Multi-tenant Platform for Continuous Knowledge Base Synthesis. PVLDB 2013
• Michael J. Welch, Aamod Sane, Chris Drome: Fast and accurate incremental entity resolution relative to an entity knowledge base. CIKM 2012
Curation and quality assessment
17.
18.
19.
20. Yahoo‟s Knowledge Graph
Chicago Cubs
Chicago
Barack Obama
Carlos Zambrano
10% off tickets
for
plays for
plays in
lives in
Brad Pitt
Angelina Jolie
Steven Soderbergh
George Clooney
Ocean’s Twelve
partner
directs
casts in
E/R
casts
in
takes place in
Fight Club
casts in
Dust Brothers
casts
in
music by
Nicolas Torzec: Making knowledge reusable at Yahoo!:
a Look at the Yahoo! Knowledge Base (SemTech 2013)
21. Query understanding
21
~70% of queries contain a named entity
Entity linking in queries and query sessions
› Online as input to ranking
› Semantic log mining
• Laura Hollink, Peter Mika, Roi Blanco: Web usage mining with semantic analysis. WWW 2013:
561-570
See tutorial on Entity Linking and Retrieval by Edgar Meij, Krisztián
Balog and Daan Odijk
22. list search
related entity finding
entity search
SemSearch 2010/11
list completion
SemSearch 2011
TREC ELC taskTREC REF-LOD task
semantic search
Common retrieval tasks in Semantic Search
question-answering
QALD 2012/13/14
23. Entity Retrieval evaluation
SemSearch challenge (2010/2011)
› Queries
• 50 entity-mention queries selected from the Search Query Tiny Sample v1.0 dataset, provided by the
Yahoo! Webscope program
› Data
• Billion Triples Challenge 2009 data set
• Combination of crawls of multiple semantic search engines
› Evaluation
• Mechanical Turk
See report:
› Roi Blanco, Harry Halpin, Daniel M. Herzig, Peter Mika, Jeffrey Pound, Henry S.
Thompson, Thanh Tran: Repeatable and reliable semantic search evaluation. J. Web Sem.
21: 14-29 (2013)
24. Glimmer: open-source retrieval engine over RDF data
› Extension of MG4J from University of Milano
› Indexing
• MapReduce-based
• Horizontal indexing (subject/predicate/object fields)
• Vertical indexing (one field per predicate)
› Retrieval
• BM25F with machine-learned weights for properties and domains
• 52% improvement over the best system in SemSearch 2010
› Roi Blanco, Peter Mika, Sebastiano Vigna: Effective and Efficient Entity Search in RDF
Data. International Semantic Web Conference (1) 2011: 83-97
› https://github.com/yahoo/Glimmer/
25. Entity-seeking queries make up
40-50% of the query volume
› Jeffrey Pound, Peter Mika, Hugo Zaragoza: Ad-hoc
object retrieval in the web of data. WWW 2010: 771-
780
› Thomas Lin, Patrick Pantel, Michael Gamon, Anitha
Kannan, Ariel Fuxman: Active objects: actions for
entity-centric search. WWW 2012: 589-598
Show a summary of the most
likely information-needs
› Including related entities for navigation
› Roi Blanco, Berkant Barla
Cambazoglu, Peter Mika, Nicolas
Torzec: Entity Recommendations in
Web Search. ISWC 2013
Application:
entity displays in web search
28. Mobile search on the rise
Information access on-the-go requires hands-free operation
› Driving, walking, gym, etc.
• Americans spend 540 hours a year in their cars [1] vs. 348 hours browsing the Web [2]
~50% of queries are coming from mobile devices (and growing)
› Changing habits, e.g. iPad usage peaks before bedtime
› Limitations in input/output
[1] http://answers.google.com/answers/threadview?id=392456
[2] http://articles.latimes.com/2012/jun/22/business/la-fi-tn-top-us-brands-news-web-sites-20120622
29. Mobile search challenges and opportunities
29
Interaction
› Question-answering
› Support for interactive retrieval
› Spoken-language access
› Task completion
Contextualization
› Personalization
› Geo
› Context (work/home/travel)
• Try getaviate.com
30. Interactive, conversational voice search
Parlance EU project
› Complex dialogs within a domain
• Requires complete semantic understanding
Complete system (mixed license)
› Automated Speech Recognition (ASR)
› Spoken Language Understanding (SLU)
› Interaction Management
› Knowledge Base
› Natural Language Generation (NLG)
› Text-to-Speech (TTS)
Video
31. Task completion
31
We would like to help our users in task completion
› But we have trained our users to talk in nouns
• Retrieval performance decreases by adding verbs to queries
› We need to understand what the available actions are
Ongoing work in schema.org in modeling actions
› Understand what actions can be taken on a page
› Help users in mapping their query to potential actions
› Applications in web search, email etc.
THING
THING
33. Q&A
Many thanks to members of the Semantic Search team
at Yahoo Labs Barcelona and to Yahoos around the world
Contact me
› pmika@yahoo-inc.com
› @pmika
› www.slideshare.net/pmika
› Ask about our internships and other opportunities
Editor's Notes
Non-trivial problems to solve without publisher assistance
When the competition is copying you, you know that you are doing something right.