We propose a framework to address an important challenge in the context of the ongoing adoption of the “Web 2.0” in science and research, often referred to as “Research 2.0”. Microblogging is one of the trends with increasing leverage. The challenge in this thesis is to connect users of microblogging services such as Twitter based on specific common entities that are representative and truly matter to them. We investigated the possibilities of using social data for locating an expert who shares a very specific research topic. To enrich and verify this social data we link such content to existing open data provided by the online community. We are using semantic technologies (RDF ,SPARQL), com- mon ontologies (SIOC, FOAF, DublinCore, SWRC) and Linked Data (DBpedia, GeoNames, CoLinDa) to extract and mine the data about scientific conferences out of context of microblogs. We are identifying users related to each other based on entities such as topics (tags), events, time, locations and persons (mentions). As a proof-of-concept we explain, implement and evaluate such a researcher profiling use case. It involves the development of a framework that focuses on the proposition of researches based on topics and conferences they have in common. This framework provides an API that allows quick access to the analyzed information. A demonstration application: “Researcher Affinity Browser” shows how the API supports developers to build rich internet applications for Research 2.0. This application also intro- duces the concept “affinity” that exposes the implicit proximity between entities and users based on the content users produced. The usability of a demonstration application and the usefulness of the framework itself are investigated with an explicit evaluation question- naire. This user feedback lead to important conclusions about successful achievements and opportunities to further improve this effort.
Supporting Sensemaking by Modelling Discourse as Hypermedia NetworksSimon Buckingham Shum
10 July 2009: Presentation to W3C "Semantic Web for Health Care and Life Sciences" Interest Group: "Scientific Discourse" task group: http://esw.w3.org/topic/HCLSIG/SWANSIOC
A vision on collaborative computation of things for personalized analysesDaniele Gianni
Presentation delivered at the 3rd IEEE Track on
Collaborative Modeling & Simulation - CoMetS'12.
Please see http://www.sel.uniroma2.it/comets12/ for further details.
Modern learning models require linking experiences in training environments with experiences in the real-world. However, data about real-world experiences is notoriously hard to collect. Social spaces bring new opportunities to tackle this challenge, supplying digital traces where people talk about their real-world experiences. These traces can become valuable resource, especially in ill-defined domains that embed multiple interpretations. The paper presents a unique approach to aggregate content from social spaces into a semantic-enriched data browser to facilitate informal learning in ill-defined domains. This work pioneers a new way to exploit digital traces about real-world experiences as authentic examples in informal learning contexts. An exploratory study is used to determine both strengths and areas needing attention. The results suggest that semantics can be successfully used in social spaces for informal learning – especially when combined with carefully designed nudges.
Designing an effective information architectureoptimalworkshop
It’s such a waste when stuff is hard to find. In the book Ambient Findability, Peter Morville quotes a study that estimates that in a medium-sized hospital, 8,000 hours a year of staff time are spent explaining signs and redirecting people. That’s 4 person years!
Finding stuff online is even worse. According to IBM’s chairman, it’s estimated that there will be 44 times as much data and content coming over the next decade, reaching 35 zettabytes by 2020. That’s 35 followed by 21 zeros.
There is one thing you can do to help the madness. You can create an effective information architecture (IA) to connect people with the content that they’re looking for. In this practical workshop you’ll learn how to create an effective IA which will help ensure that your stuff is easy to find and provide your visitors with a great experience. You’ll leave with an armload of practical insights and tips, and with the inspiration to refine and test your own IA.
A Social Content Delivery Network for Scientific Cooperation: Vision, Design...Simon Caton
Data volumes have increased so significantly that we need to carefully consider how we interact with, share, and analyze data to avoid bottlenecks. In contexts such as eScience and scientific computing, a large emphasis is placed on collaboration, resulting in many well-known challenges in ensuring that data is in the right place at the right time and accessible by the right users. Yet these simple requirements create substantial challenges for the distribution, analysis, storage, and replication of potentially "large" datasets. Additional complexity is added through constraints such as budget, data locality, usage, and available local storage. In this paper, we propose a "socially driven" approach to address some of the challenges within (academic) research contexts by defining a Social Data Cloud and underpinning Content Delivery Network: a Social CDN (S-CDN). Our approach leverages digitally encoded social constructs via social network platforms that we use to represent (virtual) research communities. Ultimately, the S-CDN builds upon the intrinsic incentives of members of a given scientific community to address their data challenges collaboratively and in proven trusted settings. We define the design and architecture of a S-CDN and investigate its feasibility via a coauthorship case study as first steps to illustrate its usefulness.
A Visual Exploration Workflow as Enabler for the Exploitation of Linked Open ...Laurens De Vocht
Semantically annotating and interlinking Open Data results in Linked Open Data which concisely and unambiguously describes a knowledge domain.However, the uptake of the Linked Data depends on its usefulness to non-Semantic Web experts. Failing to support data consumers understanding the added-value of Linked Data and possible exploitation opportunities could inhibit its diffusion. In this paper, we propose an interactive visual workflow for discovering and exploring Linked Open Data. We implemented the workflow considering academic library metadata and carried out a qualitative evaluation. We assessed the workflow’s potential impact on data consumers which bridges the offer as published Linked Open Data, and the demand as requests for: (i) higher quality data; and (ii) more applications that re-use data. More than 70% of the 34 test users agreed that the workflow fulfills its goal: it facilitates non-Semantic Web experts to understand the potential of Linked Open Data
Supporting Sensemaking by Modelling Discourse as Hypermedia NetworksSimon Buckingham Shum
10 July 2009: Presentation to W3C "Semantic Web for Health Care and Life Sciences" Interest Group: "Scientific Discourse" task group: http://esw.w3.org/topic/HCLSIG/SWANSIOC
A vision on collaborative computation of things for personalized analysesDaniele Gianni
Presentation delivered at the 3rd IEEE Track on
Collaborative Modeling & Simulation - CoMetS'12.
Please see http://www.sel.uniroma2.it/comets12/ for further details.
Modern learning models require linking experiences in training environments with experiences in the real-world. However, data about real-world experiences is notoriously hard to collect. Social spaces bring new opportunities to tackle this challenge, supplying digital traces where people talk about their real-world experiences. These traces can become valuable resource, especially in ill-defined domains that embed multiple interpretations. The paper presents a unique approach to aggregate content from social spaces into a semantic-enriched data browser to facilitate informal learning in ill-defined domains. This work pioneers a new way to exploit digital traces about real-world experiences as authentic examples in informal learning contexts. An exploratory study is used to determine both strengths and areas needing attention. The results suggest that semantics can be successfully used in social spaces for informal learning – especially when combined with carefully designed nudges.
Designing an effective information architectureoptimalworkshop
It’s such a waste when stuff is hard to find. In the book Ambient Findability, Peter Morville quotes a study that estimates that in a medium-sized hospital, 8,000 hours a year of staff time are spent explaining signs and redirecting people. That’s 4 person years!
Finding stuff online is even worse. According to IBM’s chairman, it’s estimated that there will be 44 times as much data and content coming over the next decade, reaching 35 zettabytes by 2020. That’s 35 followed by 21 zeros.
There is one thing you can do to help the madness. You can create an effective information architecture (IA) to connect people with the content that they’re looking for. In this practical workshop you’ll learn how to create an effective IA which will help ensure that your stuff is easy to find and provide your visitors with a great experience. You’ll leave with an armload of practical insights and tips, and with the inspiration to refine and test your own IA.
A Social Content Delivery Network for Scientific Cooperation: Vision, Design...Simon Caton
Data volumes have increased so significantly that we need to carefully consider how we interact with, share, and analyze data to avoid bottlenecks. In contexts such as eScience and scientific computing, a large emphasis is placed on collaboration, resulting in many well-known challenges in ensuring that data is in the right place at the right time and accessible by the right users. Yet these simple requirements create substantial challenges for the distribution, analysis, storage, and replication of potentially "large" datasets. Additional complexity is added through constraints such as budget, data locality, usage, and available local storage. In this paper, we propose a "socially driven" approach to address some of the challenges within (academic) research contexts by defining a Social Data Cloud and underpinning Content Delivery Network: a Social CDN (S-CDN). Our approach leverages digitally encoded social constructs via social network platforms that we use to represent (virtual) research communities. Ultimately, the S-CDN builds upon the intrinsic incentives of members of a given scientific community to address their data challenges collaboratively and in proven trusted settings. We define the design and architecture of a S-CDN and investigate its feasibility via a coauthorship case study as first steps to illustrate its usefulness.
A Visual Exploration Workflow as Enabler for the Exploitation of Linked Open ...Laurens De Vocht
Semantically annotating and interlinking Open Data results in Linked Open Data which concisely and unambiguously describes a knowledge domain.However, the uptake of the Linked Data depends on its usefulness to non-Semantic Web experts. Failing to support data consumers understanding the added-value of Linked Data and possible exploitation opportunities could inhibit its diffusion. In this paper, we propose an interactive visual workflow for discovering and exploring Linked Open Data. We implemented the workflow considering academic library metadata and carried out a qualitative evaluation. We assessed the workflow’s potential impact on data consumers which bridges the offer as published Linked Open Data, and the demand as requests for: (i) higher quality data; and (ii) more applications that re-use data. More than 70% of the 34 test users agreed that the workflow fulfills its goal: it facilitates non-Semantic Web experts to understand the potential of Linked Open Data
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...Laurens De Vocht
Searching for relationships between Linked Data resources is typically interpreted as a pathfinding problem: looking for chains of intermediary nodes (hops) forming the connection or bridge between these resources in a single dataset or across multiple datasets.
In many cases centralizing all needed linked data in a certain (specialized) repository or index to be able to run the algorithm is not possible or at least not desired. To address this, we propose an approach to top-k shortest pathfinding, which optimally translates a pathfinding query into se- quences of triple pattern fragment requests.
Triple Pattern Fragments were recently introduced as a solution to address the availability of data on theWeb and the scalability of linked data client applications, preventing data processing bottlenecks on the server.
The results are streamed to the client, thus allowing clients to do asynchronous processing of the top-k shortest paths.
We explain how this approach behaves using a training dataset, a subset of DBpedia with 10 million triples, and show the trade-offs to a SPARQL approach where all the data is gathered in a single triple store on a single machine.
Furthermore we investigate the scalability when increasing the size of the subset up to 110 million triples.
Software Package (NodeJS): npmjs.com/package/everything_is_connected_engine
Discovering Meaningful Connections between Resources in the Web of DataLaurens De Vocht
Slides of LDOW2013 presentation, May 14th, Rio De Janeiro, Brazil
We will show that semantically annotated paths lead to discovering meaningful, non-trivial relations and connections between multiple resources in large online datasets such as the Web of Data. Graph algorithms have always been key in pathfinding applications (e.g., navigation systems). They make optimal use of available computation resources to find paths in structured data. Applying these algorithms to Linked Data can facilitate the resolving of complex queries that involve the semantics of the relations between resources. In this paper, we introduce a new approach for finding paths in Linked Data that takes into account the meaning of the connections and also deals with scalability. An efficient technique combining pre-processing and indexing of datasets is used for finding paths between two resources in largedatasets within a couple of seconds. To demonstrate our approach, we have implemented a testcase using the DBpedia dataset.
A Framework Concept for Profiling Researchers on Twitter using the Web of DataLaurens De Vocht
Based upon findings and results from our recent research we propose a generic frame-
work concept for researcher profiling with appliance to the areas of ”Science 2.0” and ”Research 2.0”. Intensive growth of users in social networks, such as Twitter generated a vast amount of information. It has been shown in many previous works that social networks users produce valuable content for profiling and recommendations. Our research focuses on identifying and locating experts for specific research area or topic. In our approach we apply semantic technologies like (RDF, SPARQL), common vocabularies (SIOC , FOAF, MOAT, Tag Ontology) and Linked Data (GeoNames , COLINDA).
Aligning Web Collaboration Tools with Research Data for ScholarsLaurens De Vocht
Resources for research are not always easy to explore, and
rarely come with strong support for identifying, linking and
selecting those that can be of interest to scholars. In this
work we introduce a model that uses state-of-the-art semantic technologies to interlink structured research data and data from Web collaboration tools, social media and Linked Open Data. We use this model to build a platform that connects scholars, using their proles as a starting point to explore novel and relevant content for their research. Scholars can easily adapt to evolving trends by synchronizing new social media accounts or collaboration tools and integrate then with new datasets. We evaluate our approach by a scenario of personalized exploration of research repositories where we analyze real world scholar profiles and compare them to a reference profile.
Providing Interchangeable Open Data to Accelerate Development of Sustainable ...Laurens De Vocht
Travelers expect access to tourism information at anytime, anywhere, with any media. Mobile tourism guides, accessible via the Web, provide an omnipresent approach to this. Thereby it is expensive and not trivial to (re)model, translate and transform data over and over. This inhibits many players, including governments, in developing such applications. We report on our experience in running a project on mobile tourism in Flanders, Belgium where we develop a methodology and reusable formalization for the data disclosure. We apply open data standards to achieve a reusable and interoperable datahub for mobile tourism. We organized working groups resulting in a re-usable formal specification and serialization of the domain model that is immediately usable for building mobile tourism applications. This increased the awareness and lead to semantic convergence which is forming a regional foundation to develop sustainable mobile guides for tourism.
Big Linked Data ETL Benchmark on Cloud Commodity HardwareLaurens De Vocht
Linked Data storage solutions often optimize for low latency querying and quick responsiveness. Meanwhile, in the back-end, offline ETL processes take care of integrating and preparing the data. In this paper we explain a workflow and the results of a benchmark that examines which Linked Data storage solution and setup should be chosen for different dataset sizes to optimize the cost-effectiveness of the entire ETL process. The benchmark executes diversified stress tests on the storage solutions. The results include an in-depth analysis of four mature Linked Data solutions with commercial support and full SPARQL 1.1 compliance. Whereas traditional benchmarks studies generally deploy the triple stores on premises using high-end hardware, this benchmark uses publicly available cloud machine images for reproducibility and runs on commodity hardware. All stores are tested using their default configuration. In this setting Virtuoso shows the best performance in general. The other tree stores show competitive results and have disjunct areas of excellence. Finally, it is shown that each store’s performance heavily depends on the structural properties of the queries, giving an indication of where vendors can focus their optimization efforts.
Examples of how Blueback tools can expand and enhance your Petrel workflowMitch Sutherland
A poster designed to give you a flavour of some of the cool Petrel plug-ins available from Blueback Reservoir. For exploration, interpretation, inversion, QI, data analysis, geomodeling & project management.
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked DataLaurens De Vocht
Path-based storytelling with Linked Data on the Web provides users the ability to discover concepts in an entertaining and educational way. Given a query context, many state-of-the-art pathfinding approaches aim at telling a story that coincides with the user’s expectations by investigating paths over Linked Data on the Web. By taking into account serendipity in storytelling, we aim at improving and tailoring existing approaches towards better fitting user expectations so that users are able to discover interesting knowledge without feeling unsure or even lost in the story facts. To this end, we propose to optimize the link estimation between - and the selection of facts in a story by increasing the consistency and relevancy of links between facts through additional domain delineation and refinement steps. In order to address multiple aspects of serendipity, we propose and investigate combinations of weights and heuristics in paths forming the essential building blocks for each story. Our experimental findings with stories based on DBpedia indicate the improvements when applying the optimized algorithm.
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Laurens De Vocht
Linked Data offers an entity-based infrastructure to resolve indirect relations between resources, expressed as chains of links. If we could benchmark how effective retrieving chains of links from these sources is, we can motivate why they are a reliable addition for exploratory search interfaces. A vast number of applications could reap the benefits from encouraging insights in this field. Especially all kinds of knowledge discovery tasks related for instance to ad-hoc
decision support and digital assistance systems. In this paper, we explain a benchmark model for evaluating the effectiveness of associating chains of links with keyword-based queries. We illustrate the benchmark model with an example case using academic library and conference metadata where we measured precision involving targeted expert users and directed it towards search effectiveness. This kind of typical semantic search engine evaluation focusing on information
retrieval metrics such as precision is typically biased towards the final result only. However, in an exploratory search scenario, the dynamics of the intermediary links that could lead to potentially relevant discoveries are not to be neglected.
Each government level uses its own different information system. At the same time citizens expect that these governmental levels adopt a user-centric approach and provide instant access to their data or to open government data. Therefore the applications at various government levels need to be interoperable in support of the ‘once only-principle’: data is inputted and registered only once and then reused. Given government budget constraints and the cost and complexity of (re)modeling, translating and transforming data over and over, public administrations need to reduce interoperability costs. This is achieved by semantically aligning information between the different information systems of each government level. Semantical interoperable systems facilitate citizen-centered e-government services. This paper illustrates how the Open Standards for Linked Organizations program (OSLO) paved the way bottom-up from a broad basis of stakeholders towards a government-endorsed strategy. OSLO applied a generic process and methodology and provided practical insights on how to overcome the encountered hurdles: political support and adoption; reaching semantic agreement. The lessons learned in the region of Flanders (Belgium) can speed-up the process in other countries that face the complexity of integrating information intensive processes between different applications, administrations and government levels.
Oil 101: Introduction to Oil and Gas - UpstreamEKT Interactive
Oil 101: Introduction to Oil and Gas - Upstream
What is Upstream? This Midstream content is derived from our Oil 101 Upstream ebook and can be found in our oil and gas learning community.
This Upstream module includes the following sections (use the links below for quick access):
-Introduction to Upstream
-Upstream Business Characteristics
-Oilfield Services
-Reserves – Formation and Importance
-Production – The First Step in Adding Value
-The Unconventional Future of Upstream
Upstream
What is Upstream? Most oil and gas companies’ business structures are segmented and organized according to business segment, assets, or function.
The upstream segment of the business is also known as the exploration and production (E&P) sector because it encompasses activities related to searching for, recovering and producing crude oil and natural gas.
The upstream segment is all about wells: where to locate them; how deep and how far to drill them; and how to design, construct, operate and manage them to deliver the greatest possible return on investment with the lightest, safest and smallest operational footprint.
Exploration
The exploration sector involves obtaining a lease and permission to drill from the owners of onshore or offshore acreage thought to contain oil or gas, and conducting necessary geological and geophysical (G&G) surveys required to explore for (and hopefully find) economic accumulations of oil or gas.
Drilling
There is always uncertainty in the geological and geophysical survey results. The only way to be sure that a prospect is favorable is to drill an exploratory well. Drilling is physically creating the “borehole” in the ground that will eventually become an oil or gas well. This work is done by rig contractors and service companies in the Oilfield Services business sector.
Production
The production sector of the upstream segment maximizes recovery of petroleum from subsurface reservoirs.
Introduction to Oil and Gas Industry from Upstream (Exploration & Production), Midstream (Transportation & Storage), to Downstream (Refining, Petrochemical, & Marketing)
Using Triple Pattern Fragments To Enable Streaming of Top-k Shortest Paths vi...Laurens De Vocht
Searching for relationships between Linked Data resources is typically interpreted as a pathfinding problem: looking for chains of intermediary nodes (hops) forming the connection or bridge between these resources in a single dataset or across multiple datasets.
In many cases centralizing all needed linked data in a certain (specialized) repository or index to be able to run the algorithm is not possible or at least not desired. To address this, we propose an approach to top-k shortest pathfinding, which optimally translates a pathfinding query into se- quences of triple pattern fragment requests.
Triple Pattern Fragments were recently introduced as a solution to address the availability of data on theWeb and the scalability of linked data client applications, preventing data processing bottlenecks on the server.
The results are streamed to the client, thus allowing clients to do asynchronous processing of the top-k shortest paths.
We explain how this approach behaves using a training dataset, a subset of DBpedia with 10 million triples, and show the trade-offs to a SPARQL approach where all the data is gathered in a single triple store on a single machine.
Furthermore we investigate the scalability when increasing the size of the subset up to 110 million triples.
Software Package (NodeJS): npmjs.com/package/everything_is_connected_engine
Discovering Meaningful Connections between Resources in the Web of DataLaurens De Vocht
Slides of LDOW2013 presentation, May 14th, Rio De Janeiro, Brazil
We will show that semantically annotated paths lead to discovering meaningful, non-trivial relations and connections between multiple resources in large online datasets such as the Web of Data. Graph algorithms have always been key in pathfinding applications (e.g., navigation systems). They make optimal use of available computation resources to find paths in structured data. Applying these algorithms to Linked Data can facilitate the resolving of complex queries that involve the semantics of the relations between resources. In this paper, we introduce a new approach for finding paths in Linked Data that takes into account the meaning of the connections and also deals with scalability. An efficient technique combining pre-processing and indexing of datasets is used for finding paths between two resources in largedatasets within a couple of seconds. To demonstrate our approach, we have implemented a testcase using the DBpedia dataset.
A Framework Concept for Profiling Researchers on Twitter using the Web of DataLaurens De Vocht
Based upon findings and results from our recent research we propose a generic frame-
work concept for researcher profiling with appliance to the areas of ”Science 2.0” and ”Research 2.0”. Intensive growth of users in social networks, such as Twitter generated a vast amount of information. It has been shown in many previous works that social networks users produce valuable content for profiling and recommendations. Our research focuses on identifying and locating experts for specific research area or topic. In our approach we apply semantic technologies like (RDF, SPARQL), common vocabularies (SIOC , FOAF, MOAT, Tag Ontology) and Linked Data (GeoNames , COLINDA).
Aligning Web Collaboration Tools with Research Data for ScholarsLaurens De Vocht
Resources for research are not always easy to explore, and
rarely come with strong support for identifying, linking and
selecting those that can be of interest to scholars. In this
work we introduce a model that uses state-of-the-art semantic technologies to interlink structured research data and data from Web collaboration tools, social media and Linked Open Data. We use this model to build a platform that connects scholars, using their proles as a starting point to explore novel and relevant content for their research. Scholars can easily adapt to evolving trends by synchronizing new social media accounts or collaboration tools and integrate then with new datasets. We evaluate our approach by a scenario of personalized exploration of research repositories where we analyze real world scholar profiles and compare them to a reference profile.
Providing Interchangeable Open Data to Accelerate Development of Sustainable ...Laurens De Vocht
Travelers expect access to tourism information at anytime, anywhere, with any media. Mobile tourism guides, accessible via the Web, provide an omnipresent approach to this. Thereby it is expensive and not trivial to (re)model, translate and transform data over and over. This inhibits many players, including governments, in developing such applications. We report on our experience in running a project on mobile tourism in Flanders, Belgium where we develop a methodology and reusable formalization for the data disclosure. We apply open data standards to achieve a reusable and interoperable datahub for mobile tourism. We organized working groups resulting in a re-usable formal specification and serialization of the domain model that is immediately usable for building mobile tourism applications. This increased the awareness and lead to semantic convergence which is forming a regional foundation to develop sustainable mobile guides for tourism.
Big Linked Data ETL Benchmark on Cloud Commodity HardwareLaurens De Vocht
Linked Data storage solutions often optimize for low latency querying and quick responsiveness. Meanwhile, in the back-end, offline ETL processes take care of integrating and preparing the data. In this paper we explain a workflow and the results of a benchmark that examines which Linked Data storage solution and setup should be chosen for different dataset sizes to optimize the cost-effectiveness of the entire ETL process. The benchmark executes diversified stress tests on the storage solutions. The results include an in-depth analysis of four mature Linked Data solutions with commercial support and full SPARQL 1.1 compliance. Whereas traditional benchmarks studies generally deploy the triple stores on premises using high-end hardware, this benchmark uses publicly available cloud machine images for reproducibility and runs on commodity hardware. All stores are tested using their default configuration. In this setting Virtuoso shows the best performance in general. The other tree stores show competitive results and have disjunct areas of excellence. Finally, it is shown that each store’s performance heavily depends on the structural properties of the queries, giving an indication of where vendors can focus their optimization efforts.
Examples of how Blueback tools can expand and enhance your Petrel workflowMitch Sutherland
A poster designed to give you a flavour of some of the cool Petrel plug-ins available from Blueback Reservoir. For exploration, interpretation, inversion, QI, data analysis, geomodeling & project management.
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked DataLaurens De Vocht
Path-based storytelling with Linked Data on the Web provides users the ability to discover concepts in an entertaining and educational way. Given a query context, many state-of-the-art pathfinding approaches aim at telling a story that coincides with the user’s expectations by investigating paths over Linked Data on the Web. By taking into account serendipity in storytelling, we aim at improving and tailoring existing approaches towards better fitting user expectations so that users are able to discover interesting knowledge without feeling unsure or even lost in the story facts. To this end, we propose to optimize the link estimation between - and the selection of facts in a story by increasing the consistency and relevancy of links between facts through additional domain delineation and refinement steps. In order to address multiple aspects of serendipity, we propose and investigate combinations of weights and heuristics in paths forming the essential building blocks for each story. Our experimental findings with stories based on DBpedia indicate the improvements when applying the optimized algorithm.
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Laurens De Vocht
Linked Data offers an entity-based infrastructure to resolve indirect relations between resources, expressed as chains of links. If we could benchmark how effective retrieving chains of links from these sources is, we can motivate why they are a reliable addition for exploratory search interfaces. A vast number of applications could reap the benefits from encouraging insights in this field. Especially all kinds of knowledge discovery tasks related for instance to ad-hoc
decision support and digital assistance systems. In this paper, we explain a benchmark model for evaluating the effectiveness of associating chains of links with keyword-based queries. We illustrate the benchmark model with an example case using academic library and conference metadata where we measured precision involving targeted expert users and directed it towards search effectiveness. This kind of typical semantic search engine evaluation focusing on information
retrieval metrics such as precision is typically biased towards the final result only. However, in an exploratory search scenario, the dynamics of the intermediary links that could lead to potentially relevant discoveries are not to be neglected.
Each government level uses its own different information system. At the same time citizens expect that these governmental levels adopt a user-centric approach and provide instant access to their data or to open government data. Therefore the applications at various government levels need to be interoperable in support of the ‘once only-principle’: data is inputted and registered only once and then reused. Given government budget constraints and the cost and complexity of (re)modeling, translating and transforming data over and over, public administrations need to reduce interoperability costs. This is achieved by semantically aligning information between the different information systems of each government level. Semantical interoperable systems facilitate citizen-centered e-government services. This paper illustrates how the Open Standards for Linked Organizations program (OSLO) paved the way bottom-up from a broad basis of stakeholders towards a government-endorsed strategy. OSLO applied a generic process and methodology and provided practical insights on how to overcome the encountered hurdles: political support and adoption; reaching semantic agreement. The lessons learned in the region of Flanders (Belgium) can speed-up the process in other countries that face the complexity of integrating information intensive processes between different applications, administrations and government levels.
Oil 101: Introduction to Oil and Gas - UpstreamEKT Interactive
Oil 101: Introduction to Oil and Gas - Upstream
What is Upstream? This Midstream content is derived from our Oil 101 Upstream ebook and can be found in our oil and gas learning community.
This Upstream module includes the following sections (use the links below for quick access):
-Introduction to Upstream
-Upstream Business Characteristics
-Oilfield Services
-Reserves – Formation and Importance
-Production – The First Step in Adding Value
-The Unconventional Future of Upstream
Upstream
What is Upstream? Most oil and gas companies’ business structures are segmented and organized according to business segment, assets, or function.
The upstream segment of the business is also known as the exploration and production (E&P) sector because it encompasses activities related to searching for, recovering and producing crude oil and natural gas.
The upstream segment is all about wells: where to locate them; how deep and how far to drill them; and how to design, construct, operate and manage them to deliver the greatest possible return on investment with the lightest, safest and smallest operational footprint.
Exploration
The exploration sector involves obtaining a lease and permission to drill from the owners of onshore or offshore acreage thought to contain oil or gas, and conducting necessary geological and geophysical (G&G) surveys required to explore for (and hopefully find) economic accumulations of oil or gas.
Drilling
There is always uncertainty in the geological and geophysical survey results. The only way to be sure that a prospect is favorable is to drill an exploratory well. Drilling is physically creating the “borehole” in the ground that will eventually become an oil or gas well. This work is done by rig contractors and service companies in the Oilfield Services business sector.
Production
The production sector of the upstream segment maximizes recovery of petroleum from subsurface reservoirs.
Introduction to Oil and Gas Industry from Upstream (Exploration & Production), Midstream (Transportation & Storage), to Downstream (Refining, Petrochemical, & Marketing)
A sponsored supplement produced for Jisc on how researchers can cope with the data deluge of modern research techniques. Published by Times Higher Education on 25 November 2009
Keynote presentation by Professor Carole Goble at BOSC (Bioinformatics Open Source Conference) Long Beach, California, USA, July 14 2012. Co-located with ISMB, Intelligent Systems in Molecular Biology
Linked Data and Semantic Technologies can support a next generation of science. This talk shows examples of discovery, access, integration, analysis, and shows directions towards prediction and vision.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
By Design, not by Accident - Agile Venture Bolzano 2024
Researcher Profiling based on Semantic Analysis in Social Networks
1. Researcher Profiling based on
Semantic Analysis in Social Networks
Laurens De Vocht
Supervisors Promotors
Gonzalo Parra Erik Duval
Selver Softic Martin Ebner
July 1, 2011
4. Definitions
Profiling
“Inferring unobser vable information about users from
observable information about them, that is their actions or their
utterances.” (Zukerman and Albrecht, 2001)
Semantic Analysis
“A technique using semantic-based tools and ontologies in
order to gain a deeper understanding of the information being
stored and manipulated in an existing system” (McComb, 2004)
4
5. Problem Statement
Web users generate a massive
unstructured information flow
?
Who has scientific information
relevant for me?
5
6. Problem Statement
Connecting researchers based on shared scientific events
(conferences)
Scientific Profiling
Scientific
User Model Event Model Conferences
Resource
Researchers
Profiler/
Analyzer
Researcher
(User)
6
7. The Social Semantic Web
Community of (micro)blogging,
researchers with sharing,
conference tagging,
experience discussion
semi-structured
information
Larger population of system
people interested in
(faceted) search
scientific conferences
engine
recommendation clustered and
engine analyzed data
(Gruber, 2007)
7
8. The Social Semantic Web
Community of (micro)blogging,
researchers with sharing,
conference tagging,
experience discussion
semi-structured
information
Larger population of system
people interested in
(faceted) search
scientific conferences
engine
recommendation clustered and
engine analyzed data
Human process
(Gruber, 2007)
7
9. The Social Semantic Web
Community of (micro)blogging,
researchers with sharing,
conference tagging,
experience discussion
semi-structured
information
Larger population of system
people interested in
(faceted) search
scientific conferences
engine
recommendation clustered and
engine analyzed data
Human process Machine process
(Gruber, 2007)
7
10. The Social Semantic Web
Social Web
Community of (micro)blogging,
researchers with sharing,
conference tagging,
experience discussion
semi-structured
information
Larger population of system
people interested in
(faceted) search
scientific conferences
engine
recommendation clustered and
engine analyzed data
Human process Machine process
(Gruber, 2007)
7
11. The Social Semantic Web
Social Web Semantic Web
Community of (micro)blogging,
researchers with sharing,
conference tagging,
experience discussion
semi-structured
information
Larger population of system
people interested in
(faceted) search
scientific conferences
engine
recommendation clustered and
engine analyzed data
Human process Machine process
(Gruber, 2007)
7
12. The Social semantic Web
‣Hashtags as Identifiers
‣not always strong or consistent enough
‣properties of good hashtags formalized
‣helpful in assessment of valuable identifiers
(Laniado and Mika, 2007)
‣Expert Search/Profiling with Linked Data
‣aggregate and analyze certain types of data
‣need to surpass limits of closed data sets
‣LOD delivers multi-purpose data
(Stankovic et al., 2010)
8
13. Scope & Value of the Study
‣Bridging research areas
Human Computer-Interaction & Semantic Analysis
‣Mining usable data
out of social networks (microblogs)
‣Integration
Social network data and linked open data
‣Framework driven methodology
based upon current state-of-the-art semantic tools
‣Evaluation
proof-of-concept Research 2.0 application
9
14. Solution
Annotate Data from Social Networks
Community approved
ontologies: FOAF, SIOC
Linked Open Data Applications
Scientific Profiling Framework
Connect People and Resources
that share Scientific Affinities
10
16. Framework: Overview
Social Linked Open
Output Format
Networks Data Cloud
Framework Aggregate Interlink Publish
Archived/Cached Scientific
Linked Data Information
Data Annotate Analyse
12
17. Framework: Overview
Social Linked Open
Output Format
Networks Data Cloud
Framework Aggregate Interlink Publish
Archived/Cached Scientific
Linked Data Information
Data Annotate Analyse
DBPedia JSON
Twitter Colinda RDF (XML)
GeoNames
Aggregate Interlink Publish
Semantic Scientific
Grabeeter Profiling API
Annotate Profiling Network Analyse
12
18. Framework: Overview
Social Linked Open
Output Format
Networks Data Cloud
Framework Aggregate Interlink Publish
Archived/Cached Scientific
Linked Data Information
Data Annotate Analyse
DBPedia JSON
Twitter Colinda RDF (XML)
GeoNames
Aggregate Interlink Publish
Semantic Scientific
Grabeeter Profiling API
Annotate Profiling Network Analyse
13
32. Evaluation: Usability
‣Definitely useful application
‣Use of the map view makes sense
‣People - Event split confusing
‣View of own profile
‣not a suitable starting point
‣only useful in comparison
‣shouldn’t be always visible
‣Person-specific affinities
‣too much hidden
25
35. Evaluation: Usefulness
‣Relevance
Test users rate their search results
‣Satisfaction questionnaire
Targeted questions about usefulness
Allow comments on user interface
28
36. Evaluation: Usefulness
Relevant user percentage
Number of users
0% (None)
1-20% (A few)
21-40% (Less than one half)
41-60% (About one half)
61-80% (More than one half)
81-99% (Almost all)
100% (All)
0 1 2 3 4
29
37. Evaluation: Usefulness Usefulness Questionnaire Results
Concept Affinity
Clear view of affinities between people
Map & Plot combination understood
Deactivating filer fast enough
Activating filer fast enough
Never usability glitches
Convention between views understood
Information display not overwhelming (confusing)
Relevant detailed person info
Shown details correspond with ‘real life’ activities
Enough relevant (new) persons
Daily updating of information obvious
Twitter data made more useful for researchers
1 2 3 4 5
30
38. Evaluation: Discussion
‣ Affinities exposed in an engaging way
‣ Relevant users rating
OR Many common entities trigger positive rating
OR Common entities start deeper investigation
‣ Reliability of person details hard to verify
‣ UI satisfaction user dependent
‣ What does the user expect from “Affinity Browser”?
‣ Test different scenarios to identify usage types?
31
39. Future work
‣ Rank tags
by importance, not just frequency of use
‣ Visualization
improve viewing of links between users and entities
‣ Multiple Resources
better reliability and more verification of data
32
40. Conclusion
‣ Framework could support many social semantic-based applications
‣ Realized with current state-of-the-art technologies
‣ Interlinking with Linked Open Data Cloud enriches social network
data
‣ Researcher Affinity Browser
‣ Exposes affinities between users
‣ User feedback affirms positively new view on social data
‣ Hash tags identified as conferences provide consistent links
33
Editor's Notes
\n
\n
Results: so far\n
\n
To make progress in research it is important to get in touch and share ideas with people who share affinities.\nOne of the most visible trends on the internet is the emergence of “Social Web” sites.\nCurrent online community sites are isolated fromone another. The main reason for this\nlack of interoperability is the fact that common standards for data interchange still have\nto arise.\nWe propose a framework to address an important issue in the context of the ongoing\nadoption of the “Web 2.0” in science and research, often referred to as “Science 2.0” or\n“Research 2.0”. A growing number of people are linked via acquaintances and online\nsocial networks such as Twitter allow indirect access to a huge amount of ideas. These\nideas are contained in amassive human information flow. That users of these networks\nproduce relevant data is being shown in many studies. The problem however lies in\ndiscovering and verifying such a stream of unstructured data items. Another related\nproblem is locating an expert that could provide an answer to a very specific research\nquestion.\n\n
The goal is to build a semantic profiling framework that can support applications and\nservices that try to improve the connecting of researchers.\nThemain use case and application that the framework has to support is illustrated by\nwhat could be called: “the conference case”. Scientists and researchers are interested in\nvery specific topics, this is best verified by the conferences they are attending. Another\ntrend is that they all blog and tweet about these events[14][10]. This creates huge opportunities for profiling. The attendees tweet about what they notice, what they remark\nas interesting for their own projects. What if we could connect these users using this\ninformation? We could call an application that does just that “Scientific Profiling”. This\napproach comes from the concept that the data produced in social networks can have\ntrue value if properly annotated and interlinked [5]. A second requirement is to create\na suitable context in which this information can get meaning. This is very important to\nidentify which ontologies should be used\n
Social Semantic Web Application - A Collective Knowledge System.\nThe essential difference between the classic Web and the Semantic Web is that structured data is exposed in a structured way.  For example, the classic Web might have a document that mentions a place, "Paris".  The conventional way to find this document on the Web is to search for the term "Paris" in a search engine.  Similarly, to find out more about the place one would plow through the search results on the term "Paris" and manually pick out the pages that seem to have something to do with the place.  The heuristics employed by today's search engines for inferring what one means by the string "Paris" are biased by popularity, which means that one will encounter many pages about a celebrity heiress en route to the French capital.\nThe Semantic Web vision is to point to a representation of the entity, in this case a city, rather than its surface manifestation. Thus to find the city Paris, one would search for things known to be cities for entities whose names match "Paris", possibly limiting the results to cities of a certain size or in a particular country. Then one might look for information of the desired type about the city, such as maps, travel guides, restaurants, or famous people who lived in Paris during some period of history.  The heuristics for searching the Semantic Web depend on conventions about how to represent things like cities (such as those specified in ontologies), and the availability of data which use these conventions.  Such data is not available for most user contributions in the Social Web. To move to the next level of collective knowledge systems, it would be nice to get the benefits of structured data from the systems that give rise to the Social Web.\nGruber argues that the Social Web and the Semantic Web should be combined, and that collective knowledge systems are the "killer applications" of this integration.  The keys to getting the most from collective knowledge systems, toward true collective intelligence, are tightly integrating user-contributed content and machine-gathered data, and harvesting the knowledge from this combination of unstructured and structured information.\n
Social Semantic Web Application - A Collective Knowledge System.\nThe essential difference between the classic Web and the Semantic Web is that structured data is exposed in a structured way.  For example, the classic Web might have a document that mentions a place, "Paris".  The conventional way to find this document on the Web is to search for the term "Paris" in a search engine.  Similarly, to find out more about the place one would plow through the search results on the term "Paris" and manually pick out the pages that seem to have something to do with the place.  The heuristics employed by today's search engines for inferring what one means by the string "Paris" are biased by popularity, which means that one will encounter many pages about a celebrity heiress en route to the French capital.\nThe Semantic Web vision is to point to a representation of the entity, in this case a city, rather than its surface manifestation. Thus to find the city Paris, one would search for things known to be cities for entities whose names match "Paris", possibly limiting the results to cities of a certain size or in a particular country. Then one might look for information of the desired type about the city, such as maps, travel guides, restaurants, or famous people who lived in Paris during some period of history.  The heuristics for searching the Semantic Web depend on conventions about how to represent things like cities (such as those specified in ontologies), and the availability of data which use these conventions.  Such data is not available for most user contributions in the Social Web. To move to the next level of collective knowledge systems, it would be nice to get the benefits of structured data from the systems that give rise to the Social Web.\nGruber argues that the Social Web and the Semantic Web should be combined, and that collective knowledge systems are the "killer applications" of this integration.  The keys to getting the most from collective knowledge systems, toward true collective intelligence, are tightly integrating user-contributed content and machine-gathered data, and harvesting the knowledge from this combination of unstructured and structured information.\n
Social Semantic Web Application - A Collective Knowledge System.\nThe essential difference between the classic Web and the Semantic Web is that structured data is exposed in a structured way.  For example, the classic Web might have a document that mentions a place, "Paris".  The conventional way to find this document on the Web is to search for the term "Paris" in a search engine.  Similarly, to find out more about the place one would plow through the search results on the term "Paris" and manually pick out the pages that seem to have something to do with the place.  The heuristics employed by today's search engines for inferring what one means by the string "Paris" are biased by popularity, which means that one will encounter many pages about a celebrity heiress en route to the French capital.\nThe Semantic Web vision is to point to a representation of the entity, in this case a city, rather than its surface manifestation. Thus to find the city Paris, one would search for things known to be cities for entities whose names match "Paris", possibly limiting the results to cities of a certain size or in a particular country. Then one might look for information of the desired type about the city, such as maps, travel guides, restaurants, or famous people who lived in Paris during some period of history.  The heuristics for searching the Semantic Web depend on conventions about how to represent things like cities (such as those specified in ontologies), and the availability of data which use these conventions.  Such data is not available for most user contributions in the Social Web. To move to the next level of collective knowledge systems, it would be nice to get the benefits of structured data from the systems that give rise to the Social Web.\nGruber argues that the Social Web and the Semantic Web should be combined, and that collective knowledge systems are the "killer applications" of this integration.  The keys to getting the most from collective knowledge systems, toward true collective intelligence, are tightly integrating user-contributed content and machine-gathered data, and harvesting the knowledge from this combination of unstructured and structured information.\n
Social Semantic Web Application - A Collective Knowledge System.\nThe essential difference between the classic Web and the Semantic Web is that structured data is exposed in a structured way.  For example, the classic Web might have a document that mentions a place, "Paris".  The conventional way to find this document on the Web is to search for the term "Paris" in a search engine.  Similarly, to find out more about the place one would plow through the search results on the term "Paris" and manually pick out the pages that seem to have something to do with the place.  The heuristics employed by today's search engines for inferring what one means by the string "Paris" are biased by popularity, which means that one will encounter many pages about a celebrity heiress en route to the French capital.\nThe Semantic Web vision is to point to a representation of the entity, in this case a city, rather than its surface manifestation. Thus to find the city Paris, one would search for things known to be cities for entities whose names match "Paris", possibly limiting the results to cities of a certain size or in a particular country. Then one might look for information of the desired type about the city, such as maps, travel guides, restaurants, or famous people who lived in Paris during some period of history.  The heuristics for searching the Semantic Web depend on conventions about how to represent things like cities (such as those specified in ontologies), and the availability of data which use these conventions.  Such data is not available for most user contributions in the Social Web. To move to the next level of collective knowledge systems, it would be nice to get the benefits of structured data from the systems that give rise to the Social Web.\nGruber argues that the Social Web and the Semantic Web should be combined, and that collective knowledge systems are the "killer applications" of this integration.  The keys to getting the most from collective knowledge systems, toward true collective intelligence, are tightly integrating user-contributed content and machine-gathered data, and harvesting the knowledge from this combination of unstructured and structured information.\n
Laniado and Mika found that not all hashtags are used in the same way, not all of them aggregate messages around a community or a topic, not all of them endure in time, and not all of them have an actual meaning. In this work they had addressed the issue of evaluating Twitter hashtags as strong identifiers, as a first step in order to bridge the gap between Twitter and the Semantic Web. The first contribution of this paper stands in the formalization of the problem, and in the elaboration of a number of desired properties for a good hashtag to serve as a URI. Frequency, specificity, consistency in usage and stability over time. Based on these data, they had tested the results obtained with the algorithms described in their paper, showing how a combination of the proposed measures can help in the task of assessing which tags are more likely to represent valuable identifiers. These results are promising, with respect to the perspective of anchoring Twitter hashtags to Semantic Web URIs, and to detect concepts and entities valuable to be treated as new identifiers.\n The authors concluded that expert search and profiling systems aggregate and analyze certain types of data depending on the types of expertise hypotheses they use. Traditional approaches tend to retrieve their data from closed or limited data corpuses. LOD on the other hand allows querying the whole Web like a huge database, thus surpassing the limits of closed data sets, and closed online communities. They believe that this opens new possibilities for traditional expert search and profiling systems which usually only rely on data from their local and limited databases or on unstructured data gathered from the Web. LOD also stands up for a great promise to deliver mutli purpose data that can be used to find experts in many domains and with many different expertise hypotheses. In this paper they have explored the potentials and drawbacks of LOD in comparison to traditional datasources used for expert search. They haven’t only asked the question what LOD can do, but also what one can do for LOD to make it an even better source of expertise evidence.\n
The study spans two main areas - semantic analysis and usability. The current state of the art Semantic Web standards and processes are used as a foundation for this study. Researcher profiling applications integrate human computer interaction (HCI) and expert finding. Everybody who is interested in the semantic web, microblogging and profiling might find some parts of this thesis relevant.\nThe approach presented aims at gaining more knowledge and mining usable data out of social networks, especially microblogs, with a framework driven methodology based upon Semantic Web standards and tools. Introducing the interesting aspects about microblogs, this thesis tries to answer how far they correspond with ideas from other research areas like Science 2.0, Research 2.0, Semantic Web or Linked Data and to outline the importance and relevance of such or similar efforts by examples and arguments from current research and with examples from current work.\nIt is to be noted that neither the literature study nor the software architecture want to give a broad overview of the current semantic web and microblogging services. It is targeted as a carefully considered selection of articles that allows the development of researcher profiling applications. The architecture of the framework is being designed only with the problem statement in mind. At this time it is not part of the research to find out how this could be extended to other resources or targets (e.g. mobile applications). It focuses on the integration of user data from a microblogging service and domain knowledge from scientific conferences.\n
The SemanticWeb Technology stack is well defined and applying frameworks\nsuch as SIOC (Semantically Interlinked Online Communities) [4] and FOAF (Friend-Of-A-Friend) [2] can lead to a an interlinked and semantically rich knowledge source. This\nknowledge source will be built with user profiles and the content they produce on various\nsocial networks as a basis.\nTwitter contains infos on:\nPeople, Organisations, Locations, Trends …\nLOD Cloud contains\nBillions of triples about:\nGeolocations , data about science, government, common knowledge , persons, news …\n
Results: so far\n
The idea is to design, develop and implement a framework that collects data from social networks and uses community approved ontologies and linked open data to analyze and verify the data.\n
The idea is to design, develop and implement a framework that collects data from social networks and uses community approved ontologies and linked open data to analyze and verify the data.\n
Aggregate your Tweets, Search in your Tweets offline using the Grabeeter Client. Grabeeter [45] is an application that allows you to search tweets of a single Twitter user online and offline. In contrast to the Twitter API, Grabeeter provides all stored tweets and makes no restriction over time.The Grabeeter web application uses the Twitter API to retrieve tweets of predefined users. Tweets are stored in the Grabeeter database and on the file system as Apache Lucene[2] index. In order to ensure an efficient search tweets must be indexed.\n\n
The idea is to design, develop and implement a framework that collects data from social networks and uses community approved ontologies and linked open data to analyze and verify the data.\n
The semantic profiling framework has to support a Scientific Profiling application as\nwas explained in the problem statement in chapter 1. The framework architecture still\nconsists of three layers:\n1. Extraction layer: Extracts data fromvarious resources and annotates it using relevant ontologies for that specific data context.\n2. Interlinking layer: Is feeded with annotated data (triples) and creates a SPARQL\nendpoint for it. It is responsible for requesting more data if needed for a certain\ninformation query. It parsers high level queries and translates them tot SPARQL\nQueries. The results are then being returned.\n3. Analysis layer: Here a user information needs are translated into high level queries\nthat the interlinking layer understands. It also contains somemetrics to rank and\nevaluate the returned results.\n
\n
The user profile\n
Related entities for a user\n
Suggested conferences for a user\n
Suggested users & info for a specific event\n
The test application is deployed on the Google App Engine server. This makes the maintenance straightforward and the deployment simple.\n\nGrabeeter consists of several scripts that are crawling the registered users Twitter accounts. Everything is stored in a MySQL database. Requests from the Semantic Profiling network are querying the Grabeeter MySQL database directly.\n\nThe semantic profiling server has several scripts that maintain the high level functionality. Two scripts are run periodically to keep the linked data network up to date. Other scripts realize the API functionality.\nThe “provider” script checks the Grabeeter database for new users. If there are new users their data is passed on to the Extraction module for annotation and triplification. For existing users, new tweets are fetched and triplified.\nThe “interlinking” script goes through all tags and first compares them against the Colinda repository. Any found conference tags are annotated appropriately. Secondly the script checks if tags represent a location or a common knowledge entity.\nThe scripts “person”, “event” and“discovery” implement the API functionality. They use the arguments given by the REST call. The return is every time a JSON Object containing the result of the call. The script “allusers” returns a JSON Array that contains all users currently in the system.\n
Results: so far\n
Avoid user interface is an issue (hide it)\nFocus should be on the data\nFixed data: users are presented a role and have to find a good matching conference and expert.\n\n
\n
Affinities as a starting point. Affinities are now facets to filter the result list of people. Instead of popup windows, tabs with details about each user appear in the bottom.\n
Affinities as a starting point. Affinities are now facets to filter the result list of people. Instead of popup windows, tabs with details about each user appear in the bottom.\n
\n
\n
Positive agreement among users\n1: Concept Affinity\n3: Understandable combination with affinity plot\n7: Convention between views understood\n13: Twitter data is made more useful for researchers\n\nNo agreement among users\n2: Clear view of affinities between people\n4,5: Filter (de)activation\n6: Never usability glitches\n8: Information display not overwhelming/confusing\n12: Daily updates of information obvious\n\n
\n
The more resources, the more types of entities can be interlinked to improve the verifiability of the results. The framework can easily be enriched easily with additional RDF resources, a new handle in the Interlinking module suffices. Some more effort has to be done to add data from another source that is not yet available as RDF. In that case it is necessary to write an additional Model class for the Extraction module and a handle in the Annotator class that includes data from that module by annotating it appropriately. This process is completely comparable to the extraction of Twitter data presented in this thesis. On the high level, new functionality can easily be added by proper translation into SPARQL queries. As more different data models and resources become available it might be of interest to extend the API as such. Again the same approach can be used as for the discovery and presentation of persons and scientific events.\n
The framework serves as a powerful backend for a web service. In the requirements of the current framework we focused on the ability to extract, annotate and interlink data from Twitter and make the linked data available as a SPARQL Endpoint and a Web Service that allows high level requests. The architecture is based on state-of-the-art technologies and brings in a novel approach of usage and dissemination of knowledge cumulated in social networks. It uses semantic tools and techniques for the domains of appliance like Research 2.0.\nThe web service behaves as a REST API and can support applications that want to propose interesting people or interesting scientific events to their users. It is possible to create an application that connects people who attend or mention the same scientific conference, as soon as they both have made their social data available to the system. We have shown that the enrichment of social network data with linked data leads to a verifiable user profile that allows comparison with others alike.\nThe demonstration application introduces the concept “affinity”. The concept has only been used a few times before, but for a similar purpose: to expose an otherwise hidden proximity to or liking for specific aspects. The usefulness of this approach and its presentation has been reviewed positively by test users from the target group, researchers. They appreciated the use of affinities. Their feedback exposed what we learned in theory from the literature study: the use of linked data shapes a whole new view on existing social data. By interlinking tags to scientific conferences we are able to display verified entities. We noted for example that the choice for hash tags lead to enough identified\n\n