Should We Expect a Bang or a Whimper? Will Linked Data Revolutionize Scholar Authoring and Workflow Tools?
Jeff Baer, Senior Director of Product Management, Research Development Services, Proquest
Exploration, visualization and querying of linked open data sourcesLaura Po
afternoon hands-on session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
I will try to say – what is QA, how could we get the answer to questions on natural language and how successful have we been in that domain.
I have gained all of my knowledge from three proposed papers and what I read around them.
The slides discuss the research agenda for search of the semantic web and current available search tools. The slides were prepared for an audience of information
Should We Expect a Bang or a Whimper? Will Linked Data Revolutionize Scholar Authoring and Workflow Tools?
Jeff Baer, Senior Director of Product Management, Research Development Services, Proquest
Exploration, visualization and querying of linked open data sourcesLaura Po
afternoon hands-on session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
I will try to say – what is QA, how could we get the answer to questions on natural language and how successful have we been in that domain.
I have gained all of my knowledge from three proposed papers and what I read around them.
The slides discuss the research agenda for search of the semantic web and current available search tools. The slides were prepared for an audience of information
Open science can contribute to AI trustworthiness. This talk is a categorization of scientific data platforms, and a framing of AI trustworthiness with pointers to open science contributions.
folksonomy, social tagging, tag clouds, automatic folksonomy construction, word clouds, wordle,context-preserving word cloud visualisation, CPEWCV, seam carving, inflate and push, star forest, cycle cover, quantitative metrics, realized adjacencies, distortion, area utilization, compactness, aspect ratio, running time, semantics in language technology
This 2-hour lecture was held at Amsterdam University of Applied Sciences (HvA) on October 16th, 2013. It represents a basic overview over core technologies used by ICT companies such as Google, Twitter or Facebook. The lecture does not require a strong technical background and stays at conceptual level.
From Search to Predictions in Tagged Information SpacesChristoph Trattner
Tagging gained tremendously in popularity over past few years. When looking into the literature of tagging we find a lot of work regarding people's tagging motivation, their behavior, models that describe the folksonomy generation process, emergent semantic structures, etc., but interestingly we find quite little research showing the value of tags for searching an overloaded information space. Furthermore, there is lot of literature on the tag or item prediction problem, but interestingly almost all of them lookat the issue from a data-driven perspective. To bridge this gap in the literature, we have conducted several in-depth studies in the past showing the value of tags for lookup and exploratory search. We looked at the problem from a network theoretic and interface perspective and we will show how useful tags are for searching. Furthermore, we reviewed literature on memory processes from cognitive science and have invented a number of novel recommender algorithms based on the ACT-R and MINERVA2 theory. We will show that these approaches can not only predict tags and items extremely well, but also reveal how these models can help in explaining the recommendation processes better than current approaches.
Recommending Tags with a Model of Human CategorizationChristoph Trattner
Social tagging involves complex processes of human categorization that have been the topic of much research in the cognitive sciences. In this paper we present a recommender approach for social tags whose principles are derived from some of the more prominent and empirically well-founded models from this research tradition. The basic architecture is a simple three-layers connectionist model. The input layer encodes patterns of semantic features of a user-specific re- source, which are either latent topics elicited through Latent Dirichlet Allocation (LDA) or available external categories. The hidden layer categorizes the resource by matching the encoded pattern against already learned exemplar patterns. The latter are composed of unique feature patterns and associated tag distributions. Finally, the output layer samples tags from the associated tag distributions to verbalize the preceding categorization process. We have evaluated this approach on a real-world folksonomy gathered from Wikipedia bookmarks in Delicious. In the experiment our approach outperformed LDA, a well-established algorithm. We at- tribute this to the fact that our approach processes seman- tic information (either latent topics or external categories) across the three different layers, and this substantially enhances the recommendation performance. With this paper, we demonstrate that a theoretically guided design of algorithms not only holds potential for improving existing recommendation mechanisms, but it also allows us to derive more generalizable insights about how human information interaction on the Web is determined by both semantic and verbal processes.
Pistoia alliance harmonizing fair data catalog approaches webinarPistoia Alliance
Multiple groups in the life sciences community have started their journey towards data FAIR-ification by implementing Data Catalogs, a clear first step towards Finding your data. While in many cases the approaches are quite similar, in both origin and intent, differing implementations could end up hampering interoperability and reuse. The Pistoia Alliance and the Linked Data Community of Practice hosted a panel discussion describing at three implementations and their downstream goals:
[1] Pharma cross-omics data catalogs,
[2] Clinical data catalogs
[3] Bioschemas for dataset discoverability on the inter/intranet
Semantic search helps business people find answers to pressing questions by wading through oceans of information to find nuggets of meaningful information. In this presentation we’ll discuss how semantic search and content analysis technologies are starting to appear in the marketplace today. We’ll provide a recap of what semantic search is and what the key benefits are, then we’ll answer the following questions:
• Is semantic search a feature, an application, or enterprise system?
• How can I add semantic search to my existing work processes?
• Will I need to replace my existing content technologies?
• What will I need to do to prepare my content for semantic search?
• Is semantic search just for documents or can I search my data too?
• Can I use semantic search to find information on the internet and other public data sources?
• Are there standards to consider?
Analysing & Improving Learning Resources Markup on the WebStefan Dietze
Talk at WWW2017 on LRMI adoption, quality and usage. Full paper here: http://papers.www2017.com.au.s3-website-ap-southeast-2.amazonaws.com/companion/p283.pdf.
Open science can contribute to AI trustworthiness. This talk is a categorization of scientific data platforms, and a framing of AI trustworthiness with pointers to open science contributions.
folksonomy, social tagging, tag clouds, automatic folksonomy construction, word clouds, wordle,context-preserving word cloud visualisation, CPEWCV, seam carving, inflate and push, star forest, cycle cover, quantitative metrics, realized adjacencies, distortion, area utilization, compactness, aspect ratio, running time, semantics in language technology
This 2-hour lecture was held at Amsterdam University of Applied Sciences (HvA) on October 16th, 2013. It represents a basic overview over core technologies used by ICT companies such as Google, Twitter or Facebook. The lecture does not require a strong technical background and stays at conceptual level.
From Search to Predictions in Tagged Information SpacesChristoph Trattner
Tagging gained tremendously in popularity over past few years. When looking into the literature of tagging we find a lot of work regarding people's tagging motivation, their behavior, models that describe the folksonomy generation process, emergent semantic structures, etc., but interestingly we find quite little research showing the value of tags for searching an overloaded information space. Furthermore, there is lot of literature on the tag or item prediction problem, but interestingly almost all of them lookat the issue from a data-driven perspective. To bridge this gap in the literature, we have conducted several in-depth studies in the past showing the value of tags for lookup and exploratory search. We looked at the problem from a network theoretic and interface perspective and we will show how useful tags are for searching. Furthermore, we reviewed literature on memory processes from cognitive science and have invented a number of novel recommender algorithms based on the ACT-R and MINERVA2 theory. We will show that these approaches can not only predict tags and items extremely well, but also reveal how these models can help in explaining the recommendation processes better than current approaches.
Recommending Tags with a Model of Human CategorizationChristoph Trattner
Social tagging involves complex processes of human categorization that have been the topic of much research in the cognitive sciences. In this paper we present a recommender approach for social tags whose principles are derived from some of the more prominent and empirically well-founded models from this research tradition. The basic architecture is a simple three-layers connectionist model. The input layer encodes patterns of semantic features of a user-specific re- source, which are either latent topics elicited through Latent Dirichlet Allocation (LDA) or available external categories. The hidden layer categorizes the resource by matching the encoded pattern against already learned exemplar patterns. The latter are composed of unique feature patterns and associated tag distributions. Finally, the output layer samples tags from the associated tag distributions to verbalize the preceding categorization process. We have evaluated this approach on a real-world folksonomy gathered from Wikipedia bookmarks in Delicious. In the experiment our approach outperformed LDA, a well-established algorithm. We at- tribute this to the fact that our approach processes seman- tic information (either latent topics or external categories) across the three different layers, and this substantially enhances the recommendation performance. With this paper, we demonstrate that a theoretically guided design of algorithms not only holds potential for improving existing recommendation mechanisms, but it also allows us to derive more generalizable insights about how human information interaction on the Web is determined by both semantic and verbal processes.
Pistoia alliance harmonizing fair data catalog approaches webinarPistoia Alliance
Multiple groups in the life sciences community have started their journey towards data FAIR-ification by implementing Data Catalogs, a clear first step towards Finding your data. While in many cases the approaches are quite similar, in both origin and intent, differing implementations could end up hampering interoperability and reuse. The Pistoia Alliance and the Linked Data Community of Practice hosted a panel discussion describing at three implementations and their downstream goals:
[1] Pharma cross-omics data catalogs,
[2] Clinical data catalogs
[3] Bioschemas for dataset discoverability on the inter/intranet
Semantic search helps business people find answers to pressing questions by wading through oceans of information to find nuggets of meaningful information. In this presentation we’ll discuss how semantic search and content analysis technologies are starting to appear in the marketplace today. We’ll provide a recap of what semantic search is and what the key benefits are, then we’ll answer the following questions:
• Is semantic search a feature, an application, or enterprise system?
• How can I add semantic search to my existing work processes?
• Will I need to replace my existing content technologies?
• What will I need to do to prepare my content for semantic search?
• Is semantic search just for documents or can I search my data too?
• Can I use semantic search to find information on the internet and other public data sources?
• Are there standards to consider?
Analysing & Improving Learning Resources Markup on the WebStefan Dietze
Talk at WWW2017 on LRMI adoption, quality and usage. Full paper here: http://papers.www2017.com.au.s3-website-ap-southeast-2.amazonaws.com/companion/p283.pdf.
The enterprise solution for all your data needs ...
check out the advantages, the features, the reason why, the strategy, ...
Don't hesitate to contact a GACP partner for more information about this amazing tool !
Social media as a tool for terminological researchTERMCAT
Social media as a tool for terminological research
Anita Nuopponen - University of Vaasa
Niina Nissilä - University of Vaasa
VII EAFT Terminology Summit. Barcelona, 27-28 november 2014
Participatory Research Approaches With Disabled Students V3Jane65
Seminar for Higher Education Research Group at the University of Southampton that describes and evaluates the participatory methods used in a research project called LEXDIS which aims to explore the e-learning experiences of disabled students
VII Jornadas eMadrid "Education in exponential times". "Analysing and Alterin...eMadrid network
VII Jornadas eMadrid "Education in exponential times". "Analysing and Altering MOOC Learners' Behaviours at Scale". Claudia Hauff. TU Delft, Países Bajos. 03/07/2017.
1. How do you describe the importance of data in analyticsC.docxberthacarradice
1. How do you describe the importance of data in analytics?
Can we think of analytics without data? Explain.
2. Considering the new and broad definition of business
analytics, what are the main inputs and outputs to the
analytics continuum?
3. Where do the data for business analytics come from?
What are the sources and the nature of those incoming
data?
4. What are the most common metrics that make for
analytics-ready data?
Exercise Question:
Go to data.gov—a U.S. government–sponsored data
portal that has a very large number of data sets on a
wide variety of topics ranging from healthcare to education,
climate to public safety. Pick a topic that you
are most passionate about. Go through the topic- specific
information and explanation provided on the site.
Explore the possibilities of downloading the data, and
use your favorite data visualization tool to create your
own meaningful information and visualizations.
Discussion :
Create a discussion thread (with your name) and answer the following question:
Discussion (Chapter 3): Why are the original/raw data not readily usable by analytics tasks? What are the main data preprocessing steps? List and explain their importance in analytics.
Note: The first post should be made by Wednesday 11:59 p.m., EST. I am looking for active engagement in the discussion. Please engage early and often.
Your response should be 250-300 words. Respond to two postings provided by your classmates.
There must be at least one APA formatted reference (and APA in-text citation) to support the thoughts in the post. Do not use direct quotes, rather rephrase the author's words and continue to use in-text citations
.
F.A.T. City Video Analysis Content Define and Explain Fairness .docxlmelaine
F.A.T. City Video Analysis Content: Define and Explain Fairness
20.0
Analysis comprehensively summarizes how Lavoie defines and explains fairness in the classroom.
F.A.T. City Video Analysis Content: Advice to Parents on Fairness
20.0
Analysis thoroughly describes Lavoie's advice to parents regarding fairness.
F.A.T. City Video Analysis Content: Assumptions
20.0
Analysis insightfully explains what Lavoie says about assumptions and why he discusses them.
F.A.T. City Video Analysis Content: Three Key Concepts
20.0
Analysis substantially summarizes three key concepts and includes realistic, thoughtful application to future professional practice.
Organization
10.0
The content is well organized and logical. There is a sequential progression of ideas related to each other. The content is presented as a cohesive unit and the audience is provided with a sense of the main idea.
Mechanics of Writing (includes spelling, punctuation, grammar, language use)
10.0
Submission is virtually free of mechanical errors. Word choice reflects well-developed use of practice and content-related language. Sentence structures are varied and engaging.
Total Percentage
100
English 2367 Detailed Outline Assignment:
A Detailed Outline for the Persuasive Research Essay
For this assignment, you are asked to start thinking about The Persuasive Research Essay you must write. To complete this assignment, please see the blank outline template below and submit it filled out with your own information/planning for your own persuasive research essay. This outline has a specific format, which is listed below with details, examples and a blank template for you to use/fill out with your topic. Your detailed outline submission must include all 3 sections listed: Topic Overview, Body Paragraphs and Conclusion. The final draft of your outline must be 2-3 pages.
1. Topic Overview: In this section, you should write up your introduction paragraph. This introduction paragraph should include:
· General information about the topic
· Background/context to help the reader understand both sides of the argument (list both sides of the argument as you see them)
· An overview of issues/points of view/ideas surrounding the topic
· Your thesis statement
A note on your thesis: Your thesis should make a statement that is supported by reasons: I believe this because of x, y and z reasons.
Example Thesis: Technology has positively influenced the business field because it has enhanced marketing, improved user interaction through advanced software programs, such as Microsoft Office, and it has helped make the work day more productive because of the invention of computers.
2. Body paragraphs/Sections: In this section, list at minimum 3 body paragraphs or sections. For each body paragraph, write up the topic sentence, and provide at least 1-2 things you’ll want to discuss in that paragraph. Then under each of the two things you’ll want to discuss, pick a source from your Annotat ...
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...GUANGYUAN PIAO
In this paper, we study if reusing Google+ profiles can provide reliable recommendations on Twitter to resolve the cold start problem. Next, we investigate the impact of giving different weights for aggregating user profiles from two OSNs and present that giving a higher weight to the targeted OSN profile for aggregation allows the best performance in the context of a personalized link recommender system. Finally, we propose a user modeling strategy which combines entity-and category-based user profiles using with a discounting strategy. Results show that our proposed strategy improves the quality of user modeling significantly compared to the baseline method.
Semantic Web used to be perceived as an academic discipline for a long time. The purpose of this presentation is to demonstrate the industrial potential that this technology has now as well as showcase use cases in Web Search, Recommender Systems, and Semantic Marketing
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
3. determine general needs for expert search what are the obstacles for expert search on LOD determine useful sources for expert search
4. How do we search for experts in general? different data corpuses different approach list of experts expertise hypothesis
5. If a user wrote a scientific publication on topic X than he is an expert on topic X. If a user wrote a Wikipedia page on topic X than he is an expert on topic X. If a user edited or revised a document about topic X on a collaborative shared online workspace, then he might be expert on topic X. If a user blogs a lot about topic X, then he might be an expert for topic X. If a user has lower entropy of interests, where topic X is a primary interest, then he is a better expert on topic X. If a user has a lot of e-mails on topic X than he is an expert on topic X. If the user has resources/documents on topic X then he is an expert on topic X. If a user has subscription to feeds on topic X, then he is an expert in topic X. If a user participates in a Q&A community on a topic X then he is an expert on the topic X. If a user answers questions from experts than he might himself be an expert --> The more the user asking a question in a Q&A community is expert, the more significant is the expertise of the user giving the answer. If a user participates lots of email conversations about topic X than he might be an expert. If a user answers lots of questions about topic X then he is an expert on topic X. If the user discovers (and shares) "important/good" resources (i.e. resources which become later popular) on topic X, then he is an expert on topic X. If the user is among the first to find and share a good resource on topic X, then he is among the best experts on topic X. If the user participates in collaborative software development project then he might be an expert in the programming language that is used in the project. If a user claims in his resume/CV that he is skilled in a topic than he might be expert . If a user has obtained funded research grants in a certain (domain) field, then he is an expert in that field.
6. If the user wrote a paper saved a bookmark saved a bookmark before the others was retweeted on TopicX then he/she is an expert then he/she is a better ranked expert on TopicX Expertise Hypothesis Expert Candidate Expertise Evidence Expertise Topic hypothesis
7. Expertise Hypothesis Expert Candidate Expertise Evidence Expertise Topic Activities Reputation & Authority Content related to the user Attending professional events, Roles on events, Experience, Projects, Bookmarking … Social Connectedness, Blog popularity, … Blogs, Publications, Wikipedia Articles, …
8. Test Cases T1: Does LOD contain data sets with the type of data needed for a certain hypothesis? T2: Are there relevant data in the concerned data sets? T3: Are there any links to the topics of competence? T4: Are there any links to the user data sources? Topic
9. hypothesis related to content created by user Test Results : Content T1: Does LOD contain data sets with the type of data needed for a certain hypothesis? T2: Are there relevant data in the concerned data sets? T3: Are there any links to the topics of competence? T4: Are there any links to the user data sources? Topic H1: If a user wrote a scientific publication on topic X than he might be an expert on topic X + + +- + H2: If a user wrote a Wikipedia page on topic X than he might be an expert on topic X. + + + - H3: If a user blogs a lot about topic X, then he might be an expert for topic X + + +- +-
10. hypothesis related to users’ online activities Test Results: Online Activities T1: Does LOD contain data sets with the type of data needed for a certain hypothesis? T2: Are there relevant data in the concerned data sets? T3: Are there any links to the topics of competence? T4: Are there any links to the user data sources? Topic H4: If a user answers questions (on topic X) from experts on topic X then he might himself be an expert on topic X + - - - H5: If a user is among the first to discover (and share) "important/good" resources (i.e. resources which become later popular) on topic X, then he might be an expert on topic X. + - + - H6: If a user participates in collaborative software development project then he might be an expert in the programming language that is used in the project. + + +- +-
11. hypothesis related to users’ offline activities & achivements Test Results: Offline Activities T1: Does LOD contain data sets with the type of data needed for a certain hypothesis? T2: Are there relevant data in the concerned data sets? T3: Are there any links to the topics of competence? T4: Are there any links to the user data sources? Topic H7 If a user claims in his resume/CV that he is skilled in a topic X than he might be expert in topic X. - - - - H8: If a user has obtained funded research grants in a certain (domain) field, then he might be an expert in that field. + + - + H9: If a user has a certain position in company then he might be an expert on the topic related to his position. + - - +- H10: If a user supervises/teaches someone then he might be an expert on the topic he/she teaches. - - - - H11: If a user has several years of experience with working on something related to topic X then he might be an expert in topic X. - - - - H12: If a user is a member of the organization committee of a professional event, then he might be expert on the topic of the event. + + - + H13: If a user is giving a keynote or invited talk at a professional event, then he can be considered an expert in the domain topic of the event. + + - + H14: If a user is a chair of a session within a professional event, then he can be considered an expert in the topic of the session (and by generalization, also an expert in the domain topic of the event). + + - + H15: If a user is presenting within a session of a professional event, then he can be considered an expert in the topic his presentation is about. By generalizing, he can be considered an expert in the topic of the session/event his presentation is part of. + + - +
12. hypothesis related to users’ reputation Test Results: Reputation T1: Does LOD contain data sets with the type of data needed for a certain hypothesis? T2: Are there relevant data in the concerned data sets? T3: Are there any links to the topics of competence? T4: Are there any links to the user data sources? Topic H17: If a user’s blog about a topic X gets lost of comments, then he might be an expert for topic X. + + +- +- H18: If a user has higher social connectedness with an expert in topic X, then he is considered to be a better expert in topic X + + +- +- H17: If a user’s blog about a topic X gets lost of comments, then he might be an expert for topic X. + + +- +-
13. Some Benefits Traditional Approaches Lineked Data hypothesis-first data-first data bound to a specific approach data reusable, multiple perspectives difficult to adapt easy to adapt to changes in hypothesis and user behavior data source limited one query rules them all
14. Some Issues with the Current LOD Usage Restricted and Private Data Lack of Data Lack of Details in the Data Lack of Interlinks : to Topics to User Data equivalence of Trace data
16. Some Ideas Lack of Data Lack of Details in the Data Lack of Interlinks : Mailing Lists, Q&A sites, Podcasts, More Events like SemanticWeb.org, Extracting Activities from Twitter Guidelines, Validators for data compleetness, Pedantic Web Group, insist on VoID descriptions automatic: Zemanta, Open Calais; crowdsourcing: Silk, Uberblick and alike…
What allows us to focus on a certain snapshot of LOD and still get useful resutls. We seek in fact to evaluate current state in order to draw general insights.