https://www.insight-centre.org/content/leveraging-matching-dependencies-guided-user-feedback-linked-data-applications
Presented at IIWeb2012
ABSTRACT
This paper presents a new approach for managing integration quality and user feedback, for entity consolidation, within applications consuming Linked Open Data. The quality of a dataspace containing multiple linked datasets is defined in term of a utility measure, based on domain specific matching dependencies. Furthermore, the user is involved in the consolidation process through soliciting feedback about identity resolution links, where each candidate link is ranked according to its benefit to the dataspace; calculated by approximating the improvement in the utility of dataspace utility. The approach
evaluated on real world and synthetic datasets demonstrates the effectiveness of utility measure; through dataspace integration quality improvement that requires less overall user feedback iterations.
Implementing Semantic Web applications: reference architecture and challengesBenjamin Heitmann
Best paper award at the workshop for Semantic Web enabled software engineering 2009, at the International Semantic Web Conference 2009.
Full paper at: http://ceur-ws.org/Vol-524/swese2009_2.pdf
Summary of the slides and the paper:
* an empirical analysis of 98 Semantic Web applications based on an architectural analysis and an application functionality questionnaire
* a reference architecture for Semantic Web applications
* the main challenges of implementing Semantic Web technologies and their effect on an example application
* approaches for mitigating the challenges
An architecture for privacy-enabled user profile portability on the Web of DataBenjamin Heitmann
Presentation at the Heterogeneous Recommendation Workshop at the ACM Recommender Systems Conference 2010.
Providing relevant recommendations requires access to user profile data. Current social networking ecosystems allow third party services to request user authorisation for accessing profile data, thus enabling cross-domain recommendation. However these ecosystems create user lock-in and social networking data silos, as the profile data is neither portable nor interoperable. We argue that innovations in reconciling heterogeneous data sources must be also be matched by innovations in architecture design and recommender methodology. We present and qualitatively evaluate an architecture for privacy-enabled user profile portability, which is based on technologies from the emerging Web of Data (FOAF, WebIDs and the Web Access Control vocabulary). The proposed architecture enables the creation of a universal “private by default” ecosystem with interoperability of user profile data. The privacy of the user is protected by allowing multiple data providers to host their part of the user profile. This provides an incentive for more users to make profile data from different domains available for recommendations.
What your hairstyle says about your political preferences, and why you should...Benjamin Heitmann
Recent developments in the area of social networking have lead to prominent users leaving facebook due to privacy concerns.
In order to really understand what motivated facebook to implement these controversial changes, you have to look at the future of recommender systems. I will introduce my current research in the areas of multi-source, cross-domain and privacy enabled user profiling and recommendation,
and show how it relates to current developments in the social networking space.
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Umair ul Hassan
https://www.insight-centre.org/content/towards-expertise-modelling-routing-data-cleaning-tasks-within-community-knowledge-workers
Presented at the ICIQ 2012
ABSTRACT:
Applications consuming data have to deal with variety of data quality issues such as missing values, duplication, incorrect values, etc. Although automatic approaches can be utilized for data cleaning the results can remain uncertain. Therefore updates suggested by automatic data cleaning algorithms require further human verification. This paper presents an approach for generating tasks for uncertain updates and routing these tasks to appropriate workers based on their expertise. Specifically the paper tackles the problem of modelling the expertise of knowledge workers for the purpose of routing tasks within collaborative data quality management. The proposed expertise model represents the profile of a worker against a set of concepts describing the data. A simple routing algorithm is employed for leveraging the expertise profiles for matching data cleaning tasks with workers. The proposed approach is evaluated on a real world dataset using human workers. The results demonstrate the effectiveness of using concepts for modelling expertise, in terms of likelihood of receiving responses to tasks routed to workers.
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...Benjamin Heitmann
Presentation at the "Reasoning from experiences on the Web" workshop (WebCBR 2010) at the International Conference on Case Based Reasoning 2010.
Abstract:
While Case-based reasoning (CBR) has successfully been deployed on the Web, its data models are typically inconsistent with existing information infrastructure and standards. In this paper, we examine how
CBR can operate on the emerging Web of Data, with mutual benefits. The
expense of knowledge engineering and curating a case base can be reduced
by using Linked Data from the Web of Data. While Linked Data provides experiential data from many different domains, it also contains inconsistencies, missing data and noise which provide challenges for logic-based reasoning. CBR is well suited to provide alternative and robust reasoning approaches. We introduce (i) a lightweight CBR vocabulary which is
suited for the open ecosystem of the emerging Web of Data, and provide
(ii) a detailed example of a case base using data from multiple sources. We
propose that for the first time the Web of Data provides data and a real
context for open CBR systems.
Implementing Semantic Web applications: reference architecture and challengesBenjamin Heitmann
Best paper award at the workshop for Semantic Web enabled software engineering 2009, at the International Semantic Web Conference 2009.
Full paper at: http://ceur-ws.org/Vol-524/swese2009_2.pdf
Summary of the slides and the paper:
* an empirical analysis of 98 Semantic Web applications based on an architectural analysis and an application functionality questionnaire
* a reference architecture for Semantic Web applications
* the main challenges of implementing Semantic Web technologies and their effect on an example application
* approaches for mitigating the challenges
An architecture for privacy-enabled user profile portability on the Web of DataBenjamin Heitmann
Presentation at the Heterogeneous Recommendation Workshop at the ACM Recommender Systems Conference 2010.
Providing relevant recommendations requires access to user profile data. Current social networking ecosystems allow third party services to request user authorisation for accessing profile data, thus enabling cross-domain recommendation. However these ecosystems create user lock-in and social networking data silos, as the profile data is neither portable nor interoperable. We argue that innovations in reconciling heterogeneous data sources must be also be matched by innovations in architecture design and recommender methodology. We present and qualitatively evaluate an architecture for privacy-enabled user profile portability, which is based on technologies from the emerging Web of Data (FOAF, WebIDs and the Web Access Control vocabulary). The proposed architecture enables the creation of a universal “private by default” ecosystem with interoperability of user profile data. The privacy of the user is protected by allowing multiple data providers to host their part of the user profile. This provides an incentive for more users to make profile data from different domains available for recommendations.
What your hairstyle says about your political preferences, and why you should...Benjamin Heitmann
Recent developments in the area of social networking have lead to prominent users leaving facebook due to privacy concerns.
In order to really understand what motivated facebook to implement these controversial changes, you have to look at the future of recommender systems. I will introduce my current research in the areas of multi-source, cross-domain and privacy enabled user profiling and recommendation,
and show how it relates to current developments in the social networking space.
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...Umair ul Hassan
https://www.insight-centre.org/content/towards-expertise-modelling-routing-data-cleaning-tasks-within-community-knowledge-workers
Presented at the ICIQ 2012
ABSTRACT:
Applications consuming data have to deal with variety of data quality issues such as missing values, duplication, incorrect values, etc. Although automatic approaches can be utilized for data cleaning the results can remain uncertain. Therefore updates suggested by automatic data cleaning algorithms require further human verification. This paper presents an approach for generating tasks for uncertain updates and routing these tasks to appropriate workers based on their expertise. Specifically the paper tackles the problem of modelling the expertise of knowledge workers for the purpose of routing tasks within collaborative data quality management. The proposed expertise model represents the profile of a worker against a set of concepts describing the data. A simple routing algorithm is employed for leveraging the expertise profiles for matching data cleaning tasks with workers. The proposed approach is evaluated on a real world dataset using human workers. The results demonstrate the effectiveness of using concepts for modelling expertise, in terms of likelihood of receiving responses to tasks routed to workers.
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...Benjamin Heitmann
Presentation at the "Reasoning from experiences on the Web" workshop (WebCBR 2010) at the International Conference on Case Based Reasoning 2010.
Abstract:
While Case-based reasoning (CBR) has successfully been deployed on the Web, its data models are typically inconsistent with existing information infrastructure and standards. In this paper, we examine how
CBR can operate on the emerging Web of Data, with mutual benefits. The
expense of knowledge engineering and curating a case base can be reduced
by using Linked Data from the Web of Data. While Linked Data provides experiential data from many different domains, it also contains inconsistencies, missing data and noise which provide challenges for logic-based reasoning. CBR is well suited to provide alternative and robust reasoning approaches. We introduce (i) a lightweight CBR vocabulary which is
suited for the open ecosystem of the emerging Web of Data, and provide
(ii) a detailed example of a case base using data from multiple sources. We
propose that for the first time the Web of Data provides data and a real
context for open CBR systems.
Webinar reporting results from a Moxie Software usability study with 10 community managers. Study objective was to explore why user interface design of the social computing platform matters when it comes to employee adoption.
Transitioning web application frameworks towards the Semantic Web (master the...Benjamin Heitmann
Presents the results of a survey of 54 Semantic Web applications and shows how they fit into 6 broad application types/patterns. For every pattern the capabilities, requirements and components are presented.
The full version of the master thesis is available at: http://eyaloren.org/pubs/heitmann-thesis.pdf
The survey itself is available at http://activerdf.org/survey
One-stop shop for software development informationAftab Iqbal
Talks about the issues which developers face while interacting with the many software repositories and the questions they usually have in their mind while search. Introduce the linked data approach to integrate the information from different software repositories.
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEdward Curry
Energy Intelligence platforms can help organizations manage power consumption more efficiently by providing a functional view of the entire organization so that the energy consumption of business activities can be understood, changed, and reinvented to better support sustainable practices. Significant technical challenges exist in terms of information management, cross-domain data integration, leveraging real-time data, and assisting users to interpret the information to optimize energy usage. This paper presents an architectural approach to overcome these challenges using a Dataspace, Linked Data, and Complex Event Processing. The paper describes the fundamentals of the approach and demonstrates it within an Enterprise Energy Observatory.
E. Curry, S. Hasan, and S. O’Riáin, “Enterprise Energy Management using a Linked Dataspace for Energy Intelligence,” in The Second IFIP Conference on Sustainable Internet and ICT for Sustainability (SustainIT 2012), 2012.
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...Benjamin Heitmann
The SIOC data format enables mash-ups of community focused content. This presentation introduces the SIOC format, and the SIOC explorer web application, which allows you to browse and navigate such data. The slides also show how the SIOC explorer is implemented with ActiveRDF and Ruby on Rails
Understanding Composite Web Applications with SharePoint 2010SharePoint Universe
SharePoint 2010 makes it possible for IT provide governance over enterprise application development while allowing schools and universities to be empowered to create robust web applications with or without coding.
To address the emerging importance of services and the relevance of relationships, we have developed and introduced the concept of Open Semantic Service Network (OSSN). OSSN are networks which relate services with the assumption that firms make the information of their services openly available using suitable models. Services, relationships and networks are said to be open (similar to LOD), when their models are transparently available and accessible by external entities and follow an open world assumption. Networks are said to be semantic when they explicitly describe their capabilities and usage, typically using a conceptual or domain model, and ideally using Semantic Web standards and techniques. One limitation of OSSNs is that they were conceived without accounting for the dynamic behavior of service networks. In other words, they can only capture static snapshots of service-based economies but do not include any mechanism to model reactions and effects that services have on other services and the notion of time
Advanced Fuzzy Logic Based Image Watermarking Technique for Medical ImagesIJARIIT
The segmentation algorithms vary for the types of medical images such as MRI, CT, US, etc.The current study work
can further be extended to develop a GUI tool based approach for separating the ROI. Additionally, a new technique of
separating ROI form the original image that will be applicable for all type of medical images can be evolved. Separated ROI
can be stored with xmin, xmax, ymin and ymax value so that at the end of embedding process before transmitting watermarked
image, the segmented ROI can be attached with watermarked image. Any medical image watermarking approach will be
suitable, if we segment the ROI from medical image with the four values, then embedding of watermark can be done on whole
medical image, in this paper work on different scan like ctscan ,brain scan etc. our results significant high than other.
A Capability Requirements Approach for Predicting Worker Performance in Crowd...Umair ul Hassan
https://www.insight-centre.org/content/capability-requirements-approach-predicting-worker-performance-crowdsourcing
Presented at CollaborateCom 2013
Abstract:
Assigning heterogeneous tasks to workers is an important challenge of crowdsourcing platforms. Current approaches to task assignment have primarily focused on contentbased approaches, qualifications, or work history. We propose an
alternative and complementary approach that focuses on what capabilities workers employ to perform tasks. First, we model various tasks according to the human capabilities required to perform them. Second, we capture the capability traces of the
crowd workers performance on existing tasks. Third, we predict performance of workers on new tasks to make task routing decisions, with the help of capability traces. We evaluate the effectiveness of our approach on three different tasks including fact
verification, image comparison, and information extraction. The results demonstrate that we can predict worker’s performance based on worker capabilities. We also highlight limitations and extensions of the proposed approach.
SLUA: Towards Semantic Linking of Users with Actions in CrowdsourcingUmair ul Hassan
https://www.insight-centre.org/content/slua-towards-semantic-linking-users-actions-crowdsourcing
Presented at ISWC 2013
Abstract:
Recent advances in web technologies allow people to help solve complex problems by performing online tasks in return for money, learning, or fun. At present, human contribution is limited to the tasks defined on individual crowdsourcing platforms. Furthermore, there is a lack of tools and technologies that support matching of tasks with appropriate users, across multiple systems. A more explicit capture of the semantics of crowdsourcing tasks could enable the design and development of matchmaking services between users and tasks. The paper presents the SLUA ontology that aims to model users and tasks in crowdsourcing systems in terms of the relevant actions, capabilities, and re-wards. This model describes different types of human tasks that help in solving complex problems using crowds. The paper provides examples of describing users and tasks in some real world systems, with SLUA ontology.
Webinar reporting results from a Moxie Software usability study with 10 community managers. Study objective was to explore why user interface design of the social computing platform matters when it comes to employee adoption.
Transitioning web application frameworks towards the Semantic Web (master the...Benjamin Heitmann
Presents the results of a survey of 54 Semantic Web applications and shows how they fit into 6 broad application types/patterns. For every pattern the capabilities, requirements and components are presented.
The full version of the master thesis is available at: http://eyaloren.org/pubs/heitmann-thesis.pdf
The survey itself is available at http://activerdf.org/survey
One-stop shop for software development informationAftab Iqbal
Talks about the issues which developers face while interacting with the many software repositories and the questions they usually have in their mind while search. Introduce the linked data approach to integrate the information from different software repositories.
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEdward Curry
Energy Intelligence platforms can help organizations manage power consumption more efficiently by providing a functional view of the entire organization so that the energy consumption of business activities can be understood, changed, and reinvented to better support sustainable practices. Significant technical challenges exist in terms of information management, cross-domain data integration, leveraging real-time data, and assisting users to interpret the information to optimize energy usage. This paper presents an architectural approach to overcome these challenges using a Dataspace, Linked Data, and Complex Event Processing. The paper describes the fundamentals of the approach and demonstrates it within an Enterprise Energy Observatory.
E. Curry, S. Hasan, and S. O’Riáin, “Enterprise Energy Management using a Linked Dataspace for Energy Intelligence,” in The Second IFIP Conference on Sustainable Internet and ICT for Sustainability (SustainIT 2012), 2012.
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...Benjamin Heitmann
The SIOC data format enables mash-ups of community focused content. This presentation introduces the SIOC format, and the SIOC explorer web application, which allows you to browse and navigate such data. The slides also show how the SIOC explorer is implemented with ActiveRDF and Ruby on Rails
Understanding Composite Web Applications with SharePoint 2010SharePoint Universe
SharePoint 2010 makes it possible for IT provide governance over enterprise application development while allowing schools and universities to be empowered to create robust web applications with or without coding.
To address the emerging importance of services and the relevance of relationships, we have developed and introduced the concept of Open Semantic Service Network (OSSN). OSSN are networks which relate services with the assumption that firms make the information of their services openly available using suitable models. Services, relationships and networks are said to be open (similar to LOD), when their models are transparently available and accessible by external entities and follow an open world assumption. Networks are said to be semantic when they explicitly describe their capabilities and usage, typically using a conceptual or domain model, and ideally using Semantic Web standards and techniques. One limitation of OSSNs is that they were conceived without accounting for the dynamic behavior of service networks. In other words, they can only capture static snapshots of service-based economies but do not include any mechanism to model reactions and effects that services have on other services and the notion of time
Advanced Fuzzy Logic Based Image Watermarking Technique for Medical ImagesIJARIIT
The segmentation algorithms vary for the types of medical images such as MRI, CT, US, etc.The current study work
can further be extended to develop a GUI tool based approach for separating the ROI. Additionally, a new technique of
separating ROI form the original image that will be applicable for all type of medical images can be evolved. Separated ROI
can be stored with xmin, xmax, ymin and ymax value so that at the end of embedding process before transmitting watermarked
image, the segmented ROI can be attached with watermarked image. Any medical image watermarking approach will be
suitable, if we segment the ROI from medical image with the four values, then embedding of watermark can be done on whole
medical image, in this paper work on different scan like ctscan ,brain scan etc. our results significant high than other.
A Capability Requirements Approach for Predicting Worker Performance in Crowd...Umair ul Hassan
https://www.insight-centre.org/content/capability-requirements-approach-predicting-worker-performance-crowdsourcing
Presented at CollaborateCom 2013
Abstract:
Assigning heterogeneous tasks to workers is an important challenge of crowdsourcing platforms. Current approaches to task assignment have primarily focused on contentbased approaches, qualifications, or work history. We propose an
alternative and complementary approach that focuses on what capabilities workers employ to perform tasks. First, we model various tasks according to the human capabilities required to perform them. Second, we capture the capability traces of the
crowd workers performance on existing tasks. Third, we predict performance of workers on new tasks to make task routing decisions, with the help of capability traces. We evaluate the effectiveness of our approach on three different tasks including fact
verification, image comparison, and information extraction. The results demonstrate that we can predict worker’s performance based on worker capabilities. We also highlight limitations and extensions of the proposed approach.
SLUA: Towards Semantic Linking of Users with Actions in CrowdsourcingUmair ul Hassan
https://www.insight-centre.org/content/slua-towards-semantic-linking-users-actions-crowdsourcing
Presented at ISWC 2013
Abstract:
Recent advances in web technologies allow people to help solve complex problems by performing online tasks in return for money, learning, or fun. At present, human contribution is limited to the tasks defined on individual crowdsourcing platforms. Furthermore, there is a lack of tools and technologies that support matching of tasks with appropriate users, across multiple systems. A more explicit capture of the semantics of crowdsourcing tasks could enable the design and development of matchmaking services between users and tasks. The paper presents the SLUA ontology that aims to model users and tasks in crowdsourcing systems in terms of the relevant actions, capabilities, and re-wards. This model describes different types of human tasks that help in solving complex problems using crowds. The paper provides examples of describing users and tasks in some real world systems, with SLUA ontology.
Effects of Expertise Assessment on the Quality of Task Routing in Human Compu...Umair ul Hassan
https://www.insight-centre.org/content/effects-expertise-assessment-quality-task-routing-human-computation
Presented at SoHuman'12
Abstract:
Human computation systems are characterized by the use of human workers to solve computationally difficult problems. Expertise profiling involves assessment and representation of a worker’s expertise, in order to route human computation tasks to appropriate workers. This paper studies the relationship between the assessment workload on workers and the quality of task routing. Three expertise assessment approaches were compared with the help of a user study, using two different groups of human workers. The first approach requests workers to provide self-assessment of their knowledge. The second approach measures the knowledge of workers through their performance against tasks with known responses. We propose a third approach based on a combination of self-assessment and task-assessment. The results suggest that the self-assessment approach requires minimum assessment workload from workers during expertise profiling. By comparison, the task-assessment approach achieved the highest response rate and accuracy. The proposed approach requires less assessment workload, while achieving the response rate and accuracy similar to the task-assessment approach.
A Collaborative Approach for Metadata Management for Internet of ThingsUmair ul Hassan
https://www.insight-centre.org/content/collaborative-approach-metadata-management-internet-things-linking-micro-tasks-physical
Presented at CollaborateCom 2013
ABSTRACT:
There has been considerable efforts in modelling the semantics of Internet of Things and their specific context. Acquiring and managing metadata related to the physical devices and their surrounding environment becomes challenging due to the dynamic nature of environment. This paper focuses on managing metadata for Internet of Things with the help of crowds. Specifically, the paper proposes a collaborative approach for collecting and maintaining metadata through micro tasks that can be performed using variety of platforms e.g. mobiles, laptops, kiosks, etc. The approach allows non-experts to contribute towards metadata management through micro tasks, therefore resulting in reduced cost and time. Applicability of the proposed approach is demonstrated through a use case implementation for managing sensor metadata for energy management in smallbuildings.
Schema-agnositc queries over large-schema databases: a distributional semanti...Andre Freitas
The evolution of data environments towards the growth in the size, complexity, dy-
namicity and decentralisation (SCoDD) of schemas drastically impacts contemporary
data management. The SCoDD trend emerges as a central data management concern
in Big Data scenarios, where users and applications have a demand for more complete
data, produced by independent data sources, under different semantic assumptions and
contexts of use. Most Database Management Systems (DBMSs) today target a closed
communication scenario, where the symbolic schema of the database is known a priori
by the database user, which is able to interpret it in an unambiguous way. The context
in which the data is consumed and produced is well-defined and it is typically the
same context in which the data was created. In contrast, data management under the
SCoDD conditions target an open communication scenario where the symbolic system of
the database is unknown by the user and multiple interpretation contexts are possible.
In this case the database can be created under a different context from the database
user. The emergence of this new data environment demands the revisit of the semantic
assumptions behind databases and the design of data access mechanisms which can
support semantically heterogeneous (open communication) data environments.
This work aims at filling this gap by proposing a complementary semantic model for
databases, based on distributional semantic models. Distributional semantics provides a
complementary perspective to the formal perspective of database semantics, which supports
semantic approximation as a first-class database operation. Differently from models
which describe uncertain and incomplete data or probabilistic databases, distributional-
relational models focuses on the construction of conceptual approximation approaches
for databases, supported by a comprehensive semantic model automatically built from
large-scale unstructured data external to the database, which serves as a semantic/com-
monsense knowledge base. The semantic model can be used to support schema-agnosticqueries, i.e. abstracting the data consumer from a specific conceptualization behind the
data.
The proposed distributional-relational semantic model is supported by a distributional
structured vector space model, named τ −Space, which represents structured data under
a distributional semantic model representation which, in coordination with a query plan-
ning approach, supports a schema-agnostic query mechanism for large-schema databases.
The query mechanism is materialized in the Treo query engine and is evaluated using
schema-agnostic natural language queries.
The evaluation of the query mechanism confirms that distributional semantics provides
a high-recall, medium-high precision, and low maintainability solution to cope with
the abstraction and conceptual-level differences in schema-agnostic queries over largeschema/
schema-less open domain dataset
A Multi-armed Bandit Approach to Online Spatial Task AssignmentUmair ul Hassan
https://www.insight-centre.org/content/multi-armed-bandit-approach-online-spatial-task-assignment
Presented at UIC 2014
Abstract
Spatial crowdsourcing uses workers for performing tasks that require travel to different locations in the physical world. This paper considers the online spatial task assignment problem. In this problem, spatial tasks arrive in an online manner and an appropriate worker must be assigned to each task. However, outcome of an assignment is stochastic since the worker can choose to accept or reject the task. Primary goal of the assignment algorithm is to maximize the number of successful assignments over all tasks. This presents an exploration-exploitation challenge; the algorithm must learn the task acceptance behavior of workers while selecting the best worker based on the previous learning. We address this challenge by defining a framework for online spatial task assignment based on the multi-armed bandit formalization of the problem. Furthermore, we adapt a contextual bandit algorithm to assign a worker based on the spatial features
of tasks and workers. The algorithm simultaneously adapts
the worker assignment strategy based on the observed task
acceptance behavior of workers. Finally, we present an evaluation
methodology based on a real world dataset, and evaluate the
performance of the proposed algorithm against the baseline
algorithms. The results demonstrate that the proposed algorithm
performs better in terms of the number of successful assignments.
https://www.insight-centre.org/content/research-toolbox-data-analysis-python-waternomics-case-study
This seminar aims to highlight the flexibility of Python as a useful programming language for everyday tasks in research. It is based on the experience of the presenter in the Waternomics project and research experiments. The overall goal is to share the experience of data access, manipulation, and visualization. The seminar will focus on following main topics and their relevant Python libraries: (1) The Python ecosystem for Data Science (2) Data access with pandas, RDFlib, requests, json (3) Data manipulation with numpy, scipy, statsmodels (4) Data visualization with matplotlib, seaborn, and bokeh (5) Tips and tricks (Jupyter server, pgfplots, latex, pyCharm) (6) Advanced libraries (scikt-learn, pyomo, NLTK) The seminar is expected to use the full slot of the Reading Group session, with opportunities for questions and discussion in between each topic.
E2.0 - Next Generation Portal and Content Managementmuratc2a
9 Kasım 2009 Oracle Day için Andrew Gilboy tarafından yapılan "E2.0 - Next Generation Portal and Content Management - Oracle Success Stories" başlıklı sunum.
Layer 7 Mobile Security Workshop with CA Technologies and Forrester Research ...CA API Management
The bring-your-own-device (BYOD) trend is in full swing as the growth of mobile devices within the enterprise explodes. How do you enable secure data access for mobile applications? How do you deal with user authentication? How do you allow broader adoption of enterprise applications on user owned devices? CA and Layer 7 outline solutions to these issues, explore different approaches to mobile security, and use case studies to illustrate how others have solved these problems.
This workshop was all about:
• The latest mobile trends and opportunities
• Emerging mobile risks and how these can be addressed
• A reference architecture for secure enterprise mobility
A distributional structured semantic space for querying rdf graph dataAndre Freitas
The vision of creating a Linked Data Web brings together the challenge of allowing queries across highly heterogeneous and distributed datasets. In order to query Linked Data on the Web today, end users need to be aware of which datasets potentially contain the data and also which data model describes these datasets. The process of allowing users to expressively query relationships in RDF while abstracting them from the underlying data model represents a fundamental problem for Web-scale Linked Data consumption. This article introduces a distributional structured semantic space which enables data model independent natural language queries over RDF data. The center of the approach relies on the use of a distributional semantic model to address the level of semantic interpretation demanded to build the data model independent approach. The article analyzes the geometric aspects of the proposed space, providing its description as a distributional structured vector space, which is built upon the Generalized Vector Space Model (GVSM). The final semantic space proved to be flexible and precise under real-world query conditions achieving mean reciprocal rank = 0.516, avg. precision = 0.482 and avg. recall = 0.491.
Querying Heterogeneous Datasets on the Linked Data WebEdward Curry
The growing number of datasets published on the Web as linked data brings both opportunities for high data availability and challenges inherent to querying data in a semantically heterogeneous and distributed environment. Approaches used for querying siloed databases fail at Web-scale because users don't have an a priori understanding of all the available datasets. This article investigates the main challenges in constructing a query and search solution for linked data and analyzes existing approaches and trends.
Northridge Webinar Share Point 2010 Public Webjfarq
Microsoft SharePoint continues to accelerate as a platform for both “in front of the firewall” solutions and “behind the firewall” solutions. Gartner has reported that more than 50% of its own client organizations are using SharePoint in some capacity, and with the recent introduction of SharePoint 2010 exponential growth is further anticipated. During this session, Northridge SharePoint consulting experts will discuss how SharePoint is more than an enterprise intranet, enterprise content management, and BI platform -- SharePoint is a solid foundation for external web solutions.
Whether you are currently leveraging your organization’s SharePoint platform investment for your external web marketing or business solutions, or considering it, this webinar will be valuable in understanding how the SharePoint platform aligns with your business and marketing requirements, including areas such as:
• User Experience & Creative Design
• Web Content Management
• Search
• Custom Application Development
• Rich Internet Applications
A Multidimensional Semantic Space for Data Model Independent Queries over RDF...Andre Freitas
IEEE International Conference on Semantic Computing (ICSC 2011).
A Multidimensional Semantic Space for Data Model Independent Queries over RDF Data
André Freitas, João Gabriel Oliveira, Edward Curry Seán O’Riain
http://andrefreitas.org/papers/preprint_multidimensional_ieee_icsc_2011.pdf
Abstract: The vision of creating a Linked Data Web brings together the challenge of allowing queries across highly heterogeneous and distributed datasets. In order to query Linked Data
on the Web today, end-users need to be aware of which datasets potentially contain the data and also which data model describes these datasets. The process of allowing users to expressively
query relationships in RDF while abstracting them from the underlying data model represents a fundamental problem for Web-scale Linked Data consumption. This article introduces a multidimensional semantic space model which enables data model independent natural language queries over RDF data. The center of the approach relies on the use of a distributional semantic model to address the level of semantic interpretation
demanded to build the data model independent approach. The final multidimensional semantic space proved to be flexible and precise under real-world query conditions achieving mean reciprocal rank = 0.516, avg. precision = 0.482 and avg. recall =0.491.
Identity access and privacy in the new hybrid enterprise slidesCA API Management
Identity, Access & Privacy in the New Hybrid Enterprise featuring Forrester Research, Inc.
Make sense of OAuth, OpenID Connect and UMA
Overview
In the new hybrid enterprise, organizations need to manage business functions that flow across their domain boundaries in all directions: partners accessing internal applications; employees using mobile devices; internal developers mashing up Cloud services; internal business owners working with third-party app developers.
Integration increasingly happens via APIs and native apps, not browsers. Zero Trust is the new starting point for security and access control and it demands Internet scale and technical simplicity – requirements the go-to Web services solutions of the past decade, like SAML and WS-Trust, struggle to solve.
This webinar from Layer 7 Technologies, featuring special guest Eve Maler of Forrester Research, Inc., will:
• Discuss emerging trends for access control inside the enterprise
• Provide a blueprint for understanding adoption considerations
You Will Learn
• Why access control is evolving to support mobile, Cloud and API-based interactions
• How the new standards (OAuth, OpenID Connect and UMA) compare to technologies like SAML
• How to implement OAuth and OpenID Connect, based on case study examples
• Futures around UMA and enterprise-scale API access
Presented by
• Scott Morrison
CTO, Layer 7 Technologies
• Eve Maler
Principle Analyst, Forrester Research, Inc.
Intranet 2.0 - Integrating Enterprise 2.0 into your corporate intranetJames Dellow
Enterprise 2.0 opportunities and challenges; The technology building blocks: Blogs, RSS,
tags, search and wikis; Implementation approaches: Nature or nurture? Pulling it all together and getting started.
This presentation was made as a workshop at Intranet '07 on 20th September, 2007 in Sydney, Australia. Note: This version of the presentation pack contains only key slides and omits additional reading materials provided.
The World Wide Web is booming and radically vibrant due to the well established standards and widely accountable framework which guarantees the interoperability at various levels of the application and the society as a whole. So far, the web has been functioning at the random rate on the basis of the human intervention and some manual processing but the next generation web which the researchers called semantic web, edging for automatic processing and machine-level understanding. The well set notion, Semantic Web would be turn possible if only there exists the further levels of interoperability prevails among the applications and networks. In achieving this interoperability and greater functionality among the applications, the W3C standardization has already released the well defined standards such as RDF/RDF Schema and OWL. Using XML as a tool for semantic interoperability has not achieved anything effective and failed to bring the interconnection at the larger level. This leads to the further inclusion of inference layer at the top of the web architecture and its paves the way for proposing the common design for encoding the ontology representation languages in the data models such as RDF/RDFS. In this research article, we have given the clear implication of semantic web research roots and its ontological background process which may help to augment the sheer understanding of named entities in the web.
Similar to Leveraging Matching Dependencies for Guided User Feedback in Linked Data Applications (20)
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Applications
1. Digital Enterprise Research Institute www.deri.ie
Leveraging Matching Dependencies for Guided
User Feedback in Linked Data Applications
Umair ul Hassan, Sean O’Riain, Edward Curry
Digital Enterprise Research Institute
National University of Ireland, Galway
Copyright 2011 Digital Enterprise Research Institute. All rights reserved.
2. Outline
Digital Enterprise Research Institute www.deri.ie
Motivation & Problem Space
Identity Resolution on the Linked Open Data (LOD) Web
Proposed Approach
LOD Application Architecture
How it relates to existing works
Evaluation
Conclusion & Future Work
3. Overview
Digital Enterprise Research Institute www.deri.ie
Identity Resolution in the Linked Open Data Web
Real-world entities have multiple identifiers in LOD
Identity resolution links have associated uncertainty
LOD Applications require user verification of links
Problem
Feedback for all links is infeasible for large datasets
LOD Applications have domain specific utility of links
Proposed Approach
Leverages matching dependencies to define domain specific
requirements of identity resolution
Ranks identity resolution links according to value of perfect information
4. Linked Open Data (LOD)
Digital Enterprise Research Institute www.deri.ie
Expose and interlink datasets on the Web
Using URIs to identify “things” in your data
Using a graph representation (RDF) to describe URIs
Vision: The Web as a huge graph database
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
5. Linked Data Example
Digital Enterprise Research Institute www.deri.ie
Identity resolution links
Multiple Identifiers
6. Identity Resolution in LOD
Digital Enterprise Research Institute www.deri.ie
Identity resolution is required for consolidation of data in
applications consuming LOD
Three sources of identity resolution links
Provided by data publishers (e.g. dbpedia.org)
Generated by consumer through tools (e.g. SILK, SEMIRI, RiMOM)
Maintained by third party web services (e.g. sameas.org)
Uncertainty associated with links
Due to multiple identity equivalence interpretations
Due to characteristics of link generation algorithms (similarity based)
7. Identity Resolution Problem
Digital Enterprise Research Institute www.deri.ie
User feedback for uncertain links
Verify uncertain identity resolution links from users/experts
Improve quality of entity consolidation
Challenges
Domain specific semantic requirements
– How to define domain specific requirements of quality for Linked
Data applications?
Limited user attention
– How to rank candidate links according to their benefit to maximize
utility of user feedback?
8. Identity Resolution Problem
Digital Enterprise Research Institute www.deri.ie
User feedback for uncertain links
Verify uncertain identity resolution links from users/experts
Improve quality of entity consolidation
Proposed Approach
Domain specific semantic requirements
– Leverage Matching Dependencies
Limited user attention
– Employ value of perfect information theory
9. LOD Application Architecture
Digital Enterprise Research Institute www.deri.ie
Utility Feedback Consolidation
Module Module Module
Candidate Links
Questions
Rules Feedback
Matching Utility
Dependencies Improvement
Ranked
Feedback Tasks
Tom Heath and Christian Bizer (2011) Linked Data: Evolving the Web into a Global Data Space (1st edition), 1-136. Morgan & Claypool.
10. Related Work
Digital Enterprise Research Institute www.deri.ie
Jeffery et al., “Pay-as-you-go user feedback for dataspace
systems,” in Proceedings of the 2008 ACM SIGMOD
Conference, 2008, pp. 847-860.
Utility:
In terms of cardinality of query results on dataspace
General metric not suitable for application specific data quality
Assumption:
Availability of global query statistics
– Problematic for Linked Open Data
11. Proposed Approach
Digital Enterprise Research Institute www.deri.ie
Domain Specific Utility
Define utility in terms of user specified rules i.e. matching dependencies
Rank candidates links for user feedback according to value of perfect
information
Assumptions
We assume matching dependencies are either provided by user or generated
through existing tools
Utility is based on satisfaction ratio of dependencies in dataspace
12. Proposed Approach
Digital Enterprise Research Institute www.deri.ie
Matching Dependencies
Matching Rule
Example
Utility of rule
g (mk ) U ( Dmk , M {mk }) pk
Value of Perfect Information U ( Dmk , M {mk })(1 pk )
U ( D, M )
13. Evaluation
Digital Enterprise Research Institute www.deri.ie
Measure change in utility of a dataspace according to
matching rules after a specific number of feedback iterations
Candidate links generated by the Silk framework
14. Evaluation
Digital Enterprise Research Institute www.deri.ie
Datasets
IIMB 2009 Dataset UCI-Adult Dataset Drug Dataset
Data Source Instance Matching Benchmark UCI Machine Learning Repository Instance Matching Benchmark
2009 2010
Data Collection IIMB 2009 US Consensus Dataset DrugBank and Sider Datasets
- Reference Ontology - Manually created duplicates and - Interlinking between two datasets
- Ontology #16 with errors in data value errors of same domain
attributes
Entity Types imdb:Movie foaf:Person drugbank:drugs, sider:drugs
Total Triples 291 64000 14348
Total Entity IDs 44 4000 5696
Total Attributes 9 16 3
Total Values 130 10878 8473
Candidate Links 81 72 94
Correct Links 22 72 66
16. Conclusion
Digital Enterprise Research Institute www.deri.ie
Matching dependencies provide an effective mechanism to:
Represent entity matching rules
Specify domain specific semantic requirements
Measure utility of dataspaces
Value of perfect information enables effective ranking strategy
for user feedback
In the three datasets 100% utility improvement was reached
under 40% of user feedback
17. Future Work
Digital Enterprise Research Institute www.deri.ie
Expand to other data quality problems
Expand on types of dependencies such as comparable
dependencies and order dependencies
Allow multi-user feedback for collaborative data cleaning
Editor's Notes
Personal background
Executive summary vs. overview
Executive summary vs. overview
Complete stack of semantic web technologies is based on open standards and protocols.The semantic web technologies focus on application layer of internet stack.
Go back to research question slidesGo back to work flow and highlight whats needed