Big Data Challenges and Trust Management: A Personal Perspective
A tutorial presented by Dr. Krishnaprasad Thirunarayan at the International Conference on Collaboration Technologies and Systems 2016 (CTS 2016)
Social networking sites are source of information for event detection, with specific reference of the road traffic activity
blockage and accidents or earth-quack sensing system. During this paper, we have a tendency to present a time period
observation system supposed for traffic occasion detection coming back from social media stream analysis. The system
fetches tweets coming back from social media/network as per a many search criteria; ways tweets/posts, by applying matter
content mining methods; last however not least works the classification of social networks posts. The goal is to assign
appropriate category packaging to each posts, as a result of connected with Associate in Nursing activity of traffic event or
maybe not. The traffic recognition system or framework was utilized for time period observation of varied areas of the road
network, taking into consideration detection of traffic occasions simply virtually in actual time, frequently before on-line
traffic news sites. All people utilized the support vector machine sort of a classification unit; what is more, we tend to
accomplish a good accuracy price of 95.76% by making an attempt a binary classification problem. All people were
conjointly capable to discriminate if traffic is triggered by Associate in nursing external celebration or not, by partitioning a
multi category classification issue and getting accuracy price of 88.89.
A Survey On Ontology Agent Based Distributed Data MiningEditor IJMTER
With the increased complexity in number of applications and due to large volume
of availability of data from heterogeneous sources, there is a need for the development of
suitable ontology, which can handle the large data set and present the mined outcomes for
evaluation intelligently. In the era of intensive data driven applications distributed data mining can
meet the challenges with the support of agents. This paper discusses the underlying principles for
effectiveness of modern agent-based systems for distributed data mining
Mutual redundancies and triple contingenciesleydesdorff
PowerPoint Presentation at the conference of International Society for Information Studies, Vienna, 3-7 June 2014; Session: Integration of the Philosophy of Information and Information Science
There are numerous ways to analyse the web information, generally web substance are housed in
large information sets and basic inquiries are utilized to parse such information sets. As the requests
expanded with time, mining web information amended to meet challenging task in a web analysis.
Machine learning methodologies are the most up to date one to go into these analysis forms. Different
approaches like decision trees, association rules, Meta heuristic and basic learning methods are embraced
for making web data appraisal and mining data from various web instances. This study will highlight these
approaches in perspective of web investigation. One of the prime goals of this exploration is to investigate
more data mining approaches alongside machine learning systems, and to express emerging collaboration
of web analytics with artificial intelligence.
Social networking sites are source of information for event detection, with specific reference of the road traffic activity
blockage and accidents or earth-quack sensing system. During this paper, we have a tendency to present a time period
observation system supposed for traffic occasion detection coming back from social media stream analysis. The system
fetches tweets coming back from social media/network as per a many search criteria; ways tweets/posts, by applying matter
content mining methods; last however not least works the classification of social networks posts. The goal is to assign
appropriate category packaging to each posts, as a result of connected with Associate in Nursing activity of traffic event or
maybe not. The traffic recognition system or framework was utilized for time period observation of varied areas of the road
network, taking into consideration detection of traffic occasions simply virtually in actual time, frequently before on-line
traffic news sites. All people utilized the support vector machine sort of a classification unit; what is more, we tend to
accomplish a good accuracy price of 95.76% by making an attempt a binary classification problem. All people were
conjointly capable to discriminate if traffic is triggered by Associate in nursing external celebration or not, by partitioning a
multi category classification issue and getting accuracy price of 88.89.
A Survey On Ontology Agent Based Distributed Data MiningEditor IJMTER
With the increased complexity in number of applications and due to large volume
of availability of data from heterogeneous sources, there is a need for the development of
suitable ontology, which can handle the large data set and present the mined outcomes for
evaluation intelligently. In the era of intensive data driven applications distributed data mining can
meet the challenges with the support of agents. This paper discusses the underlying principles for
effectiveness of modern agent-based systems for distributed data mining
Mutual redundancies and triple contingenciesleydesdorff
PowerPoint Presentation at the conference of International Society for Information Studies, Vienna, 3-7 June 2014; Session: Integration of the Philosophy of Information and Information Science
There are numerous ways to analyse the web information, generally web substance are housed in
large information sets and basic inquiries are utilized to parse such information sets. As the requests
expanded with time, mining web information amended to meet challenging task in a web analysis.
Machine learning methodologies are the most up to date one to go into these analysis forms. Different
approaches like decision trees, association rules, Meta heuristic and basic learning methods are embraced
for making web data appraisal and mining data from various web instances. This study will highlight these
approaches in perspective of web investigation. One of the prime goals of this exploration is to investigate
more data mining approaches alongside machine learning systems, and to express emerging collaboration
of web analytics with artificial intelligence.
This Presentation contains Project idea along with the project diagrams and methodology explained. This Project can be used in Different sectors like in Industry, in Prediction analysis, for trend analysis, for sales & profit calculations etc.
Modeling a geo spatial database for managing travelers demandijdms
The geo-spatial database is a new technology in database systems which allow storing, retrieving and
maintaining the spatial data. In this paper, we seek to design and implement a geo-spatial database for
managing the traveler’s demand with the aid of open-source tools and object-relational database package.
The building of geo-spatial database starts with the design of data model in terms of conceptual, logical
and physical data model and then the design has been implemented into an object-relational database. The
geo-spatial database is developed to facilitate the storage of geographic information (where things are)
with descriptive information (what things are like) into the vector model. The developed vector geo-spatial
data can be accessed and rendered in the form of map to create the awareness of existence of various
services and facilities for prospective travelers and visitors.
Designing for Online Collaborative SensemakingNitesh Goyal
We designed a collaborative analysis tool to explore the value of implicitly sharing insights and notes, without requiring analysts to explicitly push information or request it from each other. In an experiment, pairs of remote individuals played the role of crime analysts solving a set of serial killer crimes with both partners having some, but not all, relevant clues. When implicit sharing of notes was available, participants remembered more clues related to detecting the serial killer, and they perceived the tool as more useful compared to when implicit sharing was not available.
Integrating Web Services With Geospatial Data Mining Disaster Management for ...Waqas Tariq
Data Mining (DM) and Geographical Information Systems (GIS) are complementary techniques for describing, transforming, analyzing and modeling data about real world system. GIS and DM are naturally synergistic technologies that can be joined to produce powerful market insight from a sea of disparate data. Web Services would greatly simplify the development of many kinds of data integration and knowledge management applications. This research aims to develop a Spatial DM web service. It integrates state of the art GIS and DM functionality in an open, highly extensible, web-based architecture. The Interoperability of geospatial data previously focus just on data formats and standards. The recent popularity and adoption of Web Services has provided new means of interoperability for geospatial information not just for exchanging data but for analyzing these data during exchange as well. An integrated, user friendly Spatial DM System available on the internet via a web service offers exciting new possibilities for geo-spatial analysis to be ready for decision making and geographical research to a wide range of potential users.
A forecasting of stock trading price using time series information based on b...IJECEIAES
Big data is a large set of structured or unstructured data that can collect, store, manage, and analyze data with existing database management tools. And it means the technique of extracting value from these data and interpreting the results. Big data has three characteristics: The size of existing data and other data (volume), the speed of data generation (velocity), and the variety of information forms (variety). The time series data are obtained by collecting and recording the data generated in accordance with the flow of time. If the analysis of these time series data, found the characteristics of the data implies that feature helps to understand and analyze time series data. The concept of distance is the simplest and the most obvious in dealing with the similarities between objects. The commonly used and widely known method for measuring distance is the Euclidean distance. This study is the result of analyzing the similarity of stock price flow using 793,800 closing prices of 1,323 companies in Korea. Visual studio and Excel presented calculate the Euclidean distance using an analysis tool. We selected “000100” as a target domestic company and prepared for big data analysis. As a result of the analysis, the shortest Euclidean distance is the code “143860” company, and the calculated value is “11.147”. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.
This is a brief a brief review of current multi-disciplinary and collaborative projects at Kno.e.sis led by Prof. Amit Sheth. They cover research in big social data, IoT, semantic web, semantic sensor web, health informatics, personalized digital health, social data for social good, smart city, crisis informatics, digital data for material genome initiative, etc. Dec 2015 edition.
Krishnaprasad Thirunarayan, Trust Management: Multimodal Data Perspective,
Invited Tutorial, The 2015 International Conference on Collaboration
Technologies and Systems (CTS 2015), June 2015
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersAmit Sheth
Abstract
Kno.e.sis (http://knoesis.org) is a world-class research center that uses semantic, cognitive, and perceptual computing for gathering insights from physical/IoT, cyber/Web, and social and enterprise (e.g., clinical) big data. We innovate and employ semantic web, machine learning, NLP/IR, data mining, network science and highly scalable computing techniques. Our highly interdisciplinary research impacts health and clinical applications, biomedical and translational research, epidemiology, cognitive science, social good, policy, development, etc. A majority of our $12+ million in active funds come from the NSF and NIH. In this talk, I will provide an overview of some of our major research projects.
Kno.e.sis is highly successful in its primary mission of exceptional student outcomes: our students have exceptional publication and real-world impact and our PhDs compete with their counterparts from top 10 schools for initial jobs in research universities, top industry research labs, and highly competitive companies. A key reason for Kno.e.sis' success is its unique work culture involving teamwork to solve complex problems. Practically all our work involves real-world challenges, real-world data, interdisciplinary collaborators, path-breaking research to solve challenges, real-world deployments, real-world use, and measurable real-world impact.
In this talk, I will also seek to discuss our choice of research topics and our unique ecosystem that prepares our students for exceptional careers.
This tutorial presents tools and techniques for effectively utilizing the Internet of Things (IoT) for building advanced applications, including the Physical-Cyber-Social (PCS) systems. The issues and challenges related to IoT, semantic data modelling, annotation, knowledge representation (e.g. modelling for constrained environments, complexity issues and time/location dependency of data), integration, analy- sis, and reasoning will be discussed. The tutorial will de- scribe recent developments on creating annotation models and semantic description frameworks for IoT data (e.g. such as W3C Semantic Sensor Network ontology). A review of enabling technologies and common scenarios for IoT applications from the data and knowledge engineering point of view will be discussed. Information processing, reasoning, and knowledge extraction, along with existing solutions re- lated to these topics will be presented. The tutorial summarizes state-of-the-art research and developments on PCS systems, IoT related ontology development, linked data, do- main knowledge integration and management, querying large- scale IoT data, and AI applications for automated knowledge extraction from real world data.
Related: Semantic Sensor Web: http://knoesis.org/projects/ssw
Physical-Cyber-Social Computing: http://wiki.knoesis.org/index.php/PCS
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
Amit Sheth's keynote at IEEE BigData 2014, Oct 29, 2014.
Abstract from:
http://cci.drexel.edu/bigdata/bigdata2014/keynotespeech.htm
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is personalized digital health that related to taking better decisions about our health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (e.g., information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will describe Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If my child is an asthma patient, for all the data relevant to my child with the four V-challenges, what I care about is simply, “How is her current health, and what are the risk of having an asthma attack in her current situation (now and today), especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP. I will motivate the need for a synergistic combination of techniques similar to the close interworking of the top brain and the bottom brain in the cognitive models.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city.
A colloquium talk given by Dr. Saeedeh Shekarpour of Kno.e.sis Center at IBM Almaden Research Center, 10 Feb 2017.
Abstract:
Cognitive computing is being the prevalent term for referring a series of computing which mimics human brain. This emerging computing is expected from various disciplines such as natural language processing, information retrieval, inference systems and information extraction, etc. Using knowledge graphs, which jointly enhance semantics as well as the structure of data, is fundamental in promoting capabilities of your computing towards cognitive computing. In this talk, I share my experiences in using knowledge graph for various purposes such as question answering system, information extraction.
Krishnaprasad Thirunarayan, Value-Oriented Big Data Processing with Applications,
Invited Talk, The 2015 International Conference on Collaboration
Technologies and Systems (CTS 2015), June 2015.
This Presentation contains Project idea along with the project diagrams and methodology explained. This Project can be used in Different sectors like in Industry, in Prediction analysis, for trend analysis, for sales & profit calculations etc.
Modeling a geo spatial database for managing travelers demandijdms
The geo-spatial database is a new technology in database systems which allow storing, retrieving and
maintaining the spatial data. In this paper, we seek to design and implement a geo-spatial database for
managing the traveler’s demand with the aid of open-source tools and object-relational database package.
The building of geo-spatial database starts with the design of data model in terms of conceptual, logical
and physical data model and then the design has been implemented into an object-relational database. The
geo-spatial database is developed to facilitate the storage of geographic information (where things are)
with descriptive information (what things are like) into the vector model. The developed vector geo-spatial
data can be accessed and rendered in the form of map to create the awareness of existence of various
services and facilities for prospective travelers and visitors.
Designing for Online Collaborative SensemakingNitesh Goyal
We designed a collaborative analysis tool to explore the value of implicitly sharing insights and notes, without requiring analysts to explicitly push information or request it from each other. In an experiment, pairs of remote individuals played the role of crime analysts solving a set of serial killer crimes with both partners having some, but not all, relevant clues. When implicit sharing of notes was available, participants remembered more clues related to detecting the serial killer, and they perceived the tool as more useful compared to when implicit sharing was not available.
Integrating Web Services With Geospatial Data Mining Disaster Management for ...Waqas Tariq
Data Mining (DM) and Geographical Information Systems (GIS) are complementary techniques for describing, transforming, analyzing and modeling data about real world system. GIS and DM are naturally synergistic technologies that can be joined to produce powerful market insight from a sea of disparate data. Web Services would greatly simplify the development of many kinds of data integration and knowledge management applications. This research aims to develop a Spatial DM web service. It integrates state of the art GIS and DM functionality in an open, highly extensible, web-based architecture. The Interoperability of geospatial data previously focus just on data formats and standards. The recent popularity and adoption of Web Services has provided new means of interoperability for geospatial information not just for exchanging data but for analyzing these data during exchange as well. An integrated, user friendly Spatial DM System available on the internet via a web service offers exciting new possibilities for geo-spatial analysis to be ready for decision making and geographical research to a wide range of potential users.
A forecasting of stock trading price using time series information based on b...IJECEIAES
Big data is a large set of structured or unstructured data that can collect, store, manage, and analyze data with existing database management tools. And it means the technique of extracting value from these data and interpreting the results. Big data has three characteristics: The size of existing data and other data (volume), the speed of data generation (velocity), and the variety of information forms (variety). The time series data are obtained by collecting and recording the data generated in accordance with the flow of time. If the analysis of these time series data, found the characteristics of the data implies that feature helps to understand and analyze time series data. The concept of distance is the simplest and the most obvious in dealing with the similarities between objects. The commonly used and widely known method for measuring distance is the Euclidean distance. This study is the result of analyzing the similarity of stock price flow using 793,800 closing prices of 1,323 companies in Korea. Visual studio and Excel presented calculate the Euclidean distance using an analysis tool. We selected “000100” as a target domestic company and prepared for big data analysis. As a result of the analysis, the shortest Euclidean distance is the code “143860” company, and the calculated value is “11.147”. Therefore, based on the results of the analysis, the limitations of the study and theoretical implications are suggested.
This is a brief a brief review of current multi-disciplinary and collaborative projects at Kno.e.sis led by Prof. Amit Sheth. They cover research in big social data, IoT, semantic web, semantic sensor web, health informatics, personalized digital health, social data for social good, smart city, crisis informatics, digital data for material genome initiative, etc. Dec 2015 edition.
Krishnaprasad Thirunarayan, Trust Management: Multimodal Data Perspective,
Invited Tutorial, The 2015 International Conference on Collaboration
Technologies and Systems (CTS 2015), June 2015
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersAmit Sheth
Abstract
Kno.e.sis (http://knoesis.org) is a world-class research center that uses semantic, cognitive, and perceptual computing for gathering insights from physical/IoT, cyber/Web, and social and enterprise (e.g., clinical) big data. We innovate and employ semantic web, machine learning, NLP/IR, data mining, network science and highly scalable computing techniques. Our highly interdisciplinary research impacts health and clinical applications, biomedical and translational research, epidemiology, cognitive science, social good, policy, development, etc. A majority of our $12+ million in active funds come from the NSF and NIH. In this talk, I will provide an overview of some of our major research projects.
Kno.e.sis is highly successful in its primary mission of exceptional student outcomes: our students have exceptional publication and real-world impact and our PhDs compete with their counterparts from top 10 schools for initial jobs in research universities, top industry research labs, and highly competitive companies. A key reason for Kno.e.sis' success is its unique work culture involving teamwork to solve complex problems. Practically all our work involves real-world challenges, real-world data, interdisciplinary collaborators, path-breaking research to solve challenges, real-world deployments, real-world use, and measurable real-world impact.
In this talk, I will also seek to discuss our choice of research topics and our unique ecosystem that prepares our students for exceptional careers.
This tutorial presents tools and techniques for effectively utilizing the Internet of Things (IoT) for building advanced applications, including the Physical-Cyber-Social (PCS) systems. The issues and challenges related to IoT, semantic data modelling, annotation, knowledge representation (e.g. modelling for constrained environments, complexity issues and time/location dependency of data), integration, analy- sis, and reasoning will be discussed. The tutorial will de- scribe recent developments on creating annotation models and semantic description frameworks for IoT data (e.g. such as W3C Semantic Sensor Network ontology). A review of enabling technologies and common scenarios for IoT applications from the data and knowledge engineering point of view will be discussed. Information processing, reasoning, and knowledge extraction, along with existing solutions re- lated to these topics will be presented. The tutorial summarizes state-of-the-art research and developments on PCS systems, IoT related ontology development, linked data, do- main knowledge integration and management, querying large- scale IoT data, and AI applications for automated knowledge extraction from real world data.
Related: Semantic Sensor Web: http://knoesis.org/projects/ssw
Physical-Cyber-Social Computing: http://wiki.knoesis.org/index.php/PCS
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
Amit Sheth's keynote at IEEE BigData 2014, Oct 29, 2014.
Abstract from:
http://cci.drexel.edu/bigdata/bigdata2014/keynotespeech.htm
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is personalized digital health that related to taking better decisions about our health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (e.g., information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will describe Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If my child is an asthma patient, for all the data relevant to my child with the four V-challenges, what I care about is simply, “How is her current health, and what are the risk of having an asthma attack in her current situation (now and today), especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP. I will motivate the need for a synergistic combination of techniques similar to the close interworking of the top brain and the bottom brain in the cognitive models.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city.
A colloquium talk given by Dr. Saeedeh Shekarpour of Kno.e.sis Center at IBM Almaden Research Center, 10 Feb 2017.
Abstract:
Cognitive computing is being the prevalent term for referring a series of computing which mimics human brain. This emerging computing is expected from various disciplines such as natural language processing, information retrieval, inference systems and information extraction, etc. Using knowledge graphs, which jointly enhance semantics as well as the structure of data, is fundamental in promoting capabilities of your computing towards cognitive computing. In this talk, I share my experiences in using knowledge graph for various purposes such as question answering system, information extraction.
Krishnaprasad Thirunarayan, Value-Oriented Big Data Processing with Applications,
Invited Talk, The 2015 International Conference on Collaboration
Technologies and Systems (CTS 2015), June 2015.
This presentation will discuss how the structured data, together with the semantically indexed/mined entities in semi-structured and unstructured data, are contributing to researches beyond libraries, especially in digital humanities. It aims to explore the opportunities and strategies to use, reuse, share, and effectively elaborate the smart data -- generated or to be generated -- in libraries.
Presentation at the AAAI 2013 Fall Symposium on Semantics for Big Data, Arlington, Virginia, November 15-17, 2013
Additional related material at: http://wiki.knoesis.org/index.php/Smart_Data
Related paper at: http://www.knoesis.org/library/resource.php?id=1903
Abstract: We discuss the nature of Big Data and address the role of semantics in analyzing and processing Big Data that arises in the context of Physical-Cyber-Social Systems. We organize our research around the five V's of Big Data, where four of the Vs are harnessed to produce the fifth V - value. To handle the challenge of Volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making. To handle the challenge of Variety, we resort to the use of semantic models and annotations of data so that much of the intelligent processing can be done at a level independent of heterogeneity of data formats and media. To handle the challenge of Velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize new concepts, entities and facts. To handle Veracity, we explore the formalization of trust models and approaches to glean trustworthiness. The above four Vs of Big Data are harnessed by the semantics-empowered analytics to derive Value for supporting practical applications transcending physical-cyber-social continuum.
Data Science Innovations is a guest lecture for the Advanced Data Analytics (an Introduction) course at the Advanced Analytics Institute at University of Technology Sydney
Knowledge Management in the AI Driven Scintific SystemSubhasis Dasgupta
In this dynamic talk, we'll explore the transformative role of AI in scientific knowledge management. We'll delve into how AI revolutionizes data organization, analysis, and hypothesis testing, enhancing efficiency and discovery. Highlighting the seamless integration with existing research processes, we'll address the training and ethical considerations of AI adoption. Through real-world examples, we'll demonstrate AI's impact on scientific breakthroughs, emphasizing the shift towards more collaborative and innovative research landscapes. This presentation aims to inspire the scientific community to embrace AI, leveraging its potential to redefine the boundaries of knowledge and innovation.
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
Thinking about the need for deeper provenance for knowledge graphs but also using knowledge graphs to enrich provenance. Presented at https://seminariomirianandres.unirioja.es/sw19/
Una breve introduzione alla data science e al machine learning con un'enfasi sugli scenari applicativi, da quelli tradizionali a quelli più innovativi. La overview copre la definizione di base di data science, una overview del machine learning e esempi su scenari tradizionali, Recommender systems e Social Network Analysis, IoT e Deep Learning
Data Science Innovations : Democratisation of Data and Data Science suresh sood
Data Science Innovations : Democratisation of Data and Data Science covers the opportunity of citizen data science lying at the convergence of natural language generation and discoveries in data made by the professions, not data scientists.
Introduction To Data Mining: Introduction - The evolution of database
system technology - Steps in knowledge discovery from database process
- Architecture of a data mining systems - Data mining on different kinds
of data - Different kinds of pattern - Technologies used - Applications -
Major issues in data mining - Classification of data mining systems - Data
mining task primitives - Integration of a data mining system with a
database or data warehouse system.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Big Data Challenges and Trust Management at CTS -2016
1. Big Data Challenges and Trust Management
Krishnaprasad Thirunarayan (T. K. Prasad)
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing
2. Part I
Value-Oriented Multimodal Big Data
Processing with Applications
Krishnaprasad Thirunarayan
(T. K. Prasad)
t.k.prasad@wright.edu
http://knoesis.wright.edu/tkprasad
5. The most profound technologies
are those that disappear. They
weave themselves into the fabric
of everyday life until they are
indistinguishable from it.
Mark Weiser
10/31/16 5
6. Outline
• 5 V’s of Big Data Research
• Semantic Perception for Scalability and Decision
Making
• Lightweight semantics to manage heterogeneity
– Cost-benefit trade-off continuum
• Hybrid Knowledge Representation and Reasoning
– Anomaly, Correlation, Causation
10/31/16 6
7. 5V’s of Big Data Research
Volume
Velocity
Variety
Veracity
Value
10/31/16
Big Data => Smart Data
7
9. Big Crisis Data: Social Media in Disasters and Time-Critical Situations
By Carlos Castillo
1. Volume: Data Acquisition, Storage, and Retrieval
2. Vagueness: Natural Language and Semantics
3. Variety: Classification and Clustering
4. Virality: Networks and Information Propagation
5. Velocity: Online Methods and Data Streams
6. Volunteers: Humanitarian Crowdsourcing
7. Veracity: Misinformation and Credibility
8. Validity: Biases and Pitfalls of Social Media Data
9. Visualization: Crisis Maps and Beyond
10.Values: Privacy and Ethics
10/31/16 9
12. Volume : Challenge
10/31/16
• Sensors (due to IoT) offer
unprecedented access to granular
data that can be transformed into
powerful knowledge (and decisions).
Without an integrated business
analytics platform, though, sensor
data will just add to information
overload and escalating noise.
http://www.sas.com/en_us/insights/big-data/internet-of-things.html
12
13. Volume : (1) Semantic Perception
(multimodal data abstraction and integration)
10/31/16 14
14. SSN
Ontology
2 Interpreted data
(deductive)
[in OWL]
e.g., threshold
1 Annotated Data
[in RDF]
e.g., label
0 Raw Data
[in TEXT]
e.g., number
3 Interpreted data
(abductive)
[in OWL]
e.g., diagnosis
Intellego
“150”
Systolic blood pressure of 150 mmHg
Elevated
Blood
Pressure
Hyperthyroidism
……
15
Levels of Abstraction
22. Traffic Data Analysis
Histogram of speed values
collected from June 1st 12:00 AM to June 2nd 12:00 AM
Histogram of travel time values
collected from June 1st 12:00 AM to June 2nd 12:00 AM
24
10/31/16
23. Relating Sensor Time Series Data
to Scheduled/Unscheduled Events
Image credit: http://traffic.511.org/index
Multiple events
Varying influence
interact with each other
2510/31/16
26. Volume with a Twist
Resource-constrained reasoning on mobile-
devices
10/31/16 28
27. Cory Henson’s Thesis Statement
Machine perception can be
formalized using semantic web
technologies to derive abstractions
from sensor data using background
knowledge on the Web, and
efficiently executed on resource-
constrained devices.
10/31/16 29
28. * based on Neisser’s cognitive model of perception
Observe
Property
Perceive
Feature
Explanation
Discrimination
1
2
Perception Cycle* that exploits background knowledge / domain models
Abstracting raw data
for human
comprehension
Focus generation for
disambiguation and action
(incl. human in the loop)
Prior Knowledge
10/31/16 30
30. Prior knowledge on the Web
W3C Semantic Sensor
Network (SSN) Ontology Bi-partite Graph
10/31/16 32
31. Prior knowledge on the Web
W3C Semantic Sensor
Network (SSN) Ontology Bi-partite Graph
10/31/16 33
32. Virtues of Our Approach to Semantic Perception
Blends simplicity, effectiveness, and scalability.
• Declarative specification of explanation and discrimination;
• With contemporary relevant applications (e.g., healthcare);
• Using improved encodings/algorithms that are necessary
(“tractable” resource needs for typical problem sizes) and
significant (asymptotic order of magnitude gain); and
• Prototyped using extant PCs and mobile devices.
10/31/16 40
33. O(n3) < x < O(n4) O(n)
Efficiency Improvement
• Problem size increased from 10’s to 1000’s of nodes
• Time reduced from minutes to milliseconds
• Complexity growth reduced from polynomial to linear
Evaluation on a mobile device
41
34. Variety
Syntactic and semantic heterogeneity
• in textual and sensor data,
• in (legacy) materials data
• in (long tail) geosciences data
10/31/16 46
35. Variety (What?): Materials/Geosciences Use Case
• Structured Data (e.g., relational)
• Semi-structured, Heterogeneous Documents
(e.g., Publications and technical specs, which
usually include text, numerics, maps and images)
• Tabular data (e.g., ad hoc spreadsheets and
complex tables incorporating “irregular” entries)
10/31/16 47
36. Variety (How?): (1) Granularity of Semantics & Applications
• Lightweight semantics: File and document-level
annotation to enable discovery and sharing
• Richer semantics: Data-level annotation and
extraction for semantic search and summarization
• Fine-grained semantics: Data integration,
interoperability and reasoning in Linked Open Data
Cost-benefit trade-off continuum
10/31/16 48
37. Variety (What?) : Sensor Data Use Case
Develop/learn domain models to exploit
complementary and corroborative information
to obtain improved situational awareness
• To relate patterns in multimodal data to
“situation”
• To integrate machine sensed and human
sensed data
• Example Application:
SemSOS :
Semantic Sensor Observation Service
10/31/16 51
38. Variety: (2) Hybrid KRR
Blending data-driven models with declarative
knowledge
– Data-driven: Bottom-up, correlation-based,
statistical
– Declarative: Top-down, causal/taxonomical,
logical
– Refine structure to better estimate parameters
E.g., Traffic Analytics using PGMs + KBs
10/31/16 52
39. Variety (Why?): Hybrid KRR
Data can help compensate for our overconfidence
in our own intuitions and reduce the extent to
which our desires distort our perceptions.
-- David Brooks of New York Times
However, inferred correlations require clear
justification that they are not coincidental, to
inspire confidence.
10/31/16 53
40. Variety (How?): Hybrid KRR
Blending data-driven models with declarative
knowledge
– Structure learning from data
– Enhance structure
• By refining direction of dependency
– Disambiguation
– Filtering
• By augmenting with taxonomy
– nomenclature and relationships
– Improved Parameter learning from data
E.g., Traffic Analytics using PGMs + KBs
10/31/16 54
41. • Due to common cause or origin
– E.g., Planets: Copernicus > Kepler > Newton > Einstein
• Coincidental due to data skew or misrepresentation
– E.g., Tall policy claims made by politicians!
• Coincidental new discovery
– E.g., Hurricanes and Strawberry Pop-Tarts Sales
• Strong correlation vs causation
– E.g., Spicy foods vs Helicobacter Pyroli : Stomach Ulcers
• Anomalous and accidental
– E.g., CO2 levels and Obesity
• Correlation turning into causations
– E.g., Pavlovian learning: conditional reflex
Anomalies, Correlations, Causation
10/31/16 55
42. Veracity
Lot of existing work on Trust ontologies, metrics and
models, and on Provenance tracking
• Homogeneous data: Statistical techniques
• Heterogeneous data: Semantic models
10/31/16 56
43. Veracity: Confession of sorts!
Trust is well-known,
but is not well-understood.
The utility of a notion testifies
not to its clarity but rather to the
philosophical importance of
clarifying it.
-- Nelson Goodman
(Fact, Fiction and Forecast, 1955)
10/31/16 57
44. (More on) Value
Learning domain models from “big data” for
prediction
E.g., Harnessing Twitter "Big Data" for Automatic
Emotion Identification
10/31/16 58
45. (More on) Value
Discovering gaps and enriching domain models
using data
E.g., Data driven knowledge acquisition method for
domain knowledge enrichment in the healthcare
10/31/16 59
46. Conclusions
• Glimpse of our research organized around
the 5 V’s of Big Data
• Discussed research directions in harnessing Value
– Semantic Perception (Volume)
– Continuum of Semantic models to manage
Heterogeneity (Variety)
– Hybrid KRR: Probabilistic + Logical (Variety)
– Continuous Semantics (Velocity)
– Trust Models (Veracity)
10/31/16 60
47. http://knoesis.wright.edu/tkprasad
Department of Computer Science and Engineering
Wright State University, Dayton, Ohio, USA
Kno.e.sis: Ohio Center of Excellence in Knowledge-enabled Computing
10/31/16
Special Thanks to: Pramod Anantharam, Dr. Cory Henson
Krishnaprasad Thirunarayan, Amit P. Sheth: Semantics-Empowered Big Data
Processing with Applications. AI Magazine 36(1): 39-54 (2015)
61
48. Part II
Trust Management :
Multimodal Data Perspective
Krishnaprasad Thirunarayan
(T. K. Prasad)
t.k.prasad@wright.edu
http://knoesis.wright.edu/tkprasad
49. Broad Outline
• Real-life Motivational Examples (Why?)
• Trust : Characteristics and Related Concepts (What?)
• Trust Ontology (What?)
– Type, Value, Process, Scope
• Gleaning Trustworthiness (How?) + Robustness to Attack
– Practical Examples of Trust Metrics
– Comparative Analysis of Bayesian Approaches to Trust
• Research Challenges (Why-What-How?)
– APPLICATIONS: E.g., Sensor Networks, Social Networks, Interpersonal
– ISSUES: E.g., Credibility, Scalability, Resiliency (Distributed Consensus)
10/31/2016 Trust Management: T. K. Prasad 63
51. Interpersonal
• With which neighbor should we leave our
children over the weekend when we are
required to be at the hospital?
• Who should be named as a guardian for our
children in the Will?
10/31/2016 Trust Management: T. K. Prasad 65
52. Social
• To click or not to click a http://bit.ly-URL
• To rely or not to rely on a product review
(when only a few reviews are present, or the
reviews are conflicting)?
10/31/2016 Trust Management: T. K. Prasad 66
53. Sensors
10/31/2016 Trust Management: T. K. Prasad 67
• Weather sensor network predicts a potential tornado in
the vicinity of a city.
• Issue: Should we mobilize emergency response teams
ahead of time?
• Van’s TCS (Traction Control System) indicator light came
on intermittently, while driving.
• Issue: Which is faulty: the indicator light or the traction
control system?
• Van’s Check Engine light came on, while driving.
• Issue: Which is faulty: the indicator light or the
transmission control system ?
54. Man-Machine Hybrid
Collaborative Systems
The 2002 Uberlingen Mid-air Collision (between
Bashkirian Airlines Flight 2937 and DHL Flight
611) occurred because the pilot of one of the
planes trusted the human air traffic controller
(who was ill-informed about the unfolding
situation), instead of the electronic TCAS system
(which was providing conflicting but correct
course of action to avoid collision).
http://en.wikipedia.org/wiki/2002_Uberlingen_mid-air_collision
10/31/2016 Trust Management: T. K. Prasad 68
55. Man-Machine Hybrid
Collaborative Systems
In hybrid situations, artificial agents
should reason about the
trustworthiness and deceptive
actions of their human counter
parts. People and agents in virtual
communities will deceive, and will
be deceived.
10/31/2016 Trust Management: T. K. Prasad 69
Castelfranchi and Tan, 2002
56. Common Issues and Context
• Uncertainty
– About the validity of a claim or
assumption
• Vulnerability
– Past Experience
• Need for action
Critical decision with potential for loss
10/31/2016 Trust Management: T. K. Prasad 70
57. Why Track Trust?
• In Mobile Ad Hoc Networks (MANETs),
trust enables dynamic determination of
secure routes.
– Efficiency: To improve throughput
• By avoiding nodes facing bad channel condition
– Robustness : To detect malicious nodes
• When attackers enter the network in spite of
secure key distribution/authentication
10/31/2016 Trust Management: T. K. Prasad 71
58. Why Track Trust?
• In sensor networks, it allows detection of
faults and transient bad behaviors due to
environmental effects.
• In cognitive radio networks, it can enable
selection of optimal channel (less noisy,
less crowded channels).
10/31/2016 Trust Management: T. K. Prasad 72
59. Why Track Trust?
• In E-commerce:
–To predict future behavior in a
reliable manner.
–To incentivize “good” behavior and
discourage “bad” behavior.
–To detect malicious entities.
10/31/2016 Trust Management: T. K. Prasad 73
60. The Two Sides of Trust
• Trustor assesses trustee for
dependability.
• Trustee casts itself in positive light to
trustor.
• Trust is a function of trustee's
perceived trustworthiness and the
trustor's propensity to trust.
10/31/2016 Trust Management: T. K. Prasad 74
61. Trust and Related Concepts
10/31/2016 Trust Management: T. K. Prasad 76
(What is trust?)
62. Trust Definition : Psychology slant
Trust is the psychological state
comprising a willingness to be
vulnerable in expectation of a
valued result.
10/31/2016 Trust Management: T. K. Prasad
Ontology of Trust, Huang and Fox, 2006
Josang et al’s Decision Trust
77
63. Trust Definition : Psychology slant
Trust in a person is a commitment to
an action based on a belief that the
future actions of that person will
lead to good outcome.
10/31/2016 Trust Management: T. K. Prasad
Golbeck and Hendler, 2006
78
64. Trust Definition : Probability slant
Trust (or, symmetrically, distrust)
is a level of subjective probability
with which an agent assesses
that another agent will perform
a particular action, both before
and independently of such an
action being monitored …
10/31/2016 Trust Management: T. K. Prasad
Can we Trust Trust?, Diego Gambetta, 2000
Josang et al’s Reliability Trust
79
65. Trustworthiness Definition :
Psychology Slant
Trustworthiness is a collection of
qualities of an agent that leads them
to be considered as deserving of
trust from others (in one or more
environments, under different
conditions, and to different degrees).
10/31/2016 Trust Management: T. K. Prasad
http://www.iarpa.gov/rfi_trust.html
80
66. Trustworthiness Definition :
Probability slant
Trustworthiness is the objective
probability that the trustee
performs a particular action on
which the interests of the
trustor depend.
10/31/2016 Trust Management: T. K. Prasad
Solhaug et al, 2007
81
67. Trust vs Trustworthiness : My View
Trust Disposition
Depends on
Potentially Quantified Trustworthiness Qualities
+
Context-based Trust Threshold
E.g.*, In the context of trusting strangers, people in
the West will trust for lower levels of trustworthiness
than people in the Gulf.
10/31/2016 Trust Management: T. K. Prasad
*Bohnet et al, 5/2010
82
68. Reputation is Overloaded
Community-based Reputation
vs.
Temporal Reputation
(Cf. Community endorsement of merit,
achievement, reliability, etc.)
(Cf. Sustained good behavior over time elicits
temporal reputation-based trust.)
10/31/2016 Trust Management: T. K. Prasad 86
69. Trust vs. Reputation
Reputation can be a basis for trust.
However, they are different notions*.
• I trust you because of your good reputation.
• I trust you despite your bad reputation.
• Do you still trust Toyota brand?
10/31/2016 Trust Management: T. K. Prasad
*Josang et al, 2007
87
70. Trust vs. (Community-based)
Reputation
Trust :: Reputation
::::
Local :: Global
::::
Subjective :: Objective
(Cf. Security refers to resistance to attacks.)
10/31/2016 Trust Management: T. K. Prasad 88
71. Trust is well-known,
but is not well-understood.
The utility of a notion
testifies not to its clarity but
rather to the philosophical
importance of clarifying it.
-- Nelson Goodman
(Fact, Fiction and Forecast, 1955)
10/31/2016 Trust Management: T. K. Prasad 90
72. Trust Ontology
10/31/2016 Trust Management: T. K. Prasad 91
(What is trust?)
Illustration of Knowledge Representation and Reasoning:
Relating Semantics to Data Structures and Algorithms
73. Example Trust Network -
Different Trust Links with Local Order on out-links
• Alice trusts Bob for recommending good car
mechanic.
• Bob trusts Dick to be a good car mechanic.
• Charlie does not trust Dick to be a good car
mechanic.
• Alice trusts Bob more than Charlie, for
recommending good car mechanic.
• Alice trusts Charlie more than Bob, for
recommending good baby sitter.
10/31/2016 Trust Management: T. K. Prasad
*Thirunarayan et al, IICAI 2009
92
74. Digression: Illustration of Knowledge
Representation and Reasoning
• Abstract and encode clearly delineated “subarea”
of knowledge in a formal language.
– Trust Networks => node-labeled, edge-labeled
directed graph (DATA STRUCTURES)
• Specify the meaning in terms of how “network
elements” relate to or compose with each other.
– Semantics of Trust, Trust Metrics => using logic or
probabilistic basis, constraints, etc. (SEMANTICS)
• Develop efficient graph-based procedures
– Trust value determination/querying (INFERENCE
ALGORITHMS)
10/31/2016 Trust Management: T. K. Prasad 93
75. 10/31/2016 Trust Management: T. K. Prasad 94
(In recommendations)
(For capacity to act)
(For lack of
capacity to act)
76. Trust Ontology*
6-tuple representing a trust relationship:
{type, value, scope, process}
Type – Represents the nature of trust relationship.
Value – Quantifies trustworthiness for comparison.
Scope – Represents applicable context for trust.
Process – Represents the method by which the value is
created and maintained.
trustor trustee
10/31/2016 Trust Management: T. K. Prasad
*Anantharam et al, NAECON 2010
95
77. Trust Ontology:
Trust Type, Trust Value, and Trust Scope
Trust Type*
Referral Trust – Agent a1 trusts agent a2’s ability to
recommend another agent.
(Non-)Functional Trust – Agent a1 (dis)trusts agent a2’s
ability to perform an action.
Cf. ** trust in belief vs. trust in performance
Trust Value
E.g., Star rating, numeric rating, or partial ordering.
Trust Scope*
E.g., Reliable forwarding of data.
10/31/2016 Trust Management: T. K. Prasad
*Thirunarayan et al, IICAI 2009
** Huang and Fox, 2006
96
78. Multidimensional
Trust Scopes in Ecommerce
• Trust in a vendor to deliver on
commitments.
• Trust in vendor's ethical use of
consumer data.
• Trust in Internet communication being
secure.
• Plus: Propensity/Disposition to trust
10/31/2016 Trust Management: T. K. Prasad 97
79. Trust Ontology:
Trust Process
Represents the method by which the value
is computed and maintained.
Primitive (for functional and referral links)*
Reputation – based on past behavior (temporal) or
community opinion.
Policy – based on explicitly stated constraints.
Evidence – based on seeking/verifying evidence.
Provenance – based on lineage information.
Composite (for admissible paths)**
Propagation (Chaining and Aggregation)
10/31/2016 Trust Management: T. K. Prasad
*Anantharam et al, NAECON 2010
**Thirunarayan et al, IICAI 2009
98
81. Unified Illustration of Trust Processes
Scenario : Hiring Web Search Engineer - An R&D Position
Various Processes :
• (Temporal-based) Reputation: Past job experiences
• (Community-based) Reputation: Multiple
references
• Policy-based: Score cutoffs on screening test
• Provenance-based: Department/University of
graduation
• Evidence-based: Multiple interviews (phone, on-
site, R&D team)
10/31/2016 Trust Management: T. K. Prasad 101
82. Deception
• Deception is the betrayal of trust.
–Ironically, trust makes us prone to
deception.
–Knowing what features are used to
glean trustworthiness can also assist
in avoiding detection while deceiving.
10/31/2016 Trust Management: T. K. Prasad 102
84. Trust Metrics and Trust Models
• Trust Metric => How is primitive trust
represented and computed?
• E.g., Real number, Finite levels, Partial Order.
• Trust Model => How is composite trust
computed or propagated?
10/31/2016 Trust Management: T. K. Prasad 104
Y. L. Sun, et al, 2/2008
85. Trust Models : Ideal Approach
• Capture semantics of trust using
– axioms for trust propagation, or
– catalog of equivalent trust networks.
• Develop trust computation rules
for propagation (that is, chaining
and aggregation) that satisfy the
axioms or equivalence relation.
10/31/2016 Trust Management: T. K. Prasad 105
86. Direct Trust : Functional and Referral
Reputation-based Process
10/31/2016 Trust Management: T. K. Prasad 106
(Using large number of observations)
87. Using Large Number of Observations
• Over time (<= Referral + Functional) :
Temporal Reputation-based Process
– Mobile Ad-Hoc Networks
– Sensor Networks
• Quantitative information
(Numeric data)
• Over agents (<= Referral + Functional) :
Community Reputation-based Process
– Product Rating Systems
• Quantitative + Qualitative information
(Numeric + text data)
10/31/2016 Trust Management: T. K. Prasad 108
88. Desiderata for Trustworthiness
Computation Function
• Initialization Problem : How do we get initial value?
• Update Problem : How do we reflect the observed
behavior in the current value dynamically?
• Trusting Trust* Issue: How do we mirror uncertainty
in our estimates as a function of observations?
• Law of Large Numbers: The average of the results obtained from a
large number of trials should be close to the expected value.
• Efficiency Problem : How do we store and update
values efficiently?
10/31/2016 Trust Management: T. K. Prasad
*Ken Thompson’s Turing Award Lecture: “Reflections on Trusting Trust”
109
90. Beta-distribution : Gently
• Consider a (potentially unfair) coin that comes up
HEADS with probability p and TAILS with probability
(1 – p).
• Suppose we perform ( r + s ) coin tosses and the coin
turns up with HEADS r times and with TAILS s times.
• What is the best estimate of the distribution of the
probability p given these observations?
=> Beta-distribution with parameters ( r+1, s+1 )
10/31/2016 Trust Management: T. K. Prasad
f(p; r+1, s+1)
111
91. Beta Probability Density Function(PDF)
x is a probability,
so it ranges from 0-1
If the prior distribution of x is
uniform, then the beta
distribution gives posterior
distribution of x after
observing a-1 occurrences
of event with probability x
and b-1 occurrences of the
complementary event with
probability (1-x).
10/31/2016 Trust Management: T. K. Prasad 112
92. a= 5
b= 5
a= 1
b= 1
a= 2
b= 2
a= 10
b= 10
a = b, so the PDF’s are symmetric w.r.t 0.5.
Note that the graphs get narrower as (a+b) increases.
10/31/2016 Trust Management: T. K. Prasad 113
93. a= 5
b= 25
a= 5
b= 10
a= 25
b= 5
a= 10
b= 5
a ≠ b, so the PDF’s are asymmetric w.r.t . 0.5.
Note that the graphs get narrower as (a+b) increases.
10/31/2016 Trust Management: T. K. Prasad 114
94. Beta-distribution - Applicability
• Dynamic trustworthiness can be
characterized using beta probability
distribution function gleaned from total
number of correct (supportive) r = (a-1)
and total number of erroneous
(opposing) s = (b-1) observations so far.
• Overall trustworthiness (reputation) is its
mean: a/a +b
10/31/2016 Trust Management: T. K. Prasad 116
95. Why Beta-distribution?
• Intuitively satisfactory, mathematically precise, and
computationally tractable
• Initialization Problem : Assumes that all probability values
are equally likely.
• Update Problem : Updates (a, b) by incrementing a for
every correct (supportive) observation and b for every
erroneous (opposing) observation.
• Trusting Trust Issue: The graph peaks around the mean, and
the variance diminishes as the number of observations
increase, if the agent is well-behaved.
• Efficiency Problem: Only two numbers stored/updated.
10/31/2016 Trust Management: T. K. Prasad 117
96. Information Theoretic Interpretation
of Trustworthiness Probability
• Intuitively, probability values of 0 and 1 imply
certainty, while probability value of 0.5 implies
a lot of uncertainty.
• This can be formalized by mapping probability
in [0,1] to trust value in [–1,1], using
information theoretic approach.
10/31/2016 Trust Management: T. K. Prasad
Y. L. Sun, et al, 2/2008
118
97. Information Theoretic Interpretation
of Trustworthiness Probability
• T(trustee : trustor, action) =
if 0.5 <= p
then 1 – H(p) /* 0.5 <= p <= 1 */
else H(p) – 1 /* 0 <= p <= 0.5 */
where
H(p) = – p log2(p) – (1 – p) log2(1 – p)
10/31/2016 Trust Management: T. K. Prasad 119
98. Plot of T(trustee : trustor, action) vs. p
10/31/2016 Trust Management: T. K. Prasad 122
99. Direct Trust : Functional
Policy-based Process
10/31/2016 Trust Management: T. K. Prasad 123
(Using Trustworthiness Qualities)
100. General Approach to Trust Assessment
• Domain dependent qualities for determining
trustworthiness
– Based on Content / Data
– Based on External Cues / Metadata
• Domain independent mapping to trust values
or levels
– Quantification through abstraction and
classification
10/31/2016 Trust Management: T. K. Prasad 124
101. Example: Wikipedia Articles
• Quality (content-based)
– Appraisal of information provenance
• References to peer-reviewed publication
• Proportion of paragraphs with citation
– Article size
• Credibility (metadata-based)
– Author connectivity
– Edit pattern and development history
• Revision count
• Proportion of reverted edits - (i) normal (ii) due to vandalism
• Mean time between edits
• Mean edit length.
10/31/2016 Trust Management: T. K. Prasad 125
Sai Moturu, 8/2009
102. (cont’d)
• Quantification of Trustworthiness
– Based on Dispersion Degree Score
(Extent of deviation from mean)
• Evaluation Metric
– Ranking based on trust level (determined from
trustworthiness scores), and compared to gold
standard classification using Normalized
Discounted Cumulative Gain (NDCG)
• RATINGS: featured, good, standard, cleanup, and stub.
• NDCG: error penalty proportional to the rank.
10/31/2016 Trust Management: T. K. Prasad 126
103. Indirect Trust : Referral + Functional
Variety of Trust Metrics and Models
10/31/2016 Trust Management: T. K. Prasad 128
(Using Propagation – Chaining and Aggregation over Paths)
104. Trust Propagation Frameworks
• Chaining, Aggregation, and Overriding
• Trust Management
• Abstract properties of operators
• Reasoning with trust
• Matrix-based trust propagation
• The Beta-Reputation System
• Algebra on opinion = (belief, disbelief, uncertainty)
10/31/2016 Trust Management: T. K. Prasad
Guha et al., 2004
Richardson et al, 2003
Josang and Ismail, 2002
132
Massa-Avesani, 2005
Bintzios et al, 2006
Golbeck – Hendler, 2006 Sun et al, 2006
Thirunarayan et al, 2009
105. Trust Propagation Algorithms
• Top-down
• 1: Extract trust DAG (eliminate cycles)
• 2: Predict trust score for a source in a target by
aggregating trust scores in target inherited
from source’s “trusted” parents weighted with
trust value in the corresponding parent.
–Computation is level-by-level
–Alternatively, computation can be based on
paths.
10/31/2016 Trust Management: T. K. Prasad
Golbeck – Hendler, 2006
133
106. Trust Propagation Algorithms
• Bottom-up
• 1: Extract trust DAG (eliminate cycles)
• 2: Predict trust score for a source in a target by
aggregating trust scores in target inherited
from target’s “trusted” neighbors weighted
with trust value in the corresponding neighbor.
–Computation is level-by-level
–Alternatively, computation can be based on
paths.
10/31/2016 Trust Management: T. K. Prasad
Massa-Avesani, 2005
Bintzios et al, 2006
134
107. Top-down vs Bottom-up (visualized)
10/31/2016 Trust Management: T. K. Prasad 135
source
Target
source
Target
w3
w2
w1
w3w2w1
Trust-Network
108. Example: Comparative Analysis
10/31/2016 Trust Management: T. K. Prasad 136
Different Interpretation:
q distrusts s (Bintzios et al’s)
vs
q has no information about
the trustworthiness of s (our’s,
Golbeck rounding algorithm)
109. Indirect Trust : Referral + Functional
Variety of Bayesian Trust Models
10/31/2016 Trust Management: T. K. Prasad 140
With Applications to Mobile Ad hoc Networks
Wireless Sensor Networks, etc.
110. Direct Trust : Functional and Referral
• Direct Trust for Packet Forwarding
• S = Number of packets forwarded
• F = Number of packets dropped
• S + F = Total number of requests for packet forwarding
• Direct Trust for Recommendations
• S = Number of times observed direct trust for packet
forwarding approximates expected indirect trust for
packet forwarding (trust over transit path : r+f)
• F = Number of times observed direct trust for packet
forwarding does not approximate expected indirect
trust for packet forwarding (trust over transit path : r+f)
10/31/2016 Trust Management: T. K. Prasad 141
111. Indirect Trust : Functional and Referral
• Indirect Trust for Packet Forwarding
– Used when direct trust is not available
»(overriding behavior)
• Chain links for a path from a recommender to the target
– Multiplicative
• Aggregate over multiple (parallel) paths from
recommenders to the target
– Unclear, in general
• Indirect Trust for Recommendations
• Obtained implicitly through computed referral trust
10/31/2016 Trust Management: T. K. Prasad 142
112. Trust Propagation Rules :
Axioms for Trust Models
Rule 1: Concatenation
propagation does
not increase trust.
Rule 2: Multipath
propagation does
not reduce trust.
10/31/2016 Trust Management: T. K. Prasad 143
Sun et al, 2006
|T(A1,C1)| <= min(|R(A1,B1)|, |T(B1,C1)|)
0<=T(A1,C1) <= T(A2,C2) for R1 > 0 and T2 >= 0
0>=T(A1,C1) >= T(A2,C2) for R1 > 0 and T2 < 0
113. (cont’d)
Rule 3: Trust based on multiple referrals from a
single source should not be higher than that
from independent sources.
10/31/2016 Trust Management: T. K. Prasad 144
Sun et al, 2006
0<=T(A1,C1) <= T(A2,C2) for R1, R2, R3 > 0 and T2 >= 0
0>=T(A1,C1) >= T(A2,C2) for R1, R2, R3 > 0 and T2 < 0
114. Trust Propagation Rules :
Implementation
10/31/2016 Trust Management: T. K. Prasad
Sun et al, 2006
1 2
145
115. Trust Paths Visualized for Scalability:
Semantics unclear based on Sun et al’s spec
10/31/2016 Trust Management: T. K. Prasad 151
Bottom-up computation
reflects our needs better?
116. Trust : Functional and Referral
• Direct Trust for Primitive Actions based on
• S = Number of success actions
• F = Number of failed actions
• S + F = Total number of actions
• Indirect Trust via Recommendations based on
summing direct experiences of recommenders
• Sk = Number of success actions for kth recommender
• Fk = Number of failed actions for kth recommender
• No chaining for referrals
10/31/2016 Trust Management: T. K. Prasad 153
Denko-Sun 2008
117. Cumulative Trust using
Direct Experience and Recommendations
• Cumulative Trust is obtained by using total
number of success actions and failed actions
from direct experience (ns,nu) and from i
(indirect experiences through)
recommendations (ns
r,nu
r).
10/31/2016 Trust Management: T. K. Prasad 154
118. Contents of [Ganeriwal et al, 2007] Paper
• (a,b)-parameters to compute trust of i in j is
obtained by combining direct observations
(aj,bj) with indirect observations (aj
k,bj
k) from
k weighted by (ak,bk) using [Josang-Ismail,
2002] chaining/discounting rule.
• Obtains cumulative trust by combining direct trust from
a functional link and indirect trusts using paths
containing one referral link and one functional link.
• However, it does not distinguish functional and referral
trust.
10/31/2016 Trust Management: T. K. Prasad 155
119. Security Issues:
Threats and Vulnerabilities
10/31/2016 Trust Management: T. K. Prasad 156
Attacks and Robustness Analysis
120. Attacks
• Trust Management is an attractive target for
malicious nodes.
Bad mouthing attack (Defamation)
Dishonest recommendations on good nodes (calling them
bad)
Ballot stuffing attack (Collusion)
Dishonest recommendations on bad nodes (calling them
good)
Sybil attack
Creating Fake Ids
Newcomer attack
Registering as new nodes
10/31/2016 Trust Management: T. K. Prasad 157
121. Attacks
• Inconsistency in time-domain
On-Off attack
Malicious node behaves good and bad alternatively to
avoid detection
Sleeper attack
Malicious node acquires high trust by behaving good
and then striking by behaving bad
Inconsistency in node-domain
Conflicting Behavior Attack
Provide one recommendation to one set of peers and a
conflicting recommendation to a disjoint set of peers
10/31/2016 Trust Management: T. K. Prasad 158
122. Security : Robustness w.r.t Attacks
Bad mouthing attack
Example: Competent nodes downplay competitions.
Example: Can diminish throughput due to lost capacity.
Approach:
Separate functional and referral trust, updating
referral trust to track good recommendations
Trust composition rules ensure that low or
negative referral trust does not impact decision
Low trust nodes can be branded as malicious and
avoided. (Not viable if majority collude.)
10/31/2016 Trust Management: T. K. Prasad 159
123. Security : Robustness w.r.t Attacks
Ballot stuffing attack
Example: Malicious nodes collude to recommend each
other.
Example: Can cause unexpected loss of throughput.
Approach:
Feedback : Cross-check actual functional
performance with expected behavior via referral,
and update (reward/penalize) referral trust (in
parent) accordingly (in addition to updating
functional trust (in target))
10/31/2016 Trust Management: T. K. Prasad 160
124. Security : Robustness w.r.t. Attacks
Sybil attack
Create Fake Ids to take blame for malicious
behavior (dropping packets)
Newcomer attack
Register as new node to erase past history
Approach
Requires separate (key-based or security token-
based) authentication mechanism (with TTP) to
overcome these attacks.
10/31/2016 Trust Management: T. K. Prasad 161
125. Security : Robustness w.r.t Attacks
On-Off attack
Sleeper attack
Example: Due to malice or environmental changes
Approach:
Use forgetting factor (0<=b<=1):
k good/bad actions at t1
= k * b(t2 – t1) good/bad actions at t2 (> t1)
10/31/2016 Trust Management: T. K. Prasad 162
126. Forgetting Factor
k good/bad actions at t1 = k * b(t2 – t1) good/bad actions at t2 (> t1)
• High b value (0.9) enhances memorized time
window, while low b value (0.001) reduces it.
– High b enables malicious nodes (on-off/sleeper
attackers) to use prior good actions to mask
subsequent intentional bad actions.
• Reduces reliability.
– Low b forces legitimate nodes to be avoided due
to short spurts of unintentional bad actions.
• Reduces throughput.
10/31/2016 Trust Management: T. K. Prasad 163
127. Adaptive Forgetting Factor
• Intuition: Bad actions are remembered for a
longer duration than good actions.
• Actions performed with high trust forgotten quicker than
actions performed with low trust.
Choose b equal to ( 1 – p )
Choose b = 0.01 when p in [0.5,1] else 0.9
Example: Similar ideas used in Ushahidi
Note: Effectively, more good actions are
necessary to compensate for fewer bad actions,
to recover trust.
10/31/2016 Trust Management: T. K. Prasad 164
128. Security : Robustness w.r.t. Attacks
Conflicting Behavior Attack
Malicious node divide and conquer, by behaving
differently (resp. by providing different
recommendations) to different peers, causing
peers to provide conflicting recommendations to
source about the malicious node (resp. about
some target), reducing source’s referral trust in
some peers.
Eventually, this causes recommendations of some peers
to be ignored incorrectly.
10/31/2016 Trust Management: T. K. Prasad 165
129. Example
• Peer Node Set 1: 1, 2, 3, and 4
• Peer Node Set 2: 5, 6, 7, and 8
• Malicious node 0 behaves well towards nodes in
Set 1 but behaves badly towards nodes in Set 2.
• When node 9 seeks recommendations from
nodes in Set 1 U Set 2 on node 0, node 9 receives
conflicting recommendations on malicious node
0, causing referral trust in nodes in Set 1 or
nodes in Set 2 to be lowered.
=> Eventually throughput lowered
10/31/2016 Trust Management: T. K. Prasad 166
130. Security : Robustness w.r.t. Attacks
Conflicting Behavior Attack
Issue: Can recommenders get feedback
to reduce trust in malicious node?
Otherwise, referral trust cannot be
relied upon for detecting malicious
nodes.
10/31/2016 Trust Management: T. K. Prasad 167
131. 10/31/2016 Trust Management: T. K. Prasad 171
APPROACH/
METRIC
Trust Type /
Context
Trust Model /
Foundation
Robustness to
Attacks
D[3] /
Binary
Functional / One Trivial chaining /
Beta-PDF
Ballot-stuffing;
Bad-mouthing
G[4] /
Binary
Functional /
Indistinguishable
Josang-Ismail
discounting /
Beta-PDF
Ballot-stuffing;
Bad-mouthing;
Sleeper and On-
off
S[6] /
Binary
Functional + Referral
/ One
Limited chaining
and aggregation /
Beta-PDF
Ballot-stuffing;
Bad-mouthing;
Sleeper and On-
off
Q[28] / Multi-level Functional + Referral /
Multiple
No /
Bayesian
Ad Hoc
Ballot-stuffing;
Bad-mouthing;
Sleeper and On-
off; Sybil
Ours /
Multi-level
Functional + Referral /
Multiple
No /
Dirichlet-PDF
Ballot-stuffing;
Bad-mouthing;
Sleeper and On-
off; Conflicting
behavior
133. Generic Directions
• Finding online substitutes for traditional cues
to derive measures of trust.
• Creating efficient and secure systems for
managing and deriving trust, in order to
support decision making.
10/31/2016 Trust Management: T. K. Prasad
Josang et al, 2007
173
134. Robustness Issue
You can fool some of the
people all of the time, and all
of the people some of the
time, but you cannot fool all
of the people all of the time.
Abraham Lincoln,
16th president of US (1809 - 1865)
10/31/2016 Trust Management: T. K. Prasad 174
135. Trust : Social Networks vs
Machine Networks
• In social networks such as Facebook, trust is
often subjective, while in machine networks
and social networks such as Twitter, trust can
be given an objective basis and approximated
by trustworthiness.
• Reputation is the perception that an agent
creates through past actions about its
intentions and norms.
– Reputation can be a basis for trust.
10/31/2016 Trust Management: T. K. Prasad 175
138. Concrete Application
• Applied Beta-PDF to Mesowest Weather Data
– Used quality flags (OK, CAUTION, SUSPECT)
associated with observations from a sensor
station over time to derive reputation of a sensor
and trustworthiness of a perceptual theory that
explains the observation.
– Perception cycle used data from ~800 stations,
collected for a blizzard during 4/1-6/03.
10/31/2016 Trust Management: T. K. Prasad 178
139. Concrete Application
• Perception Cycle
• Trusted Perception Cycle
– http://www.youtube.com/watch?v=lTxzghCjGgU
10/31/2016 Trust Management: T. K. Prasad 179
140. Research Issues
• Outlier Detection
– Homogeneous Networks
• Statistical Techniques
– Heterogeneous Networks (sensor + social)
• Domain Models
• Distinguishing between abnormal phenomenon
(observation), malfunction (of a sensor), and
compromised behavior (of a sensor)
– Abnormal situations
– Faulty behaviors
– Malicious attacks
10/31/2016 Trust Management: T. K. Prasad 181
Ganeriwal et al, 2008
142. Our Research
• Study semantic issues relevant to trust
• Proposed model of trust/trust metrics to
formalize indirect trust
10/31/2016 Trust Management: T. K. Prasad 183
143. Quote
• Guha et al:
While continuous-valued trusts are
mathematically clean, from the standpoint
of usability, most real-world systems will
in fact use discrete values at which one
user can rate another.
• E.g., Epinions, Ebay, Amazon, Facebook, etc all
use small sets for (dis)trust/rating values.
10/31/2016 Trust Management: T. K. Prasad 185
144. Our Approach
Trust formalized in terms of partial orders
(with emphasis on relative magnitude)
Local but realistic semantics
Distinguishes functional and referral trust
Distinguishes direct and inferred trust
Direct trust overrides conflicting inferred trust
Represents ambiguity explicitly
10/31/2016 Trust Management: T. K. Prasad
Thirunarayan et al , 2009
145. Formalizing the Framework
• Given a trust network (Nodes AN, Edges RL U
PFL U NFL with Trust Scopes TSF, Local
Orderings ⪯ANxAN), specify when a source can
trust, distrust, or be ambiguous about a
target, reflecting local semantics of:
• Functional and referral trust links
• Direct and inferred trust
• Locality
18710/31/2016 Trust Management: T. K. Prasad
146. 10/31/2016 Trust Management: T. K. Prasad 188
(In recommendations)
(For capacity to act)
(For lack of
capacity to act)
147. Benefits of Formal Analysis
• Enables detecting and avoiding
unintended consequences.
– An earlier formalization preferred “certain”
conclusion from a relatively less trustworthy
source over “ambiguous” conclusion from a
relatively more trustworthy source.
The whole problem with the world is
that fools and fanatics are always so
certain of themselves, but wiser people
so full of doubts. — Betrand Russell
10/31/2016 Trust Management: T. K. Prasad 192
148. Research Issues
• Determination of trust / influence from
social networks
– Text analytics on communication
– Analysis of network topology
• E.g., follower relationship, friend relationship, etc.
• HOLY GRAIL: Direct Semantics in favor of
Indirect Translations
10/31/2016 Trust Management: T. K. Prasad 195
149. Research Issues :
Credibility and Tweets
• Social Media is a source of News, means
for tracking Diseases, and coordination in
Disaster Scenarios
• But is fraught with Rumors (e.g., during
Mumbai Bombings), Lies (e.g., during
Boston Marathon Bombing), and Fakes
(e.g., during Hurricane Sandy)
10/31/2016 Trust Management: T. K. Prasad 196
151. Top 10 Credibility Indicators
from Morris et al. [37]
10/31/2016 Trust Management: T. K. Prasad 198
Feature Average Credibility Impact
Tweet is a RT from a trusted source 4.08
Author is an expert on the topic 4.04
Author is someone you follow 4.00
Contains URL you followed 3.93
Author is someone you have heard of 3.93
Author’s account is verified 3.92
Author often tweets on topic 3.74
Author has many tweets with similar content 3.71
Author uses personal photo as avatar 3.70
Author often mentioned and retweeted 3.69
152. Research Issues :
Multimodal Integration
• Intelligent integration of mobile sensor
and social data for situational awareness
– To exploit corroborative and
complementary evidence provided by them
– To obtain qualitative and quantitative
context
– To improve robustness and completeness
10/31/2016 Trust Management: T. K. Prasad 199
156. Research Issues
• Linguistic clues that betray
trustworthiness
• Experiments for gauging interpersonal
trust in real world situations
– *Techniques and tools to detect and amplify
useful signals in Self to more accurately predict
trust and trustworthiness in Others
10/31/2016 Trust Management: T. K. Prasad 203
*IARPA-TRUST program
157. Research Issues
• Other clues for gleaning trustworthiness
– Face (in photo) can effect perceived
trustworthiness and decision making
–Trust-inducing features of e-commerce
sites can impact buyers
– Personal traits: religious beliefs, age,
gullibility, benevolence, etc
– Nature of dyadic relationship
10/31/2016 Trust Management: T. K. Prasad 204
158. Research Issues
• Study of cross-cultural differences in
trustworthiness qualities and trust thresholds
to better understand
– Influence
• What aspects improve influence?
– Manipulation
• What aspects flag manipulation?
10/31/2016 Trust Management: T. K. Prasad 205
160. Research Issues
• Trust-aware resource management and
scheduling
– Clients specify resource
preferences/requirements/constraints
• Trust models for P2P systems
– To detect bad domains
– To detect bogus recommendations and attacks
10/31/2016 Trust Management: T. K. Prasad 207
Azzedin and Maheshwaran, 2002-2003
Azzedin and Ridha, 2010
Bessis et al, 2011
161. Research Issues : Resilience
• Resiliency is the ability to maintain an
acceptable level of service even with
faults and challenges to normal operation.
• Coping with failures in computer systems
– Failed component stops working
– Failed component sends conflicting
information to different parts of a system.
(Byzantine Fault)
• Agreement in the presence of faults.
10/31/2016 Trust Management: T. K. Prasad 208
162. Large Scale Graph Processing for
Scalable Implementation
10/31/2016 Trust Management: T. K. Prasad 209
163. Pregel
• Scalable general-purpose graph processing
system primarily designed for Google cluster.
• Inspired from Bulk Synchronous Parallel
164. Pregel: Anatomy of Superstep
• Read messages sent to vertex V in superstep S-1.
• Send messages to other vertices that will be
delivered in superstep S+1.
• Modify the state of V and it’s outgoing edges
• Can change the graph topology by adding,
deleting or modifying edges as well as vertices.
165. Giraph
• Open source implementation of Pregel.
• Can be executed as a Hadoop job.
worker worker
Task Tracker
JobTracker
NameNode
ZooKeeper
worker worker
Task Tracker
worker worker
Task Tracker
master worker
Task Tracker
Giraph implementation leveraging Hadoop
167. Bayesian Trust Management Framework :
Multi-level Trust Metric
10/31/2016 Trust Management: T. K. Prasad 214
Illustrating a General Approach
Quercia et al 2006
Josang and Haller 2007
Thirunarayan et al 2012
168. General Outline
• Motivation : Multi-level trust management
• Mathematical Foundation: Dirichlet
Distribution
• Implementation and Behavior Details:
– Local Trust Data Structures
– Trust Formation
– Bayesian Trust Evolution
• Analysis of Robustness to Attacks: Security
• Evaluation: Example trace vs. experiment
10/31/2016 Trust Management: T. K. Prasad 215
169. Conclusion
• Provided simple examples of trust (Why?)
• Explained salient features of trust (What?)
• Showed examples of gleaning trustworthiness
(How?)
• Touched upon research challenges in the
context of sensor, social, and interpersonal
networks.
• Described an outline for multi-level trust
management
10/31/2016 Trust Management: T. K. Prasad 216
170. http://knoesis.wright.edu/tkprasad
Department of Computer Science and Engineering
Wright State University, Dayton, Ohio, USA
Kno.e.sis: Ohio Center of Excellence in Knowledge-enabled Computing
10/31/16
Special Thanks to: Dr. Pramod Anantharam, Dr. Cory Henson, Professor Amit Sheth,
Dharan Kumar Althuru, Jacob Ross
K. Thirunarayan, et al: Comparative Trust Management with Applications:
Bayesian Approaches Emphasis. Future Generation Comp. Syst. 31: 182-199
(2014)
[Course: http://cecs.wright.edu/~tkprasad/courses/cs7600/cs7600.html]
217