Presented at the 31st ACM User Interface Software and Technology Symposium (UIST), 2018. Paper: https://www.ischool.utexas.edu/~ml/papers/nguyen-uist18.pdf
Talk given at Delft University speaker series on "Crowd Computing & Human-Centered AI" (https://www.academicfringe.org/). November 23, 2020. Covers two 2020 works:
(1) Anubrata Das, Brandon Dang, and Matthew Lease. Fast, Accurate, and Healthier: Interactive Blurring Helps Moderators Reduce Exposure to Harmful Content. In Proceedings of the 8th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2020.
Alexander Braylan and Matthew Lease. Modeling and Aggregation of Complex Annotations via Annotation Distances. In Proceedings of the Web Conference, pages 1807--1818, 2020.
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Matthew Lease
This document summarizes a presentation about designing human-AI partnerships for fact-checking misinformation. It discusses using crowdsourced rationales to improve the accuracy and cost-efficiency of annotation tasks. It also addresses challenges in designing interfaces for automatic fact-checking models, such as integrating human knowledge and reasoning to correct errors and account for bias. The goal is to develop mixed-initiative systems where humans and AI can jointly reason and personalize fact-checking.
Designing Human-AI Partnerships to Combat Misinfomation Matthew Lease
The document discusses designing human-AI partnerships to combat misinformation. It describes a prototype partnership where a human and AI work together to fact-check claims. The partnership aims to make the AI more transparent and address user bias by allowing the user to adjust the perceived reliability of news sources, which then changes the AI's political leaning analysis and fact checking results. The discussion wraps up by noting challenges like avoiding echo chambers and assessing potential harms, as well as opportunities for AI to reduce bias and increase trust through explainable, interactive systems.
Explainable Fact Checking with Humans in-the-loopMatthew Lease
Invited Keynote at KDD 2021 TrueFact Workshop: Making a Credible Web for Tomorrow, August 15, 2021.
https://www.microsoft.com/en-us/research/event/kdd-2021-truefact-workshop-making-a-credible-web-for-tomorrow/#!program-schedule
Talk given August 29, 2018 at the 1st Biannual Conference on Design of Experimental Search & Information Retrieval Systems (DESIRES 2018). Paper: https://www.ischool.utexas.edu/~ml/papers/lease-desires18.pdf
Presentation given at the Linguistic Data Consortium (LDC), University of Pennsylvania, April 2019. Based on presentations at the 6th ACM Collective Intelligence Conference, 2018 and the 6th AAAI Conference on Human Computation & Crowdsourcing (HCOMP), 2018. Blog post: https://blog.humancomputation.com/?p=9932.
AI & Work, with Transparency & the Crowd Matthew Lease
The document discusses designing human-AI partnerships and the role of crowdsourcing in AI systems. It summarizes work on designing AI assistants to work with humans, using crowds to help fact-check information, and explores challenges around protecting crowd workers who review harmful content or do "dirty jobs". It advocates for more research on ethics in AI and using crowds to help check work for ethical issues.
Talk given at Delft University speaker series on "Crowd Computing & Human-Centered AI" (https://www.academicfringe.org/). November 23, 2020. Covers two 2020 works:
(1) Anubrata Das, Brandon Dang, and Matthew Lease. Fast, Accurate, and Healthier: Interactive Blurring Helps Moderators Reduce Exposure to Harmful Content. In Proceedings of the 8th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2020.
Alexander Braylan and Matthew Lease. Modeling and Aggregation of Complex Annotations via Annotation Distances. In Proceedings of the Web Conference, pages 1807--1818, 2020.
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Matthew Lease
This document summarizes a presentation about designing human-AI partnerships for fact-checking misinformation. It discusses using crowdsourced rationales to improve the accuracy and cost-efficiency of annotation tasks. It also addresses challenges in designing interfaces for automatic fact-checking models, such as integrating human knowledge and reasoning to correct errors and account for bias. The goal is to develop mixed-initiative systems where humans and AI can jointly reason and personalize fact-checking.
Designing Human-AI Partnerships to Combat Misinfomation Matthew Lease
The document discusses designing human-AI partnerships to combat misinformation. It describes a prototype partnership where a human and AI work together to fact-check claims. The partnership aims to make the AI more transparent and address user bias by allowing the user to adjust the perceived reliability of news sources, which then changes the AI's political leaning analysis and fact checking results. The discussion wraps up by noting challenges like avoiding echo chambers and assessing potential harms, as well as opportunities for AI to reduce bias and increase trust through explainable, interactive systems.
Explainable Fact Checking with Humans in-the-loopMatthew Lease
Invited Keynote at KDD 2021 TrueFact Workshop: Making a Credible Web for Tomorrow, August 15, 2021.
https://www.microsoft.com/en-us/research/event/kdd-2021-truefact-workshop-making-a-credible-web-for-tomorrow/#!program-schedule
Talk given August 29, 2018 at the 1st Biannual Conference on Design of Experimental Search & Information Retrieval Systems (DESIRES 2018). Paper: https://www.ischool.utexas.edu/~ml/papers/lease-desires18.pdf
Presentation given at the Linguistic Data Consortium (LDC), University of Pennsylvania, April 2019. Based on presentations at the 6th ACM Collective Intelligence Conference, 2018 and the 6th AAAI Conference on Human Computation & Crowdsourcing (HCOMP), 2018. Blog post: https://blog.humancomputation.com/?p=9932.
AI & Work, with Transparency & the Crowd Matthew Lease
The document discusses designing human-AI partnerships and the role of crowdsourcing in AI systems. It summarizes work on designing AI assistants to work with humans, using crowds to help fact-check information, and explores challenges around protecting crowd workers who review harmful content or do "dirty jobs". It advocates for more research on ethics in AI and using crowds to help check work for ethical issues.
Introduction to Data Science and Large-scale Machine LearningNik Spirin
This document is a presentation about data science and artificial intelligence given by James G. Shanahan. It provides an outline that covers topics such as machine learning, data science applications, architecture, and future directions. Shanahan has over 25 years of experience in data science and currently works as an independent consultant and teaches at UC Berkeley. The presentation provides background on artificial intelligence and machine learning techniques as well as examples of their successful applications.
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
Amit Sheth's keynote at IEEE BigData 2014, Oct 29, 2014.
Abstract from:
http://cci.drexel.edu/bigdata/bigdata2014/keynotespeech.htm
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is personalized digital health that related to taking better decisions about our health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (e.g., information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will describe Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If my child is an asthma patient, for all the data relevant to my child with the four V-challenges, what I care about is simply, “How is her current health, and what are the risk of having an asthma attack in her current situation (now and today), especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP. I will motivate the need for a synergistic combination of techniques similar to the close interworking of the top brain and the bottom brain in the cognitive models.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city.
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Saurabh Mishra
This group reviewed data and measurements indicating the positive potential of AI to serve Sustainable Development Goals (SDG’s). Alongside these optimistic inquiries, this group also investigated the risks of AI in areas such as privacy, vulnerable populations, human rights, workplace and organizational policy. The socio-political consequences of AI raise many complex questions which require continued rigorous examination.
Philosophy of Big Data: Big Data, the Individual, and SocietyMelanie Swan
Philosophical concepts elucidate the impact the Big Data Era (exabytes/year of scientific, governmental, corporate, personal data being created) is having on our sense of ourselves as individuals in society as information generators in constant dialogue with the pervasive information climate.
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsAmit Sheth
Opening talk at Singapore Symposium on Sentiment Analysis (S3A), February 6, 2015, Singapore. http://s3a.sentic.net/#s3a2015
Abstract
With the rapid rise in the popularity of social media, and near ubiquitous mobile access, the sharing of observations and opinions has become common-place. This has given us an unprecedented access to the pulse of a populace and the ability to perform analytics on social data to support a variety of socially intelligent applications -- be it for brand tracking and management, crisis coordination, organizing revolutions or promoting social development in underdeveloped and developing countries.
I will review: 1) understanding and analysis of informal text, esp. microblogs (e.g., issues of cultural entity extraction and role of semantic/background knowledge enhanced techniques), and 2) how we built Twitris, a comprehensive social media analytics (social intelligence) platform.
I will describe the analysis capabilities along three dimensions: spatio-temporal-thematic, people-content-network, and sentiment-emption-intent. I will couple technical insights with identification of computational techniques and real-world examples using live demos of Twitris (http://twitris2.knoesis.org).
A talk at the Urban Science workshop at the Puget Sound Regional Council July 20 2014 organized by the Northwest Institute for Advanced Computing, a joint effort between Pacific Northwest National Labs and the University of Washington.
A key contemporary trend emerging in big data science is the Quantified Self (QS) - individuals engaged in the deliberate self-tracking of any kind of biological, physical, behavioral, or transactional information as n=1 individuals or in groups. This is giving rise to interesting pools of individual data, group data, and big data which can be interlinked to create a new era of highly-targeted value-specific consumer applications. There are significant opportunities in big data to develop models to support QS data collection, integration, analysis, and use for personal lifestyle and consumption management. There are also opportunities to provide leadership in designing consumer-friendly standards and etiquette regarding the use of personal and collective data. Next-generation QS big data applications and services could include tools for rendering QS data meaningful in behavior change, establishing baselines and variability in objective metrics, applying new kinds of pattern recognition techniques, and aggregating multiple self-tracking data streams from wearable electronics, biosensors, mobile phones, genomic data, and cloud-based services. Potential limitations regarding QS activity need to be considered including consumer non-adoption, data privacy and sharing concerns, the digital divide, ease-of-use, and social acceptance.
Presented at the Panel on
Sensor, Data, Analytics and Integration in Advanced Manufacturing, at the Connected Manufacturing track of Bosch-USA organized "Leveraging Public-Private Partnerships for Regional Growth Summit". Panel statement: Sensors, data and analytics are the core of any smart manufacturing system. What are the main challenges to create actionable outputs, replicate systems and scale efficiency gains across industries?
Moderator: Thomas Stiedl, Bosch
Panelists:
1. Amit Sheth, Wright State University
2. Howie Choset, Carnegie Melon University
3. Nagi Gebraeel, Georgia Institute of Technology
4. Brian Anthony, Massachusetts Institute of Technology
5. Yarom Polosky, Oak Ridget National Laboratory
For in-depth look:
Smart IoT: IoT as a human agent, human extension, and human complement
http://amitsheth.blogspot.com/2015/03/smart-iot-iot-as-human-agent-human.html
Semantic Gateway: http://knoesis.org/library/resource.php?id=2154
SSN Ontology: http://knoesis.org/library/resource.php?id=1659
Applications of Multimodal Physical (IoT), Cyber and Social Data for Reliable and Actionable Insights: http://knoesis.org/library/resource.php?id=2018
Smart Data: Transforming Big Data into Smart Data...: http://wiki.knoesis.org/index.php/Smart_Data
Historic use of the term Smart Data (2004): http://www.scribd.com/doc/186588820
Designing Cybersecurity Policies with Field ExperimentsGene Moo Lee
This document summarizes Gene Moo Lee's research on using randomized field experiments to evaluate the effectiveness of cybersecurity policies at the organizational level. The research aims to set up an independent institution to monitor organizations' cybersecurity levels and evaluate how information disclosure impacts behavior. The experiment involved randomly assigning over 7,900 US organizations to control, private disclosure, or public disclosure treatment groups. Preliminary results found that private disclosure did not change behavior but public disclosure via a website reduced spam volumes, especially for organizations that initially had large spam volumes or less competition. Further analysis of the effects is ongoing.
An invited talk in the Big Data session of the Industrial Research Institute meeting in Seattle Washington.
Some notes on how to train data science talent and exploit the fact that the membrane between academia and industry has become more permeable.
The Philosophy of Big Data is the branch of philosophy concerned with the foundations, methods, and implications of big data; the definitions, meaning, conceptualization, knowledge possibilities, truth standards, and practices in situations involving very-large data sets that are big in volume, velocity, variety, veracity, and variability
The document provides an overview of funding and active projects at Kno.e.sis as of December 2015. Key details include total extramural funds exceeding $8.3 million with the majority obtained that year from competitive NSF and NIH sources. Active projects focus on areas such as context-aware harassment detection on social media, monitoring drug trends on social media, disaster management using social and physical sensing, and modeling social behavior for healthcare utilization in depression. The summary highlights student and faculty involvement and accomplishments across multiple funded projects.
Teaching, Assessment and Learning Analytics: Time to Question AssumptionsSimon Buckingham Shum
Presented by the Assessment Research Centre
and the Melbourne Centre for the Study of Higher Education
Teaching, Assessment and Learning Analytics: Time to Question Assumptions
Simon Buckingham Shum
Professor of Learning Informatics, and Director of the Connected Intelligence Centre (CIC)
University of Technology Sydney
When: 11.30 -12.30 pm, Wed. 13 Sep 2017
Where: Frank Tate Room, Level 9, 100 Leicester St, Carlton
This will be a non-technical talk accessible to a broad range of educational practitioners and researchers, designed to provoke a conversation that provides time to question assumptions. The field of Learning Analytics sits at the convergence of two fields: Learning (including learning technology, educational research and learning/assessment sciences) and Analytics (statistics; visualisation; computer science; data science; AI). Many would add Human-Computer Interaction (e.g. participatory design; user experience; usability evaluation) as a differentiator from related fields such as Educational Data Mining, since the Learning Analytics community attracts many with a concern for the sociotechnical implications of designing and embedding analytics in educational organisations.
Learning Analytics is viewed by many educators with the same suspicion they reserve for AI or “learning management systems”. While in some cases this is justified, I will question other assumptions with some learning analytics examples which can serve as objects for us to think with. I am curious to know what connections/questions arise when these are shared..
Simon Buckingham Shum is Professor of Learning Informatics at the University of Technology Sydney, where he was appointed in August 2014 to direct the new Connected Intelligence Centre. Previously he was Professor of Learning Informatics and an Associate Director at The UK Open University’s Knowledge Media Institute. He is active in the field of Learning Analytics as a co-founder and former Vice President of the Society for Learning Analytics Research, and Program Co-Chair of LAK18, the International Learning Analytics and Knowledge Conference. Previously he co-founded the Compendium Institute and Learning Emergence networks. Simon brings a Human-Centred Informatics (HCI) approach to his work, with a background in Psychology (BSc, York), Ergonomics (MSc, London) and HCI Design Argumentation (PhD, York). He co-edited Visualizing Argumentation (2003) followed by Knowledge Cartography (2008, 2nd Edn. 2014), and with Al Selvin, wrote Constructing Knowledge Art (2015). He was recently appointed as a Fellow of The RSA. http://Simon.BuckinghamShum.net
The document discusses learning analytics and cognitive automation, and their implications for education. It begins by outlining how cognitive automation is automating routine cognitive work. This will impact learning analytics, as analytics aggregate lower-level data and AI automates routine cognitive tasks. As a result, humans must focus on higher-order skills like creativity, ethics, resilience and curiosity. The document then provides examples of learning analytics research focusing on dispositions, teamwork and learning beyond the classroom. It argues analytics could assess holistic development if they evaluate integration of knowledge, skills and dispositions over time.
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Amit Sheth
Featured Keynote at Worldcomp'14, July 2014: http://www.world-academy-of-science.org/worldcomp14/ws/keynotes/keynote_sheth
Video of the talk at: http://youtu.be/2991W7OBLqU
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is human health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information, etc.). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will forward the concept of Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If I am an asthma patient, for all the data relevant to me with the four V-challenges, what I care about is simply, “How is my current health, and what is the risk of having an asthma attack in my personal situation, especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city. I will present examples from a couple of these.
Strategic Network Formation in a Location-Based Social NetworkGene Moo Lee
This document summarizes Gene Moo Lee's presentation on strategic network formation in location-based social networks. It introduces three research questions about how mobile users form friendships and measures user similarity. It then provides an overview of Lee's structural model of network formation, approach to measuring user similarity using topic models, and empirical analysis using a large LBSN dataset to examine how different factors influence friendship links.
This document discusses algorithmic fairness and the impacts of machine learning and AI systems on society. It provides an outline of topics to be covered, including sources of algorithmic bias, examination of key research papers in the field, and sketching out a different direction for the discussion. The document reviews several influential papers that proposed definitions of fairness for algorithms and techniques for achieving fairness, but also showed limitations and tradeoffs. It discusses how combining the results of these papers suggests that achieving perfect algorithmic fairness through technical means alone may not be possible. The document argues for taking a broader view of the problem that considers social and normative issues, in addition to technical approaches.
The document discusses how digital infrastructures like algorithms, social media platforms, and search engines reflect and perpetuate existing biases and inequities in society. It provides examples of how algorithms have exhibited biases against women in hiring and racial disparities in criminal risk assessments. Social media platforms have enabled the spread of misinformation and the harvesting of personal data without consent. Search engines aim to provide unbiased access to information but can also normalize hateful or extremist views through their commercial priorities and lack of vetting. Overall, the document argues that digital infrastructures are shaped by the values of their creators and serve to preserve existing power structures and inequities unless changes are made.
Introduction to Data Science and Large-scale Machine LearningNik Spirin
This document is a presentation about data science and artificial intelligence given by James G. Shanahan. It provides an outline that covers topics such as machine learning, data science applications, architecture, and future directions. Shanahan has over 25 years of experience in data science and currently works as an independent consultant and teaches at UC Berkeley. The presentation provides background on artificial intelligence and machine learning techniques as well as examples of their successful applications.
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
Amit Sheth's keynote at IEEE BigData 2014, Oct 29, 2014.
Abstract from:
http://cci.drexel.edu/bigdata/bigdata2014/keynotespeech.htm
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is personalized digital health that related to taking better decisions about our health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (e.g., information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will describe Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If my child is an asthma patient, for all the data relevant to my child with the four V-challenges, what I care about is simply, “How is her current health, and what are the risk of having an asthma attack in her current situation (now and today), especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP. I will motivate the need for a synergistic combination of techniques similar to the close interworking of the top brain and the bottom brain in the cognitive models.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city.
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Saurabh Mishra
This group reviewed data and measurements indicating the positive potential of AI to serve Sustainable Development Goals (SDG’s). Alongside these optimistic inquiries, this group also investigated the risks of AI in areas such as privacy, vulnerable populations, human rights, workplace and organizational policy. The socio-political consequences of AI raise many complex questions which require continued rigorous examination.
Philosophy of Big Data: Big Data, the Individual, and SocietyMelanie Swan
Philosophical concepts elucidate the impact the Big Data Era (exabytes/year of scientific, governmental, corporate, personal data being created) is having on our sense of ourselves as individuals in society as information generators in constant dialogue with the pervasive information climate.
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsAmit Sheth
Opening talk at Singapore Symposium on Sentiment Analysis (S3A), February 6, 2015, Singapore. http://s3a.sentic.net/#s3a2015
Abstract
With the rapid rise in the popularity of social media, and near ubiquitous mobile access, the sharing of observations and opinions has become common-place. This has given us an unprecedented access to the pulse of a populace and the ability to perform analytics on social data to support a variety of socially intelligent applications -- be it for brand tracking and management, crisis coordination, organizing revolutions or promoting social development in underdeveloped and developing countries.
I will review: 1) understanding and analysis of informal text, esp. microblogs (e.g., issues of cultural entity extraction and role of semantic/background knowledge enhanced techniques), and 2) how we built Twitris, a comprehensive social media analytics (social intelligence) platform.
I will describe the analysis capabilities along three dimensions: spatio-temporal-thematic, people-content-network, and sentiment-emption-intent. I will couple technical insights with identification of computational techniques and real-world examples using live demos of Twitris (http://twitris2.knoesis.org).
A talk at the Urban Science workshop at the Puget Sound Regional Council July 20 2014 organized by the Northwest Institute for Advanced Computing, a joint effort between Pacific Northwest National Labs and the University of Washington.
A key contemporary trend emerging in big data science is the Quantified Self (QS) - individuals engaged in the deliberate self-tracking of any kind of biological, physical, behavioral, or transactional information as n=1 individuals or in groups. This is giving rise to interesting pools of individual data, group data, and big data which can be interlinked to create a new era of highly-targeted value-specific consumer applications. There are significant opportunities in big data to develop models to support QS data collection, integration, analysis, and use for personal lifestyle and consumption management. There are also opportunities to provide leadership in designing consumer-friendly standards and etiquette regarding the use of personal and collective data. Next-generation QS big data applications and services could include tools for rendering QS data meaningful in behavior change, establishing baselines and variability in objective metrics, applying new kinds of pattern recognition techniques, and aggregating multiple self-tracking data streams from wearable electronics, biosensors, mobile phones, genomic data, and cloud-based services. Potential limitations regarding QS activity need to be considered including consumer non-adoption, data privacy and sharing concerns, the digital divide, ease-of-use, and social acceptance.
Presented at the Panel on
Sensor, Data, Analytics and Integration in Advanced Manufacturing, at the Connected Manufacturing track of Bosch-USA organized "Leveraging Public-Private Partnerships for Regional Growth Summit". Panel statement: Sensors, data and analytics are the core of any smart manufacturing system. What are the main challenges to create actionable outputs, replicate systems and scale efficiency gains across industries?
Moderator: Thomas Stiedl, Bosch
Panelists:
1. Amit Sheth, Wright State University
2. Howie Choset, Carnegie Melon University
3. Nagi Gebraeel, Georgia Institute of Technology
4. Brian Anthony, Massachusetts Institute of Technology
5. Yarom Polosky, Oak Ridget National Laboratory
For in-depth look:
Smart IoT: IoT as a human agent, human extension, and human complement
http://amitsheth.blogspot.com/2015/03/smart-iot-iot-as-human-agent-human.html
Semantic Gateway: http://knoesis.org/library/resource.php?id=2154
SSN Ontology: http://knoesis.org/library/resource.php?id=1659
Applications of Multimodal Physical (IoT), Cyber and Social Data for Reliable and Actionable Insights: http://knoesis.org/library/resource.php?id=2018
Smart Data: Transforming Big Data into Smart Data...: http://wiki.knoesis.org/index.php/Smart_Data
Historic use of the term Smart Data (2004): http://www.scribd.com/doc/186588820
Designing Cybersecurity Policies with Field ExperimentsGene Moo Lee
This document summarizes Gene Moo Lee's research on using randomized field experiments to evaluate the effectiveness of cybersecurity policies at the organizational level. The research aims to set up an independent institution to monitor organizations' cybersecurity levels and evaluate how information disclosure impacts behavior. The experiment involved randomly assigning over 7,900 US organizations to control, private disclosure, or public disclosure treatment groups. Preliminary results found that private disclosure did not change behavior but public disclosure via a website reduced spam volumes, especially for organizations that initially had large spam volumes or less competition. Further analysis of the effects is ongoing.
An invited talk in the Big Data session of the Industrial Research Institute meeting in Seattle Washington.
Some notes on how to train data science talent and exploit the fact that the membrane between academia and industry has become more permeable.
The Philosophy of Big Data is the branch of philosophy concerned with the foundations, methods, and implications of big data; the definitions, meaning, conceptualization, knowledge possibilities, truth standards, and practices in situations involving very-large data sets that are big in volume, velocity, variety, veracity, and variability
The document provides an overview of funding and active projects at Kno.e.sis as of December 2015. Key details include total extramural funds exceeding $8.3 million with the majority obtained that year from competitive NSF and NIH sources. Active projects focus on areas such as context-aware harassment detection on social media, monitoring drug trends on social media, disaster management using social and physical sensing, and modeling social behavior for healthcare utilization in depression. The summary highlights student and faculty involvement and accomplishments across multiple funded projects.
Teaching, Assessment and Learning Analytics: Time to Question AssumptionsSimon Buckingham Shum
Presented by the Assessment Research Centre
and the Melbourne Centre for the Study of Higher Education
Teaching, Assessment and Learning Analytics: Time to Question Assumptions
Simon Buckingham Shum
Professor of Learning Informatics, and Director of the Connected Intelligence Centre (CIC)
University of Technology Sydney
When: 11.30 -12.30 pm, Wed. 13 Sep 2017
Where: Frank Tate Room, Level 9, 100 Leicester St, Carlton
This will be a non-technical talk accessible to a broad range of educational practitioners and researchers, designed to provoke a conversation that provides time to question assumptions. The field of Learning Analytics sits at the convergence of two fields: Learning (including learning technology, educational research and learning/assessment sciences) and Analytics (statistics; visualisation; computer science; data science; AI). Many would add Human-Computer Interaction (e.g. participatory design; user experience; usability evaluation) as a differentiator from related fields such as Educational Data Mining, since the Learning Analytics community attracts many with a concern for the sociotechnical implications of designing and embedding analytics in educational organisations.
Learning Analytics is viewed by many educators with the same suspicion they reserve for AI or “learning management systems”. While in some cases this is justified, I will question other assumptions with some learning analytics examples which can serve as objects for us to think with. I am curious to know what connections/questions arise when these are shared..
Simon Buckingham Shum is Professor of Learning Informatics at the University of Technology Sydney, where he was appointed in August 2014 to direct the new Connected Intelligence Centre. Previously he was Professor of Learning Informatics and an Associate Director at The UK Open University’s Knowledge Media Institute. He is active in the field of Learning Analytics as a co-founder and former Vice President of the Society for Learning Analytics Research, and Program Co-Chair of LAK18, the International Learning Analytics and Knowledge Conference. Previously he co-founded the Compendium Institute and Learning Emergence networks. Simon brings a Human-Centred Informatics (HCI) approach to his work, with a background in Psychology (BSc, York), Ergonomics (MSc, London) and HCI Design Argumentation (PhD, York). He co-edited Visualizing Argumentation (2003) followed by Knowledge Cartography (2008, 2nd Edn. 2014), and with Al Selvin, wrote Constructing Knowledge Art (2015). He was recently appointed as a Fellow of The RSA. http://Simon.BuckinghamShum.net
The document discusses learning analytics and cognitive automation, and their implications for education. It begins by outlining how cognitive automation is automating routine cognitive work. This will impact learning analytics, as analytics aggregate lower-level data and AI automates routine cognitive tasks. As a result, humans must focus on higher-order skills like creativity, ethics, resilience and curiosity. The document then provides examples of learning analytics research focusing on dispositions, teamwork and learning beyond the classroom. It argues analytics could assess holistic development if they evaluate integration of knowledge, skills and dispositions over time.
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Amit Sheth
Featured Keynote at Worldcomp'14, July 2014: http://www.world-academy-of-science.org/worldcomp14/ws/keynotes/keynote_sheth
Video of the talk at: http://youtu.be/2991W7OBLqU
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is human health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information, etc.). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will forward the concept of Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If I am an asthma patient, for all the data relevant to me with the four V-challenges, what I care about is simply, “How is my current health, and what is the risk of having an asthma attack in my personal situation, especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city. I will present examples from a couple of these.
Strategic Network Formation in a Location-Based Social NetworkGene Moo Lee
This document summarizes Gene Moo Lee's presentation on strategic network formation in location-based social networks. It introduces three research questions about how mobile users form friendships and measures user similarity. It then provides an overview of Lee's structural model of network formation, approach to measuring user similarity using topic models, and empirical analysis using a large LBSN dataset to examine how different factors influence friendship links.
This document discusses algorithmic fairness and the impacts of machine learning and AI systems on society. It provides an outline of topics to be covered, including sources of algorithmic bias, examination of key research papers in the field, and sketching out a different direction for the discussion. The document reviews several influential papers that proposed definitions of fairness for algorithms and techniques for achieving fairness, but also showed limitations and tradeoffs. It discusses how combining the results of these papers suggests that achieving perfect algorithmic fairness through technical means alone may not be possible. The document argues for taking a broader view of the problem that considers social and normative issues, in addition to technical approaches.
The document discusses how digital infrastructures like algorithms, social media platforms, and search engines reflect and perpetuate existing biases and inequities in society. It provides examples of how algorithms have exhibited biases against women in hiring and racial disparities in criminal risk assessments. Social media platforms have enabled the spread of misinformation and the harvesting of personal data without consent. Search engines aim to provide unbiased access to information but can also normalize hateful or extremist views through their commercial priorities and lack of vetting. Overall, the document argues that digital infrastructures are shaped by the values of their creators and serve to preserve existing power structures and inequities unless changes are made.
Determining the Fit and Impact of CTI Indicators on Your Monitoring Pipeline ...Alex Pinto
Implementing an appropriate data processing pipeline to make good use of your indicators of compromise is a problem that has been successfully addressed over the last few years. Even with all the push of automation and orchestration, a fundamental question remains: WHICH data should I be ingesting in my detection pipelines? There is no lack of data available, shared or not, paid or not. But how to keep my CTI IR team from spinning their wheels on a pile of CTI mud?
This talk will discuss statistical analysis you can do with the CTI indicators you collect and your own network telemetry to define:
- FIT: How appropriate does the CTI data apply to your own traffic. CTI vendors always talk about vertical specific threats, but is that measurable and verifiable?
- IMPACT: How much was your true positive detections assisted by matches and link analysis derived from those CTI feeds
- COVERAGE: Is your current mix of CTI feeds providing "intelligence" on the current threats that you should actually be concerned with?
Those concepts will be introduced and explained with minimal math background needed, and pseudo-code (and real-code!) will be provided to assist organizations to perform those experiments on their own environment. We hope those tools will help attendees to better evaluate the quality of the CTI feeds they ingest from their open sources, paid providers and sharing communities.
How do we train AI to be Ethical and Unbiased?Mark Borg
The document discusses recent achievements in AI such as improvements in speech recognition and image captioning. It then addresses the widespread use of AI and potential benefits as well as concerns regarding issues like data bias, model reliability, misuse of AI systems, and adversarial AI. The document argues that addressing these technical issues and social implications will help maximize the benefits of AI.
This document discusses perspectives on artificial intelligence (AI) from technology leaders and experts. It summarizes views that AI will benefit humanity by helping to solve major challenges, but could also pose existential risks if not developed responsibly. The document also outlines how AI is rapidly advancing and transforming industries like automotive, healthcare, and personal assistance. While AI may displace some jobs, it could also create new types of work. Overall the document expresses an optimistic view of AI's potential if issues around ethics, safety, and economic impacts are adequately addressed.
This document discusses bias in artificial intelligence and algorithms. It begins with an introduction to the topic and why it is important. It then explores how to detect bias through various fairness metrics and how to mitigate bias through preprocessing, inprocessing, and postprocessing techniques. The document provides examples of different sources of bias and strategies to address them. It also recommends resources like the AI Fairness 360 toolkit to help evaluate models for fairness and identify potential biases.
Crowdsourcing & ethics: a few thoughts and refences. Matthew Lease
Extracts and addendums from an earlier talk, for those interested in ethics and related issues in regard to crowdsourcing, particularly research uses. Slides updated Sept. 2, 2013.
This document provides an overview of artificial intelligence and machine learning. It begins by defining AI as computer systems that can perform tasks autonomously and adaptively. Machine learning is described as getting computers to learn without being explicitly programmed. Examples of machine learning in daily life are discussed. The basics of supervised and unsupervised learning are explained. Ethical issues around AI like bias, fairness, and determining appropriate use are then discussed. Options for addressing these issues like ensuring diversity of data and viewpoints are presented. The document concludes by providing recommendations for further learning.
The law and ethics of data-driven artificial intelligencePyData
By Aileen Nielsen
PyData New York City 2017
This talk is a completely non-technical discussion of how the law currently regulates artificial intelligence (if it does at all) and what is likely to change in the near future. The talk is geared towards technically-minded practitioners of data driven intelligence with the aim of increasing discussion of the social and ethical impact of data driven AI and how to code responsibly.
This document discusses some of the challenges of recommendation systems and how they can go wrong if not implemented carefully. It provides three key points:
1) Recommendation systems are difficult to implement well due to issues like cold starts, sparsity of data, and the potential to propagate biases in the data. Features used and their interactions must be chosen carefully.
2) These systems can negatively impact people's lives if they are not developed with considerations for fairness, ethics, and accountability. Models must be interpretable and avoid learning or promoting prejudices.
3) Thorough testing is needed using diverse datasets, including outliers, to identify and prevent unfair, biased, or harmful behaviors before systems are deployed. Ensemble and
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Krishnaram Kenthapadi
This document provides an overview of explainable AI techniques. It discusses how explainable AI aims to make AI models more transparent and understandable by providing explanations for their predictions. Various explanation methods are covered, including model-specific techniques like interpreting gradients in neural networks, as well as model-agnostic approaches like Shapley values from game theory. The document explains how explanations are important for building user trust in AI systems and can help with debugging, analyzing robustness, and extracting rules from complex models.
Artificial intelligence (AI) refers to a constellation of technologies, including machine learning, perception, reasoning, and natural language processing. While the field has been pursuing principles and applications for over 65 years, recent advances, uses, and attendant public excitement have returned it to the spotlight. The impact of early AI 1 systems is already being felt, bringing with it challenges and opportunities, and laying the foundation on which future advances in AI will be integrated into social and economic domains. The potential wide-ranging impact make it necessary to look carefully at the ways in which these technologies are being applied now, whom they’re benefiting, and how they’re structuring our social, economic, and interpersonal lives.
Emerging Technologies in Data Sharing and Analytics at Data61Liming Zhu
This document provides an overview of Data61, Australia's national science agency, and its work in emerging technologies related to data sharing and analytics. It discusses Data61's strategic goals and focus areas such as artificial intelligence, cybersecurity, digital agriculture, and quantum technologies. It also summarizes Data61's work establishing Australia's National AI Centre and its research on topics like blockchain, federated learning, and regulatory technologies.
Privacy in AI/ML Systems: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
How do we protect the privacy of users when building large-scale AI based systems? How do we develop machine learning models and systems taking fairness, accuracy, explainability, and transparency into account? Model fairness and explainability and protection of user privacy are considered prerequisites for building trust and adoption of AI systems in high stakes domains. We will first motivate the need for adopting a “fairness, explainability, and privacy by design” approach when developing AI/ML models and systems for different consumer and enterprise applications from the societal, regulatory, customer, end-user, and model developer perspectives. We will then focus on the application of privacy-preserving AI techniques in practice through industry case studies. We will discuss the sociotechnical dimensions and practical challenges, and conclude with the key takeaways and open challenges.
Shift AI 2020: How to identify and treat biases in ML Models | Navdeep Sharma...Shift Conference
Shift AI was a success, connecting hundreds of professionals that were eager to propel the progress of AI and discuss the newest technologies in data mining, machine learning and neural networks. More at https://ai.shiftconf.co/.
Talk description:
With all the breakthroughs in Machine Learning space, ML models are now being used to make decisions affecting the lives of humans, more than ever. Hence judging the quality of a model can no longer only fulfilled by accuracy, precision, and recall. It's important to understand that each individual and group of people is being treated with equality without any historical bias existed in the data. This talk focuses on some of the many potential ways to establish fairness as metrics for ML models in your organization. Also, my learnings and challenges, I encountered while building a fairness tool for data scientists and business stakeholders.
Demo: Algorithmic Fairness Tool (AFT) was an innovation project, done at Accenture The Dock, which focused on bringing the latest research from academia and building a tool for the industry.
Automated Models for Quantifying Centrality of Survey ResponsesMatthew Lease
Research talk presented at "Innovations in Online Research" (October 1, 2021)
Event URL: https://web.cvent.com/event/d063e447-1f16-4f70-a375-5d6978b3feea/websitePage:b8d4ce12-3d02-4d24-897d-fd469ca4808a.
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Matthew Lease
Presentation at the 1st Biannual Conference on Design of Experimental Search & Information Retrieval Systems (DESIRES 2018). August 30, 2018. Paper: https://www.ischool.utexas.edu/~ml/papers/kutlu-desires18.pdf
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Matthew Lease
Presentation at the 6th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), July 7, 2018. Work by Tanya Goyal, Tyler McDonnell, Mucahid Kutlu, Tamer Elsayed, and Matthew Lease. Pages 41-49 in conference proceedings. Online version of paper includes corrections to official version in proceedings: https://www.ischool.utexas.edu/~ml/papers/goyal-hcomp18
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...Matthew Lease
Invited Talk at the ACM JCDL 2018 WORKSHOP ON CYBERINFRASTRUCTURE AND MACHINE LEARNING FOR DIGITAL LIBRARIES AND ARCHIVES. https://www.tacc.utexas.edu/conference/jcdl18
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
Talk given at the 8th Forum for Information Retrieval Evaluation (FIRE, http://fire.irsi.res.in/fire/2016/), December 10, 2016, and at the Qatar Computing Research Institute (QCRI), December 15, 2016.
Systematic Review is e-Discovery in Doctor’s ClothingMatthew Lease
This document discusses opportunities for collaboration between researchers working in systematic reviews and electronic discovery (e-discovery). It notes similarities in the challenges both fields face, including the need for high recall with bounded costs and reliance on multi-stage review pipelines. The document proposes that technologies developed for semi-automated citation screening and crowdsourcing could help address current limitations. It concludes by encouraging information retrieval researchers to investigate open problems in systematic reviews as opportunities to advance technologies beyond other tasks and help bring together interested parties through forums like the TREC Total Recall track.
Crowd computing utilizes both crowdsourcing and human computation to solve problems. Crowdsourcing enables more efficient and scalable data collection and processing by outsourcing tasks to a large, undefined group of people. Human computation allows software developers to incorporate human intelligence and judgment into applications to provide capabilities beyond current artificial intelligence. Examples discussed include Amazon Mechanical Turk, various crowd-powered applications, and how crowdsourcing has helped label large datasets to train machine learning models.
The Rise of Crowd Computing (December 2015)Matthew Lease
Crowd computing is rising with two waves - the first using crowds to label large amounts of data for artificial intelligence applications. The second wave delivers applications that go beyond AI abilities by incorporating human computation. Open problems remain around ensuring high quality outputs, task design, understanding the worker context and experience, and addressing ethics concerns around opaque platforms and working conditions. The future holds potential for empowering crowd work but also risks like digital sweatshops if worker freedoms and conditions are not considered.
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsMatthew Lease
The document summarizes a presentation about analyzing paid crowd work platforms beyond Mechanical Turk. It discusses how Mechanical Turk has dominated research on paid crowdsourcing due to its early popularity, but that it has limitations. The presentation conducts a qualitative study of 7 alternative crowd work platforms to identify distinguishing capabilities not found on MTurk, such as different payment models, richer worker profiles, and support for confidential tasks. It aims to increase awareness of other platforms to further inform practice and research on crowdsourcing.
Toward Effective and Sustainable Online Crowd WorkMatthew Lease
New forms of online crowd work enabled by technology present both opportunities for innovation and risks of harm that require careful consideration. This document discusses three main issues. First, some crowd work tasks may enable illegal or unethical goals. Second, the lack of regulation means crowd work practices sometimes exploit vulnerable workers by not ensuring informed consent. Third, multi-stakeholder discussions are needed to develop win-win solutions that balance costs, quality, and what is fair for all parties in a global context. The goal is to learn from each other and find ways to encourage ethical practices.
Crowdsourcing: From Aggregation to Search Engine EvaluationMatthew Lease
This document provides an overview of statistical crowdsourcing and its applications. It discusses crowdsourcing platforms like Amazon Mechanical Turk and how they have enabled large-scale data labeling for tasks in areas like natural language processing. It also summarizes research on using crowdsourcing to evaluate search engines and benchmarks different statistical consensus methods for aggregating judgments from crowds. Finally, it presents work on using psychometrics and crowdsourcing to model multidimensional relevance through structured surveys and factor analysis.
Talk at AAAI Human Computation 2013 Workshop on Scaling Speech, Language Understanding and Dialogue through Crowdsourcing (November 9, 2013): http://faculty.washington.edu/mtjalve/HCOMP2013.Workshop.html
Crowdsourcing & Human Computation Labeling Data & Building Hybrid SystemsMatthew Lease
This document provides an overview of crowdsourcing and human computation. It begins with examples of using Amazon Mechanical Turk for basic tasks like labeling data. It then discusses how crowdsourcing can be used for more complex applications and discusses factors like incentive design, quality control, and platform selection. The document provides guidance on task design, experiment workflow, and usability considerations for effective crowdsourcing.
Talk presented at the ID360 Conference (http://identity.utexas.edu/id360), May 1, 2013. Paper: http://ssrn.com/abstract=2228728. Joint work with Jessica Hullman, Jeffrey P. Bigham, Michael S. Bernstein, Juho Kim, Walter S. Lasecki, Saeideh Bakhshi, Tanushree Mitra, and Robert C. Miller.
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on integration of Salesforce with Bonterra Impact Management.
Interested in deploying an integration with Salesforce for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Choosing The Best AWS Service For Your Website + API.pptx
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
1. Believe it or not: Designing a Human-AI
Partnership for Mixed-Initiative Fact-Checking
Joint work with
An Thanh Nguyen (UT), Byron Wallace (Northeastern), & more!
Matt Lease
School of Information @mattlease
University of Texas at Austin ml@utexas.edu
Slides:
slideshare.net/mattlease
2. “The place where people & technology meet”
~ Wobbrock et al., 2009
“iSchools” now exist at over 65 universities around the world
www.ischools.org
What’s an Information School?
2Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
3. Information Literacy
National Information Literacy Awareness Month,
US Presidential Proclamation, October 1, 2009.
“Though we may know how to find the information
we need, we must also know how to evaluate it.
Over the past decade, we have seen a crisis of
authenticity emerge. We now live in a world where
anyone can publish an opinion or perspective, whether
true or not, and have that opinion amplified…”
3Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
4. “Truthiness”
“Truthiness is tearing apart our country... It used to
be everyone was entitled to their own opinion, but
not their own facts. But that’s not the case anymore.”
– Stephen Colbert (Jan. 25, 2006)
4Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
9. Fake News Challenge
9Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
10. 10Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
11. 11Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
12. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Challenges
• Fair, Accountable, & Transparent (AI)
– Why trust “black box” classifier?
– How do we reason about potential bias?
– Do people really only want to know true vs. false?
– How to integrate human knowledge/experience?
• Joint AI + Human Reasoning, Correct Errors, Personalization
• How to design strong Human + AI Partnerships?
– Horvitz, CHI’99: mixed-initiative design
– Dove et al., CHI’17 “Machine Learning As a Design Material”
12
13. MemeBrowser (Ryu et al., ACM HyperText’12)
http://odyssey.ischool.utexas.edu/mb
13Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
14. • Crowdsourced stance labels
– Hybrid AI + Human (near real-time) Prediction
• Joint model of stance, veracity, & annotators
– Interaction between variables
– Interpretable
• Source on github
Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Nguyen et al., AAAI’18
14
15. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
This Work
15
Demo!
16. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
This Work
16
http://fcweb.pythonanywhere.com
17. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Primary Interface
17
18. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Source Reputation
18
19. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
System Architecture
• Google Search API
• Two logistic regression models
– Stance (Ferreira & Vlachos ’16) w/ same features
• average accuracy > 70% but variable over claims
– Veracity (Popat et al. ‘17)
– Scikit-learn, L1 regularization, Liblinear solver, &
default parameters
19
20. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Data: Train & Test
Emergent (Ferreira & Vlachos ’16)
Accuracy of prediction models
20
21. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
21
22. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
22
23. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
23
24. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
– 113 participants (58 control, 55 system) – MTurk
• 1. Asked to predict claim veracity
– Likert: Def. False, Prob. F., neutral, Prob. True, Def. T.
– Error: distance
• For a (definitely) false claim, PF -> error=1, DT -> error=4
• 2. Shown model’s claim prediction
24
25. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
25
26. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
26
27. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 1: Setup
• 2 Groups: Control vs. System
– 113 participants (58 control, 55 system) – MTurk
• 1. Asked to predict claim veracity
– Likert: Def. False, Prob. F., neutral, Prob. True, Def. T.
– Error: distance
• For a (definitely) false claim, PF -> error=1, DT -> error=4
• 2. Shown model’s claim prediction
– Asked if they want to change their answer?
27
28. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Before seeing model’s claim prediction
• “System” group:
– claims 1-2: > avg. prediction error
– claim 4: < error
– claims 3, 5: only small differences
• Human accuracy in claim prediction roughly
follows model’s accuracy in stance prediction
– i.e., helped when model correct, hurt when not
28
29. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
After seeing model’s claim prediction
• “System” group:
– Smaller changes for errors
– change answers > “control” group
• Human accuracy in claim prediction roughly
follows model’s accuracy in veracity prediction
– i.e., helped when model correct, hurt when not
29
30. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Study 1: Statistical Significance (1 of 2)
• Mixed-effects Generalized Linear Model (GLM)
• Before seeing model’s claim predictions
– CSP: # correct stance predictions seen
– WSP: # wrong stance predictions seen
– 2-tail test includes unlikely possibility that seeing
correct stance predictions increases human error
30
31. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Study 1: Statistical Significance (2 of 2)
• Mixed-effects Generalized Linear Model (GLM)
• After seeing model’s claim predictions
– CSP: # correct stance predictions seen
– WSP: # wrong stance predictions seen
– 2-tail test includes unlikely possibility that seeing
correct stance predictions increases human error
31
32. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 2: Setup
• 2 Groups: Control vs. Slider
32
33. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
User Study 2: Setup
• 2 Groups: Control vs. Slider
– 109 participants (51 control, 58 slider) – MTurk
• 1. Asked to predict claim veracity
– Likert: Def. False, Prob. F., neutral, Prob. True, Def. T.
and indicate confidence in prediction
– (-20,+20) score range: accuracy x confidence
• eg, a correct answer with 75% confidence -> 20x75%=15
33
34. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
• “Slider” group:
– Claims 1-3: > score on average
– Claims 4-5: < first quartiles, but medians same
• Some participants negatively impacted by slider
– No statistically significant difference on average
34
User Study 2: Results
35. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Discussion & Future Work
• Fact Checking & IR (Lease, DESIRES’18)
– How to diversify search results for controversial topics?
– Information evaluation (eg, vaccination & autism)
• Potential harm as well as good
– Potential added confusion, data / algorithmic bias
– Potential for personal “echo chamber”
– Adversarial applications
• Future Work
– Making personalization more visible
– Collaborative use, small and big
• 1st author Nguyen looking for a postdoc!
35
36. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Conclusion
• Fact-checking more than black-box prediction
– Interaction, exploration, trust
• We proposed a mixed-initiative human + AI
partnership for fact-checking
– Back-end AI + front-end interface/interaction
– Support AI + human collaboration
– Fair, Accountable, & Transparent (FAT) AI
36
37. Matt Lease (UT Austin) • Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact-Checking
Thank You!
Slides: slideshare.net/mattlease
Lab: ir.ischool.utexas.edu
37