There has been much effort on studying how social media sites, such as Twitter, help propagate information in differ- ent situations, including spreading alerts and SOS messages in an emergency. However, existing work has not addressed how to actively identify and engage the right strangers at the right time on social media to help effectively propagate intended information within a desired time frame. To ad- dress this problem, we have developed two models: (i) a feature-based model that leverages peoples’ exhibited social behavior, including the content of their tweets and social interactions, to characterize their willingness and readiness to propagate information on Twitter via the act of retweeting; and (ii) a wait-time model based on a user's previous retweeting wait times to predict her next retweeting time when asked. Based on these two models, we build a recommender system that predicts the likelihood of a stranger to retweet information when asked, within a specific time window, and recommends the top-N qualified strangers to engage with. Our experiments, including live studies in the real world, demonstrate the effectiveness of our work.
Presented at Intelligent User Interfaces 2014, Haifa, Israel. February 27, 2014.
Using Public Social Media to Find Answers to QuestionsJeffrey Nichols
The document discusses using public social media to find answers to questions. It presents research on asking targeted questions to individuals on social media and analyzing their responses. A prototype called TSA Tracker asked people at airports to share security wait times. Over 40% of people responded, providing useful information. The results suggest it is feasible to obtain quality information and answers to questions from strangers on social media, especially if an incentive is provided. Future work involves building technology to better target messaging and support this question asking process at scale.
Using the Crowd to Understand and Adapt User InterfacesJeffrey Nichols
Using crowdsourcing techniques, researchers have explored adapting existing user interfaces and developing new interface models. CoScripter allows crowds to capture and share task traces, which can then be used to generate mobile applications through tools like Highlight. CoCo leverages crowdsourced scripts to enable conversational interfaces. Researchers also experimented with using crowds to build abstract models of interfaces, though this work remains preliminary. Harnessing crowds offers potential for engineering new interfaces and adapting existing ones, but significant challenges remain in areas like model construction.
CHI 2014 Panel: Opportunities and Risks of Discovering Personality Traits fro...Jeffrey Nichols
We will conduct a panel at CHI 2014 discussing the opportunities and implications of the coming wave of new analytics that allow individuals' intrinsic traits, such as personality and motivations, to be mined from their behaviors on social platforms.
For more information, see the Facebook page for the panel here:
https://www.facebook.com/events/633305060096913/
My talk from Carnegie Mellon's HCII Seminar on April 24, 2013.
Abstract:
On some social media platforms, such as Twitter, Youtube, Pinterest, and tumblr, much of the content generated by users is publicly accessible and communication can be easily initiated between strangers who have never previously communicated before. The communities that have risen up around these platforms, particularly on Twitter, can also be inclusive and supportive of interactions between strangers. The public and open nature of these communities creates an opportunity to create a new kind of crowdsourcing system, where individuals are identified who may be good candidates to complete various tasks based on their published content. We explore the potential of such a system through several information collection tasks, examining the response rate and information quality that can be obtained through such a system. We also explore a means of leveraging users' previous social media content to predict their likelihood of response and optimize our system's collection behavior. At IBM Research - Almaden, we are now looking to extend these ideas to additional domains, including proactive and reactive customer support, and precision marketing campaigns.
Applying Noisy Knowledge Graphs to Real ProblemsDataWorks Summit
Knowledge graphs (KGs) have recently emerged as a powerful way to represent knowledge in multiple communities, including data mining, natural language processing and machine learning. Large-scale KGs like Wikidata and DBpedia are openly available, while in industry, the Google Knowledge Graph is a good example of proprietary knowledge that continues to fuel impressive advances in Google's semantic search capabilities. Yet, both crowdsourced and automatically constructed KGs suffer from noise, both during KG construction and during search and inference. In this talk, I will discuss how to build and use such knowledge graphs effectively, despite the noise and sparsity of labeled data, to solve real-world social problems such as providing insights in disaster situations, and helping law enforcement fight human trafficking. I will conclude by providing insight on the lessons learned, and the applicability of research techniques to industrial problems. The talk will be designed to appeal both to business and technical leaders.
Semantic Analysis to Compute Personality Traits from Social Media PostsGiulio Carducci
This document discusses using semantic analysis of social media posts to automatically compute personality traits based on the Five Factor Model. It presents the background on using language to predict personality traits and describes word embeddings to represent words as vectors. An experiment is described that uses a dataset of social media posts with known personality scores to train models like SVM and LASSO to predict the Big Five personality traits of openness, conscientiousness, extraversion, agreeableness, and neuroticism. The models are tested on datasets from MyPersonality and Twitter, achieving mean squared errors between 0.3-0.7. Future work proposes expanding the approach to larger datasets and additional features.
Empathic inclination from digital footprints
Marco Polignano, Pierpaolo Basile, Gaetano Rossiello, Marco de Gemmis and Giovanni Semeraro
University of Bari “Aldo Moro”, Dept. of Computer Science, Italy
Using Public Social Media to Find Answers to QuestionsJeffrey Nichols
The document discusses using public social media to find answers to questions. It presents research on asking targeted questions to individuals on social media and analyzing their responses. A prototype called TSA Tracker asked people at airports to share security wait times. Over 40% of people responded, providing useful information. The results suggest it is feasible to obtain quality information and answers to questions from strangers on social media, especially if an incentive is provided. Future work involves building technology to better target messaging and support this question asking process at scale.
Using the Crowd to Understand and Adapt User InterfacesJeffrey Nichols
Using crowdsourcing techniques, researchers have explored adapting existing user interfaces and developing new interface models. CoScripter allows crowds to capture and share task traces, which can then be used to generate mobile applications through tools like Highlight. CoCo leverages crowdsourced scripts to enable conversational interfaces. Researchers also experimented with using crowds to build abstract models of interfaces, though this work remains preliminary. Harnessing crowds offers potential for engineering new interfaces and adapting existing ones, but significant challenges remain in areas like model construction.
CHI 2014 Panel: Opportunities and Risks of Discovering Personality Traits fro...Jeffrey Nichols
We will conduct a panel at CHI 2014 discussing the opportunities and implications of the coming wave of new analytics that allow individuals' intrinsic traits, such as personality and motivations, to be mined from their behaviors on social platforms.
For more information, see the Facebook page for the panel here:
https://www.facebook.com/events/633305060096913/
My talk from Carnegie Mellon's HCII Seminar on April 24, 2013.
Abstract:
On some social media platforms, such as Twitter, Youtube, Pinterest, and tumblr, much of the content generated by users is publicly accessible and communication can be easily initiated between strangers who have never previously communicated before. The communities that have risen up around these platforms, particularly on Twitter, can also be inclusive and supportive of interactions between strangers. The public and open nature of these communities creates an opportunity to create a new kind of crowdsourcing system, where individuals are identified who may be good candidates to complete various tasks based on their published content. We explore the potential of such a system through several information collection tasks, examining the response rate and information quality that can be obtained through such a system. We also explore a means of leveraging users' previous social media content to predict their likelihood of response and optimize our system's collection behavior. At IBM Research - Almaden, we are now looking to extend these ideas to additional domains, including proactive and reactive customer support, and precision marketing campaigns.
Applying Noisy Knowledge Graphs to Real ProblemsDataWorks Summit
Knowledge graphs (KGs) have recently emerged as a powerful way to represent knowledge in multiple communities, including data mining, natural language processing and machine learning. Large-scale KGs like Wikidata and DBpedia are openly available, while in industry, the Google Knowledge Graph is a good example of proprietary knowledge that continues to fuel impressive advances in Google's semantic search capabilities. Yet, both crowdsourced and automatically constructed KGs suffer from noise, both during KG construction and during search and inference. In this talk, I will discuss how to build and use such knowledge graphs effectively, despite the noise and sparsity of labeled data, to solve real-world social problems such as providing insights in disaster situations, and helping law enforcement fight human trafficking. I will conclude by providing insight on the lessons learned, and the applicability of research techniques to industrial problems. The talk will be designed to appeal both to business and technical leaders.
Semantic Analysis to Compute Personality Traits from Social Media PostsGiulio Carducci
This document discusses using semantic analysis of social media posts to automatically compute personality traits based on the Five Factor Model. It presents the background on using language to predict personality traits and describes word embeddings to represent words as vectors. An experiment is described that uses a dataset of social media posts with known personality scores to train models like SVM and LASSO to predict the Big Five personality traits of openness, conscientiousness, extraversion, agreeableness, and neuroticism. The models are tested on datasets from MyPersonality and Twitter, achieving mean squared errors between 0.3-0.7. Future work proposes expanding the approach to larger datasets and additional features.
Empathic inclination from digital footprints
Marco Polignano, Pierpaolo Basile, Gaetano Rossiello, Marco de Gemmis and Giovanni Semeraro
University of Bari “Aldo Moro”, Dept. of Computer Science, Italy
ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging ...GUANGYUAN PIAO
This document summarizes a study that leverages followee biographies on Twitter to infer interests for passive users. The study extracted entities from followee bios and names, finding bios provided over twice as many entities on average. It then used these entities to build user interest profiles and test different modeling strategies. Strategies that used followee bios outperformed those using only names, with interest propagation through DBpedia performing best. The authors conclude leveraging followee bios can provide more informative user profiles for improving recommendation performance.
This study proposes methods to automatically extract and recommend popular and relevant URLs from Twitter to support serendipitous learning for software developers. The researchers collected tweets from seed Twitter users and extracted URLs, calculating 14 features for each. URLs were labeled for relevance and a supervised learning-to-rank model and unsupervised Borda count approach were used to recommend URLs. The supervised approach achieved better performance with an NDCG of 0.832. Future work includes automatically categorizing URLs and building a full recommendation system.
Collective Spammer Detection in Evolving Multi-Relational Social NetworksTuri, Inc.
This document describes research on detecting collective spammer behavior in evolving multi-relational social networks. The researchers analyzed a sample of over 300 million users from Tagged.com, extracting graph structure and sequential features to classify users as spammers or legitimate with up to 88% accuracy. They also improved abuse reporting systems by incorporating reporter credibility and collective reasoning, achieving results with 87% AUC.
This document provides an introduction to machine learning concepts including definitions of machine learning, training and test data, and different machine learning techniques. It defines machine learning as a field that allows machines to learn from data without being explicitly programmed. It describes how training data is used to teach a machine and test data is used to evaluate how well a machine has learned. The document outlines common machine learning techniques including supervised learning techniques like classification and regression as well as unsupervised learning techniques like clustering. It provides examples of different algorithms for each technique.
Pavan Kapanipathi, Prateek Jain, Chitra Venkataramani, Amit Sheth, User Interests Identification on Twitter Using a Hierarchical Knowledge Base, ESWC 2014, May 2014.
Paper at: http://j.mp/user-ig
More at: http://wiki.knoesis.org/index.php/Hierarchical_Interest_Graph
This document summarizes a study that analyzed meme diffusion on Twitter and developed an agent-based model to simulate meme competition for limited user attention. The study collected Twitter retweet data, analyzed statistical properties of meme diffusion networks and user behavior, and found that users' limited attention and social network structure were sufficient to reproduce the observed patterns. The model demonstrated that strong or weak user attention failed to match real-world data, indicating attention level affects meme popularity distributions.
System U: Computational Discovery of Personality Traits from Social Media for...Michelle Zhou
The document discusses a system called System U that uses computational methods to discover personality traits, values, needs, and emotional styles of individuals from their social media posts. It aims to provide personalized experiences at scale. The system analyzes text using psycholinguistic analytics and models to predict traits according to frameworks like the Big 5 personality traits. It was validated through studies showing its predictions correlated well with standard personality surveys for most people. Further field studies on Twitter showed traits could help predict who would respond to recommendations or help others.
Presented at OECD Workshop on Systematic Reviews in the Scope of the Endocrine Disrupter Testing and Assessment (EDTA) Conceptual Framework Level 1 in Paris, France
This document summarizes findings from an analysis of usage data from JSTOR before and after the implementation of discovery services at academic institutions. The analysis found varying decreases in usage of JSTOR content across different discovery services and sizes of academic institutions. Further investigation revealed that subject metadata configuration, library customization of discovery systems, and data syndication practices all significantly impacted search and access of content. The document advocates for libraries to spend more time configuring discovery systems and for publishers to improve their data syndication.
The document summarizes research on detecting social spam on Twitter using features inherent in tweets. The researchers collected two datasets of spam and non-spam tweets and defined four feature sets including user, content, n-gram and sentiment features. Random forest classification using a combination of user and n-gram features achieved comparable spam detection to prior work but without requiring historical user data. Computational efficiency was highest for the most accurate user and n-gram features. Future work could study spamming behavior across datasets.
User Interests Identification From Twitter using Hierarchical Knowledge BasePavan Kapanipathi
Twitter, due to its massive growth as a social networking
platform, has been in focus for the analysis of its user generated content for personalization and recommendation tasks. A common challenge across these tasks is identifying user interests from tweets. Semantic enrichment of Twitter posts, to determine user interests, has been an active area of research in the recent past. These approaches typically use available public knowledge-bases (such as Wikipedia) to spot entities and create entity-based user profiles. However, exploitation of such knowledgebases to create richer user profiles is yet to be explored. In this work, we leverage hierarchical relationships present in knowledge-bases to infer user interests expressed as a Hierarchical Interest Graph. We argue that the hierarchical semantics of concepts can enhance existing systems to personalize or recommend items based on a varied level of conceptual abstractness. We demonstrate the effectiveness of our approach through a user study which shows an average of approximately eight of the top ten weighted hierarchical interests in the graph being relevant to a user's interests.
[CS570] Machine Learning Team Project (I know what items really are)Kunwoo Park
This document summarizes a team's approach to predicting which items users might be interested in using a recommendation system. It describes extracting features from user and item metadata to train an SVM model, but this was too computationally expensive. Instead, the team used logistic regression with stochastic gradient descent. They tested features like age, gender and network similarities. Their combined model outperformed random prediction baselines on the KDD Cup 2012 Track 1 dataset.
In this research we employ ERG models to expand the understanding of the variables predicting the formation of learning ties in 'Ask' subreddit communities
Predicting user demographics in social networks - Invited Talk at University ...Nikolaos Aletras
Automatically inferring user demographics in social networks is useful for both social science research and a range of downstream applications in marketing and politics. Our main hypothesis is that language use in social networks is indicative of user attributes. This talk presents recent work on inferring a new set of socioeconomic attributes, i.e. occupational class, income and socioeconomic class. We define a predictive task for each attribute where user-generated content is utilised to train supervised non-linear methods for classification and regression, i.e. Gaussian Processes. We show that our models achieve strong predictive accuracy in all of the three demographics while our analysis sheds light to factors that differentiate users between occupations, income level and socioeconomic classes.
How can we mine, analyse and visualise the Social Web?
In this lecture, you will learn about mining social web data for analysis. Data preparation and gathering basic statistics on your data.
MediaEval 2017 Retrieving Diverse Social Images Task (Overview)multimediaeval
This document provides an overview of the Retrieving Diverse Social Images Task that took place in 2017. The task involved refining image search results for queries by providing a ranked list of up to 50 photos that are both relevant and diverse representations of the query. 6 teams participated by submitting a total of 29 runs that were evaluated based on precision, cluster recall, and F1-measure at different cut-off levels. The best performing run used neural network-based clustering and text relevance feedback to improve over the initial Flickr search results. The document also discusses the provided dataset, participant results, and brief conclusions about successful methods and opportunities for future improvements.
ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging ...GUANGYUAN PIAO
This document summarizes a study that leverages followee biographies on Twitter to infer interests for passive users. The study extracted entities from followee bios and names, finding bios provided over twice as many entities on average. It then used these entities to build user interest profiles and test different modeling strategies. Strategies that used followee bios outperformed those using only names, with interest propagation through DBpedia performing best. The authors conclude leveraging followee bios can provide more informative user profiles for improving recommendation performance.
This study proposes methods to automatically extract and recommend popular and relevant URLs from Twitter to support serendipitous learning for software developers. The researchers collected tweets from seed Twitter users and extracted URLs, calculating 14 features for each. URLs were labeled for relevance and a supervised learning-to-rank model and unsupervised Borda count approach were used to recommend URLs. The supervised approach achieved better performance with an NDCG of 0.832. Future work includes automatically categorizing URLs and building a full recommendation system.
Collective Spammer Detection in Evolving Multi-Relational Social NetworksTuri, Inc.
This document describes research on detecting collective spammer behavior in evolving multi-relational social networks. The researchers analyzed a sample of over 300 million users from Tagged.com, extracting graph structure and sequential features to classify users as spammers or legitimate with up to 88% accuracy. They also improved abuse reporting systems by incorporating reporter credibility and collective reasoning, achieving results with 87% AUC.
This document provides an introduction to machine learning concepts including definitions of machine learning, training and test data, and different machine learning techniques. It defines machine learning as a field that allows machines to learn from data without being explicitly programmed. It describes how training data is used to teach a machine and test data is used to evaluate how well a machine has learned. The document outlines common machine learning techniques including supervised learning techniques like classification and regression as well as unsupervised learning techniques like clustering. It provides examples of different algorithms for each technique.
Pavan Kapanipathi, Prateek Jain, Chitra Venkataramani, Amit Sheth, User Interests Identification on Twitter Using a Hierarchical Knowledge Base, ESWC 2014, May 2014.
Paper at: http://j.mp/user-ig
More at: http://wiki.knoesis.org/index.php/Hierarchical_Interest_Graph
This document summarizes a study that analyzed meme diffusion on Twitter and developed an agent-based model to simulate meme competition for limited user attention. The study collected Twitter retweet data, analyzed statistical properties of meme diffusion networks and user behavior, and found that users' limited attention and social network structure were sufficient to reproduce the observed patterns. The model demonstrated that strong or weak user attention failed to match real-world data, indicating attention level affects meme popularity distributions.
System U: Computational Discovery of Personality Traits from Social Media for...Michelle Zhou
The document discusses a system called System U that uses computational methods to discover personality traits, values, needs, and emotional styles of individuals from their social media posts. It aims to provide personalized experiences at scale. The system analyzes text using psycholinguistic analytics and models to predict traits according to frameworks like the Big 5 personality traits. It was validated through studies showing its predictions correlated well with standard personality surveys for most people. Further field studies on Twitter showed traits could help predict who would respond to recommendations or help others.
Presented at OECD Workshop on Systematic Reviews in the Scope of the Endocrine Disrupter Testing and Assessment (EDTA) Conceptual Framework Level 1 in Paris, France
This document summarizes findings from an analysis of usage data from JSTOR before and after the implementation of discovery services at academic institutions. The analysis found varying decreases in usage of JSTOR content across different discovery services and sizes of academic institutions. Further investigation revealed that subject metadata configuration, library customization of discovery systems, and data syndication practices all significantly impacted search and access of content. The document advocates for libraries to spend more time configuring discovery systems and for publishers to improve their data syndication.
The document summarizes research on detecting social spam on Twitter using features inherent in tweets. The researchers collected two datasets of spam and non-spam tweets and defined four feature sets including user, content, n-gram and sentiment features. Random forest classification using a combination of user and n-gram features achieved comparable spam detection to prior work but without requiring historical user data. Computational efficiency was highest for the most accurate user and n-gram features. Future work could study spamming behavior across datasets.
User Interests Identification From Twitter using Hierarchical Knowledge BasePavan Kapanipathi
Twitter, due to its massive growth as a social networking
platform, has been in focus for the analysis of its user generated content for personalization and recommendation tasks. A common challenge across these tasks is identifying user interests from tweets. Semantic enrichment of Twitter posts, to determine user interests, has been an active area of research in the recent past. These approaches typically use available public knowledge-bases (such as Wikipedia) to spot entities and create entity-based user profiles. However, exploitation of such knowledgebases to create richer user profiles is yet to be explored. In this work, we leverage hierarchical relationships present in knowledge-bases to infer user interests expressed as a Hierarchical Interest Graph. We argue that the hierarchical semantics of concepts can enhance existing systems to personalize or recommend items based on a varied level of conceptual abstractness. We demonstrate the effectiveness of our approach through a user study which shows an average of approximately eight of the top ten weighted hierarchical interests in the graph being relevant to a user's interests.
[CS570] Machine Learning Team Project (I know what items really are)Kunwoo Park
This document summarizes a team's approach to predicting which items users might be interested in using a recommendation system. It describes extracting features from user and item metadata to train an SVM model, but this was too computationally expensive. Instead, the team used logistic regression with stochastic gradient descent. They tested features like age, gender and network similarities. Their combined model outperformed random prediction baselines on the KDD Cup 2012 Track 1 dataset.
In this research we employ ERG models to expand the understanding of the variables predicting the formation of learning ties in 'Ask' subreddit communities
Predicting user demographics in social networks - Invited Talk at University ...Nikolaos Aletras
Automatically inferring user demographics in social networks is useful for both social science research and a range of downstream applications in marketing and politics. Our main hypothesis is that language use in social networks is indicative of user attributes. This talk presents recent work on inferring a new set of socioeconomic attributes, i.e. occupational class, income and socioeconomic class. We define a predictive task for each attribute where user-generated content is utilised to train supervised non-linear methods for classification and regression, i.e. Gaussian Processes. We show that our models achieve strong predictive accuracy in all of the three demographics while our analysis sheds light to factors that differentiate users between occupations, income level and socioeconomic classes.
How can we mine, analyse and visualise the Social Web?
In this lecture, you will learn about mining social web data for analysis. Data preparation and gathering basic statistics on your data.
MediaEval 2017 Retrieving Diverse Social Images Task (Overview)multimediaeval
This document provides an overview of the Retrieving Diverse Social Images Task that took place in 2017. The task involved refining image search results for queries by providing a ranked list of up to 50 photos that are both relevant and diverse representations of the query. 6 teams participated by submitting a total of 29 runs that were evaluated based on precision, cluster recall, and F1-measure at different cut-off levels. The best performing run used neural network-based clustering and text relevance feedback to improve over the initial Flickr search results. The document also discusses the provided dataset, participant results, and brief conclusions about successful methods and opportunities for future improvements.
Similar to Who will RT this?: Automatically Identifying and Engaging Strangers on Twitter to Spread Information (20)
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Who will RT this?: Automatically Identifying and Engaging Strangers on Twitter to Spread Information
1. Who will RT this?
Automatically Identifying and Engaging Strangers
on Twitter to Spread Information
Kyumin Lee, Jalal Mahmud, Jilin Chen, Michelle X. Zhou, Jeffrey Nichols
Utah State University & IBM Research – Almaden
jwnichols@us.ibm.com
@jwnichls
@jwnichls
1
2. Public Social Media Contains a Wealth
of Information about Individuals…
@jwnichls
2
3. Public Social Media Contains a Wealth
of Information about Individuals…
Can we harness this information
for something useful?
•
•
Collect Information
•
@jwnichls
Identify people to recruit to do various tasks
Spread Information
3
4. Information Collection: TSA Tracker
Crowdsourcing airport security wait times through Twitter
Step #1. Watch for people
tweeting about being
in airport
http://tsatracker.org/
@tsatracker, @tsatracking
Step #2. Ask nicely if they
would share wait time to
help others
Step #3. Collect responses
and share relevant data
on web site
Step #4. Say thank you!
@jwnichls
Key Result:
Would people respond to questions
from strangers? Yes…around 40% of
the time
[Nichols et al. CSCW 2012, Mahmud et al. IUI 2013]
4
5. Today: Information Spreading
• Relevant marketing campaign messages
• Alerts and SOS messages in an emergency
• Etc.
@jwnichls
5
6. Today: Information Spreading
Challenge: Low percentage of people respond to this task
• Can we predict who will retweet and direct requests only to them?
• Can we predict who will retweet more quickly?
@jwnichls
6
7. Our Process
@jwnichls
1. Data Collection
2. Feature
Extraction
3. Feature Selection
4. Model Building
5. Evaluation
7
8. Ground-Truth Data Collection
Public Safety (location-based)
Bird Flu (topic-based)
•
•
•
•
•
Randomly selected users who
tweeted from the San Francisco bay
area (via geo-tags)
Contacted 1,902 users
52 (2.8%) retweeted our message
Message reached a total of 18,670
followers
@jwnichls
•
•
•
Randomly selected users who posted
one of the following words in at least
one tweet: “bird flu”, “H5N1” and
“avian influenza”
Contacted 1,859 users
155 (8.4%) retweeted our message
Message reached a total of 184,325
followers
8
9. In total:
Resulting Data
• Contacted 3,761 strangers
• 207 positive examples, 3554
negative examples
For each user we contacted,
we collected:
• Twitter profile (screen name,
tweet count, etc.)
• People they followed,
followers,
• Up to 200 recent messages
• Ground truth (“retweeter” or
“non-retweeter”)
@jwnichls
9
11. Feature Extraction
Feature Categories
•
•
•
•
•
•
Profile Features
Social Network Features
Personality Features
Activity Features
Past Retweeting Features
Readiness Features
@jwnichls
Profile Features
• longevity (age) of an account
• length of screen name
• whether the user profile has a
description
• length of the description
• whether the user profile has a
URL
Social Network Features
• number of users following
(friends)
• number of followers
• and the ratio of number of
friends to number of followers
11
12. Feature Extraction
Feature Categories
•
•
•
•
•
•
Profile Features
Social Network Features
Personality Features
Activity Features
Past Retweeting Features
Readiness Features
@jwnichls
Users’ word usage has been found to predict
their personality
•Linguistic Inquiry and Word Count (LIWC)
dictionary
•Personality features derived from LIWC
categories [Yarkoni 2010, Mahmud 2013]
12
13. Feature Extraction
Feature Categories
•
•
•
•
•
•
Profile Features
Social Network Features
Personality Features
Activity Features
Past Retweeting Features
Readiness Features
•
•
•
•
•
•
•
•
•
•
@jwnichls
Number of status messages
Number of direct mentions (e.g., @johny)
per status message
Number of URLs per status message
Number of hashtags per status message
Number of status messages per day
during her entire
Account life (= total number of posted
status messages / longevity)
Number of status messages per day
during last one month
Number of direct mentions per day during
last one month
Number of URLs per day during last one
month
Number of hashtags per day during last
one month
13
14. Feature Extraction
Feature Categories
•
•
•
•
•
•
Profile Features
Social Network Features
Personality Features
Activity Features
Past Retweeting Features
Readiness Features
@jwnichls
Past Retweeting Behavior
•
•
•
Number of retweets per status
message: R/N
Average number of retweets per day
Fraction of retweets for which original
messages are posted by strangers who
are not in her social network
Readiness Based on Previous Activity
•
•
•
•
•
•
Tweeting Likelihood (Day)
Tweeting Likelihood (Hour)
Entropy of Tweeting Likelihood (Day)
Entropy of Tweeting Likelihood (Hour)
Tweeting Steadiness
Tweeting Inactivity
14
15. Training and Test Sets:
Predicting
Retweeters
• Each dataset (public safety
and bird flu) was randomly
split to training set (2/3 data)
and testing set (1/3 data)
5 Predictive Models
• Random Forest, Naïve Bayes,
Logistic Regression, SMO
(SVM) and AdaboostM1
Handing Class Imbalance
• Used both over-sampling
(SMOTE) and weighting
approaches (cost-sensitive
approach)
@jwnichls
15
16. Feature Selection
Computed χ2 value for each feature in training
Feature
Group
Profile
Social-network
Significant Features (bolded is common to both
data sets)
the longevity of the account
Feature
Group
the length of description
Profile
has description in profile
|following|
ratio of number of friends to number of followers
|URLs| per day
|direct mentions| per day
|URLs| per day
|direct mentions| per day
Activity
|hashtags| per day
Activity
|URLs| per status message
|hashtags| per day
|direct mentions| per status message
|status messages|
|hashtags| per status message
|status messages| per day during entire account life
|status messages| per day during last one month
Past
Retweeting
Readiness
Personality
Significant Features (bolded is common to both data
sets)
Past
Retweeting
|retweets| per status message
|retweets| per day
Readiness
Tweeting Likelihood of the Day
Tweeting Likelihood of the Day (Entropy)
7 LIWC features: Inclusive, Achievement, Humans,
Time, Sadness, Articles, Nonfluencies
1 Facet feature: Modesty
21 Features Selected by χ2 in Publish Safety Dataset
Personality
|retweets| per status message
|retweets| per day
|URLs| per retweet message
Tweeting Likelihood of the Hour (Entropy)
34 LIWC features: Inclusive, Total Pronouns, 1st Person
Plural, 2nd Person, 3rd Person, Social Processes, Positive
Emotions, Numbers, Other References, Occupation, Affect,
School, Anxiety, Hearing, Certainty, SZensory Processes,
Death, Body States, Positive Feelings, Leisure, Optimism,
Negation, Physical States, Communication
8 Facet features: Liberalism, Assertiveness, Achievement
Striving, Self-Discipline, Gregariousness, Cheerfulness,
Activity Level, Intellect
2 Big5 features: Conscientiousness, Openness
46 Features Selected by χ2 in Bird Flu Dataset
Activity, personality, readiness and past retweeting feature groups have more significant power.
Six significant features (bolded names) are common to both sets.
@jwnichls
16
17. Evaluating Retweeter Prediction
Only the significant features are used for prediction
Classifier
AUC
F1
F1 of
Retweeter
0.638
0.958
0
Naïve Bayes
0.619
0.939
0.172
Logistic
0.640
0.958
0
SMO
0.500
0.96
0
AdaBoostM1
0.548
0.962
0.1
0.606
0.916
0.119
Naïve Bayes
0.637
0.923
0.132
Logistic
0.664
0.833
0.091
SMO
0.626
0.813
0.091
AdaBoostM1
0.633
0.933
0.129
Cost-Sensitive (Weighting, showing the best results in each model)
Random Forest
Naïve Bayes
Logistic
SMO
AdaBoostM1
0.692
0.619
0.623
0.633
0.678
F1 of
Retweeter
Random Forest
0.707
0.877
0.066
Naïve Bayes
0.670
0.834
0.222
Logistic
0.751
0.878
0.067
SMO
0.500
0.876
0
AdaBoostM1
0.627
0.878
0.067
SMOTE
SMOTE
Random Forest
F1
AUC
Basic
Basic
Random Forest
Classifier
0.954
0.93
0.938
0.892
0.956
Random Forest
0.707
0.819
0.236
Naïve Bayes
0.679
0.724
0.231
Logistic
0.76
0.733
0.258
SMO
0.729
0.712
0.278
AdaBoostM1
0.709
0.837
0.292
Cost-Sensitive (Weighting, showing the best results in each model)
Random Forest
0.785
0.815
0.296
0.147
Naïve Bayes
0.670
0.767
0.24
0.042
0.123
0.133
Logistic
0.735
0.742
0.243
SMO
0.676
0.738
0.256
AdaBoostM1
0.669
0.87
0.031
0.125
Prediction accuracy (Public Safety)
Prediction accuracy (Bird Flu)
We use Random Forest for all following experiments.
@jwnichls
17
18. Comparison with Two Baselines
Baselines
Random people contact
• Randomly select and
ask a sub-set of
qualified candidates
Popular people contact
• Sort candidates in our
test set by their
follower count in the
descending order
@jwnichls
Retweeting Rate in Testing Set
Approach
Public Safety
Bird flu
Random People Contact
2.6%
8.3%
Popular People Contact
3.1%
8.5%
Our Prediction Approach
13.3%
19.7%
Comparison of retweeting rates
18
19. Incorporating Time Constraints
•
•
Can we predict when a person will retweet given a request?
We use an exponential distribution model to estimate a user’s retweeting
wait time with a probability.
•
•
Assume that retweeting events follow a poisson process.
1/λ = the average wait time for a user based on prior retweeting wait time
When a person’s average wait time 1/λ = 180 min,
retweeting probability within 200 minutes is
larger than 0.6
@jwnichls
19
20. Evaluation with Time Constraints
•
•
•
Compare two baselines with our prediction approach and our prediction approach
with wait-time model
For the wait-time model, if a user has at least 70% probability to retweet within a
certain time window (say, 6 hours), we will contact the user.
We experimented with different time windows such as 6, 12, 18 or 24 hours. The
below table shows the averaged retweeting rate
Approach
Random People Contact
Popular People Contact
Our Prediction Approach
Our Prediction Approach + WaitTime Model
Average Retweeting Rate in Testing Set
under Time Constraints
Public Safety
2.2%
2.7%
13.3%
Bird flu
6.5%
6.4%
13.6%
19.3%
14.7%
Comparison of retweeting rates with time constraints
@jwnichls
20
21. Evaluating in Terms of Benefit & Cost
Benefit: Number of people exposed to message
Cost (N): the number of people contacted
K: the number of retweeters
(normalized benefit)
Unit-Info-Reach-Per-Person
Public Safety
Bird flu
Approach
Random People Contact
6
85
Popular People Contact
11
116
Our Prediction Approach
106
135
Our Prediction Approach +
Wait-Time Model
153
155
Comparison of information reach
@jwnichls
21
22. Live Experiment
•
•
•
To validate the effectiveness of our approach in a live setting, we used our
recommender system to test our approach against the two baselines
Randomly selected 426 candidates who had recently tweeted about “bird
flu” in July 2013
Each approach selected top 100 candidates based on its criteria
Approach
Retweeting Rate
Approach
Average
Retweeting Rate
Random People Contact
4%
Random People Contact
4%
Popular People Contact
9%
Popular People Contact
8.7%
Our Prediction Approach
19%
Our Prediction Approach
Our Prediction Approach + Wait
time model
18%
Comparison of retweeting rates in live experiment
@jwnichls
18.5%
Comparison of retweeting rates in live experiment
with time constraints
22
23. To wrap up…
•
•
We have also described a time estimation model
•
In the experiments, our approaches doubled the
retweeting rates over the two baselines
•
With our time estimation model, our approach
outperformed other approaches significantly
•
@jwnichls
We have presented a feature-based prediction
model that can automatically identify the right
individuals at the right time on Twitter
Overall, our approach effectively identifies qualified
candidates for retweeting a message within a given
time window
23
26. Why Retweet a Stranger’s Request?
We randomly selected 50 people who retweeted and asked
them why they chose to retweet (33 replied)
Main reasons to retweet our requested message
• Trustworthiness of the content
“Because it contained a link to a significant report from a reputable media news
source”
• Content relevance
“Because it happened in my neighborhood”
• Content value
“my followers should know this or they may think this info is valuable”
@jwnichls
26
27. Real-Time Retweeter Recommendation
The interface of our retweeter recommendation system: (a) left panel:
system-recommended candidates, and (b) right panel: a user can edit
and compose a retweeting request.
@jwnichls
27