Presentation discusses the research paper titles "How to ask for a Favor: A Case Study on the Success of Altruistic Requests" by the authors "Tim Althoff, Christian, Dan Jurafsky".
This is just a university class presentation to discuss the research paper and all the credits and rights are with the original authors only
Measuring the Quality of Online Service - Jinyoung kimJin Young Kim
This document discusses methods for measuring the quality of online services. It describes how major companies like Google, Facebook, and Netflix collect data through user behavior, panel surveys, and direct user feedback at different stages of their services. Panel surveys can provide insights but have limitations, while user behavior data is abundant but noisy. The document also provides examples of how to design panel surveys and side-by-side evaluations to assess search engine result pages. It concludes that the best approach is to combine various data collection methods depending on the service characteristics and lifecycle.
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Jin Young Kim
검색 및 추천 시스템의 사회적 역할이 커지면서, 그 결과의 공정성 역시 최근 관심사로 대두되었다. 본 발표에서는 검색 및 추천시스템의 공정성 이슈 및 그 해법을 다룬다. 공정한 검색 및 추천 결과를 정의하는 다양한 방법, 공정성의 결여가 미치는 자원 배분 및 스테레오타이핑 문제, 그리고 검색 및 추천시스템 개발의 각 단계별로 어떤 해결책이 있는지를 최신 연구 중심으로 살펴본다. 마지막으로 실제 공정한 시스템 개발을 위한 실무적인 고려사항을 다룬다.
Cross cultural variation in the perception of impolitenessCristina Vidal
1) The document summarizes a study on perceptions of impoliteness across different cultures, reporting on student experiences in England, China, Finland, Germany, and Turkey.
2) It analyzes the experiences through the framework of quality face, social identity face, and relational face. For example, remarks targeting someone's group or insulting their personal qualities were seen as impolite.
3) Preliminary results found some differences between cultures, such as Chinese students viewing violations of equity or rights as more impolite, possibly due to Confucian influences. However, the document cautions against stereotyping cultures and calls for more nuanced research.
There are two types of implicatures:
1. Conventional implicatures - Implicatures that are directly associated with the use of certain words or expressions. For example, the word "but" conventionally implicates contrast.
2. Conversational implicatures - Implicatures that are generated based on the Cooperative Principle and Grice's Maxims of Conversation. For example, if someone says "I'm out of gas" in response to being asked for a ride, they conversationally implicate that they cannot give you a ride.
The key difference is that conventional implicatures are directly linked to the meaning of words/expressions, while conversational implicatures are inferred based on the context and Grice's max
This document discusses theories of politeness from a socio-pragmatic perspective. It outlines Brown and Levinson's influential theory of politeness from 1978, which proposes that politeness arises from people's desire to protect each other's "face" or public self-image. Brown and Levinson identify two types of face - positive face, which is the desire to be approved of, and negative face, which is the desire to not be imposed on. They suggest politeness strategies like indirect speech acts that mitigate potential threats to another's face. The document also reviews other approaches to politeness including social norm, conversational contact, and maxims approaches.
My talk from Carnegie Mellon's HCII Seminar on April 24, 2013.
Abstract:
On some social media platforms, such as Twitter, Youtube, Pinterest, and tumblr, much of the content generated by users is publicly accessible and communication can be easily initiated between strangers who have never previously communicated before. The communities that have risen up around these platforms, particularly on Twitter, can also be inclusive and supportive of interactions between strangers. The public and open nature of these communities creates an opportunity to create a new kind of crowdsourcing system, where individuals are identified who may be good candidates to complete various tasks based on their published content. We explore the potential of such a system through several information collection tasks, examining the response rate and information quality that can be obtained through such a system. We also explore a means of leveraging users' previous social media content to predict their likelihood of response and optimize our system's collection behavior. At IBM Research - Almaden, we are now looking to extend these ideas to additional domains, including proactive and reactive customer support, and precision marketing campaigns.
In this video we talk about what US is and how to gather information to make a good one with the help of two case studies.
You can find the video that goes with this here https://www.youtube.com/watch?v=nK9LHXa8x7A
Measuring the Quality of Online Service - Jinyoung kimJin Young Kim
This document discusses methods for measuring the quality of online services. It describes how major companies like Google, Facebook, and Netflix collect data through user behavior, panel surveys, and direct user feedback at different stages of their services. Panel surveys can provide insights but have limitations, while user behavior data is abundant but noisy. The document also provides examples of how to design panel surveys and side-by-side evaluations to assess search engine result pages. It concludes that the best approach is to combine various data collection methods depending on the service characteristics and lifecycle.
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Jin Young Kim
검색 및 추천 시스템의 사회적 역할이 커지면서, 그 결과의 공정성 역시 최근 관심사로 대두되었다. 본 발표에서는 검색 및 추천시스템의 공정성 이슈 및 그 해법을 다룬다. 공정한 검색 및 추천 결과를 정의하는 다양한 방법, 공정성의 결여가 미치는 자원 배분 및 스테레오타이핑 문제, 그리고 검색 및 추천시스템 개발의 각 단계별로 어떤 해결책이 있는지를 최신 연구 중심으로 살펴본다. 마지막으로 실제 공정한 시스템 개발을 위한 실무적인 고려사항을 다룬다.
Cross cultural variation in the perception of impolitenessCristina Vidal
1) The document summarizes a study on perceptions of impoliteness across different cultures, reporting on student experiences in England, China, Finland, Germany, and Turkey.
2) It analyzes the experiences through the framework of quality face, social identity face, and relational face. For example, remarks targeting someone's group or insulting their personal qualities were seen as impolite.
3) Preliminary results found some differences between cultures, such as Chinese students viewing violations of equity or rights as more impolite, possibly due to Confucian influences. However, the document cautions against stereotyping cultures and calls for more nuanced research.
There are two types of implicatures:
1. Conventional implicatures - Implicatures that are directly associated with the use of certain words or expressions. For example, the word "but" conventionally implicates contrast.
2. Conversational implicatures - Implicatures that are generated based on the Cooperative Principle and Grice's Maxims of Conversation. For example, if someone says "I'm out of gas" in response to being asked for a ride, they conversationally implicate that they cannot give you a ride.
The key difference is that conventional implicatures are directly linked to the meaning of words/expressions, while conversational implicatures are inferred based on the context and Grice's max
This document discusses theories of politeness from a socio-pragmatic perspective. It outlines Brown and Levinson's influential theory of politeness from 1978, which proposes that politeness arises from people's desire to protect each other's "face" or public self-image. Brown and Levinson identify two types of face - positive face, which is the desire to be approved of, and negative face, which is the desire to not be imposed on. They suggest politeness strategies like indirect speech acts that mitigate potential threats to another's face. The document also reviews other approaches to politeness including social norm, conversational contact, and maxims approaches.
My talk from Carnegie Mellon's HCII Seminar on April 24, 2013.
Abstract:
On some social media platforms, such as Twitter, Youtube, Pinterest, and tumblr, much of the content generated by users is publicly accessible and communication can be easily initiated between strangers who have never previously communicated before. The communities that have risen up around these platforms, particularly on Twitter, can also be inclusive and supportive of interactions between strangers. The public and open nature of these communities creates an opportunity to create a new kind of crowdsourcing system, where individuals are identified who may be good candidates to complete various tasks based on their published content. We explore the potential of such a system through several information collection tasks, examining the response rate and information quality that can be obtained through such a system. We also explore a means of leveraging users' previous social media content to predict their likelihood of response and optimize our system's collection behavior. At IBM Research - Almaden, we are now looking to extend these ideas to additional domains, including proactive and reactive customer support, and precision marketing campaigns.
In this video we talk about what US is and how to gather information to make a good one with the help of two case studies.
You can find the video that goes with this here https://www.youtube.com/watch?v=nK9LHXa8x7A
SIGIR 2016 presentation slide for paper: Xin Qian, Jimmy Lin, and Adam Roegiest. Interleaved Evaluation for Retrospective Summarization and Prospective Notification on Document Streams. Proceedings of the 39th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), pages 175-184, July 2016, Pisa, Italy.
An Introduction to the World of User ResearchMethods
What is user? Why do we do it? How do we do it? User Research Consultants, Dr Jennifer Klatt and Ben Smith from Methods Digital (https://methodsdigital.co.uk/) have kindly put together this slide deck to take you through the basics.
Quality vs. Access case study Complete a full paper outline incl.docxmakdul
This document provides an outline for analyzing a case study on the tension between quality of care and access to care in the healthcare system. The outline includes sections for introduction, stakeholders, overview, analysis, recommendations, and conclusion. The background information provided discusses how the Affordable Care Act raised Medicaid reimbursement levels and is now tying quality measurements to reimbursement levels. This could result in some patient groups facing reduced access to care. The analysis section is meant to address how the payment system could be modified to reward quality without negatively impacting access for low-income or less healthy patients.
Auditing search engines for differential satisfaction across demographicsAmit Sharma
This document presents a framework for auditing search engines to detect differences in user satisfaction across demographics. It describes three methods for more meaningful auditing that control for natural demographic variations: 1) Context matching to select near-identical user activity, 2) A hierarchical query-level model to borrow strength across popular queries, and 3) A query-level pairwise model to directly estimate relative satisfaction between user pairs for the same query. The framework found some light trends of older users being more satisfied but showed auditing is nuanced and different from measuring metrics on binned traffic alone. It provides a general approach for auditing systems using different metrics and user groups.
Measuring effectiveness of machine learning systemsAmit Sharma
Many online systems, such as recommender systems or ad systems, are increasingly being used in societally critical domains such as education, healthcare, finance and governance. A natural question to ask is about their effectiveness, which is often measured using observational metrics. However, these metrics hide cause-and-effect processes between these systems, people's behavior and outcomes. I will present a causal framework that allows us to tackle questions about the effects of algorithmic systems and demonstrate its usage through evaluation of Amazon's recommender system and a major search engine. I will also discuss how such evaluations can lead to metrics for designing better systems.
The document evaluates the quality of life of residents living in the Bob and Judy Charles SmartHome run by Imagine!. Data was collected through phone interviews and the Supports Intensity Scale before and one year after moving in. Results found that most quality of life indicators like safety, choices, and relationships increased while support needs decreased. There were also some positive correlations between subjective and objective quality of life reports. In conclusion, living in the SmartHome enhanced residents' quality of life by providing greater independence, access, and interaction with their environment.
6.6 Family and Youth Program Measurement Simplified
Speaker: Iain DeJong
Effective homeless assistance systems rely on quality data and performance measurement. This workshop will describe simple steps to evaluate program outcomes as well as practical strategies for using data systems to support a performance-based homeless assistance system.
This document provides an overview of a public and patient engagement training session hosted by Community & Voluntary Action Tameside (CVAT) and Healthwatch Tameside. The training covered frameworks for assessing the scope and impact of proposals in order to determine the appropriate level of public engagement. Participants worked through case studies to practice applying the frameworks. They considered questions around understanding impact, identifying stakeholders, and planning evaluation. The goal was to equip participants with tools for meaningful public involvement in health and social care projects.
This document summarizes a discussion between Christy Gilchrist from Wellspan Health and Todd Tullis from goBalto on using site intelligence and predictive analytics to improve clinical trial feasibility assessments, site selection, startup, and performance evaluation. Some key points discussed include:
1) Using data analysis of electronic health records and epidemiological models to better predict patient enrollment expectations and feasibility at sites.
2) Measuring site and sponsor responsiveness to startup tasks in real-time to facilitate faster resolution of issues.
3) Evaluating site performance against enrollment goals, compliance goals, and business goals to help sites improve for future trials.
4) Sharing post-study performance data with sites to build
This document summarizes a presentation by Iain De Jong on data and performance measurement for homelessness services. The presentation covers: why collecting good data is important; key definitions like inputs, activities, outputs and outcomes; how to create a data typology and logic models; setting targets and doing data analysis; meeting funder expectations; and creating a data-focused organizational culture. Common problems with data like confusing outputs and outcomes are also addressed. The goal is to help organizations better use data to understand their work and drive improvements in serving clients experiencing homelessness.
Crowdsourcing can provide medical insights from patients and physicians. It involves soliciting contributions from online communities rather than traditional employees. There are three main crowdsourcing groups: general population, disease-specific communities, and physician-specific platforms. Crowdsourcing offers advantages like cost, speed, and geographic reach but also risks if not properly utilized and interpreted. It has potential uses including understanding patient experiences, preferences, and comprehension. Costs vary by project but general crowdsourcing typically aims to compensate at least minimum wage. Crowdsourcing shows promise if risks are mitigated and results are contextualized.
The Safe Shelter Collaborative is a project dedicated to finding more shelter faster for a greater diversity of human trafficking and domestic violence survivors. This deck provides overview information, a hold for a live demo, and appendices that include results from the pilot, the research we've done on where to launch next, and what it takes to participate in the project in general.
Analyzing behavioral data for improving search experiencePavel Serdyukov
This document discusses behavioral data analysis from search click logs to improve search experiences. It provides an overview of Yandex's efforts to share anonymized click data through hosting public challenges on relevance prediction, switching detection, and personalized search. These challenges helped analyze user behavior and identify challenges around sparse query and click data for tail queries, lack of feedback beyond the first search results page, and limitations of offline evaluation metrics. The talk outlines approaches to address these challenges, such as propagating click-through rates between similar queries, examining lower ranked results, and developing click model-based offline metrics.
The document discusses survey design and data collection. It covers several key topics in 3 sentences or less:
1. What should be measured including characteristics, channels, outcomes and assumptions based on a theory of change. Accurate and precise indicators are important.
2. Methods of data collection such as surveys, qualitative methods, and tests. Good measures are accurate without bias and precise without random error.
3. Challenges in measurement including things people don't know well or want to talk about, abstract concepts, things not directly observable, and things best directly observed through protocols. Data collection requires reliability, validity, integrity, accuracy and timeliness.
An Engaging Click ... or how can user engagement measurement inform web searc...Mounia Lalmas-Roelleke
A good search engine is one when users come very regularly, type their queries, get their results, and leave quickly. With user engagement metrics from web analytics, these translate to a low dwell time, often low CTR, but a very high return rate. But user engagement is not just about this. User engagement is a complex phenomenon that requires a number of approaches for its measurement: we can ask the user about their experience though questionnaires, we can observe where they look or move the mouse, and we can calculate various web analytic metrics. The aim of this talk is to discuss how current work on user engagement, not necessary specific to web search, can provide insights into putting search into more broader perspectives.
This presentation is part of Search Solutions 2013, 27 November 2013, at the BCS HQ. A first version of this talk was given at the SIGIR 2013 Industry Day by Ricardo Baeza-Yates.
This document summarizes a study on the effect of using social network sites for business marketing in Bahraini organizations. The study aimed to identify how useful social networks are for marketing, their effective utilization, and their relationship to profit and customer loyalty. A literature review covered the history and characteristics of social networks. The researcher conducted a survey of 65 Bahraini businesses and analyzed the results. The study found that social network marketing can increase awareness and loyalty if used positively, but not reputation. It also found a strong correlation between social media use and increased inquiries, orders, revenue, market share and profit.
what is a needs assessment , How to write a needs assessmentNeveenJamal
A needs assessment is a systematic process for determining and addressing needs, or "gaps" between current conditions and desired conditions or "wants“
A needs assessment is a process used by organizations to determine priorities, make organizational improvements, or allocate resources. It involves determining the needs, or gaps, between where the organization envisions itself in the future and the organization's current state
A needs assessment is a part of planning processes
This document provides an overview of a public and patient engagement training session focused on engagement planning. It introduces the trainers and organizations involved. It reviews engagement concepts like levels of involvement and unexpected responses. Exercises guide participants to plan engagement for case studies, including identifying target audiences, appropriate methods, questions, consent processes, and impact evaluation. The training emphasizes structured planning, understanding audiences, managing expectations, and providing feedback.
Measuring user engagement: the do, the do not do, and the we do not knowMounia Lalmas-Roelleke
In the online world, user engagement refers to the quality of the user experience that emphasises the phenomena associated with wanting to use an application longer and frequently. User engagement is a multifaceted, complex phenomenon; this gives rise to a number of measurement approaches. Common ways to evaluate user engagement include self-report measures, e.g., questionnaires; physiological methods, e.g. cursor and eye tracking; and web analytics, e.g., number of site visits, click depth. These methods represent various trade-off in terms of the setting (laboratory versus in the wild), object of measurement (user behaviour, affect or cognition) and scale of data collected. This talk will present various efforts aiming at combining approaches to measure engagement. A particular focus will be what these measures individually and combined can tell us and not tell about user engagement. The talk will use examples of studies on news sites, social media, and native advertising.
Encouraging Reading of Diverse Political Viewpoints with a Browser WidgetSean Munson
The Internet gives individuals more choice in political news and information sources and more tools to filter out disagreeable information. Citing the preference described by selective exposure theory — people prefer information that supports their beliefs and avoid counter-attitudinal information — observers warn that people may use these tools to access only agreeable information and thus live in ideological echo chambers.
We report on a field deployment of a browser extension that showed users feedback about the political lean of their weekly and all time reading behaviors. Compared to a control group, showing feedback led to a modest move toward balanced exposure, corresponding to 1-2 visits per week to ideologically opposing sites or 5-10 additional visits per week to centrist sites.
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
SIGIR 2016 presentation slide for paper: Xin Qian, Jimmy Lin, and Adam Roegiest. Interleaved Evaluation for Retrospective Summarization and Prospective Notification on Document Streams. Proceedings of the 39th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), pages 175-184, July 2016, Pisa, Italy.
An Introduction to the World of User ResearchMethods
What is user? Why do we do it? How do we do it? User Research Consultants, Dr Jennifer Klatt and Ben Smith from Methods Digital (https://methodsdigital.co.uk/) have kindly put together this slide deck to take you through the basics.
Quality vs. Access case study Complete a full paper outline incl.docxmakdul
This document provides an outline for analyzing a case study on the tension between quality of care and access to care in the healthcare system. The outline includes sections for introduction, stakeholders, overview, analysis, recommendations, and conclusion. The background information provided discusses how the Affordable Care Act raised Medicaid reimbursement levels and is now tying quality measurements to reimbursement levels. This could result in some patient groups facing reduced access to care. The analysis section is meant to address how the payment system could be modified to reward quality without negatively impacting access for low-income or less healthy patients.
Auditing search engines for differential satisfaction across demographicsAmit Sharma
This document presents a framework for auditing search engines to detect differences in user satisfaction across demographics. It describes three methods for more meaningful auditing that control for natural demographic variations: 1) Context matching to select near-identical user activity, 2) A hierarchical query-level model to borrow strength across popular queries, and 3) A query-level pairwise model to directly estimate relative satisfaction between user pairs for the same query. The framework found some light trends of older users being more satisfied but showed auditing is nuanced and different from measuring metrics on binned traffic alone. It provides a general approach for auditing systems using different metrics and user groups.
Measuring effectiveness of machine learning systemsAmit Sharma
Many online systems, such as recommender systems or ad systems, are increasingly being used in societally critical domains such as education, healthcare, finance and governance. A natural question to ask is about their effectiveness, which is often measured using observational metrics. However, these metrics hide cause-and-effect processes between these systems, people's behavior and outcomes. I will present a causal framework that allows us to tackle questions about the effects of algorithmic systems and demonstrate its usage through evaluation of Amazon's recommender system and a major search engine. I will also discuss how such evaluations can lead to metrics for designing better systems.
The document evaluates the quality of life of residents living in the Bob and Judy Charles SmartHome run by Imagine!. Data was collected through phone interviews and the Supports Intensity Scale before and one year after moving in. Results found that most quality of life indicators like safety, choices, and relationships increased while support needs decreased. There were also some positive correlations between subjective and objective quality of life reports. In conclusion, living in the SmartHome enhanced residents' quality of life by providing greater independence, access, and interaction with their environment.
6.6 Family and Youth Program Measurement Simplified
Speaker: Iain DeJong
Effective homeless assistance systems rely on quality data and performance measurement. This workshop will describe simple steps to evaluate program outcomes as well as practical strategies for using data systems to support a performance-based homeless assistance system.
This document provides an overview of a public and patient engagement training session hosted by Community & Voluntary Action Tameside (CVAT) and Healthwatch Tameside. The training covered frameworks for assessing the scope and impact of proposals in order to determine the appropriate level of public engagement. Participants worked through case studies to practice applying the frameworks. They considered questions around understanding impact, identifying stakeholders, and planning evaluation. The goal was to equip participants with tools for meaningful public involvement in health and social care projects.
This document summarizes a discussion between Christy Gilchrist from Wellspan Health and Todd Tullis from goBalto on using site intelligence and predictive analytics to improve clinical trial feasibility assessments, site selection, startup, and performance evaluation. Some key points discussed include:
1) Using data analysis of electronic health records and epidemiological models to better predict patient enrollment expectations and feasibility at sites.
2) Measuring site and sponsor responsiveness to startup tasks in real-time to facilitate faster resolution of issues.
3) Evaluating site performance against enrollment goals, compliance goals, and business goals to help sites improve for future trials.
4) Sharing post-study performance data with sites to build
This document summarizes a presentation by Iain De Jong on data and performance measurement for homelessness services. The presentation covers: why collecting good data is important; key definitions like inputs, activities, outputs and outcomes; how to create a data typology and logic models; setting targets and doing data analysis; meeting funder expectations; and creating a data-focused organizational culture. Common problems with data like confusing outputs and outcomes are also addressed. The goal is to help organizations better use data to understand their work and drive improvements in serving clients experiencing homelessness.
Crowdsourcing can provide medical insights from patients and physicians. It involves soliciting contributions from online communities rather than traditional employees. There are three main crowdsourcing groups: general population, disease-specific communities, and physician-specific platforms. Crowdsourcing offers advantages like cost, speed, and geographic reach but also risks if not properly utilized and interpreted. It has potential uses including understanding patient experiences, preferences, and comprehension. Costs vary by project but general crowdsourcing typically aims to compensate at least minimum wage. Crowdsourcing shows promise if risks are mitigated and results are contextualized.
The Safe Shelter Collaborative is a project dedicated to finding more shelter faster for a greater diversity of human trafficking and domestic violence survivors. This deck provides overview information, a hold for a live demo, and appendices that include results from the pilot, the research we've done on where to launch next, and what it takes to participate in the project in general.
Analyzing behavioral data for improving search experiencePavel Serdyukov
This document discusses behavioral data analysis from search click logs to improve search experiences. It provides an overview of Yandex's efforts to share anonymized click data through hosting public challenges on relevance prediction, switching detection, and personalized search. These challenges helped analyze user behavior and identify challenges around sparse query and click data for tail queries, lack of feedback beyond the first search results page, and limitations of offline evaluation metrics. The talk outlines approaches to address these challenges, such as propagating click-through rates between similar queries, examining lower ranked results, and developing click model-based offline metrics.
The document discusses survey design and data collection. It covers several key topics in 3 sentences or less:
1. What should be measured including characteristics, channels, outcomes and assumptions based on a theory of change. Accurate and precise indicators are important.
2. Methods of data collection such as surveys, qualitative methods, and tests. Good measures are accurate without bias and precise without random error.
3. Challenges in measurement including things people don't know well or want to talk about, abstract concepts, things not directly observable, and things best directly observed through protocols. Data collection requires reliability, validity, integrity, accuracy and timeliness.
An Engaging Click ... or how can user engagement measurement inform web searc...Mounia Lalmas-Roelleke
A good search engine is one when users come very regularly, type their queries, get their results, and leave quickly. With user engagement metrics from web analytics, these translate to a low dwell time, often low CTR, but a very high return rate. But user engagement is not just about this. User engagement is a complex phenomenon that requires a number of approaches for its measurement: we can ask the user about their experience though questionnaires, we can observe where they look or move the mouse, and we can calculate various web analytic metrics. The aim of this talk is to discuss how current work on user engagement, not necessary specific to web search, can provide insights into putting search into more broader perspectives.
This presentation is part of Search Solutions 2013, 27 November 2013, at the BCS HQ. A first version of this talk was given at the SIGIR 2013 Industry Day by Ricardo Baeza-Yates.
This document summarizes a study on the effect of using social network sites for business marketing in Bahraini organizations. The study aimed to identify how useful social networks are for marketing, their effective utilization, and their relationship to profit and customer loyalty. A literature review covered the history and characteristics of social networks. The researcher conducted a survey of 65 Bahraini businesses and analyzed the results. The study found that social network marketing can increase awareness and loyalty if used positively, but not reputation. It also found a strong correlation between social media use and increased inquiries, orders, revenue, market share and profit.
what is a needs assessment , How to write a needs assessmentNeveenJamal
A needs assessment is a systematic process for determining and addressing needs, or "gaps" between current conditions and desired conditions or "wants“
A needs assessment is a process used by organizations to determine priorities, make organizational improvements, or allocate resources. It involves determining the needs, or gaps, between where the organization envisions itself in the future and the organization's current state
A needs assessment is a part of planning processes
This document provides an overview of a public and patient engagement training session focused on engagement planning. It introduces the trainers and organizations involved. It reviews engagement concepts like levels of involvement and unexpected responses. Exercises guide participants to plan engagement for case studies, including identifying target audiences, appropriate methods, questions, consent processes, and impact evaluation. The training emphasizes structured planning, understanding audiences, managing expectations, and providing feedback.
Measuring user engagement: the do, the do not do, and the we do not knowMounia Lalmas-Roelleke
In the online world, user engagement refers to the quality of the user experience that emphasises the phenomena associated with wanting to use an application longer and frequently. User engagement is a multifaceted, complex phenomenon; this gives rise to a number of measurement approaches. Common ways to evaluate user engagement include self-report measures, e.g., questionnaires; physiological methods, e.g. cursor and eye tracking; and web analytics, e.g., number of site visits, click depth. These methods represent various trade-off in terms of the setting (laboratory versus in the wild), object of measurement (user behaviour, affect or cognition) and scale of data collected. This talk will present various efforts aiming at combining approaches to measure engagement. A particular focus will be what these measures individually and combined can tell us and not tell about user engagement. The talk will use examples of studies on news sites, social media, and native advertising.
Encouraging Reading of Diverse Political Viewpoints with a Browser WidgetSean Munson
The Internet gives individuals more choice in political news and information sources and more tools to filter out disagreeable information. Citing the preference described by selective exposure theory — people prefer information that supports their beliefs and avoid counter-attitudinal information — observers warn that people may use these tools to access only agreeable information and thus live in ideological echo chambers.
We report on a field deployment of a browser extension that showed users feedback about the political lean of their weekly and all time reading behaviors. Compared to a control group, showing feedback led to a modest move toward balanced exposure, corresponding to 1-2 visits per week to ideologically opposing sites or 5-10 additional visits per week to centrist sites.
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
2. • Requests : Act of asking formally for something.
• Core of many social media systems.
• Factors that lead members to satisfy a request
remain largely unknown.
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 2
3. • Does the Language of the Request Matters?
• If Yes, How it matters?
• Is it possible to predict the success or failure of
Altruistic Requests.
No Incentive in return for the favor
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 3
4. • Goal was to understand what motivates people to
give when they do not receive anything tangible in
return.
• Developed a framework for controlling potential
confounds while studying the role of two aspects
that characterize compelling requests:
– Social Factors
– Linguistic Factors
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 4
5. How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 5
6. • What is being requested?
• What the giver receives in return?
• Group Dynamics
• In peer-to-peer, people are more likely to give to
projects what others are already giving to.
(Mitra and Gilbert 2014; Mollick 2014; Etter, Grossglauser, and Thiran 2013; Ceyhan, Shi, and Leskovec 2011; Teevan, Morris, and
Panovich 2011; Burke et al. 2007; Wash 2013; Cialdini 2001) 6
7. • Online community facilitating sending and
receiving free pizzas between strangers.
• As they say „Together we aim to restore faith
in humanity, one slice at a time“
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 7
8. • What? – All requests ask for same thing, a pizza.
• Incentive? – No incentives or rewards
• Group Dynamics? – Social Networking, Requests are
largely Textual
• Request Satisfiers? – Single User
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 8
9. • 21,577 posts (8th Dec. 2010 – 29th Sept. 2012)
• Only users with a single request, 5728
requests.
• Average success rate 24.6%
• Split Dataset mirrioring the success rate:
– Development (70%)
– Test Set (30%)
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 9
10. How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 10
11. 1. Textual Factors
– Politeness
– Evidentiality
– Reciprocity
– Sentiment
– Length
2. Social Factors
– Status
– Similarity
3. Narratives
– Money
– Job
– Student
– Family
– Craving
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 11
12. • “My gf and I have hit some hard times with her
losing her job and then unemployment as well for
being physically unable to perform her job due to
various hand injuries as a server in a restaurant. She
is currently petitioning to have unemployment
reinstated due to medical reasons for being unable
to perform her job, but until then things are really
tight and ANYTHING would help us out right now.
I [...] would certainly return the favor again when I
am able to reciprocate.”
Length : Favorable for success
Evidentiality : Urgents requests
are met fast.
Reciprocity : Promise to return
the favor
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 12
13. • Temporal Factors : Temporal or Seasonal effects are controlled
• Politeness : Measure politeness by extracting 19 individual
features deom computational politenes model (Danescu-
Niculescu-Mizil et al. 2013).
• Evidentiality : Presence of an image link, providing evidence
for their claim (86% of images in random sample included
some kind of evidence)
• Reciprocity : If the request includes phrases like „pay it
forward“, „pay it back“ or „return the favor“.
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 13
14. • Sentiment : Extracted sentiment annotations for each
sentence of the request using the Stanford CoreNLP Package,
count features based on lexicons of positive and negative
words from LIWC, detecting emoticons.
• Length: Total number of words in request.
• Status : Karma points, measure whether or not uses has posted
on RAOP before, user account age.
• Narrative : measure usage of all 5 narratives by word count
features that measures how often a given requests mention
words from the previously defined nattative lexicons.
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 14
15. Coefficient Estimate SE
Community Age -0.13*** 0.01
First Half of Month 0.22** 0.08
Gratitude 0.27** 0.08
Including Image 0.81*** 0.17
Reciprocity 0.32** 0.10
Strong Positive Sentiment 0.14 0.08
Strong Negative Sentiment -0.07 0.08
Length in 100 words 0.30*** 0.05
***p < 0.001, **p < 0.01, *p < 0.05
Temporal Control
Politeness
Evidentiality
Reciprocity
Length
Sentiment
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 15
16. Coefficient Estimate SE
Karma 0.13*** 0.02
Posted in RAOP before 1.34*** 0.16
Narrative Craving -0.34*** 0.09
Narrative Family 0.22* 0.09
Narrative Job 0.26** 0.09
Narrative Money 0.19** 0.08
Narrative Student 0.09 0.09
***p < 0.001, **p < 0.01, *p < 0.05
Status
Narratives
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 16
17. Length : 50 words
Narrative : Craving
Length: 50 words
Narrative : Job and Money
Length : 150 words
Narrative : Job and Money
Includes Picture, Gratitute
and Reciprocity
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 17
18. How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 18
19. • Predicting held-out requests (1.6k)
• Model: Area under receiver operating
characteristic curve (ROC AUC)
– Illustrate performance of binary classifier systems.
– Graphical plot.
– True Positive Rate VS False Positive Rate
(Friedman, Hastie, and Tibshirani 2010; DeLong, DeLong, and Clarke-Pearson 1988) 19
20. How to Ask for a Favor : A Case Study on the Success of Altruistic Requests
Feature ROC AUC (***p < 0.001)
Random Baseline 0.500
Unigram Baseline 0.621***
Bigram Baseline 0.618***
Trigram Baseline 0.618***
Text Features 0.625***
Social Features 0.576***
Temporal Features 0.579***
Temporal + Social 0.638***
Temporal + Social + Text 0.669***
Temporal + Social + Text + Unigram 0.672***
20
21. • User similarity (in terms of Interest and
Activity)had NO significant effect on giving.
• Did not included user similarity as a feature in
the logistic regression model since authors were
able to observe givers for a small subset.
21
22. • Language of requests matter a lot.
• Narratives of request also play a role in success.
• Reciprocity matters : Promise to pay it forward
• Pro-social behavior towards requestors who are
of high status.
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 22
23. • Expressing gratitude is also vital.
• And finaly we can say:
Success is Predictable!
How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 23
24. How to Ask for a Favor : A Case Study on the Success of Altruistic Requests 24
We live in a time where people increasingly turn to the web for help. And if we do not get satisfactory answers from the existing web pages, what we do. We turn to real people but still on the web. We ask question in online forums. These requests make the core of many social media systems such as stackoverflow.com, donorschoose.org, reddit.com etc. Factors that lead the community members to satisfy a request remain largely unknown. If we can understand these dynamics and factors, users can be educated for better formulating the requests, it has usages in social psychology and linguistic pragmatics.
This paper analysis the language of the requests. So main questions which the authors tried to answer in this paper are:
How to distabgle(control) the effects of these factors and focus on language? Is there a possibilits to find such community or such setup where these complexities are addressed?
Lets see what are the characteristics of this community. How this community resolves the confounds/complexities we are facing.
This community help in controlling all the confounds and provides us with an unusually clear picture of the effects of the language and social factors on success. This is the Ideal situation to understand the effect of Language
Language of the request matters a lot specially in these kind of communities where request are Altruistic in nature.
Authors divided the factors in 3 main categories:
Politeness : A person experiencing gratitude is more likely to behave prosocially. However gratitude is only one component of politeness, others are deference, greetings, apologies etc. Authors try to answer a more general question here: does a polite request make you more likely to be successful?
Evidentiality : Urgent requests are met more frequently than non-urgent requests.
Reciprocity : Generalized reciprocity (forward to another community member)
Sentiment : Behavioral literature points out towards the sentiments of the persons involved. But here we can only refer to the sentiments of the text.
Length: Longer request will be interpreted showing more effort, and gives opportunity to provide more convincing evidence of their situation.
Status: Studies in solial psychology found that high status attracts help more often.
Similarity : People are more likely to help those who reseble them.
Narratives: Different kinds of narratives are identified based on previous literature using topic modeling and related techniques. We came to know that success rate varies a lot with different cluster of words (narrative groups). We found that some topics cover the same or multiple narratives (some noise) and that some mostly consist of function/stop words etc. Therefore, we used those topics at a starting point (together with similar LIWC categories) to manually define the five narratives. Dropped are Friend, Time, Gratitude, Pizza, General.
Temporal Factors: Controlled by measuring specific week days, months, hours etc.
Linguistic Inquiry and Word Count (LIWC) is a text analysis software program
All these factors are measure and success probability of a request is modeled in a logistic regression framework that allows to reason about the significance of one factor given all the other factors using Success as a dependent variable and textual, social and temporal features as independent variables.
The likelihood-ratio test discussed above to assess model fit is also the recommended procedure to assess the contribution of individual "predictors" to a given model.
If the p-value is small enough to claim statistical significance, that just means there is strong evidence that the coefficient is different from 0.
Politeness: Out of 19 fetures, only gratitude is significant.
Evidentiality: High significance, make success more likely to succeed. Need and Urgency
Reciprocity: Willingness to give back to the community, high significance.
Sentiment: Stops being significantly correlated with success when controlling for other variables.
Status: Account age is strongly correlated with karma, senior users have high status.
Narratives: Narratives significantly improve the fix except the „Student“ narrative. Narratives that clearly communicate need are more successful than those that do not.
Lets take an example to understand the results and interpretate.
Median Length is 74 words.
Examples assume Median Karma and Community age.
We have seen that textual, social and temporal features all significantly improve the fit of logistic regression model. Lets test to what degree the model is able to generalize and predict the success of unseen requests from held out test data set.
L1 penalized estimation method shrinks the estimates of the regression coefficients towards 0 relative to the maximum likelihood estimates.
Receiver operating characteristic is a graphical plot which illustrates the performance of a binary classifier system. It is created by plotting the true positive rate vs false positive rate. A perfect model will score 1 and random guess will score around 0.5
In the fields of computational linguistics and probability, an n-gram is a contiguous sequence of n items from a given sequence of text or speech.
DNA sequencingbase pair …AGCTTCGA……, (Unigram) A, G, C, T, T, C, G, A, ……, (Bigram)AG, GC, CT, TT, TC, CG, GA, ……, (Trigram)AGC, GCT, CTT, TTC, TCG, CGA, …
No significance difference between textual model and uni, bi and trigram baselines even when we have only 9 features in tectual and baselines have much more features.
Lastly, table also demostrate that unigram model does not significantly improve predective accuracy. This shows that the concise set of textual factors accounts for almost all the variances.
It is worth pointing out that we are purposfully dealing with a very difficult setting- since the goal is to assist the users during request creation we do not use any factors that can only be observed later (e.g. Responses, updates, comments).
Measured user similarity by representing users by their interests in terms of the set of subreddits in which they have posted at least once.