This document discusses methods for measuring the quality of online services. It describes how major companies like Google, Facebook, and Netflix collect data through user behavior, panel surveys, and direct user feedback at different stages of their services. Panel surveys can provide insights but have limitations, while user behavior data is abundant but noisy. The document also provides examples of how to design panel surveys and side-by-side evaluations to assess search engine result pages. It concludes that the best approach is to combine various data collection methods depending on the service characteristics and lifecycle.
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Jin Young Kim
검색 및 추천 시스템의 사회적 역할이 커지면서, 그 결과의 공정성 역시 최근 관심사로 대두되었다. 본 발표에서는 검색 및 추천시스템의 공정성 이슈 및 그 해법을 다룬다. 공정한 검색 및 추천 결과를 정의하는 다양한 방법, 공정성의 결여가 미치는 자원 배분 및 스테레오타이핑 문제, 그리고 검색 및 추천시스템 개발의 각 단계별로 어떤 해결책이 있는지를 최신 연구 중심으로 살펴본다. 마지막으로 실제 공정한 시스템 개발을 위한 실무적인 고려사항을 다룬다.
SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...Jin Young Kim
This tutorial aims to provide attendees with a detailed understanding of end-to-end evaluation pipeline based on human judgments (offline measurement). The tutorial will give an overview of the state of the art methods, techniques, and metrics necessary for each stage of evaluation process. We will mostly focus on evaluating an information retrieval (search) system, but the other tasks such as recommendation and classification will also be discussed. Practical examples will be drawn both from the literature and from real world usage scenarios in industry.
Little Known Features of Qualtrics Research Suite That Will Make Your Life Ea...Qualtrics
Have you ever had one of those moments where you think to yourself, "How did I not know this before?" Join us for a fast-paced webinar as we uncover some of our favorite features to help you make a bigger impact in your research.
Presented: April 1, 2015
A bridge between two worlds – where qual and quant meet: Slides from UX Austr...U1 Group
In a combined presentation with Telstra, we put a unique, fresh and evidence-based approach to the often-controversial topic – qual or quant? We will definitively demonstrate how linking quantitative with qualitative techniques can significantly improve the ability to understand customers – and consequently design services to meet these needs, improve experiences, and ultimately measure success.
Leveraging business intelligence with service design frameworks
Most companies collect a large amount of data in the form of customer feedback, but due to the structure and size it is often underutilised. Let us show you how we created a service framework using this information for Telstra – one that tests the end-to-end customer experience by aligning both quantitative and qualitative research, the best of both worlds! See the techniques we applied, as well as how the framework for Telstra’s products and services relates to service design and testing.
This service framework has provided a better, more holistic service experience for customers. The feedback from our qualitative counterparts has been amazing; it has revolutionised the way they do UX and CX research. Not only do they use it as a tool to understand existing service environments, they can now prioritise findings on key user and customer experiences that have the biggest impact in driving changes and improvements.
Instead of just relying on a small sample of information to make a conclusion about a market or experience, researchers now have the added value of quantitative information to gain further credibility with stakeholders – and ultimately drive better business outcomes.
We hope that our presentation will help you take away what we have learned, and what strategies we recommend, to maximise outcomes for your business too.
Fairness in Search & RecSys 네이버 검색 콜로키움 김진영Jin Young Kim
검색 및 추천 시스템의 사회적 역할이 커지면서, 그 결과의 공정성 역시 최근 관심사로 대두되었다. 본 발표에서는 검색 및 추천시스템의 공정성 이슈 및 그 해법을 다룬다. 공정한 검색 및 추천 결과를 정의하는 다양한 방법, 공정성의 결여가 미치는 자원 배분 및 스테레오타이핑 문제, 그리고 검색 및 추천시스템 개발의 각 단계별로 어떤 해결책이 있는지를 최신 연구 중심으로 살펴본다. 마지막으로 실제 공정한 시스템 개발을 위한 실무적인 고려사항을 다룬다.
SIGIR Tutorial on IR Evaluation: Designing an End-to-End Offline Evaluation P...Jin Young Kim
This tutorial aims to provide attendees with a detailed understanding of end-to-end evaluation pipeline based on human judgments (offline measurement). The tutorial will give an overview of the state of the art methods, techniques, and metrics necessary for each stage of evaluation process. We will mostly focus on evaluating an information retrieval (search) system, but the other tasks such as recommendation and classification will also be discussed. Practical examples will be drawn both from the literature and from real world usage scenarios in industry.
Little Known Features of Qualtrics Research Suite That Will Make Your Life Ea...Qualtrics
Have you ever had one of those moments where you think to yourself, "How did I not know this before?" Join us for a fast-paced webinar as we uncover some of our favorite features to help you make a bigger impact in your research.
Presented: April 1, 2015
A bridge between two worlds – where qual and quant meet: Slides from UX Austr...U1 Group
In a combined presentation with Telstra, we put a unique, fresh and evidence-based approach to the often-controversial topic – qual or quant? We will definitively demonstrate how linking quantitative with qualitative techniques can significantly improve the ability to understand customers – and consequently design services to meet these needs, improve experiences, and ultimately measure success.
Leveraging business intelligence with service design frameworks
Most companies collect a large amount of data in the form of customer feedback, but due to the structure and size it is often underutilised. Let us show you how we created a service framework using this information for Telstra – one that tests the end-to-end customer experience by aligning both quantitative and qualitative research, the best of both worlds! See the techniques we applied, as well as how the framework for Telstra’s products and services relates to service design and testing.
This service framework has provided a better, more holistic service experience for customers. The feedback from our qualitative counterparts has been amazing; it has revolutionised the way they do UX and CX research. Not only do they use it as a tool to understand existing service environments, they can now prioritise findings on key user and customer experiences that have the biggest impact in driving changes and improvements.
Instead of just relying on a small sample of information to make a conclusion about a market or experience, researchers now have the added value of quantitative information to gain further credibility with stakeholders – and ultimately drive better business outcomes.
We hope that our presentation will help you take away what we have learned, and what strategies we recommend, to maximise outcomes for your business too.
Tips and tricks for effective in-app customer research and surveys. This slideshare walks you through the growing importance of mobile research, the questions you should be asking when designing an in-app survey, and a few best practices we've discovered while working with our enterprise customers.
Intro talk on lean unmoderated user testing given at General Assembly, Los Angeles in spring 2013. Covers basics, benefits & limitations, when to test, what to test, and a case study.
Remove the weight of user testing. Make user testing as light weight as possible, to embed it in your design processes without slowing down your development.
UX STRAT Online 2020: Dr. Martin Tingley, NetflixUX STRAT
Over the years, the Netflix UI has evolved from a sparse and static webpage into an immersive, video-centric experience tailored to a variety of platforms. In this talk, I’ll describe the simple but powerful framework that Netflix uses to evolve the product experience: we ask our members, through online A/B tests, which of several possible experiences resonate with them. I’ll also describe the steps we are taking to democratize access to experimentation across the company so that we can explore more ideas and identify those that deliver more value to our members.
User research for Product Managers - Product Tank London Jan 17Morag McLaren
As the head of product for a User Experience Research company I gathered feedback from our clients to help other product managers get user research embedded within their companies.
We talked about getting buy-in from stakeholders, getting started with UX and proving its value and also some of the common tools and methodologies involved.
A primer on AB testing and it's application in ecommerce. A necessary tool in every product manager's arsenal. Covers the principles behind setting up a good test and the statistical tools required to analyze results.
Intro to User Journey Maps for Building Better Websites - Cornell Drupal Camp...Anthony D. Paul
You’ve asked the right questions and maybe you have some personas. There’s a heap of feature requests from your client and a whole lot of content to organize into a sitemap (IA) document and wireframes. However, something’s not sitting right and you wonder how the site fits into the bigger customer journey with the client’s brand, business, and products.
In this talk, I’ll show you how to get started with taking all of that subject matter expertise you’ve been collecting in your mind, and to convert it into one of several useful types of journey maps. I’ll share process, examples, context on how they fit into a larger project, and show how they help bring agreement among your client decision-makers.
• Understand the benefits of thinking through a user journey outside of your website.
• See the variety of types of journey maps and identify where and when to use them.
• Build and use journey maps to shape client conversations and audit decisions.
Machine learning applications nurturing growth of various business domainsShrutika Oswal
Machine learning is a science in which machines are becoming smarter and helping humans to make the best decisions based on previous data recommended practices. This technique is not new but is occupying fresh momentum. Machine Learning Algorithm learns from the previous records and analyses the data. Without any human interrupt, it will generate its own recommendation. A machine will add that recommendation as experience in its database and use it for further processing. In short, the machine learns from its own experience and gives you better and better output.
Machine learning is an iterative process as the more data added to machines learn from fresh feeds of data and then independently adapt new features to handle new data without constant human intervention. Machine learning was earlier used to predict what’s happing with the business but now the machine learning algorithm will suggest what action needs be taken by moving our business forward.
This PowerPoint presentation presents the results of a literature survey of machine learning applications nurturing the growth of various business domains. More specifically, it gives a brief introduction of Machine Learning, four major types of Machine Learning, enhancement in various business domains by the use of various machine learning algorithms.
Social Entrepreneur meets Technology by 황진솔 대표Jin Young Kim
곧 연말을 맞아 Giving에 동참하시는 분이 많으실텐데요, 이번에 모신 황진솔 대표님께서는 기부와 투자를 결합하는 새로운 BM을 가진 스타트업 The Bridge를 운영하고 계십니다. 이번 세미나에서 The Bridge의 사업과 함께, 저개발국 상황에 맞는 적정기술(appropriate technology)에 관해서도 말씀해주실 예정입니다.
Tips and tricks for effective in-app customer research and surveys. This slideshare walks you through the growing importance of mobile research, the questions you should be asking when designing an in-app survey, and a few best practices we've discovered while working with our enterprise customers.
Intro talk on lean unmoderated user testing given at General Assembly, Los Angeles in spring 2013. Covers basics, benefits & limitations, when to test, what to test, and a case study.
Remove the weight of user testing. Make user testing as light weight as possible, to embed it in your design processes without slowing down your development.
UX STRAT Online 2020: Dr. Martin Tingley, NetflixUX STRAT
Over the years, the Netflix UI has evolved from a sparse and static webpage into an immersive, video-centric experience tailored to a variety of platforms. In this talk, I’ll describe the simple but powerful framework that Netflix uses to evolve the product experience: we ask our members, through online A/B tests, which of several possible experiences resonate with them. I’ll also describe the steps we are taking to democratize access to experimentation across the company so that we can explore more ideas and identify those that deliver more value to our members.
User research for Product Managers - Product Tank London Jan 17Morag McLaren
As the head of product for a User Experience Research company I gathered feedback from our clients to help other product managers get user research embedded within their companies.
We talked about getting buy-in from stakeholders, getting started with UX and proving its value and also some of the common tools and methodologies involved.
A primer on AB testing and it's application in ecommerce. A necessary tool in every product manager's arsenal. Covers the principles behind setting up a good test and the statistical tools required to analyze results.
Intro to User Journey Maps for Building Better Websites - Cornell Drupal Camp...Anthony D. Paul
You’ve asked the right questions and maybe you have some personas. There’s a heap of feature requests from your client and a whole lot of content to organize into a sitemap (IA) document and wireframes. However, something’s not sitting right and you wonder how the site fits into the bigger customer journey with the client’s brand, business, and products.
In this talk, I’ll show you how to get started with taking all of that subject matter expertise you’ve been collecting in your mind, and to convert it into one of several useful types of journey maps. I’ll share process, examples, context on how they fit into a larger project, and show how they help bring agreement among your client decision-makers.
• Understand the benefits of thinking through a user journey outside of your website.
• See the variety of types of journey maps and identify where and when to use them.
• Build and use journey maps to shape client conversations and audit decisions.
Machine learning applications nurturing growth of various business domainsShrutika Oswal
Machine learning is a science in which machines are becoming smarter and helping humans to make the best decisions based on previous data recommended practices. This technique is not new but is occupying fresh momentum. Machine Learning Algorithm learns from the previous records and analyses the data. Without any human interrupt, it will generate its own recommendation. A machine will add that recommendation as experience in its database and use it for further processing. In short, the machine learns from its own experience and gives you better and better output.
Machine learning is an iterative process as the more data added to machines learn from fresh feeds of data and then independently adapt new features to handle new data without constant human intervention. Machine learning was earlier used to predict what’s happing with the business but now the machine learning algorithm will suggest what action needs be taken by moving our business forward.
This PowerPoint presentation presents the results of a literature survey of machine learning applications nurturing the growth of various business domains. More specifically, it gives a brief introduction of Machine Learning, four major types of Machine Learning, enhancement in various business domains by the use of various machine learning algorithms.
Social Entrepreneur meets Technology by 황진솔 대표Jin Young Kim
곧 연말을 맞아 Giving에 동참하시는 분이 많으실텐데요, 이번에 모신 황진솔 대표님께서는 기부와 투자를 결합하는 새로운 BM을 가진 스타트업 The Bridge를 운영하고 계십니다. 이번 세미나에서 The Bridge의 사업과 함께, 저개발국 상황에 맞는 적정기술(appropriate technology)에 관해서도 말씀해주실 예정입니다.
헬로 데이터 과학: 삶과 업무를 개선하는 데이터 과학 이야기 (스타트업 얼라이언스 강연)Jin Young Kim
12월 22일 스타트업 얼라이언스에서 있었던 데이터 사이언스 관련 공개 강연 슬라이드 입니다. 실제 사용했던 슬라이드에 시간 관계상 생략했던 슬라이드와 각종 링크를 추가한 확장판입니다.
- 데이터에 대한 오해와 진실
- 데이터 과학의 절차와 유의사항
- 비즈니스 성장을 위한 데이터 과학사례
- 데이터 과학을 활용한 책 쓰기
데이터 과학에 관련된 다양한 자료를 제 홈페이지와 페북, 트위터, 브런치에서 만나보실 수 있습니다.
http://www.hellodatascience.com/
이벤트에 관련된 좀더 자세한 사항은 온오프믹스 링크 참조하세요: http://onoffmix.com/event/59334
CS 유학 모임을 여러분의 참여로 성황리에 마칠 수 있었습니다. 지난번 데이터 사이언스 모임에서처럼, 저의 발표보다는 여러분들의 경험과 지식에서 많이 배울 수 있었던 것 같습니다. 패널리스트로 참석해주신 이병호님과 (CSUhak.info 운영자이십니다.) 박욱님 (제 학교 선배님으로 경희대 전자공학과 조교수로 올해 임용되셨습니다.)께 다시금 감사의 말씀 전하고 싶습니다. 이병호님께서는 석사 유학에 대한 경험담을 공유해 주셨고, 그리고 박욱님께서는 국내에서 학위를 마치신 만큼, 자칫 유학 중심으로 치우치기 쉬운 논의의 중심을 잡아주셨습니다.
행사의 결과물을 다 많은 분들과 공유하는 차원에서, 발표자료 및 동영상을 여기에 공유합니다.
대형 병원의 교양 세미나에서 발표한 자료입니다.
이미 기술 지식은 충분하셨고 사례를 많이 궁금해 하셨습니다. 그래서 제 경험을 통해 얻었던 인사이트를 많이 나누었습니다. 하지만 의료현장은 플랫폼이나 기술보다는 의료기기로 접근하지 않으면 사용되기 힘들다는 생각이 들었습니다.
An Introduction to the World of User ResearchMethods
What is user? Why do we do it? How do we do it? User Research Consultants, Dr Jennifer Klatt and Ben Smith from Methods Digital (https://methodsdigital.co.uk/) have kindly put together this slide deck to take you through the basics.
The future for performance management, quality and true continuous improvement for local council planning services. Uses much of the data that councils already send to government, supplements it with some new approaches to customer and quality feedback, and brings it all together in one tidy, holistic report.
October 2013 - Public legal education (PLE) is increasingly delivered online. This webinar will look at how to leverage a number of free or low-cost online tools (including Google Analytics and iPerceptions surveys) to acquire data to measure your impact and align with your key performance indicators or KPIs.
Other tools that will be discussed include online user testing tools and what metrics matter when it comes social media evaluation.
It may be easier than ever today to collect data, but many marketers still find themselves scratching their heads when trying to decide how best to sift through it to uncover the gems. What’s often even more difficult, however, is developing reports that incite action and encourage future investment in the right strategies and optimizations – especially when findings challenge the status quo.
In this session, Ben Magnuson, Senior Data Strategist at One North, explore how to deliver reports that your stakeholders will actually care to read. Specifically, he dives into how you can shift your reporting strategy to ensure you are:
* Establishing the right baselines and goals to help you more accurately benchmark your progress towards KPIs
* Moving beyond simply showing your work to provide the right level of context around data trends that matter
* Including stakeholders in the development of metrics to prevent surrogation, or the confusion of strategic intent with the metrics meant to represent it
* Creating an influential narrative around your results that helps you overcome bias, combat conventional thought and improve decision making
• WhyAnalytics
• Google Analytics Step by Step
• Who are the people who visit my website?
• What brought visitors to your site?
• What they do once they got on the website?
• Did they do what you wanted them to do?
Digital Marketing Course Week 4: Digital AnalyticsAyca Turhan
Fourth week slides of eMarketing Course at Hacettepe University taught by Ayca Turhan.
Topics covered within the presentation include:
Digital Analytics
Conversion Optimization
Testing
Google Analytics Examples
For more please visit: www.aycaturhan.com/man423
Data - How to Use it & When by Square and Call Rail Product LeaderProduct School
Main Takeaways:
-It’s important to define success metrics before you start building your product or feature
-The goal is to validate a hypothesis--what you think users want--and the data might invalidate your hypothesis. This is a good thing! Keep iterating in that direction to identify and test the next hypothesis.
-Data is important to point you in the right direction, but it won’t answer “what is the perfect product?” To figure this out, you need to experiment, fail and repeat as fast as possible to find success.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Measuring the Quality of Online Service - Jinyoung kim
1. Measuring the Quality
of Online Service
Jin Young Kim
Senior Applied Scientist
Microsoft Web Search and AI
2. About Jin Young Kim
• Data Scientist at Microsoft
• Quantified Self Enthusiast
(10 years of happiness tracking)
• Author of ‘Hello, Data Science’
(#1 Bestseller in Korea)
3. Data is the ingredients for all these issues
• Data for training and evaluating ML models
• Data for discovering the defect and issues
• Data for monitoring the health of existing service
• Data for measuring the value of new service
4. Issues in Online Service Development
• Planning
• How to set business objective & plan?
• Implementation
• How to train and improve ML models?
• Evaluation
• How much are users satisfied with the service?
Plan
ExecuteEvaluate
6. Case Study: Data Collection for Restaurants
• Customer Behavior
• Facial expression
• Quantity of leftovers
• Pace of dining
Only limited type of data is
available, possibly with lots
of noise
7. Case Study: Data Collection for Restaurants
• Panel Survey
• Satisfaction for Food
• Satisfaction for Service
• Satisfaction for Environment
Survey can provide
insights into customer
satisfaction, but with
some caveat
8. Data Collection for Online Service
• User Behavior
• Various ‘signals’ from behavioral data
• Limited type of data is available, with lots of noise
• Needs substantial user base required
• Panel Survey
• Hire a group of panels, or use crowdsourcing
• Collect feedback for all aspects of service quality
• Cost of hiring and maintaining panel
9. Data Collection for Online Service (2)
• Direct User Feedback
• Request real-time feedback from customers
• Typically low response rate, with potential nuisance
• Widely used for personalized services (i.e., recommendation)
Panel Survey User Behavior
User Feedback
10.
11. How does major online service companies
collect data for measurement?
12. Search Engine: Google / Bing
• Early stage: panel-based survey
• Late stage: user behavior-based experiments
• Source: Google
14. Social Network: Facebook
• Before: use only user behavior
• Nowadays: user behavior + panel survey + user feedback
• Source: Slate / Quora
We could expose contents users are
actually satisfied instead of click-baits by
using panel survey and user feedback in
addition to signals from user behavior
- Julie Zhuo, Product Design VP at
Facebook
17. Movie Recommendations from Netflix
Algorithm
A
Algorithm
B
Can you tell if algorithm A vs. B is better?
Even the users
themselves
can’t!!!
18. Movie Recommendations from Netflix (2)
Results below are more relevant, but users engage more with the above
19. So, how should I collect data for my service?
• What signals can we extract out of user behavior?
• Are there incentives for users to provide feedback?
Service
Characteristics
• Do you already have substantial volume of active users?
• Can a panel evaluate user experience as a substitute?
Feasibility of
Collection
• Do you have marketing budget for building a user base,
or for a panel survey?
Cost of
Collection
21. Evaluation based on
user behavior
• Which result did users click?
• Is click the only measure of satisfaction?
• How long did a user stay on a result?
• Is longer dwell-time already better?
• Do users perform search repeatedly?
• Does loyalty mean satisfaction?
User behavior is an important clue, but a noisy one.
22. How can you design a panel survey for SERP
evaluation?
How would you evaluate
the search results for
query ‘crowdsourcing’?
Bad
Good
Excellent
Perfect
Q: Who do you think so?
23. Alternative: Evaluating a Webpage
How would you evaluate
the search results for
query ‘crowdsourcing’?
Bad
Good
Excellent
Perfect
Q: Who do you think so?
24. Alternative: Side-by-Side SERP Evaluation
Q: How would you
compare two results?
Left much better
Left slightly better
About the same
Right slightly better
Right much better
Q: Why do you think so?
26. Summary…
• As a first step in data science, plan on collecting high-quality data
• Combine various data collection methods depending on the
characteristics and lifecycle of service
• It takes a lot of consideration to get the panel survey done right
27. For more information…
• What you need to know about data even if you’re not a Data Scientist
• SIGIR’2015 Tutorial on Offline Search Evaluation
• Offline Evaluation for Information Retrieval
Foundation and Trend in IR Journal (To Appear)
Editor's Notes
행사 제목이 ‘우리가 데이터를 쓰는 법'인데요, 저는 오늘 데이터 수집에 초점을 맞추어 볼까 한다.
데이터로 일을 해보신 분들은 공감하겠지만 제대로 된 데이터가 있으면 이를 가공하는 것도, 사용하는 것도 상대적으로 쉽다.
이들 대부분은 데이터 문제다.
이처럼 다양한 유형의 데이터가 있지만 핵심은 서비스에 대한 고객의 반응을 측정하는 것이다.
온라인 서비스의 개발 과정은 크게 ~ 로 나눌 수 있다.
각 단계별로 다양한 이슈가 존재한다.
이해를 돕기 위해 식당을 예로 들어보자. 고객의 행동에서 얻을 수 있는 데이터는 무엇일까?
부족한 데이터는 패널 서베이를 통해 얻을 수 있다.
패널 서베이는 고객의 의견을 대표하는 패널을 고용하여 그들의 의견을 청취하는 것이다.
이런 데이터 수집 방법은 온라인 서비스의 개선에도 그대로 적용할 수 있다.
지금까지 두가지 방법을 알아보았는데, 이를 결합하면 어떨까? 사용자에게 실시간으로 피드백을 받는 것이다.
하지만, 이를 제대로 하지 않으면 낮은 응답률에 오히려 사용자를 성가시게 할 수도 있다.
이제 이런 데이터 수집 방법을 주요 온라인 서비스 기업에서 어떻게 활용하는지 알아보자.
우선 필자의 업무 영역인 검색 서비스 사례를 생각해보자.
검색서비스 개선을 위해서는 다양한 실험 기법이 사용되는데 ~
(뒤에 자세히 다룬다)
페이스북은 서비스 초기에 사용자 로그만 사용했다고 한다. 하지만 최근에는 ~
페이스북 피드 랭킹에 사용자 로그와 함께 패널 서베이와 사용자 피드백을 추가로 사용함으로써 클릭만을 유도하는 컨텐츠 대신 사용자가 만족하는 컨텐츠를 더 많이 노출시킬 수 있었다.
넷플릭스에서는 검색 서비스와 추천 서비스의 평가에 각각 다른 데이터를 사용한다고 한다.
그 이유중 하나는 개인화된 추천 서비스의 결과를 서베이로 평가하기 어렵다는 것이다.
예를 들어 두가지 추천 알고리즘에서 나온 결과를 비교해보자. 이용자 자신도 우열을 판단하기가 쉽지 않다!
No ground for comparison / What if the judge doesn’t understand the intent?
No ground for comparison / What if the judge doesn’t understand the intent?