This is a step-by-step instruction for controlled experiment design. I tried to simplify this complex and tedious process into a relatively simple and easy to follow recipe. This is the lecture slides I developed for CS3248: Design of Interaction Systems in the School of Computing, National University of Singapore
The methods of exploratory testing has gained significant attention in industry and research in the last years. However, as many “buzzword" technologies, the introduction and application of exploratory testing is not straightforward. Exploratory testing it is not only black or white - scripted or exploratory - but also all shades of grey in between. Within the EASE industrial excellence center, we have executed an industrial workshop on exploratory testing, that helps providing understanding of how to choose feasible levels of exploration in exploratory testing. We will present the concepts of levels of exploration in exploratory testing, the outcomes of the workshop, along with relevant empirical research findings on exploratory testing.
Industry-Academia Communication In Empirical Software EngineeringPer Runeson
Researchers in software engineering must communicate with industry practitioners, both engineers and managers. Communication may be about collaboration buy-in, problem identification, empirical data collection, solution design, evaluation, and reporting. In order to gain mutual benefit of the collaboration, ensuring relevant research and improved industry practice, researchers and practitioners must be good at communicating. The basis for a researcher to be good at industry-academia communication is firstly to be “bi-lingual”. Understanding and being able to translate between these “languages” is essential. Secondly, it is also about being “bi-cultural”.Understanding the incentives in industry and academia respectively, is a basis for being able to find balances between e.g. rigor and relevance in the research. Time frames is another aspect that is different in the two cultures. Thirdly, the choice of communication channels is key to reach the intended audience.A wide range of channels exist, from face to face meetings, via tweets and blogs, to academic journal papers and theses; each having its own audience and purposes. The keynote speech will explore the challenges of industry-academia communication, based on two decades of collaboration experiences, both successes and failures. It aims to support primarily the academic side of the communication to help achieving industry impact through rigorous and relevant empirical software engineering research.
Popular Delusions, Crowds, and the Coming Deluge: end of the Oracle?Bob Binder
Invited Talk at the 20th CREST Open Workshop, The Oracle Problem for Automated Software Testing. University College of London. May 21, 2012
Pragmatic Innovations for test oracles, a new Oracle Taxonomy, Characterization of test oracles, Challenges.
Unit test ???? - is an automated piece of code that invokes a unit of work in the system and then checks a single assumption about the behavior of that unit of work. This presentation is all about Unit Testing. Will be beneficial to both the developers and testers to learn from.
This is a step-by-step instruction for controlled experiment design. I tried to simplify this complex and tedious process into a relatively simple and easy to follow recipe. This is the lecture slides I developed for CS3248: Design of Interaction Systems in the School of Computing, National University of Singapore
The methods of exploratory testing has gained significant attention in industry and research in the last years. However, as many “buzzword" technologies, the introduction and application of exploratory testing is not straightforward. Exploratory testing it is not only black or white - scripted or exploratory - but also all shades of grey in between. Within the EASE industrial excellence center, we have executed an industrial workshop on exploratory testing, that helps providing understanding of how to choose feasible levels of exploration in exploratory testing. We will present the concepts of levels of exploration in exploratory testing, the outcomes of the workshop, along with relevant empirical research findings on exploratory testing.
Industry-Academia Communication In Empirical Software EngineeringPer Runeson
Researchers in software engineering must communicate with industry practitioners, both engineers and managers. Communication may be about collaboration buy-in, problem identification, empirical data collection, solution design, evaluation, and reporting. In order to gain mutual benefit of the collaboration, ensuring relevant research and improved industry practice, researchers and practitioners must be good at communicating. The basis for a researcher to be good at industry-academia communication is firstly to be “bi-lingual”. Understanding and being able to translate between these “languages” is essential. Secondly, it is also about being “bi-cultural”.Understanding the incentives in industry and academia respectively, is a basis for being able to find balances between e.g. rigor and relevance in the research. Time frames is another aspect that is different in the two cultures. Thirdly, the choice of communication channels is key to reach the intended audience.A wide range of channels exist, from face to face meetings, via tweets and blogs, to academic journal papers and theses; each having its own audience and purposes. The keynote speech will explore the challenges of industry-academia communication, based on two decades of collaboration experiences, both successes and failures. It aims to support primarily the academic side of the communication to help achieving industry impact through rigorous and relevant empirical software engineering research.
Popular Delusions, Crowds, and the Coming Deluge: end of the Oracle?Bob Binder
Invited Talk at the 20th CREST Open Workshop, The Oracle Problem for Automated Software Testing. University College of London. May 21, 2012
Pragmatic Innovations for test oracles, a new Oracle Taxonomy, Characterization of test oracles, Challenges.
Unit test ???? - is an automated piece of code that invokes a unit of work in the system and then checks a single assumption about the behavior of that unit of work. This presentation is all about Unit Testing. Will be beneficial to both the developers and testers to learn from.
Currently we are having a project of Human Computer Interaction (HCI) course in which we are developing a mobile app named "Announcer".
This is a project report of our "Announcer" mobile app.
Click on our blogspot here to know more:
yujinnohikari.blogspot.com
prototyping software credit to: justinmind.com
Chapter 9: Evaluation techniques
from
Dix, Finlay, Abowd and Beale (2004).
Human-Computer Interaction, third edition.
Prentice Hall. ISBN 0-13-239864-8.
http://www.hcibook.com/e3/
Using Touchscreen Operant Systems to Study Cognitive Behaviors in RodentsInsideScientific
In this exclusive webinar sponsored by Lafayette Instrument, experts discuss novel rodent Touchscreen systems, referred to as Bussey-Saksida Systems, in terms of the animal environment, various behavioral Tasks, data analysis, methodology and prescribed best practices. In addition, information on integrating Touchscreen behavioral Tasks with video tracking, optogenetics and electrophysiology is shared.
Background Information:
Translational neuroscience has driven the need to create validated rodent and primate touchscreen ‘Tasks’ that are designed to mimic similar tests in clinical research. These Tasks cover many different aspects of cognitive behavior, one such example being PAL (Paired Associate Learning), a task shown to be very sensitive to detecting early onset of Alzheimer’s disease in humans. This well established human task displays six different images in six locations on a touchscreen -- the subject has to remember where each image was shown. If a mistake is made the images are shown to the subject again, introducing learning memory into the Task.
In the rodent version of the Task, subjects are first trained to touch the screens, then to initiate a task, and finally to learn which locations each of three images belong. The animal environment is optimized to focus subject attention to the Task displayed. In addition, to better understand the cognitive processes at play in the touchscreen Tasks these systems have been further developed to integrate behavioural tests with either Electrophysiology recording, optogenetic stimulation or video tracking.
Moderated vs Unmoderated Research: It’s time to say ELMO (Enough, let’s move ...UserZoom
Does this sound familiar? Researchers sitting around a meeting table arguing about which methods to use, especially when it comes to unmoderated remote testing vs moderated? Usually without any empirical data?
In this webinar we'll give you the power of data to say "ELMO!" (Enough, let’s move on!) and end the argument once and for all.
We collected this data by conducting 10 moderated and 10 unmoderated remote sessions across six tasks on Patagonia.com, in order to show how moderated and unmoderated remote studies compare in terms of the number and severity of usability issues surfaced.
Register for this upcoming webinar and discover the theoretical and actual strengths and weaknesses of various user research methods to stop the argument before it even begins.
User Experiments in Human-Computer InteractionDr. Arindam Dey
This lecture covers the basics of user experiment design in human-computer interaction. Computer scientists and developers often create interfaces for a particular purpose. This lecture explains how a user experiment can be designed and conducted to systematically compare one interface with the other.
Currently we are having a project of Human Computer Interaction (HCI) course in which we are developing a mobile app named "Announcer".
This is a project report of our "Announcer" mobile app.
Click on our blogspot here to know more:
yujinnohikari.blogspot.com
prototyping software credit to: justinmind.com
Chapter 9: Evaluation techniques
from
Dix, Finlay, Abowd and Beale (2004).
Human-Computer Interaction, third edition.
Prentice Hall. ISBN 0-13-239864-8.
http://www.hcibook.com/e3/
Using Touchscreen Operant Systems to Study Cognitive Behaviors in RodentsInsideScientific
In this exclusive webinar sponsored by Lafayette Instrument, experts discuss novel rodent Touchscreen systems, referred to as Bussey-Saksida Systems, in terms of the animal environment, various behavioral Tasks, data analysis, methodology and prescribed best practices. In addition, information on integrating Touchscreen behavioral Tasks with video tracking, optogenetics and electrophysiology is shared.
Background Information:
Translational neuroscience has driven the need to create validated rodent and primate touchscreen ‘Tasks’ that are designed to mimic similar tests in clinical research. These Tasks cover many different aspects of cognitive behavior, one such example being PAL (Paired Associate Learning), a task shown to be very sensitive to detecting early onset of Alzheimer’s disease in humans. This well established human task displays six different images in six locations on a touchscreen -- the subject has to remember where each image was shown. If a mistake is made the images are shown to the subject again, introducing learning memory into the Task.
In the rodent version of the Task, subjects are first trained to touch the screens, then to initiate a task, and finally to learn which locations each of three images belong. The animal environment is optimized to focus subject attention to the Task displayed. In addition, to better understand the cognitive processes at play in the touchscreen Tasks these systems have been further developed to integrate behavioural tests with either Electrophysiology recording, optogenetic stimulation or video tracking.
Moderated vs Unmoderated Research: It’s time to say ELMO (Enough, let’s move ...UserZoom
Does this sound familiar? Researchers sitting around a meeting table arguing about which methods to use, especially when it comes to unmoderated remote testing vs moderated? Usually without any empirical data?
In this webinar we'll give you the power of data to say "ELMO!" (Enough, let’s move on!) and end the argument once and for all.
We collected this data by conducting 10 moderated and 10 unmoderated remote sessions across six tasks on Patagonia.com, in order to show how moderated and unmoderated remote studies compare in terms of the number and severity of usability issues surfaced.
Register for this upcoming webinar and discover the theoretical and actual strengths and weaknesses of various user research methods to stop the argument before it even begins.
User Experiments in Human-Computer InteractionDr. Arindam Dey
This lecture covers the basics of user experiment design in human-computer interaction. Computer scientists and developers often create interfaces for a particular purpose. This lecture explains how a user experiment can be designed and conducted to systematically compare one interface with the other.
These slides provide an introduction to usability testing. This well-known method in user-centred design is used to improve products, by having participants interact with these products and by measuring their performances and responses.
I presented this topic as a guest lecturer to first-year Psychology students at the University of Twente at February 6th, 2017. Providing examples and best practices from Dutch digital design agency Mirabeau, I explained to them the required steps for the preparation, the moderation, and the analysis of usability tests. Moreover, I highlighted the importance of psychologists’ knowledge, (research) methods and skills for design, which I believe to be invaluable.
Organizing Your First Website Usability Test - Cornell Drupal Camp 2016 - part 4Anthony D. Paul
You’ve built a shiny, new Drupal site. You asked your grandma and your client if they like it and they both do. However, you’re lying awake at night wondering if you’re missing something—because you know you’re not the end user. You yearn for actionable feedback.
In this talk, I’ll distill my background in usability research into a how-to framework for taking your site and conducting your first unmoderated usability test. I’ll cover what to look for, best practices in facilitation, tools on the cheap, and how to glean the most from a brief window of time.
Organizing Your First Website Usability Test - WordCamp Toronto 2016Anthony D. Paul
You’ve built a shiny, new WordPress site. You asked your co-worker and your boss if they like it and they both do. However, you’re lying awake at night wondering if you’re missing something—because you know you’re not the end user. You yearn for actionable feedback. In this talk, I’ll distill my background in usability research into a how-to framework for taking your site and conducting your first unmoderated usability test. I’ll cover why and when you should be running usability tests; how to set research goals and draft a script for them; setting up your lab environment and capturing feedback; and best practices for facilitation, minimizing bias, keeping users on task and gleaning the most from each brief test.
Introduction to Usability Testing for Survey ResearchCaroline Jarrett
The basics of how to incorporate usability testing in the development process of a survey. Workshp first presented at the SAPOR conference, Raleigh, North Carolina USA, October 2011 by Emily Geisen of RTI and Caroline Jarrett of Effortmark.
Seungwon Hwang: Entity Graph Mining and MatchingMichael Shilman
This talk introduces the problem of matching web-scale entity graphs, such as multilingual name graphs and social network graphs, to solve difficult problems such as name translation or social id finding. While existing approaches focus on using textual (or phonetic) similarity or Web co-occurrences, this approach combines the strength of the two and significantly outperforms the state-of-the-arts. We present our evaluation results using real-life entity graphs.
Collective Intelligence
- Introduction
- Collective Intelligence
- Creative Research Practices
- Why you should take the course
- Assignment 1
- Feedback
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Controlled Experiments - Shengdong Zhao
1. How to Design Controlled
Experiment in HCI?
Shengdong Zhao
NUS-HCI Lab
National University of Singapore
This material is mostly developed by Shengdong Zhao, with certain materials taken
from Maneesh Agarwala’s slides (used with permission). You are free to use the 1
material as long as the original authors are acknowledged.
2. Outline
The 5 Step Approach to Experiment Design
1. Define the research question
2. Determine variables
3. Arrange conditions
4. Decide blocks and trials
5. Set instruction and procedures
1,2 will be covered in tutorial
3,4,5 will be covered in future lectures
11. Video
• http://www.youtube.com/watch?v=bATkA0Usoio
Paper
– Shengdong Zhao, Pierre Dragicevic, Mark H.
Chignell, Ravin Balakrishnan, Patrick Baudisch
(2007). earPod: Eyes-free Menu Selection with Touch
Input and Reactive Audio Feedback. Proceedings of
the ACM Conference on Human Factors in
Computing Systems (CHI). pp. 1395-1404
11
13. The 5 Step Approach to
Experiment Design
1. Define the research question
2. Determine variables
3. Arrange conditions
4. Decide blocks and trials
5. Set instruction and procedures
14. The 5 Step Approach to
Experiment Design
1. Define the research question
– Step 1.1 Start with a general question
– Step 1.2 Define target population
– Step 1.3 Define task(s)
– Step 1.4 Define measure(s)
– Step 1.5 Define factor(s)
2. Determine variables
3. Arrange conditions
4. Decide blocks and trials
5. Set instruction and procedures
15. Step 1.1: Start with a General
Question
How does earPod
compare with iPod’s menu
in terms of performance?
16. Step 1.2: Define Target
Population
General question: How does earPod compare
with iPod’s menu in terms of performance?
Target population?
Question: we designed earPod for whom?
17. Step 1.3: Define Task(s)
General question: How does earPod compare
with iPod’s menu in terms of performance?
Task(s): menu selection
However, the menu selection task has endless
possibilities: single short menu, single long menu,
hierarchical menus
18. Step 1.3: Define Task(s)
Key insight: experiment design need to decide
what subset of tasks is appropriate to test.
Question: how do you choose the subset?
19. Step 1.4: Define Measures
Question: How does earPod compare with iPod’s
menu in terms of performance?
Measures: performance
In HCI, we typically use three measures to quantify
performance:
– Speed
– Accuracy
– Learnability
Key insight: need to define “testable” measures
20. Step 1.5: Define (other)
Factors with iPod’s
Question: How does earPod compare
menu in terms of performance?
Factors: other than the different type of tasks, what
other factors can influence the measures?
Again: the number of factors are unlimited …
• Scenario of use
• Input device
• Background of the user
– Educational level
– Gender
– Ethnic background
– Age
– …
Key insight: experiment design need to determine a
subset of factors to test.
Question: how to choose the factors?
21. Let’s Review Step 1
Step 1: Define the research
question
–Step 1.1: Start with a general
question
–Step 1.2: Define target population
–Step 1.3: Define task(s)
–Step 1.4: Define measure(s)
–Step 1.5 Define factor(s)
22. Let’s Practice
Example 1: “earPod vs. iPod”
1.1: General Question
– How does earPod compare with iPod’s menu
in terms of performance?
1.2: Target Population
– Young generation
1.3: Task(s)?
– e.g., Menu selection for three types of breadth
(4, 8, 12) and two types of depth (1, 2),
content of the menu is from common
categories
1.4: Measures?
– Speed, accuracy, learning
1.5: (Other) Factors?
– Single-task vs. multi-tasking
– …
23. Let’s Practice Again
Example 2: “Opti” vs. “Qwerty” Keyboard
1.1: General question
– how does the “opti” keyboard layout compare with the “qwerty”
keyboard in performance?
1.2: Target Population?
– Computer users?
1.3: Task(s)?
– Type “the quick brown fox jumps over the lazy dog”
1.4: Measure(s)?
– Speed, accuracy, learning
1.5: (Other) Factors?
– Device: Touch typing vs. stylus?
– Screen size: different screen size?
24. The 5 Step Approach to
Experiment Design
1. Define the research question
2. Determine variables
3. Arrange conditions
4. Decide blocks and repetitions
5. Set instruction and trials
25. Step 2: Define Variables
Type of variables
• Independent variable (IV)
– Factors that are manipulated in the experiment
– Have multiple levels
• Dependent variable (DV)
– Factors which are measured
• Control variable
– Attributes that will be fixed throughout experiment
– Confound – attribute that varied and was not accounted for
• Problem: Confound rather than IV could have caused change in DVs
– Confounds make it difficult/impossible to draw conclusions
• Random variable
– Attributes that are randomly sampled
– Increases generalizability
26. Type of Independent
Variables
• Primary
– The most important independent variable(s) that you
want to investigate
– For the question: “How does the earPod compare with
iPod’s menu in terms of performance, the primary
focus of interest is the different type of
device/technique (earPod vs. iPod), so the primary IV
is device/technique
• Secondary
– The other interesting factors you want to manipulate
in the experiment. They help to answer the main
question in a richer way. For example, a secondary IV
for the earPod vs. iPod experiment will be the
scenario of use (stationary vs. mobile). This variable
helps to answer the primary question “how does
earPod vs. iPod” in a richer way: earPod may work
better in mobile while iPod better in stationary, etc.
28. Let’s Try
Example 1: earPod vs. iPod
• Independent variables
– Technique
• 2 levels (earPod vs. iPod)
– usage scenario
• 2 levels (single-task vs. dual-task)
– menu breadth
• 3 levels (4, 8, 12)
– menu depth
• 2 levels (1, 2)
• Dependent variables
– Speed (measured in completion time)
– Accuracy (measured in percentage of errors)
– Learning (measured in speed & accuracy change over
time)
29. Let’s Try
Example 1: earPod vs. iPod
• Control variables
– Same computer, experiment time, environment,
instruction, etc.
• Random variables
– Attributes of participants: age, gender, background,
etc.
30. Let’s Try Again
Example 2: “Opti” vs. “Qwerty” keyboard layout
• Independent variables
– Type of keyboard
• 2 levels (opti vs. qwerty)
– Input method
• 2 levels (touch vs. stylus)
– Screen size
• 3 levels (watch, mobile phone, tablet)
• Dependent variables
– Speed (measured in word per minute)
– Accuracy (measured in?)
– Learning (measured in speed & accuracy change
over time)
31. Example 2
Example 2: “Opti” vs. “Qwerty” keyboard layout
• Control variables
– Same computer, experiment time, environment,
instruction, etc.
• Random variables
– Attributes of participants: age, gender, background,
etc.
32. How to Design Controlled
Experiment in HCI?
Part 2
Shengdong Zhao
NUS-HCI Lab
National University of Singapore
33. Review: The 5 Step Approach
to Experiment Design
1. Define the research question
2. Determine variables
3. Arrange conditions
4. Decide blocks and trials
5. Set instruction and procedures
34. Let’s Review Step 1: Define
the Research Question
1. Define the research question
– Step 1.1 Start with a general question
– Step 1.2 Define target population
– Step 1.3 Define task(s)
– Step 1.4 Define measure(s)
– Step 1.5 Define factor(s)
37. Let’s Review Step 1
Example 1: “earPod vs. iPod”
1.1: General Question
– How does earPod compare with iPod’s menu
in terms of performance?
1.2: Target Population
– Mobile device users (mostly young
generation)
1.3: Task(s)?
– e.g., Menu selection for three types of breadth
(4, 8, 12) and two types of depth (1, 2),
content of the menu is from common
categories
1.4: Measures?
– Speed, accuracy, learning
1.5: (Other) Factors?
– Single-task vs. multi-tasking
– …
38. Let’s Review Step 2
Example 1: earPod vs. iPod
• Independent variables
– Technique
• 2 levels (earPod vs. iPod)
– usage scenario
• 2 levels (single-task vs. dual-task)
– menu breadth
• 3 levels (4, 8, 12)
– menu depth
• 2 levels (1, 2)
• Dependent variables
– Speed (measured in completion time)
– Accuracy (measured in percentage of errors)
– Learning (measured in speed & accuracy change over
time)
39. Let’s Review Step 2
Example 1: earPod vs. iPod
• Control variables
– Same computer, experiment time, environment,
instruction, etc.
• Random variables
– Attributes of participants: age, gender, background,
etc.
40. Confounding Variable
Any variable other than the independent variables that can
possibly explain the change in measures
Example 1 – three techniques are compared (A, B, C)
• All participants are tested on A, followed by B, followed
by C
– Performance might improve due to practice
– “Practice” is a confounding variable (because it explains the
changes in measures but it is not an IV)
Example 2 – two search engine interfaces are compared
(Google vs. new)
• All participants have prior experience with Google, but
no experience with the new interface
– “Prior experience” is a confounding variable
Note: Practice & Prior experience are two important
confounding variables we need to control. More on this
topic later …
41. The 5 Step Approach to
Experiment Design
1. Define the research question
2. Determine variables
3. Arrange conditions
From Independent Variables to Experimental Conditions
4. Decide blocks and trials
5. Set instruction and procedures
42. What is a condition?
• Let’s start with an example
• A particular independent variable “Technique”
has two levels: earPod and iPod.
– If it is the only independent variable considered, this
experiment has two conditions
• However, an experiment rarely only has 1
independent variable, suppose there is another
independent variable “Menu Breadth” with 3
levels (4, 8, 12).
– There are 2 (Techniques) x 3 (Menu Breadth) = 6
experimental conditions
– The each unique combination of the different levels of
the various independent variables (such as earPod, 4)
is an experimental condition
43. How can We Test These
Conditions?
• Method 1:
– Recruit 6 participants, one for each condition (this is
also called between-subject design, which means
the conditions are tested between different subjects)
• P1: earPod, 4
• P2: earPod, 8
• P3: earPod, 12
• P4: iPod, 4
• P5: iPod, 8
• P6: iPod 12
– What’s the problem of this approach?
• What about individual differences?
• To balance individual differences, we need lots of
participants
• Key insight: this method is expensive
44. How can We Test It?
• Method 2:
– Recruit the same participants to test all 6 conditions
(this is also called within-subject design since all
conditions are tested within the same subject)
– This method is much more economical
– What’s the problem of this approach?
• Practice (or order effect) as a confounding variable
• However, in many cases, this effect can be controlled
45. Control Order Effect using
Counter-balancing
If we assume the order effect is symmetric, which
means A -> B = B->A, and is linear, which means the
increment between different conditions is about the
same, we can use counter-balancing to cancel the
effect out.
E.g., we assume the transferring effect between (A
after B) and (B after A) are both 10
Participant 1: A followed by B (A B)
Participant 2: B followed by A (B A)
Observation: the order effect equally affects both A
and B, so the absolute relationship between A and B is
not changed.
However, a minimum number of participants is
needed for counter-balancing to work
46. Counter-balancing with 3
Levels
What if an IV has 3 levels? A B C
If the same assumption holds: assume effects are
symmetric, and equal in size. We need to counter-
balance as follows.
P1: A B C
P2: A C B
P3: B A C
P4: B C A
P5: C A B
P6: C B A
What about 4 levels, 5 levels, 6 levels, …?
4 level = 4! (24), 5 levels = 5! (120), …
47. Introducing Partial Counter-
balancing: Latin Square
Latin square:
– Ensures each level appears in every position in order
equally often:
ABC
BCA
CAB
Assume A-B = B-A = A-C = C-A = B-C However, A-B = B-A = 10
= C-B = 10 A-C = C-A = 20
P1: a + (b+10) + (c+20) B-C = C-B = 30
P2: b + (c+10) + (a+20) P1: a + (b+10) + (c+50)
P3: c + (a+10) + (b+20) P2: b + (c+30) + (a+30)
Average A = (3a +30)/3 = a + 10 P3: c + (a+20) + (b+40)
Average B = (3b+30)/3 = b + 10 Average A = (3a +50)/3 = a + 50/3
Average C = (3c+30)/3 = c + 10 Average B = (3b+50)/3 = b + 50/3
Average C = (3c+80)/3 = c + 80/3
47
48. Steps for Arranging Conditions
for Within-Subject Design
3.1: List all Independent Variables and their levels
3.2: Decide counter-balancing strategy for each
variable
3.3: Determine the minimum No. of participants
3.4: Arrange the overall design
3.5: Determine detailed arrangement for each
participant
49. Example 1: earPod vs. iPod
Assume we have three IVs
Step 3.1: list the IV and their levels
– Technique (2 levels: earPod, iPod)
– Scenario of use (2 levels: single-task, multi-task)
– Menu depth (2 levels: 1, 2)
Step 3.2: determine counter-balancing strategies
for each IV
– Choices: 1) fully counter-balancing, 2) Latin-square,
3) no counter-balancing (sequential)
– Question: how to decide which strategy to use?
• It depends on how interesting is the independent variable
• It depends on how much resource we have
50. Example 1: earPod vs. iPod
Step 3.2: Counter-balancing strategies
– Technique (fully counter-balance)
– Scenario of use (fully counter-balance)
– Menu depth (no counter-balance, sequential)
– Why?
Step 3.3: Determine the minimum No. of
participants
– Minimum No. = 2 Tech. conditions X 2 Scenario
conditions X 1 Menu depth arrangement = 4
– Question: if Menu depth is also fully counter-
balanced, how many participants we need?
– Question: if Technique has 3 levels and it is fully
counter balanced (assume menu depth is not counter-
balanced), how many participants we need?
51. Step 3.4: Determine the overall arrangement
T1, T2 Single-task, Multi-task
x
T2, T1 Multi-task, Single-task
Step 3.5: Determine arrangement for each
participant
52. In-class Exercise: Example 2
Step 3.1: List IVs
– Technique (3 levels: A, B, C)
– Scenario of use (2 levels: single-task, multi-task)
Step 3.2: Decide counter-balancing strategy
Step 3.3: Determine Minimum No. of Participants
Step 3.4: Determine the overall arrangement
Step 3.5: Determine individual arrangement for
each participant
54. However, Counter-balancing May
not Always Work
Counter-balancing Assumes symmetric transfer
and linear increment
– A-B transfer == B-A transfer
– A-B transfer == B-C transfer
If asymmetric transfer and non-linear increment
– i.., A-B transfer > or < B-A transfer or A-B <> B-C
– Have to use Between-subjects design
– In addition, some factors have to be between-subject
• Age, Gender, etc.
54
55. No. of Condition Reduction
Strategies
In experiment design, one major problem we often face
is there are many possible relevant factors. It’s
important for experiment designers to pick the most
important/interesting factors to test.
Run a few independent variables at a time
– If strong effect, include variable in future studies
– Otherwise pick fixed control value for it
Not all within-subject IVs need to be counter-balanced
– If we are not interested in the absolute difference among
different levels, we don’t need to counter-balance. E.g.,
Menu Breadth, Menu Depth, etc.
56. Exercise: earPod vs. iPod
• Independent variables
– Technique (2 levels: earPod vs. iPod)
– Usage scenario (2 levels: single vs multi-tasking)
– Menu breadth (3 levels: 4, 8, 12)
– Menu depth (2 levels: 1, 2)
• Question: which of these factors need counter-
balancing?
57. Let’s Review: Between- vs.
Within-Subject Design
• Method 1: use a lot of participants, randomly
assign them to each technique (between-subject
design)
– Drawback: costly
• Method 2: use the same participant to test both
techniques (within-subject design)
– Drawback: practice effect
59. Steps for Arranging Conditions
for Within-Subject Design
3.1: List all Independent Variables and their levels
3.2: Decide counter-balancing strategy for each
variable
3.3: Determine the minimum No. of participants
3.4: Arrange the overall design
3.5: Determine detailed arrangement for each
participant
60. The 5 Step Approach to
Experiment Design
1. Define the research question
2. Determine variables
3. Arrange conditions
4. Decide blocks and trials
5. Set instruction and procedures
61. Definitions
• Trial
– A single repetition of a single condition/cell
– A number of trials are used to increase reliability
• Block*
– An entire section of the experiment
– Repeated to analyze learning
* Block has other definitions. This is a simplified definition for the
purpose of this assignment.
62. Trials in each block: same content,
with order randomized
P1
Block: same arrangement, repeated
64. Determine Number of
Blocks/Repetitions
• Reasonable experiment duration
– Time Constraint and Fatigue
– Typically within 1 hour
• However, minus pre- and post-experiment interviews, only
left with 45 minutes
– In some cases, up to 2 hours
• Enough data points for significant effects
65. Step 4: Determine Blocks and
Trials
• Step 4.1: estimate the time for each trial
(typically at least 3 trials per condition)
• Step 4.2: estimate the time for each block
• Step 4.3: balance the trials and blocks so that
the main part of the experiment is within 45
minutes
• Step 4.4: combine with the condition
arrangement
66. Exercise
Full experiment design: earPod vs. iPod
Independent variables
– Technique (2 levels: earPod vs. iPod)
– Usage scenario (2 levels: single vs multi-tasking)
– Menu breadth (3 levels: 4, 8, 12)
– Menu depth (2 levels: 1, 2)
Block = 3
Trials per condition = 4
Each trial takes roughly 10 seconds to finish
Question: how is the experiment arranged?
Question: how long will the experiment take?
67. The 5 Step Approach to
Experiment Design
1. Define the research question
2. Determine variables
3. Arrange conditions
4. Decide blocks and trials
5. Set instruction and procedures
68. Detailed Steps
Step 5.1: Recruit participants (determine target
users and randomize)
Step 5.2: Consent form and pre-experiment
questionnaire
Step 5.3: Instructions
Step 5.4: Practice trials
Step 5.5: Main experiment with breaks
Step 5.6: Post-experiment questionnaire and
interview
Step 5.7: Debriefing
70. The Participants’ Standpoint
Testing is a distressing experience
– Pressure to perform
– Feeling of inadequacy
– Looking like a fool in front of your peers, your boss,…
Golden rule:
subjects should always
be treated with respect!!!
(from “Paper Prototyping” by Snyder)
71. Treating Subjects With Respect
Follow human subject protocols
– Individual test results will be kept confidential
– Users can stop the test at any time
– Users are aware (and understand) the monitoring technique
– Their performance will have not implication on their life
– Records will be made anonymous
• Videos
Use standard informed consent form
– Especially for quantitative tests
– Be aware of legal requirements
72. Conducting the Experiment
Before the experiment
– Have them read and sign the consent form
– Explain the goal of the experiment
• In a way accessible to users
• Be careful about the demand characteristic
– Participants biased towards experimenter’s hypothesis
– Answer questions
During the experiment
– Stay neutral
– Never indicate displeasure with users performance
After the experiment
– Debrief users
• Inform users about the goal of the experiment
– Answer any questions they have
73. The Importance of Practice
Trials
• earPod
– New technique, no one has
seen it
• iPod
– Existing technique, many
people used or seen it
• Question: how do we control
this?
74. Pilot Study and Protocols
Always pilot it first!
– Reveals unexpected problems
– Can’t change experiment design after starting it
Always follow same steps – use a checklist
Get consent from subjects
Debrief subjects afterwards
75. Let’s Review the Entire
Process
1. Define the research question
2. Determine variables
3. Arrange conditions
4. Decide blocks and trials
5. Set instruction and procedures
76. Step 1: Define the Research
Question
Define the research question has 4 sub-steps
Step 1.1 Start with a general question
Step 1.2 Define the target population
Step 1.3 Define task(s)
Step 1.4 Define measure(s)
Step 1.5 Define factor(s)
78. Step 3: Arranging Conditions for
Within-Subject Design
3.1: List all Independent Variables and their levels
3.2: Decide counter-balancing strategy for each
variable
3.3: Determine the minimum No. of participants
3.4: Arrange the overall design
3.5: Determine detailed arrangement for each
participant
79. Step 4: Determine Blocks and
Trials
• Step 4.1: estimate the time for each trial
(typically at least 3 trials per condition)
• Step 4.2: estimate the time for each block
• Step 4.3: balance the trials and blocks so that
the main part of the experiment is within 45
minutes
• Step 4.4: combine with the condition
arrangement
80. Step 5: Set Introduction &
Procedure
Step 5.1: Recruit participants (determine target
users and randomize)
Step 5.2: Consent form and pre-experiment
questionnaire
Step 5.3: Instructions
Step 5.4: Practice trials
Step 5.5: Main experiment with breaks
Step 5.6: Post-experiment questionnaire and
interview
Step 5.7: Debriefing
82. Next Time
• Don’t’ forget to submit G1 by Sunday 23:59
• Bring your questions to tutorial next Tuesday
• We will teach you storyboarding, sketching, and
low-fidelity prototyping in the lab and lecture next
week