Understanding the impact of a search system’s response latency on its users’ searching behaviour has been recently an active research topic in the information retrieval and human-computer interaction areas. Along the same line, this paper focuses on the user impact of search latency and makes the following two contributions. First, through a controlled experiment, we reveal the physiological effects of response latency on users and show that these effects are present even at small increases in response latency. We compare these effects with the information gathered from self-reports and show that they capture the nuanced attentional and emotional reactions to latency much better. Second, we carry out a large-scale analysis using a web search query log obtained from Yahoo to understand the change in the way users engage with a web search engine under varying levels of increasing response latency. In particular, we analyse the change in the click behaviour of users when they are subject to increasing response latency and reveal significant behavioural differences.
CASA: Context Aware Scalable Authentication, at SOUPS 2013Jason Hong
We introduce context-aware scalable authentication (CASA) as a way of balancing security and usability for authentication. Our core idea is to choose an appropriate form of active authentication (e.g., typing a PIN) based on the combination of multiple passive factors (e.g., a user’s current location) for authentication. We provide a probabilistic framework for dynamically selecting an active authentication scheme that satisfies a specified security requirement given passive factors. We also present the results of three user studies evaluating the feasibility and users’ receptiveness of our concept. Our results suggest that location data has good potential as a passive factor, and that users can reduce up to 68% of active authentications when using an implementation of CASA, compared to always using fixed active authentication. Furthermore, our participants, including those who do not using any security mechanisms on their phones, were very positive about CASA and amenable to using it on their phones.
CASA: Context-Aware Scalable Authentication, at SOUPS 2013Jason Hong
We introduce context-aware scalable authentication (CASA) as a way of balancing security and usability for authentication. Our core idea is to choose an appropriate form of active authentication (e.g., typing a PIN) based on the combination of multiple passive factors (e.g., a user’s current location) for authentication. We provide a probabilistic framework for dynamically selecting an active authentication scheme that satisfies a specified security requirement given passive factors. We also present the results of three user studies evaluating the feasibility and users’ receptiveness of our concept. Our results suggest that location data has good potential as a passive factor, and that users can reduce up to 68% of active authentications when using an implementation of CASA, compared to always using fixed active authentication. Furthermore, our participants, including those who do not using any security mechanisms on their phones, were very positive about CASA and amenable to using it on their phones.
User Behaviour Modelling - Online and Offline Methods, Metrics, and ChallengesTelefonica Research
Network flows, social networking, smart devices, the Internet-of-Things. These innovations carry no deep value in themselves. Value invariably comes from understanding, obtaining accurate measurements, predicting, and controlling. This vision, however, rests on the application of machine intelligence and data mining techniques that can tackle large and diverse data collections, but also on our capacity to operationalise an interdisciplinary research in the intersection of many domains, such as statistics, signal processing, neuroscience, privacy and security, to name a few. In this talk, through the narration of my involvement in past and recent projects, I share my experience in the domain of user behaviour analysis and predictive modelling. I discuss offline and online experimental methods (and how they can be brought together), present current practices in measuring human behaviour in the online world, and highlight research challenges and opportunities that I have encountered.
Designing for maximum usability – the goal of interaction design
Principles of usability
general understanding
Standards and guidelines
direction for design
Design patterns
capture and reuse design knowledge
Influence of Timeline and Named-entity Components on User Engagement Roi Blanco
Nowadays, successful applications are those which contain features that captivate and engage users. Using an interactive news retrieval system as a use case, in this paper we study the effect of timeline and named-entity components on user engagement. This is in contrast with previous studies where the importance of these components were studied from a retrieval effectiveness point of view. Our experimental results show significant improvements in user engagement when named-entity and timeline components were installed. Further, we investigate if we can predict user-centred metrics through user's interaction with the system. Results show that we can successfully learn a model that predicts all dimensions of user engagement and whether users will like the system or not. These findings might steer systems that apply a more personalised user experience, tailored to the user's preferences.
CASA: Context Aware Scalable Authentication, at SOUPS 2013Jason Hong
We introduce context-aware scalable authentication (CASA) as a way of balancing security and usability for authentication. Our core idea is to choose an appropriate form of active authentication (e.g., typing a PIN) based on the combination of multiple passive factors (e.g., a user’s current location) for authentication. We provide a probabilistic framework for dynamically selecting an active authentication scheme that satisfies a specified security requirement given passive factors. We also present the results of three user studies evaluating the feasibility and users’ receptiveness of our concept. Our results suggest that location data has good potential as a passive factor, and that users can reduce up to 68% of active authentications when using an implementation of CASA, compared to always using fixed active authentication. Furthermore, our participants, including those who do not using any security mechanisms on their phones, were very positive about CASA and amenable to using it on their phones.
CASA: Context-Aware Scalable Authentication, at SOUPS 2013Jason Hong
We introduce context-aware scalable authentication (CASA) as a way of balancing security and usability for authentication. Our core idea is to choose an appropriate form of active authentication (e.g., typing a PIN) based on the combination of multiple passive factors (e.g., a user’s current location) for authentication. We provide a probabilistic framework for dynamically selecting an active authentication scheme that satisfies a specified security requirement given passive factors. We also present the results of three user studies evaluating the feasibility and users’ receptiveness of our concept. Our results suggest that location data has good potential as a passive factor, and that users can reduce up to 68% of active authentications when using an implementation of CASA, compared to always using fixed active authentication. Furthermore, our participants, including those who do not using any security mechanisms on their phones, were very positive about CASA and amenable to using it on their phones.
User Behaviour Modelling - Online and Offline Methods, Metrics, and ChallengesTelefonica Research
Network flows, social networking, smart devices, the Internet-of-Things. These innovations carry no deep value in themselves. Value invariably comes from understanding, obtaining accurate measurements, predicting, and controlling. This vision, however, rests on the application of machine intelligence and data mining techniques that can tackle large and diverse data collections, but also on our capacity to operationalise an interdisciplinary research in the intersection of many domains, such as statistics, signal processing, neuroscience, privacy and security, to name a few. In this talk, through the narration of my involvement in past and recent projects, I share my experience in the domain of user behaviour analysis and predictive modelling. I discuss offline and online experimental methods (and how they can be brought together), present current practices in measuring human behaviour in the online world, and highlight research challenges and opportunities that I have encountered.
Designing for maximum usability – the goal of interaction design
Principles of usability
general understanding
Standards and guidelines
direction for design
Design patterns
capture and reuse design knowledge
Influence of Timeline and Named-entity Components on User Engagement Roi Blanco
Nowadays, successful applications are those which contain features that captivate and engage users. Using an interactive news retrieval system as a use case, in this paper we study the effect of timeline and named-entity components on user engagement. This is in contrast with previous studies where the importance of these components were studied from a retrieval effectiveness point of view. Our experimental results show significant improvements in user engagement when named-entity and timeline components were installed. Further, we investigate if we can predict user-centred metrics through user's interaction with the system. Results show that we can successfully learn a model that predicts all dimensions of user engagement and whether users will like the system or not. These findings might steer systems that apply a more personalised user experience, tailored to the user's preferences.
Beyond Eye Tracking: Using User Temperature, Rating Dials, and Facial Analysi...Jennifer Romano Bergstrom
Dan Berlin, Jon Strohl, David Hawkins and I presented this at UXPA 2013. Eye tracking is well known and accepted in the UX community. Here we present preliminary evidence for the usefulness of adding electrodermal activity (EDA), continuous dial ratings, etc. to user experience research.
SIGIR2014 - Impact of Response Latency on User Behavior in Web SearchTelefonica Research
Traditionally, the efficiency and effectiveness of search systems have both been of great interest to the information retrieval community. However, an in-depth analysis on the interplay between the response latency of web search systems and users’ search experience has been missing so far. In order to fill this gap, we conduct two separate studies aiming to reveal how response latency affects the user behavior in web search. First, we conduct a controlled user study trying to understand how users perceive the response latency of a search system and how sensitive they are to increasing delays in response. This study reveals that, when artificial delays are introduced into the response, the users of a fast search system are more likely to notice these delays than the users of a slow search system. The introduced delays become noticeable by the users once they exceed a certain threshold value. Second, we perform an analysis using a large-scale query log obtained from Yahoo web search to observe the potential impact of increasing response latency on the click behavior of users. This analysis demonstrates that latency has an impact on the click behavior of users to some extent. In particular, given two content-wise identical search result pages, we show that the users are more likely to perform clicks on the result page that is served with lower latency.
Experimental psychology, cognitive science or, more recently, cognitive neuroscience, is the main framework to place human information processing under extensive empirical scrutiny. The last decade has seen a surge of interest in the application of psychological measurements for evaluating increasingly complex human-technology interactions. While most welcome from the psychological perspective, we propose that the use of these methodologies should not rely only on the application of sophisticated measurement tools, but also on the application of contemporary knowledge on psychological phenomena and dynamics of human information processing. In addition, we argue that the latest developments in multimodal signals and data mining techniques offer a unique opportunity to extend psychological methodologies to large scale testing grounds. Thus, the application of psychological knowledge to information retrieval research will not only be beneficial for the latter, but for the former as well, inasmuch as information retrieval provides a real field of application for its hypotheses about human information processing.
Big Data Day LA 2016/ Use Case Driven track - Shaping the Role of Data Scienc...Data Con LA
At IRIS.TV, our business builds algorithmic solutions for video recommendation with the end goal to deliver a great user experience as evidenced by users viewing more video content. This talk outlines our reasons for expanding from a descriptive/predictive approach to data analytics toward a philosophy that features more prescriptive analytics, driven by our data science team.
These are the slides presented by Darren Price in the Open Science Panel discussion at the BIOMAG 2018 meeting in Philadelphia. See also http://www.cam-can.org
DeepScan: Exploiting Deep Learning for Malicious Account Detection in Locatio...yeung2000
The widespread location-based social networks (LBSNs) have immersed into our daily life. As an open platform, LBSNs typically allow all kinds of users to register accounts. Malicious attackers can easily join and post misleading information, often with the intention of influencing the users' decision in urban computing environments. To provide reliable information and improve the experience for legitimate users, we design and implement DeepScan, a malicious account detection system for LBSNs. Different from existing approaches, DeepScan leverages emerging deep learning technologies to learn users' dynamic behavior. In particular, we introduce the long short-term memory (LSTM) neural network to conduct time series analysis of user activities. DeepScan combines newly introduced time series features and a set of conventional features extracted from user activities, and exploits a supervised machine learning-based model for detection. Using the real traces collected from Dianping, a representative LBSN, we demonstrate that DeepScan can achieve an excellent prediction performance with an F1-score of 0.964. We also find that the time series features play a critical role in the detection system.
Measuring effectiveness of machine learning systemsAmit Sharma
Many online systems, such as recommender systems or ad systems, are increasingly being used in societally critical domains such as education, healthcare, finance and governance. A natural question to ask is about their effectiveness, which is often measured using observational metrics. However, these metrics hide cause-and-effect processes between these systems, people's behavior and outcomes. I will present a causal framework that allows us to tackle questions about the effects of algorithmic systems and demonstrate its usage through evaluation of Amazon's recommender system and a major search engine. I will also discuss how such evaluations can lead to metrics for designing better systems.
Predicting user engagement with direct displays (DD) is of paramount importance to commercial search engines, as well as to search performance evaluation. However, understanding within-content engagement on a web page is not a trivial task mainly because of two reasons: (1) engagement is subjective and different users may exhibit different behavioural patterns; (2) existing proxies of user engagement (e.g., clicks, dwell time) suffer from certain caveats, such as the well-known position bias, and are not as effective in discriminating between useful and non-useful components. In this paper, we conduct a crowdsourcing study and examine how users engage with a prominent web search engine component such as the knowledge module (KM) display. To this end, we collect and analyse more than 115k mouse cursor positions from 300 users, who perform a series of search tasks. Furthermore, we engineer a large number of meta-features which we use to predict different proxies of user engagement, including attention and usefulness. In our experiments, we demonstrate that our approach is able to predict more accurately different levels of user engagement and outperform existing baselines.
Influence of time and length size feature selections for human activity seque...ISA Interchange
In this paper, Viterbi algorithm based on a hidden Markov model is applied to recognize activity sequences from observed sensors events. Alternative features selections of time feature values of sensors events and activity length size feature values are tested, respectively, and then the results of activity sequences recognition performances of Viterbi algorithm are evaluated. The results show that the selection of larger time feature values of sensor events and/or smaller activity length size feature values will generate relatively better results on the activity sequences recognition performances.
Modeling Electronic Health Records with Recurrent Neural NetworksJosh Patterson
Time series data is increasingly ubiquitous. This trend is especially obvious in health and wellness, with both the adoption of electronic health record (EHR) systems in hospitals and clinics and the proliferation of wearable sensors. In 2009, intensive care units in the United States treated nearly 55,000 patients per day, generating digital-health databases containing millions of individual measurements, most of those forming time series. In the first quarter of 2015 alone, over 11 million health-related wearables were shipped by vendors. Recording hundreds of measurements per day per user, these devices are fueling a health time series data explosion. As a result, we will need ever more sophisticated tools to unlock the true value of this data to improve the lives of patients worldwide.
Deep learning, specifically with recurrent neural networks (RNNs), has emerged as a central tool in a variety of complex temporal-modeling problems, such as speech recognition. However, RNNs are also among the most challenging models to work with, particularly outside the domains where they are widely applied. Josh Patterson, David Kale, and Zachary Lipton bring the open source deep learning library DL4J to bear on the challenge of analyzing clinical time series using RNNs. DL4J provides a reliable, efficient implementation of many deep learning models embedded within an enterprise-ready open source data ecosystem (e.g., Hadoop and Spark), making it well suited to complex clinical data. Josh, David, and Zachary offer an overview of deep learning and RNNs and explain how they are implemented in DL4J. They then demonstrate a workflow example that uses a pipeline based on DL4J and Canova to prepare publicly available clinical data from PhysioNet and apply the DL4J RNN.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Beyond Eye Tracking: Using User Temperature, Rating Dials, and Facial Analysi...Jennifer Romano Bergstrom
Dan Berlin, Jon Strohl, David Hawkins and I presented this at UXPA 2013. Eye tracking is well known and accepted in the UX community. Here we present preliminary evidence for the usefulness of adding electrodermal activity (EDA), continuous dial ratings, etc. to user experience research.
SIGIR2014 - Impact of Response Latency on User Behavior in Web SearchTelefonica Research
Traditionally, the efficiency and effectiveness of search systems have both been of great interest to the information retrieval community. However, an in-depth analysis on the interplay between the response latency of web search systems and users’ search experience has been missing so far. In order to fill this gap, we conduct two separate studies aiming to reveal how response latency affects the user behavior in web search. First, we conduct a controlled user study trying to understand how users perceive the response latency of a search system and how sensitive they are to increasing delays in response. This study reveals that, when artificial delays are introduced into the response, the users of a fast search system are more likely to notice these delays than the users of a slow search system. The introduced delays become noticeable by the users once they exceed a certain threshold value. Second, we perform an analysis using a large-scale query log obtained from Yahoo web search to observe the potential impact of increasing response latency on the click behavior of users. This analysis demonstrates that latency has an impact on the click behavior of users to some extent. In particular, given two content-wise identical search result pages, we show that the users are more likely to perform clicks on the result page that is served with lower latency.
Experimental psychology, cognitive science or, more recently, cognitive neuroscience, is the main framework to place human information processing under extensive empirical scrutiny. The last decade has seen a surge of interest in the application of psychological measurements for evaluating increasingly complex human-technology interactions. While most welcome from the psychological perspective, we propose that the use of these methodologies should not rely only on the application of sophisticated measurement tools, but also on the application of contemporary knowledge on psychological phenomena and dynamics of human information processing. In addition, we argue that the latest developments in multimodal signals and data mining techniques offer a unique opportunity to extend psychological methodologies to large scale testing grounds. Thus, the application of psychological knowledge to information retrieval research will not only be beneficial for the latter, but for the former as well, inasmuch as information retrieval provides a real field of application for its hypotheses about human information processing.
Big Data Day LA 2016/ Use Case Driven track - Shaping the Role of Data Scienc...Data Con LA
At IRIS.TV, our business builds algorithmic solutions for video recommendation with the end goal to deliver a great user experience as evidenced by users viewing more video content. This talk outlines our reasons for expanding from a descriptive/predictive approach to data analytics toward a philosophy that features more prescriptive analytics, driven by our data science team.
These are the slides presented by Darren Price in the Open Science Panel discussion at the BIOMAG 2018 meeting in Philadelphia. See also http://www.cam-can.org
DeepScan: Exploiting Deep Learning for Malicious Account Detection in Locatio...yeung2000
The widespread location-based social networks (LBSNs) have immersed into our daily life. As an open platform, LBSNs typically allow all kinds of users to register accounts. Malicious attackers can easily join and post misleading information, often with the intention of influencing the users' decision in urban computing environments. To provide reliable information and improve the experience for legitimate users, we design and implement DeepScan, a malicious account detection system for LBSNs. Different from existing approaches, DeepScan leverages emerging deep learning technologies to learn users' dynamic behavior. In particular, we introduce the long short-term memory (LSTM) neural network to conduct time series analysis of user activities. DeepScan combines newly introduced time series features and a set of conventional features extracted from user activities, and exploits a supervised machine learning-based model for detection. Using the real traces collected from Dianping, a representative LBSN, we demonstrate that DeepScan can achieve an excellent prediction performance with an F1-score of 0.964. We also find that the time series features play a critical role in the detection system.
Measuring effectiveness of machine learning systemsAmit Sharma
Many online systems, such as recommender systems or ad systems, are increasingly being used in societally critical domains such as education, healthcare, finance and governance. A natural question to ask is about their effectiveness, which is often measured using observational metrics. However, these metrics hide cause-and-effect processes between these systems, people's behavior and outcomes. I will present a causal framework that allows us to tackle questions about the effects of algorithmic systems and demonstrate its usage through evaluation of Amazon's recommender system and a major search engine. I will also discuss how such evaluations can lead to metrics for designing better systems.
Predicting user engagement with direct displays (DD) is of paramount importance to commercial search engines, as well as to search performance evaluation. However, understanding within-content engagement on a web page is not a trivial task mainly because of two reasons: (1) engagement is subjective and different users may exhibit different behavioural patterns; (2) existing proxies of user engagement (e.g., clicks, dwell time) suffer from certain caveats, such as the well-known position bias, and are not as effective in discriminating between useful and non-useful components. In this paper, we conduct a crowdsourcing study and examine how users engage with a prominent web search engine component such as the knowledge module (KM) display. To this end, we collect and analyse more than 115k mouse cursor positions from 300 users, who perform a series of search tasks. Furthermore, we engineer a large number of meta-features which we use to predict different proxies of user engagement, including attention and usefulness. In our experiments, we demonstrate that our approach is able to predict more accurately different levels of user engagement and outperform existing baselines.
Influence of time and length size feature selections for human activity seque...ISA Interchange
In this paper, Viterbi algorithm based on a hidden Markov model is applied to recognize activity sequences from observed sensors events. Alternative features selections of time feature values of sensors events and activity length size feature values are tested, respectively, and then the results of activity sequences recognition performances of Viterbi algorithm are evaluated. The results show that the selection of larger time feature values of sensor events and/or smaller activity length size feature values will generate relatively better results on the activity sequences recognition performances.
Modeling Electronic Health Records with Recurrent Neural NetworksJosh Patterson
Time series data is increasingly ubiquitous. This trend is especially obvious in health and wellness, with both the adoption of electronic health record (EHR) systems in hospitals and clinics and the proliferation of wearable sensors. In 2009, intensive care units in the United States treated nearly 55,000 patients per day, generating digital-health databases containing millions of individual measurements, most of those forming time series. In the first quarter of 2015 alone, over 11 million health-related wearables were shipped by vendors. Recording hundreds of measurements per day per user, these devices are fueling a health time series data explosion. As a result, we will need ever more sophisticated tools to unlock the true value of this data to improve the lives of patients worldwide.
Deep learning, specifically with recurrent neural networks (RNNs), has emerged as a central tool in a variety of complex temporal-modeling problems, such as speech recognition. However, RNNs are also among the most challenging models to work with, particularly outside the domains where they are widely applied. Josh Patterson, David Kale, and Zachary Lipton bring the open source deep learning library DL4J to bear on the challenge of analyzing clinical time series using RNNs. DL4J provides a reliable, efficient implementation of many deep learning models embedded within an enterprise-ready open source data ecosystem (e.g., Hadoop and Spark), making it well suited to complex clinical data. Josh, David, and Zachary offer an overview of deep learning and RNNs and explain how they are implemented in DL4J. They then demonstrate a workflow example that uses a pipeline based on DL4J and Canova to prepare publicly available clinical data from PhysioNet and apply the DL4J RNN.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills MN
Travis Hills of Minnesota developed a method to convert waste into high-value dry fertilizer, significantly enriching soil quality. By providing farmers with a valuable resource derived from waste, Travis Hills helps enhance farm profitability while promoting environmental stewardship. Travis Hills' sustainable practices lead to cost savings and increased revenue for farmers by improving resource efficiency and reducing waste.
Toxic effects of heavy metals : Lead and Arsenicsanjana502982
Heavy metals are naturally occuring metallic chemical elements that have relatively high density, and are toxic at even low concentrations. All toxic metals are termed as heavy metals irrespective of their atomic mass and density, eg. arsenic, lead, mercury, cadmium, thallium, chromium, etc.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Studia Poinsotiana
I Introduction
II Subalternation and Theology
III Theology and Dogmatic Declarations
IV The Mixed Principles of Theology
V Virtual Revelation: The Unity of Theology
VI Theology as a Natural Science
VII Theology’s Certitude
VIII Conclusion
Notes
Bibliography
All the contents are fully attributable to the author, Doctor Victor Salas. Should you wish to get this text republished, get in touch with the author or the editorial committee of the Studia Poinsotiana. Insofar as possible, we will be happy to broker your contact.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxRASHMI M G
Abnormal or anomalous secondary growth in plants. It defines secondary growth as an increase in plant girth due to vascular cambium or cork cambium. Anomalous secondary growth does not follow the normal pattern of a single vascular cambium producing xylem internally and phloem externally.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
1. Unconscious Physiological Effects of Search
Latency on Users and Their Click Behaviour
Miguel Barreda-Ángeles (Eurecat), Ioannis Arapakis (Yahoo Labs), Xiao Bai (Yahoo Labs)
B. Barla Cambazoglu (Yahoo Labs), Alexandre Pereda-Baños (Eurecat)
2. Introduction
§ The core research in IR has been on improving the efficiency of
search systems with the eventual goal of satisfying the
information needs of users
§ Most research in this direction had a very system-oriented
viewpoint
§ The impact of efficiency improvements on users’ searching
behaviour and experience have been left unexplored
3. Human Information Processing
§ We are not consciously aware of the
mental processes determining our
behaviour
§ Such unconscious influences reach
from basic or low-level mental
processes to high-level psychological
processes like motivations,
preferences, or complex behaviours
5. Web Search Latency
§ Previous research in the context of web search has shown that
response latency values lower than a certain threshold are
unnoticeable by the users
§ Conclusions are based on self-report methods which are
inherently limited, since users cannot provide information that is
not consciously available to them
§ We cannot dismiss completely the possibility that even small
latency increases can affect the web search experience
6. Study Focus
§ Impact of response latency increase on user behaviour in web
search
§ Smaller latency values (≥1000ms) that may not be consciously
perceived by users
§ We employ two different yet complementary approaches:
• a small-scale controlled user study
• a large scale query log analysis
8. Experimental Design
§ Repeated-measures design
§ One independent variable
• search latency* (with four levels in milliseconds: 0, 500, 750, and 1,000)
§ 19 participants (female = 2, male = 17)
§ Dependent variables:
• experienced positive and negative affect
• level of focused attention
• perceived system usability
• participants’ physiological responses
* Search latency was adjusted by a desired amount using a custom-made JS deployed using Greasemonkey.
9. Procedure
§ Participants performed four search tasks
• evaluate the performance of four different backend search systems
• submit as many navigational queries from a list of 200 randomly sampled
web domains
• for each query they were asked to locate the target URL among the first ten
results of the SERP
§ Training queries were used to allow participants to familiarize
themselves with the “default” search site speed
10. Psychophysiological Measures of Engagement
§ User Engagement Scale (UES)
• Positive affect (PAS)
• Negative affect (NAS)
• Perceived usability
• Felt involvement and focused attention
§ IBM’s Computer System Usability Questionnaire
(CSUQ)
• System usefulness (SYSUSE)
§ Electrodermal activity (EDA)
§ Electromyography [corrugator supercilii] (EMG-CS)
11. Characteristics of Psychological Methods
§ Helpful in unveiling attentional and emotional reactions not
consciously available to us
§ Offer high temporal and spatial resolution
§ Robust against cognitive biases (e.g., social desirability bias*)
§ Always provide “honest” responses
§ No direct question to the subject, no direct answer
§ The information on the research questions has to be inferred
from the variations on the physiological signals and the way they
are related to psychological constructs
* The tendency of survey respondents to answer questions in a manner that will be viewed favorably by others.
12. Physiological Data
§ Mixed multilevel models (a regression-based approach)
• allows comparison of data at different levels
• Level 1: conditions within-subjects
• Level 2: subjects
• allows including random terms in the model for random factors
• random intercepts for between-subject variability; accounts for the difference in means between
subjects
• useful for physiological data, since between subject variability can be much larger than variability
due to experimental conditions, and, therefore, can mask it
• random slopes for the effects of time and order of presentation
• Deals with autocorrelated data (e.g. physiological data)
13. EDA Signal
§ Applied 200ms smoothing filter & artifact removal
§ A temporal series was constructed from each physiological signal
§ Averaged the data every 1-second period (480 points == ~ 8 minutes)
§ Each 10-second period following a query submission was visually
inspected for SCRs (skin conductance responses)
§ Data sample: 132 SCRs; 10 points (seconds) by SCR
15.0
15.2
15.4
15.6
15.8
16.0
16.2
16.4
16.6
16.8
17.0
0 1 2 3 4 5 6 7 8 9 10 11 12
µS
Time after stimulus onset (in seconds)
14. EDA Signal
§ Factors considered in the model:
§ random intercept for participants
§ random slope for time and order of presentation
§ fixed factors:
§ latency (4 conditions)
§ seconds (10 seconds)
15. EDA Results
§ Significant increases in the values of EDA through SCRs associated
to the three latency conditions
§ This can be interpreted that, when there is an SCR response, it is
more intense in the three latency conditions (250ms, 500ms and
1000ms) compared to the 0ms condition, i.e., the arousal is higher
for those conditions compared to the 0ms condition
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1 2 3 4 5 6 7 8 9 10
µS
Time after query onset (in seconds)
0ms
500ms
750ms
1000ms
EDA Model
Fixed factors Coefficients
Intercept - .31*
Latency 500ms .50***
Latency 750ms .42**
Latency 1000ms .60***
Seg 2 .11***
Seg 3 .36***
Seg 4 .68***
Seg 5 .88***
Seg 6 .90***
Seg 7 .80***
Seg 8 .74***
Seg 9 .72***
Seg 10 .69***
16. EMG-CS Signal
§ Band-pass filter 30-500Hz & artifact removal
§ A temporal series was constructed from each physiological
signal
§ Averaged the data every 1-second period (480 points == ~ 8
minutes)
§ Included the data for the entire 3-second period after each query
submission
§ Outliers excluded. Data sample: 7256 samples (4 seconds by
query)
17. EMG-CS Signal
§ Factors considered in the model:
§ random intercept for participants
§ random slope for time and order of presentation
§ fixed factors:
§ latency (4 conditions)
§ seconds (10 seconds)
18. EMG-CS Results
§ Significant increases in the values of EMG
associated to the three latency conditions
§ Since EMG over corrugator supercilii is related
to the negative valence of the emotions, the
three latency conditions produced a more
negative valence compared to the 0s latency
condition.
EDA Model
Fixed factors Coefficients
Intercept .0188***
Latency 500ms .0019***
Latency 750ms .0034***
Latency 1000ms .0010*
Seg 1 .0000393
Seg 2 .0002397***
Seg 3 .0003163***
20. Entropy Analysis
§ We compute two entropy-based features for the EDA and EMG-
CS data:
• Shannon entropy
• Permutation entropy
§ Entropy has been extensively used in signal processing and pattern
recognition
§ In information theory, entropy measures the disorder or uncertainty
associated with a discrete, random variable, i.e., the expected value of
the information in a message
22. Setup
§ Random sample of 30m web search queries obtained from Yahoo
Search (issued by approximately 6m users)
§ Each age group involved at least 100K users
§ Similar number of female and male users
§ To control for differences due to geolocation or device, we select
queries issued:
• within the US
• to a particular search data center
• from desktop computers
23. Latency measurement
§ We use the end-to-end (user perceived) latency values
§ We quantify engagement using the clicked page ratio metric
User
Search
frontend
Search
backend
tpre tproc
tpost
tfb
tbf
tuf
tfu
trender
24. Engagement metrics
§ We compare the presence of clicks for two given query instances
(qfast, qslow) that are:
• submitted by the same user
• having the same query string
• matching the same search results
§ Click presence (click-on-fast, click-on-slow)
§ Click count (click-more-on-fast, click-more-on-slow)
25. Results
0
0.05
0.10
0.15
0.20
0 500 750 1000
0
0,5
1.0
1.5
2.0
Fractionofquerypairs
Click-on-fast/Click-on-slow
Latency difference (in milliseconds)
Click-on-fast
Click-on-slow
Ratio
Fig. 1: Fast or slow query response preference according to the click presence
metric.
26. Results
0
0.05
0.10
0.15
0.20
0 500 750 1000
0
0.5
1.0
1.5
2.0
Fractionofquerypairs
Click-more-on-fast/Click-more-on-slow
Latency difference (in milliseconds)
Click-more-on-fast
Click-more-on-slow
Ratio
Fig. 2: Fast or slow query response preference according to the click count metric.
27. Conclusions
§ As the response latency of the search engine reaches higher
values, the arousal and the negative valence of the experienced
emotions increase as well
§ Physiological data showed that the three latency conditions were
associated to:
• higher arousal (SCR data)
• higher negative valence (EMG-CS data)
§ This can be interpreted as a more emotional and negative
experience: a worse experience
28. Conclusions
§ Although the latency effects did not produce changes on the
self-reported data, their impact on users’ physiological
responses is evident
§ Even if such short latency increases of under 500ms are not
consciously perceived, they have sizeable physiological
effects that can contribute to the overall user experience
29. Conclusions
§ A large-scale query log analysis ascertained the effect on the
clicking behaviour of users and revealed a significant decrease
in users’ engagement with the search result page, even at
small increases in latency
§ This highlights the need for a more inter-disciplinary approach
to the evaluation of human information processing in HCI
research
30. Thank you for your attention!
iarapakis
http://www.slideshare.net/iarapakis/sigir15