Recommendation algorithm using reinforcement learningArithmer Inc.
Slide for study session given by Lu Juanjuan at Arithmer inc.
It is a summary of recent methods for recommendation system using reinforcement learning.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
This is slides used at Arithmer seminar given by Dr. Masaaki Uesaka at Arithmer inc.
It is a summary of recent methods for quality assurance of machine learning model.
Arithmer Seminar is weekly held, where professionals from within our company give lectures on their respective expertise.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
Applying Deep Learning to Enhance Momentum Trading Strategies in StocksLawrence Takeuchi
Contact author: larrytakeuchi@gmail.com
Abstract
We use an autoencoder composed of stacked restricted Boltzmann machines to extract
features from the history of individual stock prices. Our model is able to discover an enhanced version of the momentum effect in stocks without extensive hand-engineering of input features and deliver an annualized return of 45.93% over the 1990-2009 test period
versus 10.53% for basic momentum.
Recommender systems analyze patterns of user interest in
products to provide personalized recommendations. They seek to predict the rating or preference that user would
give to an item. Some of the most successful realizations of latent factor models are based on matrix factorization...
Recommendation algorithm using reinforcement learningArithmer Inc.
Slide for study session given by Lu Juanjuan at Arithmer inc.
It is a summary of recent methods for recommendation system using reinforcement learning.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
This is slides used at Arithmer seminar given by Dr. Masaaki Uesaka at Arithmer inc.
It is a summary of recent methods for quality assurance of machine learning model.
Arithmer Seminar is weekly held, where professionals from within our company give lectures on their respective expertise.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
Applying Deep Learning to Enhance Momentum Trading Strategies in StocksLawrence Takeuchi
Contact author: larrytakeuchi@gmail.com
Abstract
We use an autoencoder composed of stacked restricted Boltzmann machines to extract
features from the history of individual stock prices. Our model is able to discover an enhanced version of the momentum effect in stocks without extensive hand-engineering of input features and deliver an annualized return of 45.93% over the 1990-2009 test period
versus 10.53% for basic momentum.
Recommender systems analyze patterns of user interest in
products to provide personalized recommendations. They seek to predict the rating or preference that user would
give to an item. Some of the most successful realizations of latent factor models are based on matrix factorization...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...IJECEIAES
A hard partition clustering algorithm assigns equally distant points to one of the clusters, where each datum has the probability to appear in simultaneous assignment to further clusters. The fuzzy cluster analysis assigns membership coefficients of data points which are equidistant between two clusters so the information directs have a place toward in excess of one cluster in the meantime. For a subset of CiteScore dataset, fuzzy clustering (fanny) and fuzzy c-means (fcm) algorithms were implemented to study the data points that lie equally distant from each other. Before analysis, clusterability of the dataset was evaluated with Hopkins statistic which resulted in 0.4371, a value < 0.5, indicating that the data is highly clusterable. The optimal clusters were determined using NbClust package, where it is evidenced that 9 various indices proposed 3 cluster solutions as best clusters. Further, appropriate value of fuzziness parameter m was evaluated to determine the distribution of membership values with variation in m from 1 to 2. Coefficient of variation (CV), also known as relative variability was evaluated to study the spread of data. The time complexity of fuzzy clustering (fanny) and fuzzy c-means algorithms were evaluated by keeping data points constant and varying number of clusters.
Slide for Arithmer Seminar given by Dr. Daisuke Sato (Arithmer) at Arithmer inc.
The topic is on "explainable AI".
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
Enhanced ID3 algorithm based on the weightage of the AttributeAM Publications
ID3 algorithm a decision tree classification algorithm is very popular due to its speed and simplicity in construction but it has its own snags while classifying the ID3 algorithm and tends to choose the attributes with large values and practical complexities arises due to this. To solve this problem the proposed algorithm empowers and uses the importance of the attributes and classifies accordingly to produce effective rules. The proposed algorithm uses the attribute weightage and calculates the information gain for the few values attributes and performs quite better when compared to classical ID3 algorithm. The proposed algorithm is applied on a real time data (i.e.) selection process of employees in a firm for appraisal based on few important attributes and executed.
Modeling Crude Oil Prices (CPO) using General Regression Neural Network (GRNN) AI Publications
Modeling time series is often associated with the process forecasts certain characteristics in the next period. One of the methods forecasts that developed nowadays is using artificial neural network or more popularly known as aneural network. Use neural network in forecasts time series can be agood solution, but the problem is network architecture and the training method in the right direction. General Regression Neural Network (GRNN) is one of the network model radial basis that used to approach a function. GRNN including model neural network model with a solution that quickly, because it is not needed each iteration in the estimation weight. This model has a network architecture that wasa number of units in pattern layer in accordance with the number of input data. One of the application GRNN is to predict the crude oil by using a model GRNN.From the training and testing on the data obtained by the RMSE testing 1.9355 and RMSE training 1.1048.Model is good to be used to give aprediction that is quite accurate information that is shown by the close target with the output
Provides a brief overview of what machine learning is, how it works (theory), how to prepare data for a machine learning problem, an example case study, and additional resources.
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...Marco Brambilla
Over one billion cars interact with each other on the road every day. Each driver has his own driving style, which could impact safety, fuel economy and road congestion. Knowledge about the driving style of the driver could be used to encourage ``better" driving behaviour through immediate feedback
while driving, or by scaling auto insurance rates based on the aggressiveness of the driving style.
In this work we report on our study of driving behaviour profiling based on unsupervised data mining methods. The main goal is to detect the different driving behaviours, and thus to cluster drivers with similar behaviour.
This paves the way to new business models related to the driving sector, such as Pay-How-You-Drive insurance
policies and car rentals.
Driver behavioral characteristics are studied by collecting information from GPS sensors on the cars and by applying three different analysis approaches (DP-means, Hidden Markov Models, and Behavioural Topic Extraction) to the contextual scene detection problems on car trips, in order to detect different
behaviour along each trip. Subsequently, drivers are clustered in similar profiles based on that and the results are compared with a human-defined groundtruth on drivers classification. The proposed framework is tested on a real dataset containing sampled car signals. While the different approaches show relevant differences in trip segment classification, the coherence of the final driver clustering results is surprisingly high.
Introduction to Machine Learning : Machine Learning (ML) is a type of Intelligence (AI) that allows Software applications to become more accurate at predicting outcomes without being explicitly programmed to do so. Machine Learning Algorithms use historical data as input to predict new output values.
Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page and Radio. Due to the iterative nature of training these models they suffer from IO overhead of Hadoop and are a natural fit to the Spark programming paradigm. In this talk I will present both the right way as well as the wrong way to implement collaborative filtering models with Spark. Additionally, I will deep dive into how Matrix Factorization is implemented in the MLlib library.
Alleviating cold-user start problem with users' social network data in recomm...Eduardo Castillejo Gil
This work explores the possibility of using relevant data from users’
social network to alleviate the cold-user problems in a recommender
system domain. The proposed solution extracts the most valuable
node in the graph generated by check in a venue with an Android
application using the Foursquare API. By obtaining the recommendations to this node we estimate the probability of some categories
to be similar to users tastes...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...IJECEIAES
A hard partition clustering algorithm assigns equally distant points to one of the clusters, where each datum has the probability to appear in simultaneous assignment to further clusters. The fuzzy cluster analysis assigns membership coefficients of data points which are equidistant between two clusters so the information directs have a place toward in excess of one cluster in the meantime. For a subset of CiteScore dataset, fuzzy clustering (fanny) and fuzzy c-means (fcm) algorithms were implemented to study the data points that lie equally distant from each other. Before analysis, clusterability of the dataset was evaluated with Hopkins statistic which resulted in 0.4371, a value < 0.5, indicating that the data is highly clusterable. The optimal clusters were determined using NbClust package, where it is evidenced that 9 various indices proposed 3 cluster solutions as best clusters. Further, appropriate value of fuzziness parameter m was evaluated to determine the distribution of membership values with variation in m from 1 to 2. Coefficient of variation (CV), also known as relative variability was evaluated to study the spread of data. The time complexity of fuzzy clustering (fanny) and fuzzy c-means algorithms were evaluated by keeping data points constant and varying number of clusters.
Slide for Arithmer Seminar given by Dr. Daisuke Sato (Arithmer) at Arithmer inc.
The topic is on "explainable AI".
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
Enhanced ID3 algorithm based on the weightage of the AttributeAM Publications
ID3 algorithm a decision tree classification algorithm is very popular due to its speed and simplicity in construction but it has its own snags while classifying the ID3 algorithm and tends to choose the attributes with large values and practical complexities arises due to this. To solve this problem the proposed algorithm empowers and uses the importance of the attributes and classifies accordingly to produce effective rules. The proposed algorithm uses the attribute weightage and calculates the information gain for the few values attributes and performs quite better when compared to classical ID3 algorithm. The proposed algorithm is applied on a real time data (i.e.) selection process of employees in a firm for appraisal based on few important attributes and executed.
Modeling Crude Oil Prices (CPO) using General Regression Neural Network (GRNN) AI Publications
Modeling time series is often associated with the process forecasts certain characteristics in the next period. One of the methods forecasts that developed nowadays is using artificial neural network or more popularly known as aneural network. Use neural network in forecasts time series can be agood solution, but the problem is network architecture and the training method in the right direction. General Regression Neural Network (GRNN) is one of the network model radial basis that used to approach a function. GRNN including model neural network model with a solution that quickly, because it is not needed each iteration in the estimation weight. This model has a network architecture that wasa number of units in pattern layer in accordance with the number of input data. One of the application GRNN is to predict the crude oil by using a model GRNN.From the training and testing on the data obtained by the RMSE testing 1.9355 and RMSE training 1.1048.Model is good to be used to give aprediction that is quite accurate information that is shown by the close target with the output
Provides a brief overview of what machine learning is, how it works (theory), how to prepare data for a machine learning problem, an example case study, and additional resources.
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...Marco Brambilla
Over one billion cars interact with each other on the road every day. Each driver has his own driving style, which could impact safety, fuel economy and road congestion. Knowledge about the driving style of the driver could be used to encourage ``better" driving behaviour through immediate feedback
while driving, or by scaling auto insurance rates based on the aggressiveness of the driving style.
In this work we report on our study of driving behaviour profiling based on unsupervised data mining methods. The main goal is to detect the different driving behaviours, and thus to cluster drivers with similar behaviour.
This paves the way to new business models related to the driving sector, such as Pay-How-You-Drive insurance
policies and car rentals.
Driver behavioral characteristics are studied by collecting information from GPS sensors on the cars and by applying three different analysis approaches (DP-means, Hidden Markov Models, and Behavioural Topic Extraction) to the contextual scene detection problems on car trips, in order to detect different
behaviour along each trip. Subsequently, drivers are clustered in similar profiles based on that and the results are compared with a human-defined groundtruth on drivers classification. The proposed framework is tested on a real dataset containing sampled car signals. While the different approaches show relevant differences in trip segment classification, the coherence of the final driver clustering results is surprisingly high.
Introduction to Machine Learning : Machine Learning (ML) is a type of Intelligence (AI) that allows Software applications to become more accurate at predicting outcomes without being explicitly programmed to do so. Machine Learning Algorithms use historical data as input to predict new output values.
Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page and Radio. Due to the iterative nature of training these models they suffer from IO overhead of Hadoop and are a natural fit to the Spark programming paradigm. In this talk I will present both the right way as well as the wrong way to implement collaborative filtering models with Spark. Additionally, I will deep dive into how Matrix Factorization is implemented in the MLlib library.
Alleviating cold-user start problem with users' social network data in recomm...Eduardo Castillejo Gil
This work explores the possibility of using relevant data from users’
social network to alleviate the cold-user problems in a recommender
system domain. The proposed solution extracts the most valuable
node in the graph generated by check in a venue with an Android
application using the Foursquare API. By obtaining the recommendations to this node we estimate the probability of some categories
to be similar to users tastes...
Exploratory Search Beyond the Query-Response ParadigmTakehiro Yamamoto
This slide is the summary of paper, "Exploratory Search Beyond the Query-Response Paradigm", written by White et al. The slide is written in Japanese. http://rerank.jp/lab/
A Statistical Approach to Resolve Conflicting Requirements in Pervasive Compu...Osama M. Khaled
Pervasive computing systems are complex and challenging. In this research, our aim is to build a robust reference architecture for pervasive computing derived from real business needs and based on process re-engineering practices. We derived requirements from different sources grouped by selected quality features and worked on refining them by identifying the conflicts among these requirements, and by introducing solutions for them. We checked the consistency of these solutions across all the requirements. We built a mathematical model that describes the degrees of consistency with the requirements model and showed that they are normally distributed within that scope.
Full paper:
https://www.researchgate.net/publication/316582796_A_Statistical_Approach_to_Resolve_Conflicting_Requirements_in_Pervasive_Computing_Systems?_iepl%5BviewId%5D=GYzOmb1HA01ScDf1IJ1T8eif&_iepl%5BprofilePublicationItemVariant%5D=default&_iepl%5Bcontexts%5D%5B0%5D=prfpi&_iepl%5BtargetEntityId%5D=PB%3A316582796&_iepl%5BinteractionType%5D=publicationTitle
Citation
Osama M. Khaled, Hoda M. Hosny, Mohamed Shalan (2017). A Statistical Approach to resolve conflicting requirements in pervasive computing systems. The 12th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2017), Porto, Portugal, 28-29 April 2017.
Carrying out OLAP analyses in hands-free scenarios requires lean forms of communication between the users and the system, based for instance on natural language. In this paper we introduce VOOL, a framework specifically devised for vocalizing the insights resulting from OLAP sessions. VOOL is self-configurable, extensible, and is aware of the user's intentions expressed by OLAP operators. To avoid overwhelming the user with very long descriptions, we pursue the vocalization of selected insights automatically extracted from query results. These insights are detected by a set of modules, each returning a set of independent insights that characterize data. After describing and formalizing our approach, we evaluate it in terms of efficiency and effectiveness.
Looker's Ben Porterfield - Asking The Right QuestionsHeavybit
Ben offers his insight into data analysis with this Heavybit Speaker Series talk. He includes information on the data he finds key to strategic product pivot, the difference between good and bad customer questions, and how to take your data science to the next level. Check out the video of this talk at: http://www.heavybit.com/library/video/2015-02-10-ben-porterfield
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Greg Makowski
Describing a predictive data mining model can provide a competitive advantage for solving business problems with a model. The SSA approach can also provide reasons for the forecast for each record. This can help drive investigations into fields and interactions during a data mining project, as well as identifying "data drift" between the original training data, and the current scoring data. I am working on open source version of SSA, first in R.
[AFEL] Neighborhood Troubles: On the Value of User Pre-Filtering To Speed Up ...Emanuel Lacić
In this paper, we present work-in-progress on applying user pre-filtering to speed up and enhance recommendations based on Collaborative Filtering. We propose to pre-filter users in order to extracta smaller set of candidate neighbors, who exhibit a high number of overlapping entities and to compute the final user similarities based on this set. To realize this, we exploit features of the high-performance search engine Apache Solr and integrate them into a scalable recommender system. We have evaluated our approach on a dataset gathered from Foursquare and our evaluation results suggest that our proposed user pre-filtering step can help to achieve both a better runtime performance as well as an increase in overall recommendation accuracy.
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...Sease
Interleaving is an online evaluation approach for information retrieval systems that compares the effectiveness of ranking functions in interpreting the users’ implicit feedback. Previous work such as Hofmann et al. (2011) has evaluated the most promising interleaved methods at the time, on uniform distributions of queries. In the real world, usually, there is an unbalanced distribution of repeated queries that follows a long-tailed users’ search demand curve. This paper first aims to reproduce the Team Draft Interleaving accuracy evaluation on uniform query distributions and then focuses on assessing how this method generalises to long-tailed real-world scenarios. The replicability work raised interesting considerations on how the winning ranking function for each query should impact the overall winner for the entire evaluation. Based on what was observed, we propose that not all the queries should contribute to the final decision in equal proportion. As a result of these insights, we designed two variations of the ∆AB score winner estimator that assign to each query a credit based on statistical hypothesis testing. To reproduce, replicate and extend the original work, we have developed from scratch a system that simulates a search engine and users’ interactions from datasets from the industry. Our experiments confirm our intuition and show that our methods are promising in terms of accuracy, sensitivity, and robustness to noise.
Similar to Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect User Preferences? (at EVIA 2016) (20)
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect User Preferences? (at EVIA 2016)
1. Two-layered Summaries for Mobile Search:
Does the Evaluation Measure Reflect User Preferences?
Makoto P. Kato (Kyoto U.), Tetsuya Sakai (Waseda U.),
Takehiro Yamamoto (Kyoto U.), Virgil Pavlu (Northeastern U.),
and Hajime Morita (Kyoto U.)
3. IR Systems in Ten-Blue-Link Paradigm
Enter query
Click SEARCH button
Scan ranked list of URLs
Click URL
Read URL contents
Get all desired information
Long way to get all desired information
4. MobileClick System
Enter query
Click SEARCH button
Get all desired information
Go beyond the "ten-blue-link" paradigm, and tackle
information retrieval rather than document retrieval
LCD is better in terms of the weight, size and
energy saving. OLED shows a better black
color, a faster response speed, and a wider
view angle.
Advantage of OLED
Advantage of LCD
Task: Given a search query,
return a two-layered textual output
System output
OLED LCD difference
Phone: 046-223-3636.
Fax: 046-223-3630.
Address: 118-1
Nurumizu, Atsugi,
243-8551. Email:
soumu@shonan-
atsugi.jp. Visiting
hours: general ward
Mon-Fri 15-20;
Sat&Holidays 13-20 /
Intensive Care Unit
(ICU) 11-11:30, 15:30,
19-19:30.
Phone: 046-223-3636.
Fax: 046-223-3630.
Address: 118-1
Nurumizu, Atsugi,
243-8551. Email:
soumu@shonan-
atsugi.jp. Visiting
hours: general ward
Mon-Fri 15-20;
Sat&Holidays 13-20 /
Intensive Care Unit
(ICU) 11-11:30, 15:30,
19-19:30.
Skip
5. • Given a query, a set of iUnits, and a set of intents,
generate a two-layered summary
iUnit Summarization Subtask at NTCIR-12
5
iUnit
A series of evaluation workshops
Designed to enhance IA research
…
NTCIR
Input: Query
Input: iUnit set
Intents
News
Schedule
…
Input: Intents
M-measure
0.5
The NTCIR Workshop is a
series of evaluation
workshops designed to
enhance research in
information access
technologies including
information retrieval,
summarization, extraction,
question answering, etc.
News
Schedule
Tasks
2nd layer
20/Jan./2016: Task Registration Due
06/Jan./2016: Document Set Release
Jan.-May/2016: Dry Run
Mar.-July/2016: Formal Run
01/Aug./2016: Evaluation Results Due
01/Aug./2016: Task overview release
15/Sep./2016: Paper submission Due
01/Nov./2016: All paper Due
09-12/Dec./2016: NTCIR-11 Conference
Output: Two-layered summary
Evaluation metric
designed for mobile
information access
Lay out iUnits so that
any types of users can be immediately satisfied
Challenge
7. Does the Evaluation Measure
Reflect User Preferences?
Research Question Addressed in This Work
7
M-measure
0.5 0.4
User preference
(# of users who prefer to A (B))
10 4
0.5 > 0.4
10 > 4
A B
A > B
A > B
=
Same?
Which is higher? Which is better?
9. Overview of Data
9
napoleon
Queries
Documents
Web search
Born on the island of Corsica
Defeated at the Battle of Waterloo
Established legal equality and religious
toleration an innovator
iUnits
Extraction
Achievement
Skill
Career
Clustering
Intents
iUnit
summarization
Input
Input
10. • Queries
– 100 English/Japanese queries
– Most of which were ambiguous/underspecified
– Selected from five categories:
celebrity, location, definition, and QA (similar to NTCIR 1CLICK-2)
• Documents
– 500 commercial search engine results for each query
from which iUnits were extracted
Queries and Documents
10
CELEBRITY LOCATION DEFINITION QA
hulk hogan bank adelanto bitcoin what is mirror made of
bruno mars cafe killeen divers disease how to cook coleslaw
sharon stone cincinnati art museum windows 7 role of animal tail
Examples
11. • Definition
– Atomic information pieces relevant to a given query
• The number of iUnits
– 2,317 (23.8 iUnits per query) for English
– 4,169 (41.7 iUnits per query) for Japanese
iUnits
11
Born on the island of Corsica General of the Army of Italy
Defeated at the Battle of Waterloo One of the most controversial political figures
won at the Battle of Wagram
Established legal equality and religious
toleration an innovator
Baptised as a Catholic
Absent during Peninsular War Cut off European trade with Britain
Examples of iUnits for query “Napoleon”
12. • An intent can be defined as
– A specific interpretation of an ambiguous query
(“Mac OS” and “car brand” for “jaguar”), or
– An aspect of a faceted query
(“windows 8” and “windows 10” for “windows”)
• Obtained by clustering iUnits
Intents
12
Achievement
Skill
Career
Born on the island of Corsica
Defeated at the Battle of Waterloo
Established legal equality and religious
toleration an innovator
Absent during Peninsular War
iUnits Intents
Clustering
14. • Importance of iUnits in terms of an intent
• Intent probability P(i|q)
– Probability of having intent i for a given query q
Per-intent iUnit Importance and Intent Probability
iUnit Importance
A series of evaluation workshops 5
Task Registration Due 20/Jun./2016 3
iUnit Importance
A series of evaluation workshops 2
Task Registration Due 20/Jun./2016 5
In terms of intent “Definition” In terms of intent “Schedule”
Intent Prob.
Definition 0.4
Schedule 0.3
Tasks 0.3
For details, see our MobileClick-2 overview paper
15. • Consider single-layered summary evaluation
• U-measure [Sakai and Dou. SIGIR2013]
– Higher if more important iUnits appear earlier
Evaluation of iUnit Summarization (Single-layer Case)
15
𝑢1 𝑢2
𝑢3
Summary Trailtext
(reading path)
𝑢1 𝑢3
G(u1)(1-10/L)
+ G(u2)(1-15/L)
+ G(u3)(1-25/L)
U-measure
Create a list of iUnits
by assuming that users
read text from left to right,
from top to bottom
𝑈 =
𝑟=1
𝐺 𝑢 𝑟 1 −
pos 𝑢 𝑟
𝐿
𝑢 𝑟: r-th iUnit
𝐺(𝑢): importance of u
pos(𝑢): offset of u from the beginning
𝐿: patience parameter
𝑢2
10chars 10chars5chars
16. • M-measure
– Expectation of U-measure over multiple trailtexts
𝑀 =
𝐭
𝑃(𝐭)𝑈(𝐭)
1. Generate trailtexts by assuming that
– Users read a summary from the top of the first layer
– Users click on an intent if they are interested in it
M-measure
16
𝑃(𝐭): probability of trailtext t
𝑈(𝐭): U-measure of trailtext t
𝑙1
𝑢1 𝑢2
𝑢3
𝑢4
User interested in
Intent 1 (𝑃(𝑖1|𝑞))
User interested in
Intent 2 (𝑃(𝑖2|𝑞))
𝑢1 𝑢2 𝑢3 𝑢4
𝑢1 𝑢2 𝑢3
17. 2. Compute the expectation of U-measure
Evaluation of iUnit Summarization (Two-layer Case)
17
𝑙1
𝑙2
𝑢1 𝑢2
𝑢3
𝑢6
𝑢4 𝑢5
Trailtext (t)
(reading path)
U
𝑢1 𝑢2 𝑢3
𝑢4 𝑢5
𝑢1 𝑢2 𝑢3
𝑢6
0.44
0.12
0.36
𝑃 𝐭1 = 𝑃 𝑖1 𝑞 = 0.75
𝑃 𝐭2 = 𝑃 𝑖2 𝑞 = 0.25
M-measure
𝑀 =
𝐭
𝑃(𝐭)𝑈(𝐭)
Because trailtext t2 is read
by users interested in i2
20. • Users were asked to select either
the left one is better,
the right one is better,
equally good, or
equally bad
• Criteria:
(1) How much useful information you can get
from the summary, and
(2) How quickly you can get useful information
from the summary
Instruction in Pairwise Comparison
20
21. • 𝑳 of U-measure in M-measure
– 𝑈 = 𝑟=1 𝐺 𝑢 𝑟 max 0, 1 −
pos 𝑢 𝑟
𝐿
– 𝐿 is a patience parameter that controls how the
gain of iUnits decreases as the user reads the text
• Simple variants of M-measure
– Use only first layer
– Use only second layer
– Use a uniform distribution for 𝑃 𝑖 𝑞
Settings of M-measure
21
𝑙1
𝑢1 𝑢2
𝑢3
𝑢4
𝐿 = 100
𝐿 = 200
200100
1−
pos𝑢𝑟
𝐿
pos 𝑢 𝑟
22. Interpretation of Results
22
(Num. of votes for A)
(Total num. of votes)
Diff. of M-measure (M(A) - M(B))
Agree
Disagree
Disagree
Agree
A
is better
(User pref.)
B
is better
(User pref.)
Ais better
(M-measure)
Bis better
(M-measure)
Each dot represents
a pair of systems (A, B)
for a particular query
Agreement
= (#dots in Agree)
/ (#dots)
23. Experimental Results for Different Patient Parameters
23
93.75 750 6000 24000
31.25 125 2000 8000
English
Japanese
LOW agreement for LOW
patience parameter
(L=93.5)
HIGH agreement for HIGH
patience parameter
(L=24000)
Agreement is high (70-74%) for both of the languages
24. Experimental Results for Simple Variants of M-measure
24
Original
Worse Slightly worseClose
Use of the second layer and intent probability
improves the agreement (but the first layer doesn’t)
24000
2000
25. • Possible explanations include
– The quality of the second layer correlates to the
quality of the whole summary
– Users decided the quality of the summary mainly
based on the second layer
• We asked the users to look at the second layer in the
assessment
Why did the only 2nd layer correlate to the user pref. well?
25
26. • Conclusions
– Proposed M-measure
• A special case of intent-aware U-measure for two-
layered summarization
– Measured the agreement between
M-measure and user preferences
• Agreement was high (70-74%)
• Future work
– Error analysis
– Address “why did the only second layer correlate
to the user preferences well?”
Conclusions and Future Work
26