This intervention draws on experimentations ongoing in the context of the OECD-led Statistical Information System Collaboration Community (SIS-CC) to enable AI applications with SDMX. One important use case is to use AI for better accessibility and discoverability of the data: whilst UX techniques, lexical search improvements, and data harmonisation can take statistical organisations to a good level of accessibility, however, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints. That is where AI – and most importantly, NLP and LLM techniques – could potentially make a difference. The “StatsBot” could be this natural language, conversational engine that could facilitate access and usage of the data. The “StatsBot” could leverage the semantics of any SDMX source.
The objective of the presentation is to propose a technical approach and a way forward to achieve this goal and create the StatsBot as a universal, open asset usable by all statistical organisations. In a first step, the concept tested is to use Large Language Models with the Apache Solr index of SDMX objects so as to transform natural language queries into SDMX queries. In a second step, results could be framed as a natural language statement complementing the top-k search results. For the purpose of initial PoCs – aimed to demonstrate functional features and feasibility – a commercial LLM (such as OpenAI GPT-4) will be used; in a later stage substitution with an open source LLM will be analysed. The presentation will include the results of the first experimental work, lessons learnt, and scope future work that should lead to defining the path for production-grade, fully open source, and universal StatsBot.
Copenhagen business school drives sustainability at roskilde festival using c...Carlos Tomas
A nonprofit humanitarian organization in Denmark gains faster insights into crowd movements and sales data, supporting sustainability, achieves smart optimization of safety, catering, energy, water and waste management and increases scalability with cloud- based analytics and a cloud- based Big Data lab on IBM Bluemix, IBM SPSS and IBM Watson Analytics technologies, running on a SoftLayer infrastructure, to seamlessly correlate data from multiple sources.
This presentation introduces open source, open source GIS, OSGeo. This talk was given to the people who attended 'Capacity Building For National Surveying and Geographic Information Institute' program.
ICT research in the context of European Union
CASE SUMMER SCHOOL ON APPLIED SOFTWARE ENGINEERING
APPLIED SOFTWARE PROCESS MANAGEMENT AND TESTING
JULY 6-10, 2009, BOZEN/BOLZANO, ITALY
A short introduction to the term "Openness" as it is used by three different organizations in the geospatial domain, the OGC, OSGeo and OSM - plus an outlook where we should be going.
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
ITAC 2016 Where Open Source Meets Audit AnalyticsAndrew Clark
Open source software is taking the computer science community and IT departments by storm. The breadth of options, the timeliness of updates, the price, and the sense of community are all contributing factors to the rise of open source computing. For many years audit analytics has been confined to the Computer Assisted Auditing Techniques, CAAT, software vendors ACL, IDEA and now Arbutus. However, these software programs require extensive training to use effectively, are not very flexible, and in most cases fail to provide the outcome auditors are expecting. Moving to an open source platform based around the python ecosystem allows for true customization of analytics, and provides a common language to interact with your IT department. By using the same set of tools, an auditing department can move from rudimentary AP duplicate tests all the way to advanced classification and clustering machine learning tests. Although the barrier to entry for open source software is higher than for most CAATs, with cross-functional collaboration, a truly customized, sustainable, and highly effective analytics program can be created.
Open AR Cloud One Year Since the LaunchOpen AR Cloud
Open AR Cloud gave an update on its activity and a look out on its 2020 mission building the core pieces of a decentralised Open Spatial Computing platform (OSPC)
Presentation Slides of the Open AR Cloud presentation at AWE EU (Munich Nov. 17th)
Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...Trivadis
During major irregularities, the service desks of airline companies are heavily overloaded for short periods of time. A chatbot could help out during these peak hours. In this session we show how SWISS International Airlines developed a chatbot for irregularity handling. We shed light on the challenges, such as sensitive customer data and a company starting its journey into the cloud.
E2D3 is Opensource, Intaractive, Dynamic Data Visualization platform on Excel.
It's Easy, Useful, and Intuitive.
Use E2D3 app for powerful presentation of your data.
Business X Design - People, Planet & ProductCyber-Duck
A talk about intersectional design and how accessibility and sustainable overlap to create better products and experiences for both people and businesses
Since the introduction of native vector-based search in Apache Lucene happened, many features have been developed, but the support for multiple vectors in a dedicated KNN vector field remained to explore. Having the possibility of indexing (and searching) multiple values per field unlocks the possibility of working with long textual documents, splitting them into paragraphs, and encoding each paragraph as a separate vector: a scenario that is often encountered by many businesses. This talk explores the challenges, the technical design and the implementation activities that happened during the work for this contribution to the Apache Lucene project. The audience is expected to get an understanding of how multi-valued fields can work in a vector-based search use case and how this feature has been implemented.
How To Implement Your Online Search Quality Evaluation With KibanaSease
Online testing represents a fundamental method to assess the performance of a ranking model in practical applications, providing the information needed to improve and better understand its behavior. Despite the advantages, the currently available evaluation tools have certain limitations. For this reason, we will present an alternative and customized approach to evaluate ranking models using Kibana. The talk will begin with an overview of online testing, including its benefits and drawbacks. Then, we will provide an in-depth exploration of our Kibana implementation, detailing the reasons behind our approach. Attendees will learn about the various tools provided by Kibana, and with practical examples, we will show how to create visualizations and dashboards, complete with queries and code, to compare different rankers. Attending this presentation will provide participants with valuable knowledge on how to leverage Kibana for the purpose of evaluating ranking models on custom metrics and on specific contexts such as the most popular and “populous” queries.
More Related Content
Similar to When SDMX meets AI-Leveraging Open Source LLMs To Make Official Statistics More Accessible And Discoverable.pptx
Copenhagen business school drives sustainability at roskilde festival using c...Carlos Tomas
A nonprofit humanitarian organization in Denmark gains faster insights into crowd movements and sales data, supporting sustainability, achieves smart optimization of safety, catering, energy, water and waste management and increases scalability with cloud- based analytics and a cloud- based Big Data lab on IBM Bluemix, IBM SPSS and IBM Watson Analytics technologies, running on a SoftLayer infrastructure, to seamlessly correlate data from multiple sources.
This presentation introduces open source, open source GIS, OSGeo. This talk was given to the people who attended 'Capacity Building For National Surveying and Geographic Information Institute' program.
ICT research in the context of European Union
CASE SUMMER SCHOOL ON APPLIED SOFTWARE ENGINEERING
APPLIED SOFTWARE PROCESS MANAGEMENT AND TESTING
JULY 6-10, 2009, BOZEN/BOLZANO, ITALY
A short introduction to the term "Openness" as it is used by three different organizations in the geospatial domain, the OGC, OSGeo and OSM - plus an outlook where we should be going.
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
ITAC 2016 Where Open Source Meets Audit AnalyticsAndrew Clark
Open source software is taking the computer science community and IT departments by storm. The breadth of options, the timeliness of updates, the price, and the sense of community are all contributing factors to the rise of open source computing. For many years audit analytics has been confined to the Computer Assisted Auditing Techniques, CAAT, software vendors ACL, IDEA and now Arbutus. However, these software programs require extensive training to use effectively, are not very flexible, and in most cases fail to provide the outcome auditors are expecting. Moving to an open source platform based around the python ecosystem allows for true customization of analytics, and provides a common language to interact with your IT department. By using the same set of tools, an auditing department can move from rudimentary AP duplicate tests all the way to advanced classification and clustering machine learning tests. Although the barrier to entry for open source software is higher than for most CAATs, with cross-functional collaboration, a truly customized, sustainable, and highly effective analytics program can be created.
Open AR Cloud One Year Since the LaunchOpen AR Cloud
Open AR Cloud gave an update on its activity and a look out on its 2020 mission building the core pieces of a decentralised Open Spatial Computing platform (OSPC)
Presentation Slides of the Open AR Cloud presentation at AWE EU (Munich Nov. 17th)
Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...Trivadis
During major irregularities, the service desks of airline companies are heavily overloaded for short periods of time. A chatbot could help out during these peak hours. In this session we show how SWISS International Airlines developed a chatbot for irregularity handling. We shed light on the challenges, such as sensitive customer data and a company starting its journey into the cloud.
E2D3 is Opensource, Intaractive, Dynamic Data Visualization platform on Excel.
It's Easy, Useful, and Intuitive.
Use E2D3 app for powerful presentation of your data.
Business X Design - People, Planet & ProductCyber-Duck
A talk about intersectional design and how accessibility and sustainable overlap to create better products and experiences for both people and businesses
Similar to When SDMX meets AI-Leveraging Open Source LLMs To Make Official Statistics More Accessible And Discoverable.pptx (20)
Since the introduction of native vector-based search in Apache Lucene happened, many features have been developed, but the support for multiple vectors in a dedicated KNN vector field remained to explore. Having the possibility of indexing (and searching) multiple values per field unlocks the possibility of working with long textual documents, splitting them into paragraphs, and encoding each paragraph as a separate vector: a scenario that is often encountered by many businesses. This talk explores the challenges, the technical design and the implementation activities that happened during the work for this contribution to the Apache Lucene project. The audience is expected to get an understanding of how multi-valued fields can work in a vector-based search use case and how this feature has been implemented.
How To Implement Your Online Search Quality Evaluation With KibanaSease
Online testing represents a fundamental method to assess the performance of a ranking model in practical applications, providing the information needed to improve and better understand its behavior. Despite the advantages, the currently available evaluation tools have certain limitations. For this reason, we will present an alternative and customized approach to evaluate ranking models using Kibana. The talk will begin with an overview of online testing, including its benefits and drawbacks. Then, we will provide an in-depth exploration of our Kibana implementation, detailing the reasons behind our approach. Attendees will learn about the various tools provided by Kibana, and with practical examples, we will show how to create visualizations and dashboards, complete with queries and code, to compare different rankers. Attending this presentation will provide participants with valuable knowledge on how to leverage Kibana for the purpose of evaluating ranking models on custom metrics and on specific contexts such as the most popular and “populous” queries.
Introducing Multi Valued Vectors Fields in Apache LuceneSease
Since the introduction of native vector-based search in Apache Lucene happened, many features have been developed, but the support for multiple vectors in a dedicated KNN vector field remained to explore. Having the possibility of indexing (and searching) multiple values per field unlocks the possibility of working with long textual documents, splitting them in paragraphs and encoding each paragraph as a separate vector: scenario that is often encountered by many businesses. This talk explores the challenges, the technical design and the implementation activities happened during the work for this contribution to the Apache Lucene project. The audience is expected to get an understanding of how multi-valued fields can work in a vector-based search use-case and how this feature has been implemented.
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...Sease
Interleaving is an online evaluation approach for information retrieval systems that compares the effectiveness of ranking functions in interpreting the users’ implicit feedback. Previous work such as Hofmann et al. (2011) has evaluated the most promising interleaved methods at the time, on uniform distributions of queries. In the real world, usually, there is an unbalanced distribution of repeated queries that follows a long-tailed users’ search demand curve. This paper first aims to reproduce the Team Draft Interleaving accuracy evaluation on uniform query distributions and then focuses on assessing how this method generalises to long-tailed real-world scenarios. The replicability work raised interesting considerations on how the winning ranking function for each query should impact the overall winner for the entire evaluation. Based on what was observed, we propose that not all the queries should contribute to the final decision in equal proportion. As a result of these insights, we designed two variations of the ∆AB score winner estimator that assign to each query a credit based on statistical hypothesis testing. To reproduce, replicate and extend the original work, we have developed from scratch a system that simulates a search engine and users’ interactions from datasets from the industry. Our experiments confirm our intuition and show that our methods are promising in terms of accuracy, sensitivity, and robustness to noise.
How does ChatGPT work: an Information Retrieval perspectiveSease
In this talk, we will explore the underlying mechanisms of ChatGPT, a large-scale language model developed by OpenAI, from the perspective of Information Retrieval (IR). We will delve into the process of training the model using massive amounts of data, the techniques used to optimize the model’s performance, and how the IR concepts such as tokenization, vectorization, and ranking are used in generating responses. We will also discuss how ChatGPT handles contextual understanding and how it leverages the power of transfer learning to generate high-quality and relevant responses. Software engineers will gain insights into how a modern conversational AI system like ChatGPT works, providing a better understanding of its strengths and limitations, and how to best integrate it into their software applications.
This abstract has been fully written by ChatGPT with the simple prompt in input <Write an abstract for a talk called “How does ChatGPT work? An Information Retrieval perspective”, the audience is software engineers>.
How To Implement Your Online Search Quality Evaluation With KibanaSease
Online testing remains the optimal way to prove how your ranking model performs in your real-world scenario. It can lead to many advantages such as having a direct interpretation of the results and confirming the estimation of offline tests. It gives a better understanding of the ranking model behaviour and builds a solid foundation to learn from to improve it.
Nowadays, the available evaluation tools have some limitations and in this talk, we will describe an alternative and customised approach for evaluating ranking models through the use of Kibana.
First of all, we give an overview of online testing, highlighting the pros and cons and describing the state-of-the-art.
We then dive into Kibana’s implementation and the reasons behind it. We will explore the tools Kibana provides, with their constraints for real-world applications, and show, through practical examples, how to create dashboards (with queries and code) to compare different models.
Learning To Rank has been the first integration of machine learning techniques with Apache Solr allowing you to improve the ranking of your search results using training data.
One limitation is that documents have to contain the keywords that the user typed in the search box in order to be retrieved(and then reranked). For example, the query “jaguar” won’t retrieve documents containing only the terms “panthera onca”. This is called the vocabulary mismatch problem.
Neural search is an Artificial Intelligence technique that allows a search engine to reach those documents that are semantically similar to the user’s information need without necessarily containing those query terms; it learns the similarity of terms and sentences in your collection through deep neural networks and numerical vector representation(so no manual synonyms are needed!).
This talk explores the first Apache Solr official contribution about this topic, available from Apache Solr 9.0.
We start with an overview of neural search (Don’t worry - we keep it simple!): we describe vector representations for queries and documents, and how Approximate K-Nearest Neighbor (KNN) vector search works. We show how neural search can be used along with deep learning techniques (e.g, BERT) or directly on vector data, and how we implemented this feature in Apache Solr, giving usage examples!
Join us as we explore this new exciting Apache Solr feature and learn how you can leverage it to improve your search experience!
SHARE Virtual Discovery Environment (Share-VDE) is a library-driven initiative that brings together the bibliographic catalogues and authority files of a community of libraries in a shared discovery environment based on linked data.
One of the main challenges is the massive amount of data the system is supposed to manage in terms of Search, Manipulation, and Presentation.
Dense Retrieval with Apache Solr Neural Search.pdfSease
Neural Search is an industry derivation from the academic field of Neural information Retrieval. More and more frequently, we hear about how Artificial Intelligence (AI) permeates every aspect of our lives and this includes also software engineering and Information Retrieval.
In particular, the advent of Deep Learning introduced the use of deep neural networks to solve complex problems that could not be solved simply by an algorithm. Deep Learning can be used to produce a vector representation of both the query and the documents in a corpus of information. Search, in general, comprises of performing four primary steps:
- generate a representation of the query that describes the information need - generate a representation of the document that captures the information contained in it
- match the query and the document representations from the corpus of information
- assign a score to each matched document in order to establish a meaningful document ranking by relevance in the results.
With the Neural Search module, Apache Solr is introducing support for neural network based techniques that can improve these four aspects of search.
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...Sease
The first integrations of machine learning techniques with search allowed to improve the ranking of your search results (Learning To Rank) – but one limitation has always been that documents had to contain the keywords that the user typed in the search box in order to be retrieved. For example, the query “tiger” won’t retrieve documents containing only the terms “panthera tigris”. This is called the vocabulary mismatch problem and over the years it has been mitigated through query and document expansion approaches.
Neural search is an Artificial Intelligence technique that allows a search engine to reach those documents that are semantically similar to the user’s query without necessarily containing those terms; it avoids the need for long lists of synonyms by automatically learning the similarity of terms and sentences in your collection through the utilisation of deep neural networks and numerical vector representation.
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdfSease
f you want to expand your query/documents with synonyms in Apache Lucene, you need to have a predefined file containing the list of terms that share the same semantic. It’s not always easy to find a list of basic synonyms for a language and, even if you find it, this doesn’t necessarily match with your contextual domain.
The term “daemon” in the domain of operating system articles is not a synonym of “devil” but it’s closer to the term “process”.
Word2Vec is a two-layer neural network that takes as input a text and outputs a vector representation for each word in the dictionary. Two words with similar meanings are identified with two vectors close to each other.
How to cache your searches_ an open source implementation.pptxSease
Caches are used in IT systems to store data in dedicated structures for fast access so that future requests can be served faster. They are an effective tool to store the query results and speed up future query executions in information retrieval systems.
An open-source system like Apache Solr uses three different caches: queryResultCache, filterCache, and documentCache.
In this talk, we will focus on queryResultCache and filterCache and we will see, through practical examples, how they are used to handle different types of queries.
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...Sease
RRE is an open-source search quality evaluation tool that can be used to produce a set of reports about the quality of a system, iteration after iteration, and that can be integrated within a continuous integration infrastructure to monitor quality metrics after each release.
Many aspects remained problematic though:
– how to directly evaluate a middle layer search-API that communicates with Apache Solr or Elasticsearch?
– how to easily generate explicit and implicit ratings without spending hours on tedious json files?
– how to better explore the evaluation results? with nice widgets and interesting insights?
Rated Ranking Evaluator Enterprise solves these problems and much more.
Join us as we introduce the next generation of open-source search quality evaluation tools, exploring the internals and real-world scenarios!
This presentation will start by introducing how Apache Lucene can be used to classify documents using data structures that already exist in your index instead of having to generate and supply external training sets. The focus will be on extensions of the Lucene Classification module that come in Lucene 6.0 and the Lucene Classification module's incorporation into Solr 6.1. These extensions will allow you to classify at a document level with individual field weighting, numeric field support, lat/lon fields etc. The Solr ClassificationUpdateProcessor will be explored and how to use it including basic and advanced features like multi class support and classification context filtering. The presentation will include practical examples and real world use cases.
Advanced Document Similarity with Apache LuceneSease
Being your core domain involving real world entities ( such as hotels, restaurant, cars ...) or text documents, searching for similar entities, given one in input, is a very common use case for most of the systems that involve information retrieval. This presentation will start describing how much this problem is present across a variety of different scenarios and how you can use the More Like This feature in the Apache Lucene library to solve it. Building on the introduction the focus will be on how the More Like This module internally works, all the components involved end to end, BM25 text similarity metric and how this has been included through a cospicuos refactor and testing process. The presentation will include real world usage examples and future developments such as improved query building through positional phrase queries and term relevancy scoring pluggability.
Search Quality Evaluation: a Developer PerspectiveSease
Search quality evaluation is an ever-green topic every search engineer ordinarily struggles with. Improving the correctness and effectiveness of a search system requires a set of tools which help measuring the direction where the system is going.
The slides will focus on how a search quality evaluation tool can be seen under a practical developer perspective, how it could be used for producing a deliverable artifact and how it could be integrated within a continuous integration infrastructure.
Music Information Retrieval is about retrieving information from music entities.
The slides will introduce the basic concepts of the music language, passing through different kind of music representations and it will end up describing some low level features that are used when dealing with music entities.
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationSease
To provide a standard, unified and approachable technology, we developed the Rated Ranking Evaluator (RRE), an open source tool for evaluating and measuring the search quality of a given search infrastructure. RRE is modular, compatible with multiple search technologies and easy to extend. It is composed by a core library and a set of modules and plugins that give it the flexibility to be integrated in automated evaluation processes and in continuous integrations flows.
This talk will introduce RRE, it will describe its latest developments and demonstrate how it can be integrated in a project to measure and assess the search quality of your search application.
In the last few years, Artificial Intelligence applications have become more and more sophisticated and often operate like algorithmic “black boxes” for decision-making. Due to this fact, some questions naturally arise when working with these models: why should we trust a certain decision taken by these algorithms? Why and how was this prediction made? Which variables mostly influenced the prediction? The most crucial challenge with complex machine learning models is therefore their interpretability and explainability. This talk aims to illustrate an overview of the most popular explainability techniques and their application in Learning to Rank. In particular, we will examine in depth a powerful library called SHAP with both theoretical and practical insights; we will talk about its amazing tools to give an explanation of the model behaviour, especially how each feature impacts the model’s output, and we will explain to you how to interpret the results in a Learning to Rank scenario.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
When SDMX meets AI-Leveraging Open Source LLMs To Make Official Statistics More Accessible And Discoverable.pptx
1. 29/10/2023
Alessandro Benedetti
When SDMX Meets AI:
Leveraging Open Source LLMs to Make Official
Statistics More Accessible and Discoverable
9th SDMX Global Conference
2. ‣ Born in Tarquinia (ancient Etruscan city in Italy)
‣ R&D Software Engineer
‣ Director
‣ Master degree in Computer Science
‣ PC member for ECIR, SIGIR and Desires
‣ Apache Lucene/Solr PMC member/committer
‣ Elasticsearch/OpenSearch expert
‣ Semantic search, NLP, Machine Learning
technologies passionate
‣ Beach Volleyball player and Snowboarder
ALESSANDRO BENEDETTI
WHO AM I ?
2
3. ‣ Headquarter in London/distributed
‣ Open-source Enthusiasts
‣ Apache Lucene/Solr experts
‣ Elasticsearch/OpenSearch experts
‣ Community Contributors
‣ Active Researchers
‣ Hot Trends : Neural Search,
Natural Language Processing
Learning To Rank,
Document Similarity,
Search Quality Evaluation,
Relevance Tuning
SEArch SErvices
www.sease.io
3
4. AGENDA
4
LLMs and Open Source
SIS-CC to enable AI applications with SDMX
From Natural Language to structured queries
Findings and future Works
5. AI, Machine learning and Deep Learning
https://sease.io/2021/07/artificial-intelligence-applied-to-search-introduction.html
6. WHAT IS A LARGE LANGUAGE MODEL?
● Transformers
● Next-token-prediction and masked-
language-modeling
● estimate the likelihood of each possible
word (in its vocabulary) given the
previous sequence
● learn the statistical structure of
language
● pre-trained on huge quantities of text
https://towardsdatascience.com/how-chatgpt-works-the-models-behind-the-bot-1ce5fca96286
7. OPEN SOURCE LARGE LANGUAGE MODELS
Search + LLMs KPI’s:
Operational
Search Session
Improve Search-driven
Business Metrics.
What KPI’s specific to
LLMs?
What Data metrics?
Combine Metric for
Business?
Focus on limited KPI’s
that impact business.
Track customers onsite
Behaviors for positive or
negative trends.
https://sease.io/2023/06/how-to-choose-the-right-large-language-model-for-your-domain-open-
source-edition.html
● Generalists
○ Falcon
○ LLaMA
■ alpaca
■ vicuna
… many others!
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
8. Search + LLMs KPI’s:
Operational
Search Session
Improve Search-driven
Business Metrics.
What KPI’s specific to
LLMs?
What Data metrics?
Combine Metric for
Business?
Focus on limited KPI’s
that impact business.
Track customers onsite
Behaviors for positive or
negative trends.
● OECD lead initiative (The Organisation for
Economic Co-operation and Development)
● The Statistical Information System Collaboration
Community
● .Stat Suite and Apache Solr
https://siscc.org/developers/technology/
SIS-CC TO ENABLE AI APPLICATIONS WITH SDMX
10. Use case example
Natural language query:
What were the sulfur oxide emissions in
Australia in 2013?
2 Requests to LLM
Search API
GENERATIVE EXTRACTIVE
{ "name":"Emissions of air pollutants",
"id":"ds-siscc-qa:DF_AIR_EMISSIONS"}] },
"Dimensions":[ "Country", "Pollutant", "Unit", "Year"]}, …
Statistic
Filters
{
"Country": ["0|Australia#AUS#"],
"Pollutant": ["0|Sulphur Oxides#SOX#"],
"Unit Of measure": ["0|Total man-made emissions#TOT#"],
"Year": "2013"
}
FROM NATURAL LANGUAGE TO STRUCTURED QUERIES
11. ● How do you update this approach in time?
● New large language models?
● Better prompts?
● How to fit user interactions?
IMPROVING OVER TIME
12. RESULTS: What were the sulfur oxide emissions in Australia in 2013
GPT Generative answer is:
['Sulfur dioxide emissions', 'Air
pollution', 'Environmental impact', 'Fossil
fuel combustion', 'Acid rain']
GPT Extractive answer is:
{'srQMgwl_en_ss': ['1|Environment#ENV#|Air
and climate#ENV_AC#'], 'dimensions_en_ss':
['Time period', 'Reference area',
'Pollutant', 'Country']}
Dataflow retrieved is:
[{'id': 'ds-siscc-qa:DF_AIR_EMISSIONS', 'name':
'Emissions of air pollutants', 'description':
''}]
4 dimensions are available for the above
dataflow:
['Country', 'Year', 'Pollutant', 'Variable']
{
"dataflow": [
{
"description": "",
"id": "ds-siscc-qa:DF_AIR_EMISSIONS",
"name": "Emissions of air pollutants"
}
],
"filters": {
"Country": "0|Australia#AUS#",
"Pollutant": "0|Sulphur Oxides#SOX#",
"Variable": "0|Total man-made emissions#TOT#",
"Year": "2013"
},
"natural_language_query": "What were the sulfur oxide
emissions in Australia in 2013"
}
13. ● Promising! - LLM are good in query expansion (generative or extractive)
● gpt-3.5-turbo-instruct -> new models can do much better!
● 4k tokens - (for both prompt and response) is not enough
● Mistakes - Some times dimension values are associated to the wrong dimension,
more prompt engineering!
FINDINGS
14. ● [Solr] Fine-tune the dimension retrieval Solr query
● [LLM] Select the best model to date - Explore the State of the Art (both commercially and
Open source)
● [LLM] Refine the prompts according to the model
● [LLM] Implement integration tests with the most common failures -> LLM/prompt
engineering to solve them
● [Solr] Finalising the dataflow retrieval Solr query
● [Performance] Stress test the solution
● [Quality] Set up queries/expected documents
THE ROAD TO PRODUCTION
15. ● StatsBot - More conversation!
● Retrieval Augmented Generation
● Results Summarization
FUTURE WORKS