With the rise of opinionated, full-featured frameworks and Object-Relational Mappers, we – as software developers – will generally build systems where our data is directly linked to the current state of our application; one row in the database equates to one entity’s state within the system. Only ever knowing the current state of the data is adequate for many systems, but imagine the possibilities if one had access to the states of all entities at any given time, and how that state was reached. It is just as important to understand the current state as it is to understand the steps that it took to reach the current state.
Enter Event Sourcing: instead of persisting the current state of our entities, we store historical events about our data. This pattern changes how we store and process our data, but is surprisingly lightweight and performant. In this talk I will present the basic concepts behind Event Sourcing and the positive implications it has on usability, visualization, and analytics within our applications. We’ll see how naturally it couples with the Event-oriented world of Reactive systems. Finally, we’ll examine some practical use cases and when one should and should not consider implementing the pattern. Event Sourcing will completely change how you think about your data.
In Onebip we developed a reporting system based on CQRS (Command Query Responsibility Segregation) and Event Sourcing using MongoDB.
In this talk I will introduce CQRS and Event Sourcing concepts, I will talk about our path and technical and conceptual challenges we faced, the strenght of our solution and the parts where there's room for improvement.
Gr8conf US 2015 - Intro to Event Sourcing with GroovySteve Pember
As Grails developers we generally build systems where our data is directly linked to the current state of our application; one row in the database equates to one entity’s current state. Only ever knowing the current state of the data is adequate for many systems, but imagine the possibilities if one had access to the state of the data at any point in time.
Enter Event Sourcing: instead of persisting the current state of our Domain Objects, we store historical events about our data. This pattern changes how we store and process our data, but is surprisingly lightweight and performant. In this talk I will present the basic concepts of Event Sourcing and the positive effects it can have on analytics and performance. We’ll see how naturally it couples with the Event-oriented world of modern Reactive systems, and how easily it can be implemented in Groovy. We’ll examine some practical use cases and example implementations. Event Sourcing will change how you think about your data.
As businesses grow, so does the complexity of their software. New features, new models, and new background processes all continue to be added. . .and developers struggle to make sense of it all. Yet the end user demands a swift and functional experience when interacting with your application. It is paramount to be open to alternative patterns that help tame complex, high-demand services. Two such patterns are command-query responsibility segregation (CQRS) and event sourcing (ES).
Command-query responsibility segregation is an architectural pattern for user-facing applications that extends from the now standard Model-View-Controller (MVC) pattern and is an alternative to the CRUD pattern. At its core, CQRS is about changing how we think of and work with our data by introducing two types of models: all user actions become commands, and a read-only query model powers our views. Commands and queries are logistically separated, providing additional decoupling of our application. CQRS also calls for changes in how we store and structure our data.
Enter event sourcing. Instead of persisting the current state of our domain objects or entities, we record historical events about our data. The key advantage is that we can examine our application data at any point in time, rather than just the current state. This pattern changes how we persist and process our data but is surprisingly efficient.
While each of the two patterns can be used exclusively, they complement each other beautifully and facilitate the construction of decoupled, scalable applications or individual services. Stephen Pember explores the fundamentals of each pattern and offers several examples and demonstration code to show how one might actually go about implementing CQRS and ES. Steve discusses task-based UIs and domain-driven design as he outlines some of the advantages—and challenges—that ThirdChannel has seen when developing systems using CQRS and ES over the past year.
Market Basket Analysis in SQL Server Machine Learning ServicesLuca Zavarella
Market Basket Analysis is a methodology that allows the identification of the relationships between a large number of products purchased by different consumers. It was born as a Data Mining technique to support cross-selling and shelf placement of products; but it is also used in medical diagnosis, in bioinformatics, in the analysis of society on the basis of personal data, etc. In this session we will see how the new Machine Learning Services allow us to derive insights from this analysis directly in SQL Server, using the programming language R.
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine Aleksandr Tavgen
Talk about approaches to an observability. Do we need millions of metrics? Anomalies vs regularities? Can Machine Learning help us? Some abilities of Flux language by InfluxData
It covers general problem of creating monitoring and observability without killing your Ops motivation team with False Positives and unexplained alerts.
Problems on this side, pitfalls, anti-patterns, and how to make it right.
How to manage a monitoring zoo. Spaghettification of dashboards. Why Uber needs 9 billion metrics (¯\_(ツ)_/¯) and why this is antipattern. Metrics as a stream of data. We talk about new Flux language from InfluxDb. A bit of time series analysis and defining of pipelines in Flux for metrics data. Drunkyard walk on your metrics or why to measure a randomness.
In Onebip we developed a reporting system based on CQRS (Command Query Responsibility Segregation) and Event Sourcing using MongoDB.
In this talk I will introduce CQRS and Event Sourcing concepts, I will talk about our path and technical and conceptual challenges we faced, the strenght of our solution and the parts where there's room for improvement.
Gr8conf US 2015 - Intro to Event Sourcing with GroovySteve Pember
As Grails developers we generally build systems where our data is directly linked to the current state of our application; one row in the database equates to one entity’s current state. Only ever knowing the current state of the data is adequate for many systems, but imagine the possibilities if one had access to the state of the data at any point in time.
Enter Event Sourcing: instead of persisting the current state of our Domain Objects, we store historical events about our data. This pattern changes how we store and process our data, but is surprisingly lightweight and performant. In this talk I will present the basic concepts of Event Sourcing and the positive effects it can have on analytics and performance. We’ll see how naturally it couples with the Event-oriented world of modern Reactive systems, and how easily it can be implemented in Groovy. We’ll examine some practical use cases and example implementations. Event Sourcing will change how you think about your data.
As businesses grow, so does the complexity of their software. New features, new models, and new background processes all continue to be added. . .and developers struggle to make sense of it all. Yet the end user demands a swift and functional experience when interacting with your application. It is paramount to be open to alternative patterns that help tame complex, high-demand services. Two such patterns are command-query responsibility segregation (CQRS) and event sourcing (ES).
Command-query responsibility segregation is an architectural pattern for user-facing applications that extends from the now standard Model-View-Controller (MVC) pattern and is an alternative to the CRUD pattern. At its core, CQRS is about changing how we think of and work with our data by introducing two types of models: all user actions become commands, and a read-only query model powers our views. Commands and queries are logistically separated, providing additional decoupling of our application. CQRS also calls for changes in how we store and structure our data.
Enter event sourcing. Instead of persisting the current state of our domain objects or entities, we record historical events about our data. The key advantage is that we can examine our application data at any point in time, rather than just the current state. This pattern changes how we persist and process our data but is surprisingly efficient.
While each of the two patterns can be used exclusively, they complement each other beautifully and facilitate the construction of decoupled, scalable applications or individual services. Stephen Pember explores the fundamentals of each pattern and offers several examples and demonstration code to show how one might actually go about implementing CQRS and ES. Steve discusses task-based UIs and domain-driven design as he outlines some of the advantages—and challenges—that ThirdChannel has seen when developing systems using CQRS and ES over the past year.
Market Basket Analysis in SQL Server Machine Learning ServicesLuca Zavarella
Market Basket Analysis is a methodology that allows the identification of the relationships between a large number of products purchased by different consumers. It was born as a Data Mining technique to support cross-selling and shelf placement of products; but it is also used in medical diagnosis, in bioinformatics, in the analysis of society on the basis of personal data, etc. In this session we will see how the new Machine Learning Services allow us to derive insights from this analysis directly in SQL Server, using the programming language R.
Observability - The good, the bad and the ugly Xp Days 2019 Kiev Ukraine Aleksandr Tavgen
Talk about approaches to an observability. Do we need millions of metrics? Anomalies vs regularities? Can Machine Learning help us? Some abilities of Flux language by InfluxData
It covers general problem of creating monitoring and observability without killing your Ops motivation team with False Positives and unexplained alerts.
Problems on this side, pitfalls, anti-patterns, and how to make it right.
How to manage a monitoring zoo. Spaghettification of dashboards. Why Uber needs 9 billion metrics (¯\_(ツ)_/¯) and why this is antipattern. Metrics as a stream of data. We talk about new Flux language from InfluxDb. A bit of time series analysis and defining of pipelines in Flux for metrics data. Drunkyard walk on your metrics or why to measure a randomness.
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...InfluxData
Aleksandr Tavgen from Playtech, the world’s largest online gambling software supplier, will share how they are using InfluxDB 2.0, Flux, and the OpenTracingAPI to gain full observability of their platform. In addition, he will share how InfluxDB has served as the glue to cope with multiple sets of time series data, especially in the case of understanding online user activity — a use case that is normally difficult without the math functions now available with Flux.
Using Time Series for Full Observability of a SaaS PlatformDevOps.com
Aleksandr Tavgen from Playtech, the world’s largest online gambling software supplier, will share how they are using InfluxDB 2.0, Flux, and the OpenTracingAPI to gain full observability of their platform. In addition, he will share how InfluxDB has served as the glue to cope with multiple sets of time series data.
ChatGPT
Data analysis is the process of inspecting, cleaning, transforming, and modeling data to discover useful information, draw conclusions, and support decision-making. It involves applying various techniques and methods to extract insights from data sets, often with the goal of uncovering patterns, trends, relationships, or making predictions.
Here's an overview of the key steps and techniques involved in data analysis:
Data Collection: The first step in data analysis is gathering relevant data from various sources. This can include structured data from databases, spreadsheets, or surveys, as well as unstructured data such as text documents, social media posts, or sensor readings.
Data Cleaning and Preprocessing: Once the data is collected, it often needs to be cleaned and preprocessed to ensure its quality and suitability for analysis. This involves handling missing values, removing duplicates, addressing inconsistencies, and transforming data into a suitable format for analysis.
Exploratory Data Analysis (EDA): EDA involves examining and understanding the data through summary statistics, visualizations, and statistical techniques. It helps identify patterns, distributions, outliers, and potential relationships between variables. EDA also helps in formulating hypotheses and guiding further analysis.
Data Modeling and Statistical Analysis: In this step, various statistical techniques and models are applied to the data to gain deeper insights. This can include descriptive statistics, inferential statistics, hypothesis testing, regression analysis, time series analysis, clustering, classification, and more. The choice of techniques depends on the nature of the data and the research questions being addressed.
Data Visualization: Data visualization plays a crucial role in data analysis. It involves creating meaningful and visually appealing representations of data through charts, graphs, plots, and interactive dashboards. Visualizations help in communicating insights effectively and spotting trends or patterns that may be difficult to identify in raw data.
Interpretation and Conclusion: Once the analysis is performed, the findings need to be interpreted in the context of the problem or research objectives. Conclusions are drawn based on the results, and recommendations or insights are provided to stakeholders or decision-makers.
Reporting and Communication: The final step is to present the results and findings of the data analysis in a clear and concise manner. This can be in the form of reports, presentations, or interactive visualizations. Effective communication of the analysis results is crucial for stakeholders to understand and make informed decisions based on the insights gained.
Data analysis is widely used in various fields, including business, finance, marketing, healthcare, social sciences, and more. It plays a crucial role in extracting value from data, supporting evidence-based decision-making, and driving actionable insig
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...huguk
The task of “data profiling”—assessing the overall content and quality of a data set—is a core aspect of the analytic experience. Traditionally, profiling was a fairly cut-and-dried task: load the raw numbers into a stat package, run some basic descriptive statistics, and report the output in a summary file or perhaps a simple data visualization. However, data volumes can be so large today that traditional tools and methods for computing descriptive statistics become intractable; even with scalable infrastructure like Hadoop, aggressive optimization and statistical approximation techniques must be used. In this talk Sean will cover technical challenges in keeping data profiling agile in the Big Data era. He will discuss both research results and real-world best practices used by analysts in the field, including methods for sampling, summarizing and sketching data, and the pros and cons of using these various approaches.
Sean is Trifacta’s Chief Technical Officer. He completed his Ph.D. at Stanford University, where his research focused on user interfaces for database systems. At Stanford, Sean led development of new tools for data transformation and discovery, such as Data Wrangler. He previously worked as a data analyst at Citadel Investment Group.
Lecture 2: Data, pre-processing and post-processing
Chapters 2,3 from the book “Introduction to Data Mining” by Tan, Steinbach, Kumar.
Chapter 1 from the book Mining Massive Datasets by Anand Rajaraman and Jeff Ullman
Besides Circulation, How else is the print collection being used? Reporting o...Ray Schwartz
Given the diminishing circulation of our print collections, William Paterson University has decided to aggressively count all kinds of uses of our print volumes. One is the counting of ‘browses’. With the upgrade of Voyager 9.0, we are able to mine the event table for such data. However, there are caveats as to what is actually being recorded. This session will go over the pitfalls and challenges of using the event table and illustrate the graphing of both historical circulation and browsing data.
Telemetry allows the long-term collection of data over days, weeks or months from animals in their home cages. Performing chronic studies presents a number of opportunities in terms of experimental design but also results in the collection of very large data sets. Large data sets come with challenges for the collection and analysis of the data. This webinar covers common issues encountered acquiring and analyzing large data sets from chronic telemetry studies, and potential solutions such as scheduling data sampling and automating data collection and analysis. ADInstruments LabChart will be used as examples of how this can be achieved using macros in data acquisition systems. The LabChart macros introduced in this webinar are available for download here.
Association rules are a data mining technique used to discover interesting relationships or associations among a set of items or variables in large datasets. It is commonly applied in market basket analysis, where the goal is to find relationships between items frequently purchased together.
Here's an overview of association rules:
Itemset: An itemset is a collection of items that appear together in a transaction or record. In market basket analysis, an itemset can represent a combination of products purchased together.
Support: Support is a measure of how frequently an itemset appears in the dataset. It is calculated as the ratio of the number of transactions containing the itemset to the total number of transactions. High support indicates that the itemset occurs frequently.
Confidence: Confidence measures the strength of an association between two itemsets. It is calculated as the ratio of the number of transactions containing both itemsets to the number of transactions containing the first itemset. High confidence indicates a strong association.
Lift: Lift is a measure of the strength of the association between two itemsets, taking into account the expected frequency of the itemset occurring by chance. It is calculated as the ratio of the observed support to the expected support if the two itemsets were independent. Lift greater than 1 indicates a positive association.
Apriori Algorithm: The Apriori algorithm is a popular algorithm used to mine association rules. It starts by identifying frequent itemsets with support above a specified threshold. Then, it generates candidate itemsets and prunes those that do not meet the minimum support requirement. The process is repeated iteratively until no more frequent itemsets can be found.
Association Rules: Association rules are the final output of the analysis. They consist of an antecedent (a set of items) and a consequent (another set of items). The rules indicate that if the antecedent is present, the consequent is likely to be present as well. The rules are typically represented in the form antecedent → consequent with associated measures like support, confidence, and lift.
Interpretation and Application: Association rules provide valuable insights into the relationships between items or variables. They can be used for decision-making, marketing strategies, product recommendations, and cross-selling opportunities. For example, a retailer can use association rules to identify product bundles, optimize store layouts, or design targeted marketing campaigns.
It's important to note that association rules are sensitive to data quality, transaction size, and the choice of support and confidence thresholds. Fine-tuning these parameters and domain expertise are essential to obtain meaningful and actionable results.
If you have any specific questions or need further clarification on association rules, feel free to ask!
This is a follow up on a previous talk on hacking my energy monitor. In this talk I go into detail on how I used Machine Learning techniques in the area of Anomaly Detection to draw more value from my data collection.
Spring I_O 2024 - Flexible Spring with Event Sourcing.pptxSteve Pember
Event Sourcing is a modern but non-trivial data model for building scalable and powerful systems. Instead of mapping a single Entity to a single row in a datastore, in an Event Sourced system we persist all changes for an Entity in an append-only journal. This design provides a wealth of benefits: a built-in Audit Trail, Time-Based reporting, powerful Error Recovery, and more. It creates flexible, scalable systems and can easily evolve to meet changing organizational demands. That is, once you have some experience with it. Event Sourcing is straightforward in concept, but it does bring additional complexity and a learning curve that can be intimidating. People coming from traditional ORM systems often wonder: how does one model relations between Entities? How is Optimistic Locking handled? What about datastore constraints?
Based on over eight years of experience with building ES systems in Spring applications, we will demonstrate the basics of Event Sourcing and some of the common patterns. We will see how simple it can be to model events with available tools like Spring Data JPA, JOOQ, and the integration between Spring and Axon. We’ll walk through sample code in an application that demonstrates many of these techniques. However, it’s also not strictly about the code; we’ll see how a process called Event Modeling can be a powerful design tool to align Subject Matter Experts, Product, and Engineering. Attendees will leave with an understanding of the basic Event Sourcing patterns, and hopefully a desire to start creating their own Journals.
Anatomy of a Spring Boot App with Clean Architecture - Spring I/O 2023Steve Pember
In this presentation we will present the general philosophy of Clean Architecture, Hexagonal Architecture, and Ports & Adapters: discussing why these approaches are useful and general guidelines for introducing them to your code. Chiefly, we will show how to implement these patterns within your Spring (Boot) Applications. Through a publicly available reference app, we will demonstrate what these concepts can look like within Spring and walkthrough a handful of scenarios: isolating core business logic, ease of testing, and adding a new feature or two.
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...InfluxData
Aleksandr Tavgen from Playtech, the world’s largest online gambling software supplier, will share how they are using InfluxDB 2.0, Flux, and the OpenTracingAPI to gain full observability of their platform. In addition, he will share how InfluxDB has served as the glue to cope with multiple sets of time series data, especially in the case of understanding online user activity — a use case that is normally difficult without the math functions now available with Flux.
Using Time Series for Full Observability of a SaaS PlatformDevOps.com
Aleksandr Tavgen from Playtech, the world’s largest online gambling software supplier, will share how they are using InfluxDB 2.0, Flux, and the OpenTracingAPI to gain full observability of their platform. In addition, he will share how InfluxDB has served as the glue to cope with multiple sets of time series data.
ChatGPT
Data analysis is the process of inspecting, cleaning, transforming, and modeling data to discover useful information, draw conclusions, and support decision-making. It involves applying various techniques and methods to extract insights from data sets, often with the goal of uncovering patterns, trends, relationships, or making predictions.
Here's an overview of the key steps and techniques involved in data analysis:
Data Collection: The first step in data analysis is gathering relevant data from various sources. This can include structured data from databases, spreadsheets, or surveys, as well as unstructured data such as text documents, social media posts, or sensor readings.
Data Cleaning and Preprocessing: Once the data is collected, it often needs to be cleaned and preprocessed to ensure its quality and suitability for analysis. This involves handling missing values, removing duplicates, addressing inconsistencies, and transforming data into a suitable format for analysis.
Exploratory Data Analysis (EDA): EDA involves examining and understanding the data through summary statistics, visualizations, and statistical techniques. It helps identify patterns, distributions, outliers, and potential relationships between variables. EDA also helps in formulating hypotheses and guiding further analysis.
Data Modeling and Statistical Analysis: In this step, various statistical techniques and models are applied to the data to gain deeper insights. This can include descriptive statistics, inferential statistics, hypothesis testing, regression analysis, time series analysis, clustering, classification, and more. The choice of techniques depends on the nature of the data and the research questions being addressed.
Data Visualization: Data visualization plays a crucial role in data analysis. It involves creating meaningful and visually appealing representations of data through charts, graphs, plots, and interactive dashboards. Visualizations help in communicating insights effectively and spotting trends or patterns that may be difficult to identify in raw data.
Interpretation and Conclusion: Once the analysis is performed, the findings need to be interpreted in the context of the problem or research objectives. Conclusions are drawn based on the results, and recommendations or insights are provided to stakeholders or decision-makers.
Reporting and Communication: The final step is to present the results and findings of the data analysis in a clear and concise manner. This can be in the form of reports, presentations, or interactive visualizations. Effective communication of the analysis results is crucial for stakeholders to understand and make informed decisions based on the insights gained.
Data analysis is widely used in various fields, including business, finance, marketing, healthcare, social sciences, and more. It plays a crucial role in extracting value from data, supporting evidence-based decision-making, and driving actionable insig
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...huguk
The task of “data profiling”—assessing the overall content and quality of a data set—is a core aspect of the analytic experience. Traditionally, profiling was a fairly cut-and-dried task: load the raw numbers into a stat package, run some basic descriptive statistics, and report the output in a summary file or perhaps a simple data visualization. However, data volumes can be so large today that traditional tools and methods for computing descriptive statistics become intractable; even with scalable infrastructure like Hadoop, aggressive optimization and statistical approximation techniques must be used. In this talk Sean will cover technical challenges in keeping data profiling agile in the Big Data era. He will discuss both research results and real-world best practices used by analysts in the field, including methods for sampling, summarizing and sketching data, and the pros and cons of using these various approaches.
Sean is Trifacta’s Chief Technical Officer. He completed his Ph.D. at Stanford University, where his research focused on user interfaces for database systems. At Stanford, Sean led development of new tools for data transformation and discovery, such as Data Wrangler. He previously worked as a data analyst at Citadel Investment Group.
Lecture 2: Data, pre-processing and post-processing
Chapters 2,3 from the book “Introduction to Data Mining” by Tan, Steinbach, Kumar.
Chapter 1 from the book Mining Massive Datasets by Anand Rajaraman and Jeff Ullman
Besides Circulation, How else is the print collection being used? Reporting o...Ray Schwartz
Given the diminishing circulation of our print collections, William Paterson University has decided to aggressively count all kinds of uses of our print volumes. One is the counting of ‘browses’. With the upgrade of Voyager 9.0, we are able to mine the event table for such data. However, there are caveats as to what is actually being recorded. This session will go over the pitfalls and challenges of using the event table and illustrate the graphing of both historical circulation and browsing data.
Telemetry allows the long-term collection of data over days, weeks or months from animals in their home cages. Performing chronic studies presents a number of opportunities in terms of experimental design but also results in the collection of very large data sets. Large data sets come with challenges for the collection and analysis of the data. This webinar covers common issues encountered acquiring and analyzing large data sets from chronic telemetry studies, and potential solutions such as scheduling data sampling and automating data collection and analysis. ADInstruments LabChart will be used as examples of how this can be achieved using macros in data acquisition systems. The LabChart macros introduced in this webinar are available for download here.
Association rules are a data mining technique used to discover interesting relationships or associations among a set of items or variables in large datasets. It is commonly applied in market basket analysis, where the goal is to find relationships between items frequently purchased together.
Here's an overview of association rules:
Itemset: An itemset is a collection of items that appear together in a transaction or record. In market basket analysis, an itemset can represent a combination of products purchased together.
Support: Support is a measure of how frequently an itemset appears in the dataset. It is calculated as the ratio of the number of transactions containing the itemset to the total number of transactions. High support indicates that the itemset occurs frequently.
Confidence: Confidence measures the strength of an association between two itemsets. It is calculated as the ratio of the number of transactions containing both itemsets to the number of transactions containing the first itemset. High confidence indicates a strong association.
Lift: Lift is a measure of the strength of the association between two itemsets, taking into account the expected frequency of the itemset occurring by chance. It is calculated as the ratio of the observed support to the expected support if the two itemsets were independent. Lift greater than 1 indicates a positive association.
Apriori Algorithm: The Apriori algorithm is a popular algorithm used to mine association rules. It starts by identifying frequent itemsets with support above a specified threshold. Then, it generates candidate itemsets and prunes those that do not meet the minimum support requirement. The process is repeated iteratively until no more frequent itemsets can be found.
Association Rules: Association rules are the final output of the analysis. They consist of an antecedent (a set of items) and a consequent (another set of items). The rules indicate that if the antecedent is present, the consequent is likely to be present as well. The rules are typically represented in the form antecedent → consequent with associated measures like support, confidence, and lift.
Interpretation and Application: Association rules provide valuable insights into the relationships between items or variables. They can be used for decision-making, marketing strategies, product recommendations, and cross-selling opportunities. For example, a retailer can use association rules to identify product bundles, optimize store layouts, or design targeted marketing campaigns.
It's important to note that association rules are sensitive to data quality, transaction size, and the choice of support and confidence thresholds. Fine-tuning these parameters and domain expertise are essential to obtain meaningful and actionable results.
If you have any specific questions or need further clarification on association rules, feel free to ask!
This is a follow up on a previous talk on hacking my energy monitor. In this talk I go into detail on how I used Machine Learning techniques in the area of Anomaly Detection to draw more value from my data collection.
Similar to Richer data-history-event-sourcing (20)
Spring I_O 2024 - Flexible Spring with Event Sourcing.pptxSteve Pember
Event Sourcing is a modern but non-trivial data model for building scalable and powerful systems. Instead of mapping a single Entity to a single row in a datastore, in an Event Sourced system we persist all changes for an Entity in an append-only journal. This design provides a wealth of benefits: a built-in Audit Trail, Time-Based reporting, powerful Error Recovery, and more. It creates flexible, scalable systems and can easily evolve to meet changing organizational demands. That is, once you have some experience with it. Event Sourcing is straightforward in concept, but it does bring additional complexity and a learning curve that can be intimidating. People coming from traditional ORM systems often wonder: how does one model relations between Entities? How is Optimistic Locking handled? What about datastore constraints?
Based on over eight years of experience with building ES systems in Spring applications, we will demonstrate the basics of Event Sourcing and some of the common patterns. We will see how simple it can be to model events with available tools like Spring Data JPA, JOOQ, and the integration between Spring and Axon. We’ll walk through sample code in an application that demonstrates many of these techniques. However, it’s also not strictly about the code; we’ll see how a process called Event Modeling can be a powerful design tool to align Subject Matter Experts, Product, and Engineering. Attendees will leave with an understanding of the basic Event Sourcing patterns, and hopefully a desire to start creating their own Journals.
Anatomy of a Spring Boot App with Clean Architecture - Spring I/O 2023Steve Pember
In this presentation we will present the general philosophy of Clean Architecture, Hexagonal Architecture, and Ports & Adapters: discussing why these approaches are useful and general guidelines for introducing them to your code. Chiefly, we will show how to implement these patterns within your Spring (Boot) Applications. Through a publicly available reference app, we will demonstrate what these concepts can look like within Spring and walkthrough a handful of scenarios: isolating core business logic, ease of testing, and adding a new feature or two.
SACon 2019 - Surviving in a Microservices EnvironmentSteve Pember
Many presentations on microservices offer a high-level view of the architecture; rarely do you hear what it’s like to work in such an environment. Stephen Pember shares his experience migrating from a monolith to microservices across several companies, highlighting the mistakes made along the way and offering advice.
Surviving in a Microservices environment -abridgedSteve Pember
Many presentations on Microservices offer a high-level view; rarely does one hear what it’s like to work in such an environment. Individual services are somewhat trivial to develop, but now you suddenly have countless others to track. You’ll become obsessed over how they communicate. You’ll have to start referring to the whole thing as “the Platform”. You will have to take on some considerable DevOps work and start learning about deployment pipelines, metrics, and logging.
Don’t panic. In this presentation we’ll discuss what we learned over the past four years by highlighting our mistakes. We’ll examine what a development lifecycle might look like for adding a new service, developing a feature, or fixing bugs. We’ll see how team communication is more important than one might realize. Most importantly, we’ll show how - while an individual service is simple - the infrastructure demands are now much more complicated: your organization will need to introduce and become increasingly dependent on various technologies, procedures, and tools - ranging from the ELK stack to Grafana to Kubernetes. Lastly, you’ll come away with the understanding that your resident SREs will become the most valued members of your team.
Over the past few years, Gradle has become a popular build tool in the JVM space. This is not surprising, considering the power and the features it brings, compared with its competitors. However, one thing Gradle lacks is history and the collective knowledge at the same level of other alternatives: how does one organize a Gradle project in an ‘idiomatic’ fashion?
We feel that we’ve put together a decent build pipeline for each of our microservices over the years, and each one starts with their build.gradle file(s). We’d like to share it, although we’re not sure if it’s the ‘correct’ way.
In this talk, we’ll walk through a sample project structure and build process. We’ll discuss the various checks and tools we use (e.g. Sonar, CodeNarc, Jenkins) at each step of the build. We’ll explain how each of the components in the process work for us, and share samples of our Groovy scripts. Most importantly, though, we’d like to hear what the audience are using in their builds!
Many presentations on Microservices offer a high level view; rarely does one hear what it’s like to work in such an environment. Individual services are somewhat trivial to develop, but now you suddenly have countless others to track. You’ll become obsessed over how they communicate. You’ll have to start referring to the whole thing as “the Platform”. You will have to take on some DevOps work and start learning about deployment pipelines, metrics, and logging.
Don’t panic. In this presentation we’ll discuss what we, at ThirdChannel, learned over the past four years. We’ll examine what a development lifecycle might look like for adding a new service, developing a feature, or fixing bugs. We’ll dive a bit into DevOps and see how one will become dependent on various metric and centralized logging tools, like Kubernetes and the ELK stack. Finally we’ll talk about team communication and organization… and how they are likely the most important tool for surviving a Microservices development team.
Reactive applications & reactive programming result in flexible, concise, performant code and are a superior alternative to the old thread-based programming model. The reactive approach has gained popularity for a simple reason: we need alternative designs and architectures to meet today’s demands. However, it can be difficult to shift one’s mind to think in reactive terms, particularly when one realizes that we must be Reactive up and down the entire programming stack.
In this talk we’ll explore what it means to be ‘Reactive’. We’ll examine some of the more interesting tools available to us, some of which come from the Groovy community. Specifically we’ll cover Ratpack, RxGroovy, React, and RabbitMq - along with examples and a sample implementation. We’ll demonstrate how effectively they can work together at each level of the stack - from the front end, to the back end, to handling http requests and message queue events - and how easy it can be to go Reactive all the way down.
Event storage offers many practical benefits to distributed systems providing complete state changes over time, but there are a number of challenges when building an event store mechanism. Stephen Pember explores some of the problems you may encounter and shares real-world patterns for working with event storage.
Harnessing Spark and Cassandra with GroovySteve Pember
This talk is an introduction to a powerful combination in the big data space: Apache Spark and Cassandra. Spark is a cluster-computing framework that allows users to perform calculations against resilient in-memory datasets using a functional programming interface. Cassandra is a linearly scalable, fault tolerant, decentralized datastore. These two technologies are complicated, but integrate well and provide such a level of utility that whole companies have formed around them.
In this talk we’ll learn how Spark and Cassandra can be leveraged within your Groovy Application: Spark normally asks for a Scala environment. We’ll talk about Spark and Cassandra from a high level and walk through code examples. We’ll discuss the pitfalls of working with these technologies - like modeling your data appropriately to ensure even distribution in Cassandra and general packaging woes with Spark - and ways to avoid them. Finally, we’ll explore how we at ThirdChannel are using these technologies.
Surviving in a microservices environmentSteve Pember
Many presentations on Microservices offer a high level view; rarely does one hear what it’s like to work in such an environment. Individual services are somewhat trivial to develop, but now you suddenly have countless others to track. You’ll become obsessed over how they communicate. You’ll have to start referring to the whole thing as “the Platform”. You will have to take on some DevOps work and start learning about deployment pipelines, metrics, and logging.
Don’t panic. In this presentation we’ll discuss what we learned over the past three years. We’ll examine what a development lifecycle might look like for adding a new service, developing a feature, or fixing bugs. We’ll dive a bit into DevOps and see how one will become dependent on various metric and centralized logging tools, like Kubernetes and the ELK stack. Finally we’ll talk about team communication and organization... and how they are likely the most important tool for surviving a Microservices development team.
Surviving in a Microservices EnvironmentSteve Pember
Cloud Native Microservice architectures have become increasingly popular over the past few years, and for good reasons: smaller, efficient codebases, finely targeted scaling options, and the ability to do continuous deployment along with continuous integration, among others. All potentially very powerful features. However - as with most things - Microservices bring tradeoffs in terms of application complexity: working with an individual service is easy; overall application development becomes increasingly complex. Perhaps too complex for your average web application.
Many presentations on the Microservice phenomena offer either a high level view on what it is, compare and contrast it with the Monolith pattern, or discuss how to migrate from a Monolith to Microservices, but rarely does one hear what it’s like to actually work in such an environment. Frankly, it can be intimidating for someone accustomed to a traditional monolithic development experience. Individual services are somewhat trivial to develop, but now you suddenly have countless others to keep track of. You may become lost: with all these services, is anyone directing the overall development? You’ll become obsessed over how and when they communicate. You’ll have to start referring to the application on the whole as “the Platform”. It’ll soon become difficult or even impossible to run the whole Platform on a development laptop. You may even have to take on some DevOps work, and start learning about deployment pipelines, and whole new worlds of metrics and logging.
Don’t panic. In this presentation we’ll discuss what we learned working with a Microservice platform for the past three years. We’ll cover what to expect when joining a Microservice team and what the situation will look like as the team size grows. We’ll see how critical inter-service testing strategies are to the success of the team. We’ll examine what a development lifecycle might look like for adding a new service, developing a new feature, or fixing bugs. We’ll dive a bit into DevOps and see how one will become dependent and various metric and centralized logging tools, like Kubernetes and the ELK stack. Finally we’ll talk about communication, team organization strategies, and how they are likely the most important tool for surviving a Microservices development team.
An introduction to Reactive applications, Reactive Streams, and options for t...Steve Pember
The term “reactive” has lately become a buzzword, with a variety of definitions around the Web. When you hear “reactive,” what do you think of? Reactive Streams? The Reactive Manifesto? ReactJS? These terms may seem unrelated, but they share a common core concept.
Reactive applications and Reactive programming result in flexible, concise, performant code and are a superior alternative to the old, standard thread-based imperative programming model. The Reactive approach has gained popularity recently for one simple reason: we need alternative designs and architectures to meet today’s demands. However, it can be difficult to shift one’s mind to think in Reactive terms due to how accustomed we’ve become to the imperative style.
Stephen Pember explores the various definitions of Reactive and Reactive programming with the goal of providing techniques for building efficient, scalable applications. Steve dives into the key concepts of Reactive Streams and examines some sample implementations—including how ThirdChannel is currently using reactive libraries in production code. Steve looks at some of the open source options available in the JVM—including Reactor, RxJava, and Ratpack—giving you an idea of where to begin with the reactive ecosystem. If Reactive is new to you, this should be an excellent introduction.
Are you tired of Hibernate? Is GORM is too heavy for your current project? Do you like having more control over your SQL? Do you like flexible DSLs? Try JOOQ!
JOOQ (Java Object Oriented Querying) is light-weight alternative to classic data access solutions or ORMs like Hibernate, JPA, JDBC, and GORM. JOOQ's goal is to give the developer a flexible DSL for building typesafe, database agnostic SQL queries, and attempts to convince the developer of a ‘database-first’ approach to building their application. In this talk we’ll quickly present an introduction to JOOQ from a high level, discuss its features, and see several examples of how we’re using JOOQ to great effect with many Spring Boot and Ratpack apps within our platform.
Reactive Streams and the Wide World of GroovySteve Pember
The concept of Reactive Streams (aka Reactive Extensions, Reactive Functional Programming, or simply Rx) has become increasingly popular recently, and with good reason. The Reactive Streams specification provides a universal abstraction for asynchronously processing data received across multiple sources (e.g. database, user input, third-party services), and includes mechanisms for controlling the rate at which data is received. This makes it a powerful tool within a Microservice platform. And did we mention that the Groovy lang community is quite involved?
In this talk we’ll explore the various features and concepts of Reactive Streams. We’ll talk about some typical use cases for Rx and more importantly, how to implement them. We’ll focus primarily on RxGroovy and Ratpack, then provide example implementations that show you how to get started with this powerful technique.
An Introduction to Reactive Application, Reactive Streams, and options for JVMSteve Pember
The term “reactive” has lately become a buzzword, with a variety of definitions around the Web. When you hear reactive, what do you think of? Reactive Streams? The Reactive Manifesto? ReactJS? These terms may seem unrelated, but they share a common core concept.
Reactive applications and reactive programming result in flexible, concise, performant code and are a superior alternative to the old standard thread-based imperative programming model. The reactive approach has gained popularity recently for one simple reason: we need alternative designs and architectures to meet today’s demands. However, it can be difficult to shift one’s mind to think in reactive terms due to how accustomed we’ve become to the imperative style.
Stephen Pember explores the various definitions of reactive and reactive programming with the goal of providing techniques for building efficient, scalable applications. Steve dives into the key concepts of Reactive Streams and examines some sample implementations—including how ThirdChannel is currently using reactive libraries in production code. Steve looks at some of the open source options available in the JVM—including Reactor, RxJava, and Ratpack—giving attendees an idea of where to begin with the reactive ecosystem. If reactive is new to you, this should be an excellent introduction.
Richer Data History with Event Sourcing (SpringOne 2GX 2015Steve Pember
A common pattern in application development is to build systems where the data is directly linked to the current state of the application; one row in the database equates to one entity’s current state. Only ever knowing the current state of the data is adequate for many systems, but imagine the possibilities if one had access to the state of the data at any point in time. Enter Event Sourcing: instead of persisting the current state of our Domain Objects or Entities, we record historical events about our data. This pattern changes how we persist and process our data, but is surprisingly lightweight. In this talk I will present the basic concepts of Event Sourcing and the positive effects it can have on analytics and performance. We’ll discuss how storing historical events provides extremely powerful views into our data at any point in time. We’ll see how naturally it couples with the Event-oriented world of modern Reactive systems, and how easily it can be implemented in Groovy. We’ll examine some practical use cases and example implementations in Ratpack. Event Sourcing will change how you think about your data.
Springone2gx 2015 Reactive Options for GroovySteve Pember
Reactive applications and Reactive programming are an alternative to the standard thread-based imperative programming model that can result in flexible, concise code. The Reactive approach has gained popularity recently for one simple reason: we need alternative designs and architectures to meet today’s demands. However, it can be difficult to shift one’s mind to think in Reactive terms. It doesn’t help that the descriptions around the web can be contradictory and the library documentation can be obscure. In this talk, we’ll explore the concepts of Reactive and Reactive Programming. We’ll demonstrate some of the useful Reactive functions and examine some practical implementations - including how we’re currently using Reactive libraries in production code. Most importantly, we’ll look at some of the open source options available to us in the Groovy community, including Reactor, RxJava, and Ratpack. If Reactive is new to you, this should be an excellent introduction.
Gr8conf US 2015: Reactive Options for GroovySteve Pember
Reactive applications and Reactive programming are an alternative to the standard thread-based imperative programming model that can result in flexible, concise code. The Reactive approach has gained popularity recently for one simple reason: we need alternative designs and architectures to meet today’s demands. However, it can be difficult to shift one’s mind to think in Reactive terms. It doesn’t help that the descriptions around the web can be contradictory and the library documentation can be obscure.
In this talk, we’ll explore the concepts of Reactive and Reactive Programming. We’ll demonstrate some of the useful Reactive functions and examine some practical implementations - including how we’re currently using Reactive libraries in production code. Most importantly, we’ll look at some of the open source options available to us in the Groovy community, including Reactor, RxJava, and the Java 8 stream API. If Reactive is new to you, this should be an excellent introduction.
Groovy Options for Reactive Applications - Greach 2015Steve Pember
Performance demands placed on the web applications we build have drastically increased over the past few years. The Reactive approach has gained popularity recently for one simple reason: we need alternative designs and architectures to meet today’s demands. Speed is everything.
Reactive applications and Reactive programming are an alternative to the standard thread-based imperative programming model that can result in flexible, concise code. However, it can be difficult to shift one’s mind to think in Reactive terms. It doesn’t help that the descriptions around the web can be contradictory and the library documentation can be obscure.
In this talk, we’ll explore the concepts of Reactive and Reactive Programming. We’ll demonstrate some of the useful Reactive functions and examine some practical implementations – including how we’re currently using Reactive libraries in production code. Most importantly, we’ll look at some of the open source options available to us in the Groovy community, including Reactor, RxJava, and the Java 8 stream API. If Reactive is new to you, this should be an excellent introduction.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
51. THIRDCHANNEL
Event Sourcing Challenges
• Additional Work To Apply
• Additional CPU Processing
• Non-Traditional Concept of Models
• More Storage Required VS non-ES
53. THIRDCHANNEL
Event Sourcing Challenges
• Additional Work To Apply
• Additional CPU Processing
• Non-Traditional Concept of Models
• More Storage Required VS non-ES
• Reduced Database Level Constraints
55. THIRDCHANNEL
Event Sourcing Challenges
• Additional Work To Apply
• Additional CPU Processing
• Non-Traditional Concept of Models
• More Storage Required VS non-ES
• Reduced Transactions / Database Level Constraints
• May Be Confusing For Junior Engineers
70. THIRDCHANNEL
Why Event Sourcing
• More Than an Audit Log
• Data Storage is Inexpensive
• Used By All Long-Running Businesses
• Only Structural Model That Does Not Lose Information
72. THIRDCHANNEL
Why Event Sourcing
• More Than an Audit Log
• Data Storage is Inexpensive
• Used By All Long-Running Businesses
• Only Structural Model That Does Not Lose Information
• Simplified Testing and Debugging
81. THIRDCHANNEL
Why Event Sourcing
• More Than an Audit Log
• Data Storage is Inexpensive
• Used By All Long-Running Businesses
• Only Structural Model That Does Not Lose Information
• Simplified Testing and Debugging
• Ideal for Business Analysis
109. THIRDCHANNEL
Implementation (Theory)
• Simple Base Objects
• Aggregate and Event SubClasses Have Transient Properties
• Aggregates Receive and Play Events
• Aggregates Require Distinction Between New and Historical Changes
111. THIRDCHANNEL
Implementation (Theory)
• Simple Base Objects
• Aggregate and Event SubClasses Have Transient Properties
• Aggregates Receive and Play Events
• Aggregates Require Distinction Between New and Historical Changes
• Event Service Layer Required
135. THIRDCHANNEL
Querying Events
• All Queries are Projections, including Current State
• Returning Current State is Easy
• Other Projections Can Be Tough
138. THIRDCHANNEL
Querying Events
• All Queries are Projections, including Current State
• Returning Current State is Easy
• Other Projections Can Be Tough
• Try Reactive Streams
142. THIRDCHANNEL
Querying Events
• All Queries are Projections, including Current State
• Returning Current State is Easy
• Other Projections Can Be Tough
• Try Reactive Streams
• Initial Projection Work May Require Dev Time
144. THIRDCHANNEL
Querying Events
• All Queries are Projections, including Current State
• Returning Current State is Easy
• Other Projections Can Be Tough
• Try Reactive Streams
• Initial Projection Work May Require Dev Time
• Consider Cacheable Query Layers
Before I begin, I just want to run through a quick scenario with you all
Picture your bank in your mind. For many of us, this may not exactly be a happy thought.
There’s likely several brick and mortar branches for your particular bank in the area.
However, being tech folks, I don’t suppose it’s a stretch to surmise that you all primarily interact with your bank via their website, yes?
Imagine you went to your bank’s wonderful website, entered your information, and logged in successfully.
I hope no one works for farmers. I typed in “Bank website” into google, and this was the first result.
And, upon logging in, you click the link to check your balance. In doing so, you’re presented with a screen that just shows you “Balance, $100”, but with no context around that number.
This may be fine if you expected there to be $100…
But what if the balance was negative and you weren’t expecting that? What happened to the christmas check from grandma that should have cleared by now?
What if that’s all your bank balance was, just a simple number?
What if that was all your bank could tell you? What if your balance was simply a column in a row in a database somewhere?
… and What if you didn’t agree?
How angry would you be?
Can you imagine the arguments you’d have with the teller or an agent over the phone, trying to figure out if your latest pay check was deposited?
Luckily, that’s not how things are done. Banks store your account’s entire history.
Every transaction you make with your bank. Every Credit or Debit made is logged, along with an audit trail of who (e.g. which teller) made the change.
To get your balance, your bank simply adds up each of these transactions
May also periodically record what the balance was at specific points in time, to prevent having to recalculate everything from the beginning of time.
There’s a certain advantage to this idea, that we can record all modifications to our data - in this case, the credits and the debits - as EVENTS that occur within our system.
For example, Your bank is able to tell you exactly how they arrived at your account balance.
-What about your company’s software?
-Can you tell your users or internal business analysts how you are arriving at the data your application presents to them?
…
-Today I’m going to present a method called Event Sourcing that does just… that and how it can fuel the competitiveness of your company.
Bold, right?
With that, this is ‘Richer Data History with Event Sourcing’. My name is Steve, and I work for a company called ThirdChannel which is located right across the river in Cambridge.
Today I’d like to go over the following topics:
-Event Sourcing at a high level,
-Challenges
Add query!
Let’s begin
Where an object in memory maps directly to a row in a database, even if that row may be split via joins
* update is made to a model, updates a column in your database
* in this method, the only thing you know about your data is what it looks like right now.
Event Sourcing says “that’s fine, but we’re going to something a bit different”.
Instead of storing the current state of our models, we’re going to store facts about our models
Every successful user interaction (and that word successful is important, which I’ll get to later) generates a series of facts or ‘events’ within our system
This stream of events are persisted in our database in the order they occurred, as a journal of what has transpired.
That ’s’ is supposed to be ‘current state’. meant to change this slide
These events can then be played back against our Domain Object, building it up to the state it would be at any given point in time, although this is most likely the Current State
A stream of events represent a particular object in Aggregate
Which means I should talk about the two main concepts behind Event Sourcing
An Event Represents something that has occurred within your system. The past tense is important when describing them. It represents an intentional user action or result of user action that almost always results in the manipulation or state change or an Object.
Things like “BankAccountOpened”, “CurrencyDeposited” are decent names for events
The objects which are affected by Events are referred to as an Aggregate. They generally serve as a root for a stream of events; they represent the state of an event stream ‘in aggregate’.
Akin to a domain model. It doesn’t have to be, though. It can be, say, relationships between objects. For example, at ThirdChannel we model the assignment relationship between our users and what we call programs as an Aggregate. Along with many other things.
Event Sourcing is a purely additive model…
there are no deletes or updating of events. Events are immutable once written to our journal
This is a powerful notion, if you consider the implications: using Event Sourcing, no data is ever lost or ignored.
<pause>
Now, When I need to retrieve information about my Aggregates, I simply play back all of the events that have occurred in the past in order to build up the data to a specific point in time, generally the current date, thus getting the current state of our data.
One of the Key points: by maintaining all events, we’re able to access the current state of our aggregates (again, or objects), certainly, but we can also access the state of our data or aggregates at any point in time.
Which is huge.
I’m sure some of you are thinking “waiiit, if I never get rid of anything, certainly that has tradeoffs, too?” Specifically, performance. What happens if I have thousands, or even millions of events I have to apply?
You’re right, and that’s a great observation.
Luckily, Event Sourcing recommends an early optimization known as ‘Snapshots’
A Snapshot is just what you’d think it would be: a recording of the details of your Aggregate at that moment in time. Persisted forever
As we consume and create events, periodically we persist a snapshot, containing the state of the aggregate at that point in time.
-When replaying events, we load from the most recent snapshot, then apply only the events between when that snapshot was taken, and the targeted end date. So, in this case,…
-I’ll get into some more specifics around snapshots in a bit.
Before I get to the next section, I’m going to go over another example that I’ll reference back to periodically
Suppose we were building an ecommerce app and we are building the ‘shopping’ cart feature.
This is one of my favorite examples, by the way.
Naive, ORM, relational -> join table with quantity
Event Sourcing -> <identify page components> are not saved as a join table or a single row.
Instead, w/ ES, system stores all commands you’ve issued / replays for current state
<list events> Quickly remove base from the cart, before placing the order and generating an OrderPlacedEvent
View doesn’t display raw events
Data backing the view is built up from events to form an Object intended for View
Object is Transient -> object will be garbage collected and no direct representation exists on disk
That brings up the next step, Working with Objects
And this is where Event Sourcing will start to hurt your brain.
In order to fully grasp what Event Sourcing is, it’s important to realize that…:
All objects that are ‘displayed’ to the user in your View layer are simply transient derivatives of your event stream.
They are ephemeral and must be built up from the events to be used in a traditional manner within your application
Finally, I argue that structuring our data in this way is akin to the way our brains work; it’s natural.
Internally, your mind is able to tell you the current state of your knowledge about things. This current state is formed by a series of observations / facts / events in your past.
You’re able to replay these events in your mind, and also remember your knowledge at that point in time.
Our minds aren’t perfect, though, and sometimes we violate the ES rules by deleting Events. Ooops.
Let’s take me as an Example. Even if you’ve never seen or met me before today, your mind has already recorded a series of facts which is driving your mental model of me. For example:
FeatureObservedEvent
ActionObservedEvent
-Now, if I were to suddenly make a rude gesture at the audience…
that would apply a new event to your mental model. Your current state opinion of me would likely be negative
although you could remember a time before you thought negatively of me.
“Man, Steve seemed like an alright guy until he flipped off the audience. What a jerk”
That, I think, is the basis for what Event Sourcing is.
It may be a bit early, but are there any questions so far?
Then Let’s move on to the next section, Challenges or Difficulties with ES.
Or as I like to call it…
Right now, you may be suspicious.
You may be thinking:
“I mean, what you’re describing sounds like a ton of extra work to implement.”
-Not to mention a ton of overhead in processing these events, even if we do make good use of these snapshots you mention
How can you operate in a world without Models?!
And yes, that’s true.
*pause*
Furthermore, here’s some more bad things:
Storing every event that occurs within your system will almost certainly require more storage space
hoarders
Now this is a truly difficult one.
- Will have reduced Database Level Constraints, like Foreign Keys, null checks, unique checks, etc.
- Instead, we have to rely on our software for transactions and these database constraints
Now, this is usually where I lose the more seasoned developers.
Because our properties tend to be transients, serialized within the event, we lose things like foreign key constraints at the db level
Instead, in Event Sourcing, these checks tend to move within Transactional blocks within your code.
Unsure what to put for this slide. I’ve been using a good deal of Spring and Hibernate lately, but thought if I just put a @Transactional annotation, that might be strange.
Plus, ES can also be difficult for Junior Engineers
I’ve noticed that people really cling to the Model View Controller way of life.
- This is a very different way of building our applications, particularly for the web
Recommending a different structure for the Model can make people wary.
- Telling people that the Model becomes a “Transient object derived from the event stream” scares them
About a year ago, I was talking with a group at a Meetup about this very subject. Afterwards I was confronted by a rather ornery fellow.
He said, and I paraphrase: “Steve, I can see you are passionate about this subject… but I gotta tell you that it’s just really, really dumb….”
-I have access to tons of log files
- If I wanted all of my events, I’d just search / harvest my database’s transaction log! That has everything!”
- He was a lot angrier and confrontational than a normal human should be. I withered under his furious comments.
-But, his comment about the transaction log is interesting. Does everyone know about this?
-Relational databases (and Redis I know has a variation of this) will save every action taken against your data, and in the event of error recovery, can use the log to rebuild the current state of your database.
- A transaction log is actually a series of Events used … to recreate the current state of your data. Sound familiar?
As crazy as this all might sound, I’d like to argue that Event Sourcing actually has huge Benefits
Next Up, “don’t worry, ES is worth it”
First, Going back to the concept of a transaction or an Audit Log…
Why is it that I’d want something fancy like ES, when I can audit a log file or look at my transaction log?
There’s a subtle difference between an Audit Log and an Event Stream.
Audit logs tell the history of what happened in your system or what was persisted to the database.
Events Tell
Furthermore, Having the Events as a first-order member of your platform can give you enhanced information around what your users or systems are doing beyond what gets written to the database.
And it’s easier to work with if the events are integrated within the platform already.
Incidentally, an event object typically will have attached to it information about the user which generated the event, which also makes ES a perfect Audit Log
Data storage is crazy cheap. Last I looked, AWS basic SSD storage was 0.013 per gigabyte hour/ which equates to…
If you’re at the point where those pennies matter, you probably have bigger problems.
next.
What I find interesting is that Event Sourcing, or a non-digital analogue of it, is used by every ‘mature’, or long running business.
Just Like I went over in the beginning, banks and accounting methods operate with Event Sourcing
Bankers additionally even use snapshots of your balance in an additional column beyond credits and debits
Lawyers!
If a contract needs to be adjusted, is the contract thrown out and re-written? No. Rather, ‘addendums’ are placed on the contract.
To figure out the contract actually involves, one has to read the initial contract, and then each successful addendum in order to figure out what the thing actually says.
In addition, I argue that all business problems can be successfully modeled with - and benefit greatly by - event sourcing
How many of you all have delete statements in your code?
Remember: there is no delete, ES is the only structural model which does not lose information.
Event Sourcing simplifies Testing and Debugging. A bold claim, I know.
Testing is easier / simplified
with ES, you unit test the events, then later you can simply assert that specific events are applied during integration testing
In addition, debugging is easy, because we have the entire history of our data.
We can look back through our Aggregates’ timelines <next> and examine them at any point in history.
Thus I can see what the historical state of the aggregate… or all my aggregates… was at a particular point in time, along with how it reached that state and who caused those changes.
ahem… I’m sure you’re all keenly aware, but 2015 is the year they visited in this movie. “Where’s my Hoverboard?!”
Anyway.
If we at some point note that there’s an error or discrepancy in our data…
debugging or tracing the error is a snap
-We can find the faulty or conflicting event, know who executed it, when they executed it, and what lead up to the bad state.
And then we can emit a new event to ‘patch’ the issue
If we want to get even crazier, we could go to a specific point in our data’s timeline, then fire fake events in order to simulate alternate timelines.
pause
This has interesting applications for, say, a.b. testing, stress, and disaster testing.
If any of these past few slides reminds you of git… well, how astute. Git is like recursive event sourcing. Ever look at the reflog?
Event sourcing is the ideal storage mechanism for business analysis
-because Event Sourced systems do not lose data, they’re future proofed against any crazy reports that your business analysts may need in the future
suppose one such analyst came to your Ecommerce / shopping cart team asking for… all shoppers who add items to their cart and then remove them within 5 minutes. They want to know who, and which products
with non es, and the naive way I mentioned earlier, you might have to build some sort of tracking table, or mark additional rows with a timestamp.. I dunno.
Regardless, then you deploy… and then wait for the data to gather, as users add and remove products
with Event Sourcing, your write a query for that report, you deploy… and then what to do you have? If you’re thinking: all of the data, obviously… well, you’re wrong.
You have MORE than everything. We can generate the report for how it would look at every point in our history.
Which makes the company and your business analysts extremely happy. There’s nothing they like better than a good report.
Querying over the events; presenting different Views on them, is often called a projection
Perhaps the biggest advantage of ES for me.
look at specific events across one stream
look at specific event types across all streams
Grabbing the Current State of an aggregate is a projection, and probably the easiest: take all events for an Aggregate, in order.
There’s a good deal else to find. In our shopping cart example:
find items in cart for any given date or time range
find items that were removed for any given date or time range
find average rate of items removed vs items purchased for any given…
find average duration between items being added and then being removed for any…
All of the current FMRs or Agents (excuse the font, please!)
which one of us made those transitions
a timeline of each agent’s transition within the program
applications with a long gap between entering the system and being wait listed or interviewed, to see how long a candidate waits until we contact them
how long on average, an agent lasts before being fired and/or average time for agents that have quit
Turnover for a date range
And I can tell you that information at ANY POINT IN TIME. e.g. the average quit rate might be different now than 6 months ago, for example
that’s all I could think of off the top of my head
Which is amazing, right? Just from that one relation.
What would happen.. if I started to correlate other event streams?
<Pause>
Even after this presentation, if you’re still skeptical of the benefits… and you think this is the silliest thing you’ve ever heard of
..be aware that the decision can be out of your hands.
Event Sourcing is often chosen or driven by Management out of business needs, and ‘hacky’ analogues are shoehorned into an existing system AFTER the fact.
Rather, no one says “We need Event Sourcing”, but there’s discussions around the problems that need solving
I know I have
Anyway, next up is implementation
first let’s discuss the theoretical approaches.
Pure Event Sourcing is fairly simplistic in terms of implementation
There’s really only 3 base objects that you have to worry about.
-First up, the Aggregate. You have the id, which should be a UUID, the current revision number, and the type (or ‘clazz’ if you’re working with java)
-The current revision number is used for optimistic locking and to see how advanced our aggregate is.
-The type is used by our system when we want to load the aggregate into a more meaningful class in the system, say, a SubClass of Aggregate. In our example, the ShoppingCart class would SubClass from Aggregate
Next up, Event
-id, revision aggregate_id, the date with time stamp, the type, the user id, and then ‘data’
-data is a serialized representation of the event type’s properties. Generally, JSON, XML, or are recommended for the storage mechanism in the data column.
- this could also be a more efficient mechanism, too, like Google’s Protocol Buffers or Apache Avro
All Events should be named in the past tense, as they should reflect something successful that happened in the PAST
Lastly, we have Snapshot.
Again, we want to serialize the properties of the aggregate at that moment in time.
Next,
I should be clear about what - exactly - is being serialized into those data fields.
Aggregates and Events, or at least classes that implement Aggregate and Event, contain, themselves, transient properties which are generally not persisted to the database.
Plain old object with explicit transient properties
each has corresponding event or events
The Event itself has transient properties, whose values are persisted to the database.
Also, if anyone notices that I’m using JPA annotations and I earlier mentioned that this is an alternative to ORMs… appreciate the irony. This is from a small demo app.
Events Modify Transient values on the Aggregate
It’s almost a Visitor pattern. As Events are generated, they are applied to an Aggregate.
Aggregates are built up or, in the case of loading an aggregate, building back up. Event by Event
my actual aggregate class may have several properties, however, they are all transient, in the sense that they are not persisted local to the aggregate… e.g. not in the same table.
When the aggregate is first created, all of these transients are at their default value, and the playback of the events will restore them to whichever point in time I want.
shopping cart -> Order placement should only charge credit card the first time the event is created
In addition, you’ll also need a service layer to store and load the events and aggregates
it must remember to load events in order of their revision number for the correct aggregate, and initiate the event serialization and de-serialization processes
And those are the basics. That’s not too bad, eh?
Unfortunately… there are a few more practical considerations to go over that are a reality for any real ES system.
When working with ES, feel free to add new Transient properties to aggregates and Events.
Typically, one adds a new transient property to their Aggregate, and then have one or more corresponding new events.
Incidentally, no database changes will be required.
event ‘versioning’
when hydrating, have to know how to deal with old event versions and new events
Because an aggregate has a set of transient properties that are manipulated by events, removing one can cause havoc on the events that were there.
While snapshots seem awesome, do it only rarely. Greg Young, one of the largest voices in the ES community, claims that he doesn’t snapshot an aggregate until it reaches 1000 events.
You have to juggle the time cost of the additional query for the snapshot versus the processing of the small event objects.
I had originally built this out to enumerate many more items, but then realized that they could be grouped into distinct sections.
So, next up…
Persistance: How, exactly, do we store these events?
With the naive use case, here’s my entire schema… our at least, without Snapshot.
The snapshot is very similar to aggregate, just with a data text field
When looking for a storage mechanism, there many options available to you, and, for the most part, you’ll be fine no matter which database option you choose.
Now, here are some of the better options
These are the ones I’m aware of that are in active development.
NVentStore, Prooph, and Akka. Akka is interesting, in that every object you work with is persisted as an event stream, but it doesn’t explicitly call itself Akka Event Source.
Of these, I’d probably recommend Akka, provided you’re on the JVM. The persistence component is available as a standalone jar
When looking for a storage mechanism, there many options available to you, and, for the most part, you’ll be fine no matter which database option you choose.
Now, here are some of the better options
First, EventStore, the database. Highly specialized for working with events and generating projections.
Written By Greg Young, who is perhaps the most well known person in the Event Sourcing community.
Redis is a fantastic choice. It’s excellent for a rapid writes.
Furthermore, Redis itself has a few options for persisting to disk, one of which is event sourcing.
And if course, you cannot go wrong with good, old fashioned Relational Databases.
I would suggest consider sticking with a standard relation database, if you’re already using it
Switching to something like Event Sourcing is already enough change
And now we get to a particular hairy topic…
How am I going to handle all of these events and find what I’m looking for?
All queries within ES are often referred to as a Projections over the Event Stream. This includes the concept of the current state.
All queries within ES are often referred to as a Projection over the Event Stream. This includes the concept of the current state.
Load the aggregate or aggregates by id, load all of the events and replay the events to get back to current state
Fairly easy and straightforward
All queries within ES are often referred to as a Projection over the Event Stream. This includes the concept of the current state.
If we’re querying across the current state of the properties, we can:
query the most recent of the events responsible for that property. User Role
Or..
Had a rather onerous query from our codebase, but then decided a few minutes ago that it’s probably not a great idea to show proprietary / embarrassing code.
maintain a synchronized copy of your aggregates in another table; always sync current state into it, and then query that table
I generally like to keep any data synchronization to a minimum, but this approach can be an attractive convenience for current state searches
Has everyone heard of stream programming, reactive streams, or reactive extensions?
I’m primarily used to the JVM space, so coming from there, we have the ReactiveX libraries (which comes out of netflix, bindings for many languages), Reactor, Java 8’s stream api, and I’m sure the people at TypeSafe could something to this.
This is a topic that can be a presentation unto itself
In fact, the Event Store database, that’s exactly what they do.
One has to write Projections using Streams within a web interface, that then become query-able by clients.
However, as you can probably see. This is fairly difficult. The development team will need to spend time writing the projections.
If you have analysts on your team that are used to writing sql, well, it’s going to be much more difficult for them.
In fact, you may want to hire a fabled Data Scientists, or at least find someone who has experience with Statistics.
I believe that they would have the best chance of finding meaning in the swaths of historical data you’re gathering.
Because this can be so tricky to work with, and because I generally don’t like synchronization, I would absolutely recommend building a query able layer in cache.
In addition, one thing we do at 3c for some of our data is to have a separate service which pulls data from the service gathering the events, and packages it up nicely for querying by other services.
separate reporting / read service
Consider feeding events into additional services or tools, particularly those that are stream friendly, like Splunk or Apache Storm
First, I’d like to point out that we at ThirdChannel have open sourced a small library that we’re using internally for doing Event Sourcing on the JVM.
took the screenshot as unsure of the capacity of the wifi here
Event Sourcing is an additive only, lossless data storage pattern which has insanely high potential for data analysis. It is, however, tricky to work efficiently with.
I wouldn’t recommend it for certain applications; say small static content information websites (e.g. a restaurant or a business’ marketing site). Nor does it make sense to apply to every domain object in a system. However, key data in your application that you use to drive your business can benefit greatly from this approach.