Grant Allen, CTO Chief Product Officer at Dow Jones explains how to deploy Flowable at scale in AWS.
It was presented at the Flowfest 2018 in Barcelona, Spain
Serverless integration with Knative and Apache Camel on KubernetesClaus Ibsen
This presentation will introduce Knative, an open source project that adds serverless capabilities on top of Kubernetes, and present Camel K, a lightweight platform that brings Apache Camel integrations in the serverless world. Camel K allows running Camel routes on top of any Kubernetes cluster, leveraging Knative serverless capabilities such as “scaling to zero”.
We will demo how Camel K can connect cloud services or enterprise applications using its 250+ components and how it can intelligently route events within the Knative environment via enterprise integration patterns (EIP).
Target Group: Developers, architects and other technical people - a basic understanding of Kubernetes is an advantage
This document discusses JavaScript deobfuscation techniques using abstract syntax trees (ASTs). It begins by explaining goals of JavaScript obfuscation like blocking reverse engineering and bypassing antivirus detection. Common obfuscation techniques like eval packing and JSFuck are described. The document then discusses approaches to deobfuscation including runtime execution and manual analysis. It focuses on the benefits of partial evaluation using AST traversal and subtree reduction to perform operations like constant folding and function inlining. Examples are provided of challenges in evaluating complex data structures and functions. The conclusion is that AST-based deobfuscation is difficult but can counter some obfuscation techniques through multi-pass analysis and function hoisting.
How to survive the zombie scrum apocalypse Mia Horrigan
A couple of years ago Christiaan Verwijs and Johannes Schartau coined the term ‘Zombie-Scrum’. What's it all about?
Well, at first sight Zombie Scrum seems to be normal Scrum. But it lacks a beating heart. The Scrum teams do all the Scrum events but a potential releasable increment is rarely the result of a Sprint. Zombie Scrum teams have a very unambitious definition of what ‘done’ means, and no drive to extend it. They see themselves as a cog in the wheel, unable and unwilling to change anything and have a real impact: I’m only here to code! Zombie Scrum teams show no response to a failed or successful Sprint and also don’t have any intention to improve their situation. Actually nobody cares about this team. The stakeholders have forgotten the existence of this team long time ago.
Zombie Scrum is Scrum, but without the beating heart of working software and its on the rise. This workshop will help you understand how to recognise the symptoms and cuases of Zombie Scrum and what you can do to get started to combat and treat Zombie-Scrum. Knowing what causes Zombie Scrum might help prevent a further outbreak and prevent the apocalypse
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
Flinkn Forward San Francisco 2022.
In this talk, we will cover various topics around performance issues that can arise when running a Flink job and how to troubleshoot them. We’ll start with the basics, like understanding what the job is doing and what backpressure is. Next, we will see how to identify bottlenecks and which tools or metrics can be helpful in the process. Finally, we will also discuss potential performance issues during the checkpointing or recovery process, as well as and some tips and Flink features that can speed up checkpointing and recovery times.
by
Piotr Nowojski
MongoDB and Machine Learning with FlowableFlowable
Joram Barrez, Principal Software Engineer at Flowable, explains how to run Flowable on MongoDB.
It was presented at the Flowfest 2018 in Barcelona, Spain
Monitoring Kubernetes with Elasticsearch Services - Ted Jung, Consulting Arch...Amazon Web Services Korea
This document provides an agenda and overview of a presentation on using the Elastic Stack to gain observability of a Kubernetes environment. The presentation covers introducing the Elastic Stack and its components, challenges of monitoring and logging Kubernetes, and a demonstration of using Filebeat, Metricbeat, and APM tools to collect logs, metrics, and traces from a Kubernetes cluster.
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
Flink Forward San Francisco 2022.
At Bloomberg, we deal with high volumes of real-time market data. Our clients expect to be notified of any anomalies in this market data, which may indicate volatile movements in the markets, notable trades, forthcoming events, or system failures. The parameters for these alerts are always evolving and our clients can update them dynamically. In this talk, we'll cover how we utilized the open source Apache Flink and Siddhi SQL projects to build a distributed, scalable, low-latency and dynamic rule-based, real-time alerting system to solve our clients' needs. We'll also cover the lessons we learned along our journey.
by
Ajay Vyasapeetam & Madhuri Jain
Serverless integration with Knative and Apache Camel on KubernetesClaus Ibsen
This presentation will introduce Knative, an open source project that adds serverless capabilities on top of Kubernetes, and present Camel K, a lightweight platform that brings Apache Camel integrations in the serverless world. Camel K allows running Camel routes on top of any Kubernetes cluster, leveraging Knative serverless capabilities such as “scaling to zero”.
We will demo how Camel K can connect cloud services or enterprise applications using its 250+ components and how it can intelligently route events within the Knative environment via enterprise integration patterns (EIP).
Target Group: Developers, architects and other technical people - a basic understanding of Kubernetes is an advantage
This document discusses JavaScript deobfuscation techniques using abstract syntax trees (ASTs). It begins by explaining goals of JavaScript obfuscation like blocking reverse engineering and bypassing antivirus detection. Common obfuscation techniques like eval packing and JSFuck are described. The document then discusses approaches to deobfuscation including runtime execution and manual analysis. It focuses on the benefits of partial evaluation using AST traversal and subtree reduction to perform operations like constant folding and function inlining. Examples are provided of challenges in evaluating complex data structures and functions. The conclusion is that AST-based deobfuscation is difficult but can counter some obfuscation techniques through multi-pass analysis and function hoisting.
How to survive the zombie scrum apocalypse Mia Horrigan
A couple of years ago Christiaan Verwijs and Johannes Schartau coined the term ‘Zombie-Scrum’. What's it all about?
Well, at first sight Zombie Scrum seems to be normal Scrum. But it lacks a beating heart. The Scrum teams do all the Scrum events but a potential releasable increment is rarely the result of a Sprint. Zombie Scrum teams have a very unambitious definition of what ‘done’ means, and no drive to extend it. They see themselves as a cog in the wheel, unable and unwilling to change anything and have a real impact: I’m only here to code! Zombie Scrum teams show no response to a failed or successful Sprint and also don’t have any intention to improve their situation. Actually nobody cares about this team. The stakeholders have forgotten the existence of this team long time ago.
Zombie Scrum is Scrum, but without the beating heart of working software and its on the rise. This workshop will help you understand how to recognise the symptoms and cuases of Zombie Scrum and what you can do to get started to combat and treat Zombie-Scrum. Knowing what causes Zombie Scrum might help prevent a further outbreak and prevent the apocalypse
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
Flinkn Forward San Francisco 2022.
In this talk, we will cover various topics around performance issues that can arise when running a Flink job and how to troubleshoot them. We’ll start with the basics, like understanding what the job is doing and what backpressure is. Next, we will see how to identify bottlenecks and which tools or metrics can be helpful in the process. Finally, we will also discuss potential performance issues during the checkpointing or recovery process, as well as and some tips and Flink features that can speed up checkpointing and recovery times.
by
Piotr Nowojski
MongoDB and Machine Learning with FlowableFlowable
Joram Barrez, Principal Software Engineer at Flowable, explains how to run Flowable on MongoDB.
It was presented at the Flowfest 2018 in Barcelona, Spain
Monitoring Kubernetes with Elasticsearch Services - Ted Jung, Consulting Arch...Amazon Web Services Korea
This document provides an agenda and overview of a presentation on using the Elastic Stack to gain observability of a Kubernetes environment. The presentation covers introducing the Elastic Stack and its components, challenges of monitoring and logging Kubernetes, and a demonstration of using Filebeat, Metricbeat, and APM tools to collect logs, metrics, and traces from a Kubernetes cluster.
Dynamic Rule-based Real-time Market Data AlertsFlink Forward
Flink Forward San Francisco 2022.
At Bloomberg, we deal with high volumes of real-time market data. Our clients expect to be notified of any anomalies in this market data, which may indicate volatile movements in the markets, notable trades, forthcoming events, or system failures. The parameters for these alerts are always evolving and our clients can update them dynamically. In this talk, we'll cover how we utilized the open source Apache Flink and Siddhi SQL projects to build a distributed, scalable, low-latency and dynamic rule-based, real-time alerting system to solve our clients' needs. We'll also cover the lessons we learned along our journey.
by
Ajay Vyasapeetam & Madhuri Jain
Function Mesh for Apache Pulsar, the Way for Simple Streaming SolutionsStreamNative
Pulsar Function is a succinct computing abstraction Apache Pulsar provides to express simple ETL and streaming tasks. The simplicity comes in two folds: Simple Interface and Simple Deployment. As it has been adopted, we realized that the ability to run natively on cloud and integrate multiple functions into one integrity are key to user success. We developed this new feature -- Function Mesh -- to support these new requirements.
This talk aims to provide a thorough walkthrough of this new Function Mesh Feature, including its design, implementation, use cases, and examples, to help people seeking simple streaming solutions understand this newly created powerful tool in Apache Pulsar.
Oracle API Gateway is a software product that allows clients to access backend enterprise services in a simplified and secure manner. It includes components like the core gateway, policy studio for creating policies, and analytics for reporting. The document provides an overview of the basic architecture and components of Oracle API Gateway and outlines the steps for installing, configuring, and managing the gateway and its related tools.
Delight: An Improved Apache Spark UI, Free, and Cross-PlatformDatabricks
This document introduces Delight, an improved Apache Spark UI created by Data Mechanics. Delight provides high-level visualizations of Spark jobs to help users identify inefficiencies and reduce costs. It collects metrics during and after jobs to show CPU usage, task duration, and efficiency. Delight is open source and works on any Spark platform by installing an agent. Data Mechanics aims to further enhance Delight with real-time metrics, driver memory collection, and automated recommendations.
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Databricks
The increasing challenge to serve ever-growing data driven by AI and analytics workloads makes disaggregated storage and compute more attractive as it enables companies to scale their storage and compute capacity independently to match data & compute growth rate. Cloud based big data services is gaining momentum as it provides simplified management, elasticity, and pay-as-you-go model.
Monitoring MySQL Replication lag with Prometheus & pt-heartbeatJulien Pivotto
This document discusses monitoring MySQL replication delay using mysqld_exporter and pt-heartbeat. It describes how pt-heartbeat works to track replication lag by storing timestamps in MySQL. It then details how the author contributed code to mysqld_exporter to integrate this functionality, including relevant configuration options, metrics, and example alerts. The talk encourages contributions to further improve open source monitoring tools.
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward
Moving from Lambda and Kappa Architectures to Kappa+ at Uber
Kappa+ is a new approach developed at Uber to overcome the limitations of the Lambda and Kappa architectures. Whether your realtime infrastructure processes data at Uber scale (well over a trillion messages daily) or only a fraction of that, chances are you will need to reprocess old data at some point.
There can be many reasons for this. Perhaps a bug fix in the realtime code needs to be retroactively applied (aka backfill), or there is a need to train realtime machine learning models on last few months of data before bringing the models online. Kafka's data retention is limited in practice and generally insufficient for such needs. So data must be processed from archives. Aside from addressing such situations, enabling efficient stream processing on archived as well as realtime data also broadens the applicability of stream processing.
This talk introduces the Kappa+ architecture which enables the reuse of streaming realtime logic (stateful and stateless) to efficiently process any amounts of historic data without requiring it to be in Kafka. We shall discuss the complexities involved in such kind of processing and the specific techniques employed in Kappa+ to tackle them.
A presentation from internal meeting on Message Broker System and RabbitMQ. RabbitMQ is open source message broker software that implements the Advanced Message Queuing Protocol (AMQP).
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
Flink Forward San Francisco 2022.
What if my data is already in order? Stream Processing has given us an elegant and powerful solution for running analytic queries and logic over high volumes of continuously arriving data. However, in both Apache Flink and Apache Beam, the notion of time-ordering is baked in at a very low level, making it difficult to express computations that are interested in a semantic-, rather than time-ordering of the data. In financial services, what often matters the most about the data moving between systems is not when the data was created, but in what order, to the extent that many institutions engineer a global sequencing over all data entering and produced by their systems to achieve complete determinism. How, then, can financial institutions and others best employ Stream Processing on streams of data that are already ordered? I will cover various techniques that can make this work, as well as seek input from the community on how Flink might be improved to better support these use-cases.
by
Patrick Lucas
We've added the presentation used by John Walter, Solution Architect for Red Hat's Training and Certification team, from our Accelerating with Ansible webinar. He discussed the emergence of radically simple Ansible automation and answered questions from attendees. Learn how Ansible automates cloud provisioning, configuration management, application deployment, intra-service orchestration, and many other IT needs. Also learn how Ansible is designed for multi-tier deployments from day one and how Ansible models your IT infrastructure by describing how all your systems inter-relate, rather than just managing one system at a time.
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
At Comcast, our team has been architecting a customer experience platform which is able to react to near-real-time events and interactions and deliver appropriate and timely communications to customers. By combining the low latency capabilities of Apache Flink and the dataflow capabilities of Apache NiFi we are able to process events at high volume to trigger, enrich, filter, and act/communicate to enhance customer experiences. Apache Flink and Apache NiFi complement each other with their strengths in event streaming and correlation, state management, command-and-control, parallelism, development methodology, and interoperability with surrounding technologies. We will trace our journey from starting with Apache NiFi over three years ago and our more recent introduction of Apache Flink into our platform stack to handle more complex scenarios. In this presentation we will compare and contrast which business and technical use cases are best suited to which platform and explore different ways to integrate the two platforms into a single solution.
[DSC Europe 22] Overview of the Databricks Platform - Petar ZecevicDataScienceConferenc1
This document provides an overview of the Databricks platform. It discusses how Databricks combines features of data warehouses and data lakes to create a "data lakehouse" that supports both business intelligence/reporting and data science/machine learning use cases. Key components of the Databricks platform include Apache Spark, Delta Lake, MLFlow, Jupyter notebooks, and Delta Live Tables. The platform aims to unify data engineering, data warehousing, streaming, and data science tasks on a single open-source platform.
Lambda Architecture has been a common way to build data pipelines for a long time, despite difficulties in maintaining two complex systems. An alternative, Kappa Architecture, was proposed in 2014, but many companies are still reluctant to switch to Kappa. And there is a reason for that: even though Kappa generally provides a simpler design and similar or lower latency, there are a lot of practical challenges in areas like exactly-once delivery, late-arriving data, historical backfill and reprocessing.
In this talk, I want to show how you can solve those challenges by embracing Apache Kafka as a foundation of your data pipeline and leveraging modern stream-processing frameworks like Apache Flink.
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...NETWAYS
Self-managing a highly scalable distributed system with Apache Kafka® at its core is not an easy feat. That’s why operators prefer tooling such as Confluent Control Center for administering and monitoring their deployments. However, sometimes, you might also like to import monitoring data into a third-party metrics aggregation platform for service correlations, consolidated dashboards, root cause analysis, or more fine-grained alerts. If you’ve ever asked a question along these lines: Can I export JMX data from Confluent clusters to my monitoring system with minimal configuration? What if I could correlate this service’s data spike with metrics from Confluent clusters in a single UI pane? Can I configure some Grafana dashboards for Confluent clusters?
This talk will enable you on achieving the below:
Monitoring Your Event Streams: Integrating Confluent with Prometheus and Grafana (this article)
Monitoring Your Event Streams: Tutorial for Observability Into Apache Kafka Clients
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...HostedbyConfluent
For a long time we discuss how much data we can keep in Kafka. Can we store data forever or do we remove data after a while and maybe having the history in a data lake on Object Storage or HDFS? With the advent of Tiered Storage in Confluent Enterprise Platform, storing data much longer in Kafka is much very feasible. So can we replace a traditional data lake with just Kafka? Maybe at least for the raw data? But what about accessing the data, for example using SQL?
KSQL allows for processing data in a streaming fashion using an SQL like dialect. But what about reading all data of a topic? You can reset the offset and still use KSQL. But there is another family of products, so-called query engines for Big Data. They originate from the idea of reading Big Data sources such as HDFS, object storage or HBase, using the SQL language. Presto, Apache Drill and Dremio are the most popular solutions in that space. Lately these query engines also added support for Kafka topics as a source of data. With that you can read a topic as a table and join it with information available in other data sources. The idea of course is not real-time streaming analytics but batch analytics directly on the Kafka topic, without having to store it in a big data storage.
This talk answers, how well these tools support Kafka as a data source. What serialization formats do they support? Is there some form of predicate push-down supported or do we have to always read the complete topic? How performant is a query against a topic, compared to a query against the same data sitting in HDFS or an object store? And finally, will this allow us to replace our data lake or at least part of it by Apache Kafka?
The document discusses intra-cluster replication in Apache Kafka, including its architecture where partitions are replicated across brokers for high availability. Kafka uses a leader and in-sync replicas approach to strongly consistent replication while tolerating failures. Performance considerations in Kafka replication include latency and durability tradeoffs for producers and optimizing throughput for consumers.
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...HostedbyConfluent
Our core banking platform has been built using domain driven design and microservices and whilst this provides many well-known advantages, it also presents some challenges. Data encapsulation results in each application having its own data store and it becomes impossible to query the state of a customer’s relationship in totality to provide the right products. This challenge becomes even harder if we want to personalize products based on aggregate values of a customer’s behavior over potentially large periods of time.
In this session, we describe how we overcome this problem to enable dynamic charging and rewards based on customer behavior in a banking scenario. We describe
• How we guarantee consistency between our event stream and our OLTP databases using the Outbox pattern.
• The design decisions faced when considering the schema designs in Pinot and how we balanced flexibility and latency using Trino
• Two patterns for enriching the event stream using Kafka streams and how we dealt with late arriving events and transactions.
Leveraging Nexus Repository Manager at the Heart of DevOpsSeniorStoryteller
Mike Worthington of Sonatype gave a presentation about leveraging Nexus Repository Manager. He discussed how Nexus can be used at different stages of a software development lifecycle, from a simple caching proxy to improve speed and consistency, to full integration with continuous integration and continuous delivery pipelines to improve quality. Worthington also explained how Nexus can be used to manage software components, enforcing policies on open source usage and alerting on policy violations. He emphasized that the repository is the hub that connects development, testing, and deployment across teams and environments.
SAP IT session on SAP Screen Personas at TechEd 2013Peter Spielvogel
Martin Lang's presentation on how SAP IT is using SAP Screen Personas to make screens more intuitive. He discusses two use cases: Accrual Cockpit and Time Entry for Interns. In both cases, users are more productive as they require fewer keystrokes to get their work done.
Function Mesh for Apache Pulsar, the Way for Simple Streaming SolutionsStreamNative
Pulsar Function is a succinct computing abstraction Apache Pulsar provides to express simple ETL and streaming tasks. The simplicity comes in two folds: Simple Interface and Simple Deployment. As it has been adopted, we realized that the ability to run natively on cloud and integrate multiple functions into one integrity are key to user success. We developed this new feature -- Function Mesh -- to support these new requirements.
This talk aims to provide a thorough walkthrough of this new Function Mesh Feature, including its design, implementation, use cases, and examples, to help people seeking simple streaming solutions understand this newly created powerful tool in Apache Pulsar.
Oracle API Gateway is a software product that allows clients to access backend enterprise services in a simplified and secure manner. It includes components like the core gateway, policy studio for creating policies, and analytics for reporting. The document provides an overview of the basic architecture and components of Oracle API Gateway and outlines the steps for installing, configuring, and managing the gateway and its related tools.
Delight: An Improved Apache Spark UI, Free, and Cross-PlatformDatabricks
This document introduces Delight, an improved Apache Spark UI created by Data Mechanics. Delight provides high-level visualizations of Spark jobs to help users identify inefficiencies and reduce costs. It collects metrics during and after jobs to show CPU usage, task duration, and efficiency. Delight is open source and works on any Spark platform by installing an agent. Data Mechanics aims to further enhance Delight with real-time metrics, driver memory collection, and automated recommendations.
Accelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote...Databricks
The increasing challenge to serve ever-growing data driven by AI and analytics workloads makes disaggregated storage and compute more attractive as it enables companies to scale their storage and compute capacity independently to match data & compute growth rate. Cloud based big data services is gaining momentum as it provides simplified management, elasticity, and pay-as-you-go model.
Monitoring MySQL Replication lag with Prometheus & pt-heartbeatJulien Pivotto
This document discusses monitoring MySQL replication delay using mysqld_exporter and pt-heartbeat. It describes how pt-heartbeat works to track replication lag by storing timestamps in MySQL. It then details how the author contributed code to mysqld_exporter to integrate this functionality, including relevant configuration options, metrics, and example alerts. The talk encourages contributions to further improve open source monitoring tools.
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward
Moving from Lambda and Kappa Architectures to Kappa+ at Uber
Kappa+ is a new approach developed at Uber to overcome the limitations of the Lambda and Kappa architectures. Whether your realtime infrastructure processes data at Uber scale (well over a trillion messages daily) or only a fraction of that, chances are you will need to reprocess old data at some point.
There can be many reasons for this. Perhaps a bug fix in the realtime code needs to be retroactively applied (aka backfill), or there is a need to train realtime machine learning models on last few months of data before bringing the models online. Kafka's data retention is limited in practice and generally insufficient for such needs. So data must be processed from archives. Aside from addressing such situations, enabling efficient stream processing on archived as well as realtime data also broadens the applicability of stream processing.
This talk introduces the Kappa+ architecture which enables the reuse of streaming realtime logic (stateful and stateless) to efficiently process any amounts of historic data without requiring it to be in Kafka. We shall discuss the complexities involved in such kind of processing and the specific techniques employed in Kappa+ to tackle them.
A presentation from internal meeting on Message Broker System and RabbitMQ. RabbitMQ is open source message broker software that implements the Advanced Message Queuing Protocol (AMQP).
Processing Semantically-Ordered Streams in Financial ServicesFlink Forward
Flink Forward San Francisco 2022.
What if my data is already in order? Stream Processing has given us an elegant and powerful solution for running analytic queries and logic over high volumes of continuously arriving data. However, in both Apache Flink and Apache Beam, the notion of time-ordering is baked in at a very low level, making it difficult to express computations that are interested in a semantic-, rather than time-ordering of the data. In financial services, what often matters the most about the data moving between systems is not when the data was created, but in what order, to the extent that many institutions engineer a global sequencing over all data entering and produced by their systems to achieve complete determinism. How, then, can financial institutions and others best employ Stream Processing on streams of data that are already ordered? I will cover various techniques that can make this work, as well as seek input from the community on how Flink might be improved to better support these use-cases.
by
Patrick Lucas
We've added the presentation used by John Walter, Solution Architect for Red Hat's Training and Certification team, from our Accelerating with Ansible webinar. He discussed the emergence of radically simple Ansible automation and answered questions from attendees. Learn how Ansible automates cloud provisioning, configuration management, application deployment, intra-service orchestration, and many other IT needs. Also learn how Ansible is designed for multi-tier deployments from day one and how Ansible models your IT infrastructure by describing how all your systems inter-relate, rather than just managing one system at a time.
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
At Comcast, our team has been architecting a customer experience platform which is able to react to near-real-time events and interactions and deliver appropriate and timely communications to customers. By combining the low latency capabilities of Apache Flink and the dataflow capabilities of Apache NiFi we are able to process events at high volume to trigger, enrich, filter, and act/communicate to enhance customer experiences. Apache Flink and Apache NiFi complement each other with their strengths in event streaming and correlation, state management, command-and-control, parallelism, development methodology, and interoperability with surrounding technologies. We will trace our journey from starting with Apache NiFi over three years ago and our more recent introduction of Apache Flink into our platform stack to handle more complex scenarios. In this presentation we will compare and contrast which business and technical use cases are best suited to which platform and explore different ways to integrate the two platforms into a single solution.
[DSC Europe 22] Overview of the Databricks Platform - Petar ZecevicDataScienceConferenc1
This document provides an overview of the Databricks platform. It discusses how Databricks combines features of data warehouses and data lakes to create a "data lakehouse" that supports both business intelligence/reporting and data science/machine learning use cases. Key components of the Databricks platform include Apache Spark, Delta Lake, MLFlow, Jupyter notebooks, and Delta Live Tables. The platform aims to unify data engineering, data warehousing, streaming, and data science tasks on a single open-source platform.
Lambda Architecture has been a common way to build data pipelines for a long time, despite difficulties in maintaining two complex systems. An alternative, Kappa Architecture, was proposed in 2014, but many companies are still reluctant to switch to Kappa. And there is a reason for that: even though Kappa generally provides a simpler design and similar or lower latency, there are a lot of practical challenges in areas like exactly-once delivery, late-arriving data, historical backfill and reprocessing.
In this talk, I want to show how you can solve those challenges by embracing Apache Kafka as a foundation of your data pipeline and leveraging modern stream-processing frameworks like Apache Flink.
OSMC 2022 | Ignite: Observability with Grafana & Prometheus for Kafka on Kube...NETWAYS
Self-managing a highly scalable distributed system with Apache Kafka® at its core is not an easy feat. That’s why operators prefer tooling such as Confluent Control Center for administering and monitoring their deployments. However, sometimes, you might also like to import monitoring data into a third-party metrics aggregation platform for service correlations, consolidated dashboards, root cause analysis, or more fine-grained alerts. If you’ve ever asked a question along these lines: Can I export JMX data from Confluent clusters to my monitoring system with minimal configuration? What if I could correlate this service’s data spike with metrics from Confluent clusters in a single UI pane? Can I configure some Grafana dashboards for Confluent clusters?
This talk will enable you on achieving the below:
Monitoring Your Event Streams: Integrating Confluent with Prometheus and Grafana (this article)
Monitoring Your Event Streams: Tutorial for Observability Into Apache Kafka Clients
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...HostedbyConfluent
For a long time we discuss how much data we can keep in Kafka. Can we store data forever or do we remove data after a while and maybe having the history in a data lake on Object Storage or HDFS? With the advent of Tiered Storage in Confluent Enterprise Platform, storing data much longer in Kafka is much very feasible. So can we replace a traditional data lake with just Kafka? Maybe at least for the raw data? But what about accessing the data, for example using SQL?
KSQL allows for processing data in a streaming fashion using an SQL like dialect. But what about reading all data of a topic? You can reset the offset and still use KSQL. But there is another family of products, so-called query engines for Big Data. They originate from the idea of reading Big Data sources such as HDFS, object storage or HBase, using the SQL language. Presto, Apache Drill and Dremio are the most popular solutions in that space. Lately these query engines also added support for Kafka topics as a source of data. With that you can read a topic as a table and join it with information available in other data sources. The idea of course is not real-time streaming analytics but batch analytics directly on the Kafka topic, without having to store it in a big data storage.
This talk answers, how well these tools support Kafka as a data source. What serialization formats do they support? Is there some form of predicate push-down supported or do we have to always read the complete topic? How performant is a query against a topic, compared to a query against the same data sitting in HDFS or an object store? And finally, will this allow us to replace our data lake or at least part of it by Apache Kafka?
The document discusses intra-cluster replication in Apache Kafka, including its architecture where partitions are replicated across brokers for high availability. Kafka uses a leader and in-sync replicas approach to strongly consistent replication while tolerating failures. Performance considerations in Kafka replication include latency and durability tradeoffs for producers and optimizing throughput for consumers.
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...HostedbyConfluent
Our core banking platform has been built using domain driven design and microservices and whilst this provides many well-known advantages, it also presents some challenges. Data encapsulation results in each application having its own data store and it becomes impossible to query the state of a customer’s relationship in totality to provide the right products. This challenge becomes even harder if we want to personalize products based on aggregate values of a customer’s behavior over potentially large periods of time.
In this session, we describe how we overcome this problem to enable dynamic charging and rewards based on customer behavior in a banking scenario. We describe
• How we guarantee consistency between our event stream and our OLTP databases using the Outbox pattern.
• The design decisions faced when considering the schema designs in Pinot and how we balanced flexibility and latency using Trino
• Two patterns for enriching the event stream using Kafka streams and how we dealt with late arriving events and transactions.
Leveraging Nexus Repository Manager at the Heart of DevOpsSeniorStoryteller
Mike Worthington of Sonatype gave a presentation about leveraging Nexus Repository Manager. He discussed how Nexus can be used at different stages of a software development lifecycle, from a simple caching proxy to improve speed and consistency, to full integration with continuous integration and continuous delivery pipelines to improve quality. Worthington also explained how Nexus can be used to manage software components, enforcing policies on open source usage and alerting on policy violations. He emphasized that the repository is the hub that connects development, testing, and deployment across teams and environments.
SAP IT session on SAP Screen Personas at TechEd 2013Peter Spielvogel
Martin Lang's presentation on how SAP IT is using SAP Screen Personas to make screens more intuitive. He discusses two use cases: Accrual Cockpit and Time Entry for Interns. In both cases, users are more productive as they require fewer keystrokes to get their work done.
The document provides an overview of Camunda BPM and discusses typical questions executives may have when considering the product. It recommends a roadmap for introducing Camunda that involves first getting approval, then implementing an initial project to prove success before taking on additional projects. The roadmap outlines key tasks, stakeholders, and tips at each stage, and notes how Camunda can provide support. It also compares Camunda to robotic process automation (RPA) and emphasizes Camunda's ability to orchestrate end-to-end business processes across systems.
Delivering Analytical Workspaces and Rich Interactive Reports Juan Fabian
Juan Rafael hosted a Power BI User Group event in Lima, Peru in June 2017. The event covered interactive reporting solutions in Dynamics 365, modern business documents for the application suite, and analytical workspaces. Juan Rafael discussed how to author reports using Power BI Desktop and extend applications using Visual Studio 2015 to create analytical workspaces and interactive reports. He also showed how legacy report designs can be reimagined for Dynamics 365 applications.
World of Watson 2016 - Data lake or Data SwampKeith Redman
All impoundments of water need flowing mostly pollution free water constantly coming in or they become stagnant. The Data Lake is no different.
IBM views the difference between the Data Lake and the Data Swap and the constant flow of mostly pollution free information that is Governed and its Lifecycle managed. Check out these sessions on Information Governance to see how you can keep your Data Lake Crystal Clean.
This document provides a summary of Nayyar Shabbar's work experience and qualifications. He has over 20 years of experience in business analysis, project management, data warehousing, business intelligence, and analytics. He has worked on numerous large-scale projects for banks, insurance companies, and government organizations. Nayyar has extensive experience leading teams and delivering projects on time while working with various technologies.
The document summarizes the Fall 2020 release of the OpsRamp IT operations platform. It highlights new features for discovery and monitoring, event and incident management, and remediation and automation. These include an auto-monitoring wizard, curated dashboards, expanded container and cloud monitoring, human interaction for automated workflows, and multi-instance loops. The webinar included demonstrations of these features and a discussion.
The document discusses factors that drive sourcing success and describes the Quaest sourcing platform. Key points include:
1) Sourcing strategies are based on sophisticated analyses, but experience and interaction are also important to ensure practical execution. Price accounts for 20% of impact while specification and demand management make up 80%.
2) The Quaest platform provides configurable sourcing apps and tools, pre-defined analytics, and a process environment hosted on Microsoft SharePoint. It includes over 50 apps, 100 analytics, and support from a tutor network.
3) Quaest provides clients with dedicated sites that hold client data separately from other clients. The solution is integrated with Microsoft Office and hosted securely in Germany
Business intelligence: A tool that could help your businessBeyond Intelligence
Business intelligence (BI) is a set of tools and technologies that analyze raw data and convert it into useful information that can help businesses make better decisions. BI databases store facts, like sales amounts, and assign multiple attributes to each fact using a "star schema." This allows for fast analysis of data stored in online analytical processing (OLAP) databases. OLAP databases have dimensions like time, accounts, customers, and products that can be aggregated on the fly. Extract, transform, and load (ETL) tools are used to retrieve raw data, transform it into a format for OLAP databases, and load the data into "cubes" for analysis. User interfaces for BI tools include web interfaces for sharing reports, mobile interfaces
Harish Gaddale is an integration architect with 8 years of experience in integration architecture and development. He has extensive experience with TIBCO products and full SDLC, and has worked on numerous integration projects for customers like NXP Semiconductors. Some of his responsibilities have included designing integration architectures, developing interfaces, unit testing, supporting go-lives, and providing technical consultancy. He is proficient in technologies like TIBCO Suite, Web services, EDI, and SaaS integrations with applications such as Workday, Ariba, and Salesforce.
Modern Thinking: Cómo el Big Data y Cognitive están cambiando la estrategia de Marketing
Por: Ismael Yuste, Strategic Cloud Engineer Google Cloud
Presentación: Introducción a las soluciones Big Data de Google
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Daniel Zivkovic
Two #ModernDataStack talks and one DevOps talk: https://youtu.be/4R--iLnjCmU
1. "From Data-driven Business to Business-driven Data: Hands-on #DataModelling exercise" by Jacob Frackson of Montreal Analytics
2. "Trends in the #DataEngineering Consulting Landscape" by Nadji Bessa of Infostrux Solutions
3. "Building Secure #Serverless Delivery Pipelines on #GCP" by Ugo Udokporo of Google Cloud Canada
We ran out of time for the 4th presenter, so the event will CONTINUE in March... stay tuned! Compliments of #ServerlessTO.
This document discusses implementing a data warehouse for an insurance company called Axiom Care. It outlines the problems with the company's current inability to provide strategic information to decision makers. The proposed solution is to create a data warehouse with schemas representing policy creation and claims processing data. It provides a work breakdown structure and job schedule for the project. Risks are also identified such as unrealistic timelines, budget issues, and technical challenges around data quality and scalability.
As one of the top four Dutch financial institutions, SNS Bank in the Netherlands made a strategic decision to use technology to empower its customers online by fully automating its service and selling channels. In order to effectively move toward a full-scale straight-through processing (STP) experience, SNS Bank chose to achieve its goals by making use of open source software, service-oriented architecture (SOA), and business process management (BPM).
In this session, SNS Bank’s Michel Blok and Red Hat’s Eric Schabell will:
- Take attendees through the history of SNS Bank, laying the groundwork for the vision and strategy for choosing JBoss open source solutions
- Explain the move from a traditional bank to a modern Internet bank providing innovative selling channels
- Describe the existing architecture, detailing the impact this move has had on existing IT systems and the migration efforts to position open source solutions
- Provide a closer look at the lessons learned along the way, giving insight into a working open source STP BPM solution that is cost effective, reliable, flexible, and tailored to evolve with SNS Bank into the future
The SAP Startup Focus Program – Tackling Big Data With the Power of Small by ...Codemotion
Geared exclusively towards helping startups master big data to the benefit of their users, the SAP Startup Focus Program has truly gone global since its initiation in March 2012. The in-memory database platform SAP HANA forms the basis of this initiative.
Marcus and Sönke from the SAP Innovation Center will introduce the program and provide technical insights into the unique capabilities of SAP HANA in a hands-on manner.
A Journey to a Serverless Business Intelligence, Machine Learning and Big Dat...DataWorks Summit
In this talk we will describe the journey we made with one of our customers, Volotea, to deploy a serverless Business Intelligence (BI), Machine Learning (ML) and Big Data (BD) platform on the Cloud. The new platform leverages Platform-as-a-Service (PaaS) Cloud services, and it is the result of the reengineering and extension of an existing platform based on Cloud Infrastructure-as-a-Service (IaaS) services and bare-metal systems. Managing and maintaining BI, ML and BD platforms based on bare-metal or IaaS deployments is not a straightforward task, and as size and complexity grow, we often find ourselves spending more and more time in tasks that are rather administrative, more than of a development or analytics nature. That is exactly what Volotea realized, and together we envisioned and executed a plan to lift and reengineer their platform into a new solution that leverages Microsoft Azure PaaS services. We have delivered a solution that manages to greatly reduce the administrative burden as well as the technical complexity when implementing new use cases. The new platform is based on the Microsoft Azure stack and it includes Azure Data Lake, Azure Data Lake Analytics, Azure Data Factory, Azure Machine Learning and Azure SQL Database. Join us in this talk where we will share our lessons learned and we will discuss how to plan and execute such an endeavor.
The document analyzes the needs for creating a SAP PI integration platform at Vattenfall Sweden. It discusses Vattenfall's business environment undergoing changes due to energy market regulations. An analysis is presented of the options of SAP PI and Microsoft BizTalk for the integration platform. The analysis finds that while SAP PI should be used for SAP-to-SAP integration, BizTalk is more suitable currently due to lack of SAP PI expertise, its unclear roadmap, BizTalk's maturity, and avoiding vendor lock-in. The proposal is to use BizTalk for 1-2 years and monitor SAP PI's development to potentially switch to it if it becomes more mature and widely adopted.
The document provides information about SAP modules and ERP concepts. It defines ERP as enterprise resource planning software that integrates various business functions like finance, manufacturing, supply chain, and human resources. The key modules described include FI, CO, MM, SD, and Basis. It also outlines SAP architecture and positions like functional consultant. Dashboards and visualization tools are presented as ways to access and report on ERP data.
Similar to Deploying Flowable at scale in AWS (20)
This document summarizes Dario Nascimben's presentation on creating a flexible workflow using Flowable. It discusses the need for flexibility in knowledge-intensive tasks where not all cases can be predefined. It presents a taxonomy of flexibility, including flexibility by change, underspecification, deviation, and design. It then describes how Dario created custom tasks in Flowable by defining stages as JSON objects, allowing flexibility while still tracking progress. This approach decouples the real process from the software-embedded process and provides benefits like rapid development and support for CMMN standards.
How SAP uses Flowable as its BPMN engine for SAP CP WorkflowFlowable
This document discusses SAP's use of Flowable as the BPMN engine for SAP Cloud Platform Workflow. It provides an overview of SAP Cloud Platform and how Workflow fits into the platform. It also describes the architecture of SAP Cloud Platform Workflow and how it supports both PaaS and SaaS models. Additionally, it outlines SAP's journey to migrating from Activiti to Flowable as the BPMN engine.
1) SAP has evolved its business process management capabilities since 1996 from a focus on embedded workflows to intelligent BPM in the cloud.
2) The document discusses SAP's capabilities for intelligent business process management including intelligent RPA, business rules, process visibility, process mining, and multicloud.
3) Examples are provided of customers achieving a 15% productivity increase in oil well operations and over a 10x reduction in capital approval times through automated workflows.
Flowable BPM has many low code features from its core BPMN, CMMN and DMN models. The enterprise version has additional models that help define more complex solutions
The document summarizes updates to the Flowable project, including strong growth in the community, a focus on releases 6.4 and 6.5, and improvements to the BPMN, CMMN, and DMN engines. New features include better support for CMMN models, entity linking, improved event handling, batch processing, and history cleanup. Upcoming work includes the 6.5 release, documentation, and blog posts on event architectures and combining CMMN and BPMN.
MIgrating business process instances is non-trivial but Flowable provides advanced capabilities to migrate complex processes, also in batch and test modes
The document discusses challenges with error analysis in BPMN and CMMN execution using Flowable. It notes that not all necessary data is captured in historic tables due to rollbacks not being stored and transactional behavior. Examples are provided where failures in asynchronous jobs, straight-through processes, and service tasks result in no failure data being recorded. The document then covers logging capabilities in Flowable, including log events captured during transactions, and how Flowable Insight can integrate with logging for improved error analysis. Next steps discussed are enhancing logging event types and controls and further developing Flowable Insight features.
Flowable Business Processing from Kafka Events Flowable
Slides of the Presentation "Flowable Business Processing from Kafka Events" given by Joram Barrez (Software Architect at Flowable) and Tijs Rademakers (VP of Engineering at Flowable) at DevoXX Belgium, 04.11.2019 - 06.11.2019.
BpmNEXT2019 - The Case of Intentional ProcessFlowable
“The Case of the Intentional Process” given by our Chief Product Officer, Paul Holmes-Higgin, and our Chief Technology Officer, Micha Kiener at the bpmNEXT 2019 in Santa Barbara, California.
Joram Barrez and Tijs Rademakers, Principal Software Engineer at Flowable present the current state of (Flowable)things.
It was presented at the Flowfest 2018 in Barcelona, Spain
Flowable: Building a crowd sourced document extraction and verification systemFlowable
This document describes a crowd-sourced document verification system built using Flowable to replace a legacy solution. The system orchestrates machine and human tasks at scale to verify financial documents. It uses Flowable's workflow engine embedded with a Spring Boot application. The architecture includes custom UIs, a mobile app, and real-time notifications. Lessons learned include understanding asynchronous tasks, failure handling, and process migration bottlenecks with large history tables. The outcome is a highly scalable system handling millions of tasks per month across hundreds of concurrent users.
Do you want Software for your Business? Visit Deuglo
Deuglo has top Software Developers in India. They are experts in software development and help design and create custom Software solutions.
Deuglo follows seven steps methods for delivering their services to their customers. They called it the Software development life cycle process (SDLC).
Requirement — Collecting the Requirements is the first Phase in the SSLC process.
Feasibility Study — after completing the requirement process they move to the design phase.
Design — in this phase, they start designing the software.
Coding — when designing is completed, the developers start coding for the software.
Testing — in this phase when the coding of the software is done the testing team will start testing.
Installation — after completion of testing, the application opens to the live server and launches!
Maintenance — after completing the software development, customers start using the software.
Utilocate offers a comprehensive solution for locate ticket management by automating and streamlining the entire process. By integrating with Geospatial Information Systems (GIS), it provides accurate mapping and visualization of utility locations, enhancing decision-making and reducing the risk of errors. The system's advanced data analytics tools help identify trends, predict potential issues, and optimize resource allocation, making the locate ticket management process smarter and more efficient. Additionally, automated ticket management ensures consistency and reduces human error, while real-time notifications keep all relevant personnel informed and ready to respond promptly.
The system's ability to streamline workflows and automate ticket routing significantly reduces the time taken to process each ticket, making the process faster and more efficient. Mobile access allows field technicians to update ticket information on the go, ensuring that the latest information is always available and accelerating the locate process. Overall, Utilocate not only enhances the efficiency and accuracy of locate ticket management but also improves safety by minimizing the risk of utility damage through precise and timely locates.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
Artificia Intellicence and XPath Extension FunctionsOctavian Nadolu
The purpose of this presentation is to provide an overview of how you can use AI from XSLT, XQuery, Schematron, or XML Refactoring operations, the potential benefits of using AI, and some of the challenges we face.
Software Engineering, Software Consulting, Tech Lead, Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Transaction, Spring MVC, OpenShift Cloud Platform, Kafka, REST, SOAP, LLD & HLD.
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeAftab Hussain
Understanding variable roles in code has been found to be helpful by students
in learning programming -- could variable roles help deep neural models in
performing coding tasks? We do an exploratory study.
- These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia
E-commerce Application Development Company.pdfHornet Dynamics
Your business can reach new heights with our assistance as we design solutions that are specifically appropriate for your goals and vision. Our eCommerce application solutions can digitally coordinate all retail operations processes to meet the demands of the marketplace while maintaining business continuity.
Launch Your Streaming Platforms in MinutesRoshan Dwivedi
The claim of launching a streaming platform in minutes might be a bit of an exaggeration, but there are services that can significantly streamline the process. Here's a breakdown:
Pros of Speedy Streaming Platform Launch Services:
No coding required: These services often use drag-and-drop interfaces or pre-built templates, eliminating the need for programming knowledge.
Faster setup: Compared to building from scratch, these platforms can get you up and running much quicker.
All-in-one solutions: Many services offer features like content management systems (CMS), video players, and monetization tools, reducing the need for multiple integrations.
Things to Consider:
Limited customization: These platforms may offer less flexibility in design and functionality compared to custom-built solutions.
Scalability: As your audience grows, you might need to upgrade to a more robust platform or encounter limitations with the "quick launch" option.
Features: Carefully evaluate which features are included and if they meet your specific needs (e.g., live streaming, subscription options).
Examples of Services for Launching Streaming Platforms:
Muvi [muvi com]
Uscreen [usencreen tv]
Alternatives to Consider:
Existing Streaming platforms: Platforms like YouTube or Twitch might be suitable for basic streaming needs, though monetization options might be limited.
Custom Development: While more time-consuming, custom development offers the most control and flexibility for your platform.
Overall, launching a streaming platform in minutes might not be entirely realistic, but these services can significantly speed up the process compared to building from scratch. Carefully consider your needs and budget when choosing the best option for you.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
WhatsApp offers simple, reliable, and private messaging and calling services for free worldwide. With end-to-end encryption, your personal messages and calls are secure, ensuring only you and the recipient can access them. Enjoy voice and video calls to stay connected with loved ones or colleagues. Express yourself using stickers, GIFs, or by sharing moments on Status. WhatsApp Business enables global customer outreach, facilitating sales growth and relationship building through showcasing products and services. Stay connected effortlessly with group chats for planning outings with friends or staying updated on family conversations.
1. FLOWFEST - NOVEMBER 2018PROFESSIONAL INFORMATION BUSINESS
Professional Information Business
Deploying Flowable at scale in AWS
1
2. PROFESSIONAL INFORMATION BUSINESS FLOWFEST - NOVEMBER 2018 2
Introduction
● Dow Jones Professional Information Business - what does that really mean?
● Dealing with content, human research, and automated processing at scale
● Our historic approach, and why we needed BPM
● Choosing Flowable - as easy as ABC (anything but closed-source :) )
● Our architecture, how we manage and evolve it
● Challenges
● Future Expansion and Long term goals
3. PROFESSIONAL INFORMATION BUSINESS FLOWFEST - NOVEMBER 2018 3
Dow Jones Professional Information Business
A Brief Overview
Millions of articles flow into our
research engine Factiva.
4. PROFESSIONAL INFORMATION BUSINESS FLOWFEST - NOVEMBER 2018 4
Dow Jones Professional Information Business
A Brief Overview (cont.)
A range of processing steps occur in
our content pipeline, variously using
tools like rules-based coding engines,
normalisation/transformation, ML
models, etc.
5. PROFESSIONAL INFORMATION BUSINESS FLOWFEST - NOVEMBER 2018 5
Dow Jones Professional Information Business
A Brief Overview (cont.)
Historically, we scaled with people. If
we needed to cover more content, or
extract more structured data, we’d just
add more people!
6. PROFESSIONAL INFORMATION BUSINESS FLOWFEST - NOVEMBER 2018 6
Dow Jones Professional Information Business
A Brief Overview (cont.)
We knew BPM was part of the
solution. While not magically doing all
of the work, it helps us standardise,
decide what a “task” really is, highlights
best areas for automation, unlocks
insight, and more!
7. PROFESSIONAL INFORMATION BUSINESS FLOWFEST - NOVEMBER 2018 7
Choosing Flowable
● Light vs Heavy BPM
● Ease of Adoption
● Other features, e.g. Forms
● Support model(s)
12. 12
Challenges
● Decoupling BPM engine from custom
workflow application
● User/Groups integration with the external
user management application
13. 13
Future Expansion and Long Term Goals
● History / archive solution
● Deeper analytics, share data with
operational and customer data lake
● Parallel deployments across other major
businesses within Dow Jones - Ad Tech
● Using endpoints from AWS Sage Maker to
use ML models