My Talk Slides for Clojured Berlin 2019

The document discusses challenges with error analysis in BPMN and CMMN execution using Flowable. It notes that not all necessary data is captured in historic tables due to rollbacks not being stored and transactional behavior. Examples are provided where failures in asynchronous jobs, straight-through processes, and service tasks result in no failure data being recorded. The document then covers logging capabilities in Flowable, including log events captured during transactions, and how Flowable Insight can integrate with logging for improved error analysis. Next steps discussed are enhancing logging event types and controls and further developing Flowable Insight features.

Graphite at CityGrid - LA DevOps April 2014

Wil Heitritter

Join semantics in kafka streams

Knoldus Inc.

With the recent adoption of the Confluent and Kafka Streams, organizations have experienced significantly improved system stability with real-time processing framework, as well as improved scalability and lower maintenance costs. The focus of this webinar is: ~Different join operators in Kafka Streams. ~Exploring different options in Kafka Streams to join semantics, both with and without shared keys. ~How to put Application Owner in control by leveraging simplified app-centric architecture. If you have any queries, contact Himani over mail himani.arora@knoldus.in

Flowable 2019 What's New

The document summarizes updates to the Flowable project, including strong growth in the community, a focus on releases 6.4 and 6.5, and improvements to the BPMN, CMMN, and DMN engines. New features include better support for CMMN models, entity linking, improved event handling, batch processing, and history cleanup. Upcoming work includes the 6.5 release, documentation, and blog posts on event architectures and combining CMMN and BPMN.

Technologies for the capture and analysis of streaming data has changed over the years and cloud technologies have taken us to a new level. Many people are not aware of the new technologies and architectural paradigms that are available today for near-real-time capture and analysis of high-volume data. This presentation will examine Amazon Web Services’ offerings for streaming data analysis, compare how it’s changed over the years, and take a look at what might be coming in the future. Real-life case-studies and architectures will be shared to demonstrate how these technologies can, and have been, used to successfully meet customer needs.

Structured Streaming in Spark

Digital Vidya

Apache Spark has been gaining steam, with rapidity, both in the headlines and in real-world adoption. Spark was developed in 2009, and open sourced in 2010. Since then, it has grown to become one of the largest open source communities in big data with over 200 contributors from more than 50 organizations. This open source analytics engine stands out for its ability to process large volumes of data significantly faster than contemporaries such as MapReduce, primarily owing to in-memory storage of data on its own processing framework. That being said, one of the top real-world industry use cases for Apache Spark is its ability to process ‘streaming data‘.

OSMC 2015: Grafana and Future of Metrics Visualization by Torkel Ödegaard

NETWAYS

Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018

Bowen Li

Gregory Fee presented on Lyft's use of streaming technologies like Kafka and Flink. Lyft uses streaming for real-time tasks like traffic updates and fraud detection. Previously they used Kinesis and Spark/Hive but are moving to Kafka and Flink for better scalability and developer experience. Lyft's Dryft platform provides consistent feature generation for machine learning using Flink SQL to process streaming and batch data. Dryft programs can backfill historical data and process real-time streams.

Serverless Days Milano - Developing Serverless applications with GraphQL

Marcia Villalba

MineExcellence Digital mine mine safety and e-compliance v1.0

Mason Taylor

The document discusses MineExcellence's Digital Mine platform, which aims to increase safety in the mining industry through greater compliance with safety standards and procedures. The platform allows mining organizations to digitize their standard operating procedures and conduct inspections and audits on mobile devices. It features like geo-location tracking and digital signatures to ensure authenticity. The platform's real-time dashboards are meant to provide transparency and reliable safety metrics to improve safety performance and culture.

Complex batch process migration

Pask v 2.0 infrastructure development

Sooraj S Nambiar

Managing Large Scale Financial Time-Series Data with Graphs

Objectivity

Scalable Dynamic Data Consumption on the Web

The document discusses reducing server load for dynamic web data by moving continuous query evaluation from servers to clients. It proposes doing this through three steps: scalable data storage and publication, efficient data transmission using compression and caching, and continuous evaluation on clients. Several research questions are posed around how to combine publication of real-time and historical data to make it queryable efficiently while storing it in a way that allows efficient data transfer and enabling client-side query evaluation over both static and dynamic data. Hypotheses are made that new data can be stored and retrieved linearly based on amounts, and that server costs will be lower than alternatives with data transfer being the main factor influencing query times.

Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytic...

Flink Forward

The application of Quantitative Analytics to trades for the generation of Risk and P&L metrics has traditionally followed a batch based approach. Regulatory changes impose increasing demand for compute on financial institutions along with a growing demand for real time analytics due to increased volumes in eTrading across all asset classes The talk is based on a use case for pricing Interest Rate Swaps, using Apache Beam, with a call to an external C++ analytics process. It describes the performance characteristics when operating in a non-cloud environment using Apache Flink as opposed to Google Cloud Dataflow The talk will touch upon the subtle difference when operating across multiple runners. It will make suggestions on approaches to portability when architecting for a multi-runner operational environment.

Moving RDF Stream Processing to the Client

Worapol Alex Pongpech, PhD

Stream-processing SPARQL endpoints hosted on web servers are expensive due to an unknown number of clients, unbounded query complexity, and the server doing all the work while clients wait for results. Publishing dynamic data with Triple Pattern Fragments and making clients contribute more to the processing addresses this by annotating triples with timestamps, having clients re-evaluate queries as needed based on the timestamps, and designing the server interface to handle simple requests while putting most of the work on clients.

Airflow 4 manager

Apache Airflow is an open-source workflow management platform that was created at Airbnb in 2014 to author, schedule, and monitor complex workflows. It allows users to define workflows as directed acyclic graphs (DAGs) of tasks. The Airflow scheduler then executes the tasks on workers based on dependencies. Airflow is commonly used for ETL pipelines, data processing, machine learning workflows, and automating devops tasks like monitoring cron jobs. Companies like Robinhood and Google use Airflow for complex data workflows and as a managed service on Google Cloud.

Presentation_IWD_ GDGNantes_2022

Abir Alaoui

This document discusses using R for data science to analyze a case study on business processes in the Silver Economy sector. It covers preparing event log data from CSV files, performing exploratory analysis on the event log, visualizing processes and dashboards, and applying process mining techniques like process discovery and conformance checking. The case study examines a process for qualifying and assessing risk levels from alerts in a system for automatic falls detection in elderly users.

Understanding Business APIs through statistics

WSO2

This document discusses using statistics and data analysis to understand API usage. It describes WSO2's tools for offline and real-time analysis of API data. For offline analysis, the API Manager integrates with WSO2 Business Activity Monitor (BAM) which aggregates event streams, stores the data in Cassandra, analyzes it using Hive, and stores summaries in a relational database. For real-time analysis, the API Manager integrates with WSO2 Complex Event Processing (CEP) which executes queries over event streams to identify patterns like excessive requests from a client. It also discusses integrating Google Analytics for additional monitoring and visualization of API usage statistics.

DEM04 Fearless: From Monolith to Serverless with Dynatrace

When you break your monolith into components, services, or functions, you must understand where and how to break your existing code base and architecture into smaller units so that it scales, performs, and is easy to operate. In this session, Andreas Grabner, technical AWS advocate, shows you how Dynatrace redefined its architecture. He discusses the migration capabilities Dynatrace engineers built into their product and explains how the lessons learned can help you fearlessly transition from monolith to serverless. This session is brought to you by AWS Partner, Dynatrace.

DEM09 [Repeat] Fearless: From Monolith to Serverless with Dynatrace

Dynatrace is a monitoring platform that can help companies migrate from monolithic architectures to microservices and serverless architectures. It uses AI to automatically map dependencies, detect where to split up monoliths, validate performance and scalability at each step, and provide automated root cause analysis. Dynatrace monitoring and APIs help optimize architectures, automate deployments, and enable self-healing throughout the migration process.

Extracting Insights from Data at Twitter

Prasad Wagle

Prasad Wagle's talk discussed how Twitter extracts insights from its large volumes of data. Twitter collects hundreds of millions of tweets and interactions per day from over 300 million monthly active users, creating big data challenges around velocity, volume, and variety. Twitter stores this data in hundreds of petabytes across large Hadoop clusters and processes it using batch tools like Hadoop and Spark as well as real-time tools like Heron. Insights are generated through basic analytics like user counts, A/B testing of new features, and custom data science work including machine learning models for recommendations, content filtering, and ad targeting. Systems, programming, and statistical skills are needed to effectively extract value from Twitter's big data.

Platform events

arun jain

Transform Fearlessly to Serverless with Dynatrace - DEM04 - Toronto AWS Summit

When breaking your monolith into components, services or even functions you must understand WHERE and HOW you break your existing code base and architecture into smaller units to allow it to SCALE, PERFORM and make it EASY enough to operate! This session shows how Dynatrace redefined their architecture; which migration capabilities Dynatrace engineers built into their product; and how the lessons learned can benefit all of us to transform Fearless from Monolith to Serverless!

Transform to Serverless with Dynatrace

If you want to break your monolith into components, services, or even functions, it is important to understand where and how to break your existing code base and architecture into smaller units to allow it to scale and perform, and to make it easy to operate. This session, a representative from Dynatrace shows how the company redefined its architecture, explains which migration capabilities its engineers built into its product, and describes how the lessons learned can benefit everyone as they fearlessly transform from monolith to serverless.

CloudWatch hidden features for debugging serverless application

Marko (ServerlessLife)

This document discusses several hidden features in CloudWatch for debugging serverless applications, including Logs Insights for powerful log querying, Metrics Insights for SQL queries on metrics, and X-Ray for distributed tracing. It also warns that CloudWatch can be expensive for logging and recommends only logging errors or limiting retention. Third-party services are suggested for specialized serverless observability needs.

What's hot

High Volume Streaming Data: How Amazon Web Services is Changing Our Approach

Michael Krouze

Structured Streaming in Spark

Digital Vidya

OSMC 2015: Grafana and Future of Metrics Visualization by Torkel Ödegaard

NETWAYS

Streaming at Lyft, Gregory Fee, Seattle Flink Meetup, Jun 2018

Bowen Li

Serverless Days Milano - Developing Serverless applications with GraphQL

Marcia Villalba

MineExcellence Digital mine mine safety and e-compliance v1.0

Mason Taylor

Complex batch process migration

Pask v 2.0 infrastructure development

Sooraj S Nambiar

Managing Large Scale Financial Time-Series Data with Graphs

Objectivity

Scalable Dynamic Data Consumption on the Web

Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytic...

Flink Forward

Moving RDF Stream Processing to the Client