How Orange Financial combat financial frauds over 50M transactions a day usin...StreamNative
You will learn how Orange Financial combat financial frauds over 50M transactions a day using Apache Pulsar. The presentation is shared at Strata Data Conference at New York, US, 2019/09.
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...StreamNative
This talk describes Klaviyo’s internal messaging system, an asynchronous application framework built around Pulsar that provides a set of high-quality tools for building business-critical asynchronous data flows in unreliable environments. This framework includes: a pulsar ORM and schema migrator for topic configuration; a retry/replay system; a versioned schema registry; a consumer framework oriented around preventing message loss and in hostile environments while maximizing observability; an experimental “online schema change” for topics; and more. Development of this system was informed by lessons learned during heavy use of datastores like RabbitMQ and Kafka, and frameworks like Celery, Spark, and Flink. In addition to the capabilities of this system, this talk will also cover (sometimes painful) lessons learned about the process of converting a heterogenous async-computing environment onto Pulsar and a unified model.
This document discusses enterprise integration patterns. It covers common integration styles and building blocks like endpoints, channels, and messages. It also describes main message exchange patterns and styles. Popular messaging protocols like AMQP and STOMP are explained. Finally, it discusses enterprise message brokers and frameworks that implement integration patterns.
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...StreamNative
MQTT (Message Queuing Telemetry Transport,) is a message protocol based on the pub/sub model with the advantages of compact message structure, low resource consumption, and high efficiency, which is suitable for IoT applications with low bandwidth and unstable network environments.
This session will introduce MQTT on Pulsar, which allows developers users of MQTT transport protocol to use Apache Pulsar. I will share the architecture, principles and future planning of MoP, to help you understand Apache Pulsar's capabilities and practices in the IoT industry.
How Orange Financial combat financial frauds over 50M transactions a day usin...StreamNative
You will learn how Orange Financial combat financial frauds over 50M transactions a day using Apache Pulsar. The presentation is shared at Strata Data Conference at New York, US, 2019/09.
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...StreamNative
This talk describes Klaviyo’s internal messaging system, an asynchronous application framework built around Pulsar that provides a set of high-quality tools for building business-critical asynchronous data flows in unreliable environments. This framework includes: a pulsar ORM and schema migrator for topic configuration; a retry/replay system; a versioned schema registry; a consumer framework oriented around preventing message loss and in hostile environments while maximizing observability; an experimental “online schema change” for topics; and more. Development of this system was informed by lessons learned during heavy use of datastores like RabbitMQ and Kafka, and frameworks like Celery, Spark, and Flink. In addition to the capabilities of this system, this talk will also cover (sometimes painful) lessons learned about the process of converting a heterogenous async-computing environment onto Pulsar and a unified model.
This document discusses enterprise integration patterns. It covers common integration styles and building blocks like endpoints, channels, and messages. It also describes main message exchange patterns and styles. Popular messaging protocols like AMQP and STOMP are explained. Finally, it discusses enterprise message brokers and frameworks that implement integration patterns.
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...StreamNative
MQTT (Message Queuing Telemetry Transport,) is a message protocol based on the pub/sub model with the advantages of compact message structure, low resource consumption, and high efficiency, which is suitable for IoT applications with low bandwidth and unstable network environments.
This session will introduce MQTT on Pulsar, which allows developers users of MQTT transport protocol to use Apache Pulsar. I will share the architecture, principles and future planning of MoP, to help you understand Apache Pulsar's capabilities and practices in the IoT industry.
Hadoop Workshop using Cloudera on Amazon EC2IMC Institute
This document provides instructions for a hands-on workshop on installing and using Hadoop and Cloudera on Amazon EC2. It outlines the steps to launch an EC2 virtual server instance, install Cloudera Manager and Cloudera Express Edition, import and export data from HDFS, write MapReduce programs in Eclipse, and use various Hadoop tools like HDFS and Hue. The workshop is led by Dr. Thanachart Numnonda and aims to teach participants how to set up their own Hadoop cluster on EC2 and start using Hadoop for big data tasks.
Both Apache Pulsar and Apache Flink share a similar view on how the data and the computation level of an application can be “streaming-first” with batch as a special case streaming. With Apache Pulsar’s Segmented-Stream storage and Apache Flink’s steps to unify batch and stream processing workloads under one framework, there are numerous ways of integrating the two technologies to provide elastic data processing at massive scale, and build a real streaming warehouse.
In this talk, Sijie Guo from Apache Pulsar community will given an overview of Apache Pulsar and how it provides the unified data view to fully leverage Apache Flink unified computation runtime for elastic data processing. He will share the latest integrations between Apache Pulsar and Apache Flink, especially around effectively-once processing and schema integration.
Managing transactions on Ethereum with Apache AirflowMichael Ghen
Apache Airflow is a Python-based workflow management system that can be used to actively monitor and execute transactions on blockchain networks like Ethereum. This presentation is an introduction to Apache Airflow followed by a demonstration of a production deployment. Apache Airflow is an excellent tool for anyone already familiar with Python. Its ability to process jobs and handle errors makes it a good choice tool for managing activity on blockchain networks. The goal of this talk is to demonstrate how Apache Airflow can be used for environmental scanning and batch processing transactions. The demonstration will cover using Airflow and Python for monitoring and executing ERC20 token transactions on the Ethereum blockchain.
Presto is a distributed SQL query engine that allows users to run SQL queries against various data sources. It consists of three main components - a coordinator, workers, and clients. The coordinator manages query execution by generating execution plans, coordinating workers, and returning final results to the client. Workers contain execution engines that process individual tasks and fragments of a query plan. The system uses a dynamic query scheduler to distribute tasks across workers based on data and node locality.
Hadoop Workshop using Cloudera on Amazon EC2IMC Institute
This document provides instructions for a hands-on workshop on installing and using Hadoop and Cloudera on Amazon EC2. It outlines the steps to launch an EC2 virtual server instance, install Cloudera Manager and Cloudera Express Edition, import and export data from HDFS, write MapReduce programs in Eclipse, and use various Hadoop tools like HDFS and Hue. The workshop is led by Dr. Thanachart Numnonda and aims to teach participants how to set up their own Hadoop cluster on EC2 and start using Hadoop for big data tasks.
Both Apache Pulsar and Apache Flink share a similar view on how the data and the computation level of an application can be “streaming-first” with batch as a special case streaming. With Apache Pulsar’s Segmented-Stream storage and Apache Flink’s steps to unify batch and stream processing workloads under one framework, there are numerous ways of integrating the two technologies to provide elastic data processing at massive scale, and build a real streaming warehouse.
In this talk, Sijie Guo from Apache Pulsar community will given an overview of Apache Pulsar and how it provides the unified data view to fully leverage Apache Flink unified computation runtime for elastic data processing. He will share the latest integrations between Apache Pulsar and Apache Flink, especially around effectively-once processing and schema integration.
Managing transactions on Ethereum with Apache AirflowMichael Ghen
Apache Airflow is a Python-based workflow management system that can be used to actively monitor and execute transactions on blockchain networks like Ethereum. This presentation is an introduction to Apache Airflow followed by a demonstration of a production deployment. Apache Airflow is an excellent tool for anyone already familiar with Python. Its ability to process jobs and handle errors makes it a good choice tool for managing activity on blockchain networks. The goal of this talk is to demonstrate how Apache Airflow can be used for environmental scanning and batch processing transactions. The demonstration will cover using Airflow and Python for monitoring and executing ERC20 token transactions on the Ethereum blockchain.
Presto is a distributed SQL query engine that allows users to run SQL queries against various data sources. It consists of three main components - a coordinator, workers, and clients. The coordinator manages query execution by generating execution plans, coordinating workers, and returning final results to the client. Workers contain execution engines that process individual tasks and fragments of a query plan. The system uses a dynamic query scheduler to distribute tasks across workers based on data and node locality.