Updated version for SD Berlin 2016. Distributed streaming performance, consistency, reliable delivery, durability, optimisations, event time processing and other concepts discussed and explained on Akka Persistence and other examples.
Data in Motion: Streaming Static Data EfficientlyMartin Zapletal
Distributed streaming performance, consistency, reliable delivery, durability, optimisations, event time processing and other concepts discussed and explained on Akka Persistence and other examples.
Cassandra v3.0 at Rakuten meet-up on 12/2/2015datastaxjp
Cassandra v3.0 sessions at Cassandra Meet-up at Rakuten Tokyo, Fall 2015. New Functionality (support of JSON, new storage engine, Mview, UDF, UDA etc..)
Investigation of Transactions in Cassandradatastaxjp
Cassandra can be used for more than just big data applications, including global applications and transactions. The presentation discusses how to achieve consistency, atomicity, and isolation in Cassandra transactions. Consistency can be tuned on a per-transaction basis. Atomicity can be achieved using Cassandra BATCH statements. Isolation can be handled using lightweight transactions with IF conditions. An example demonstrates tracking bottlecap balances with transactions in a single table.
Apache Spark - Key Value RDD - Transformations | Big Data Hadoop Spark Tutori...CloudxLab
The document provides information about key-value RDD transformations and actions in Spark. It defines transformations like keys(), values(), groupByKey(), combineByKey(), sortByKey(), subtractByKey(), join(), leftOuterJoin(), rightOuterJoin(), and cogroup(). It also defines actions like countByKey() and lookup() that can be performed on pair RDDs. Examples are given showing how to use these transformations and actions to manipulate key-value RDDs.
This document provides an overview of advanced Hibernate concepts including:
- The persistence lifecycle and context
- Object equality, flush modes, and the persistence manager
- Transactions, contextual sessions, concurrency, caching
- Bulk operations, filters, interceptors, the event system
- Fetching plans and strategies for loading related objects
This document provides an overview of advanced Hibernate concepts including:
- The persistence lifecycle of objects in Hibernate and the four states objects can be in: transient, persistent, detached, and removed.
- The persistence context and how it acts as a cache and guarantees object identity within a session scope.
- Transaction management in Hibernate including demarcating transaction boundaries programmatically or declaratively.
- Additional topics covered include the persistence manager, concurrency control, caching, and batch processing.
Apache Spark - Basics of RDD & RDD Operations | Big Data Hadoop Spark Tutoria...CloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2JgbT3E
This CloudxLab Basics of RDD & RDD Operations tutorial helps you to understand basics of RDD and RDD Operations in detail. Below are the topics covered in this tutorial:
1) Pick Random Samples From a Dataset using Spark
2) Spark Transformations - mapPartitions() & sortBy()
3) Spark Pseudo set operations - distinct(), union(), subtract(), intersection() & cartesian()
4) Spark Actions - fold(), aggregate(), countByValue(), top(), takeOrdered(), foreach() & foreachPartition()
Data in Motion: Streaming Static Data EfficientlyMartin Zapletal
Distributed streaming performance, consistency, reliable delivery, durability, optimisations, event time processing and other concepts discussed and explained on Akka Persistence and other examples.
Cassandra v3.0 at Rakuten meet-up on 12/2/2015datastaxjp
Cassandra v3.0 sessions at Cassandra Meet-up at Rakuten Tokyo, Fall 2015. New Functionality (support of JSON, new storage engine, Mview, UDF, UDA etc..)
Investigation of Transactions in Cassandradatastaxjp
Cassandra can be used for more than just big data applications, including global applications and transactions. The presentation discusses how to achieve consistency, atomicity, and isolation in Cassandra transactions. Consistency can be tuned on a per-transaction basis. Atomicity can be achieved using Cassandra BATCH statements. Isolation can be handled using lightweight transactions with IF conditions. An example demonstrates tracking bottlecap balances with transactions in a single table.
Apache Spark - Key Value RDD - Transformations | Big Data Hadoop Spark Tutori...CloudxLab
The document provides information about key-value RDD transformations and actions in Spark. It defines transformations like keys(), values(), groupByKey(), combineByKey(), sortByKey(), subtractByKey(), join(), leftOuterJoin(), rightOuterJoin(), and cogroup(). It also defines actions like countByKey() and lookup() that can be performed on pair RDDs. Examples are given showing how to use these transformations and actions to manipulate key-value RDDs.
This document provides an overview of advanced Hibernate concepts including:
- The persistence lifecycle and context
- Object equality, flush modes, and the persistence manager
- Transactions, contextual sessions, concurrency, caching
- Bulk operations, filters, interceptors, the event system
- Fetching plans and strategies for loading related objects
This document provides an overview of advanced Hibernate concepts including:
- The persistence lifecycle of objects in Hibernate and the four states objects can be in: transient, persistent, detached, and removed.
- The persistence context and how it acts as a cache and guarantees object identity within a session scope.
- Transaction management in Hibernate including demarcating transaction boundaries programmatically or declaratively.
- Additional topics covered include the persistence manager, concurrency control, caching, and batch processing.
Apache Spark - Basics of RDD & RDD Operations | Big Data Hadoop Spark Tutoria...CloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2JgbT3E
This CloudxLab Basics of RDD & RDD Operations tutorial helps you to understand basics of RDD and RDD Operations in detail. Below are the topics covered in this tutorial:
1) Pick Random Samples From a Dataset using Spark
2) Spark Transformations - mapPartitions() & sortBy()
3) Spark Pseudo set operations - distinct(), union(), subtract(), intersection() & cartesian()
4) Spark Actions - fold(), aggregate(), countByValue(), top(), takeOrdered(), foreach() & foreachPartition()
This document provides an overview of Rxjs (Reactive Extensions for JavaScript). It begins by explaining why Rxjs is useful for dealing with asynchronous code in a synchronous way and provides one paradigm for asynchronous operations. It then discusses the history of callbacks and promises for asynchronous code. The bulk of the document explains key concepts in Rxjs including Observables, Operators, error handling, testing with Schedulers, and compares Promises to Rxjs. It provides examples of many common Rxjs operators and patterns.
This document provides an introduction to RxJS, a library for reactive programming using Observables. It discusses some of the key concepts in RxJS including Observables, operators, and recipes. Observables can be created from events, promises, arrays and other sources. Operators allow chaining transformations on Observables like map, filter, switchMap. RxJS is useful for complex asynchronous flows and is easier to test than promises using marble testing. Overall, RxJS enables rich composition of asynchronous code and cancellation of Observables.
This document summarizes a test for an ExchangeRateUploader class that uploads exchange rates to a database. The test mocks several dependencies, sets up test data including currencies and exchange rates, and verifies the uploader saves the rates and finalizes the thread.
The Ring programming language version 1.4 book - Part 17 of 30Mahmoud Samir Fayed
The document provides examples of using Qt widgets and classes in Ring. It demonstrates how to:
1. Create a main window with a menu bar containing File and Exit options.
2. Add a status bar to the window and set properties like window title, stylesheets, and mouse tracking.
3. Use QLineEdit to get user input, and QMessageBox to display messages.
4. Respond to QLineEdit events like text changed and return pressed.
5. Use other widgets like QProgressBar and respond to their signals and events.
6. Draw on widgets using QPainter, and print to PDF using QPrinter.
7. Play audio files with QMediaPlayer and get colors from the user with
Rxjs provides a paradigm for dealing with asynchronous operations in a way that resembles synchronous code. It uses Observables to represent asynchronous data streams over time that can be composed using operators. This allows handling of events, asynchronous code, and other reactive sources in a declarative way. Key points are:
- Observables represent asynchronous data streams that can be subscribed to.
- Operators allow manipulating and transforming streams through methods like map, filter, switchMap.
- Schedulers allow controlling virtual time for testing asynchronous behavior.
- Promises represent single values while Observables represent continuous streams, making Observables more powerful for reactive programming.
- Cascading asynchronous calls can be modeled elegantly using switch
Unit Test Candidate Solutions discusses various techniques for unit testing including:
- Using mock environments to isolate tests and focus on single responsibilities
- Preparing test data, files, and mock objects to provide context for developers
- Verifying XML output using techniques like XPath, XMLUnit, and REXML
- Mocking domain logic using libraries like EasyMock, jMock, and Mockito
- Employing BDD specifications styles for more readable test code
When it comes to writing tests we often live in the here-and-now and consequently end up producing "write-only" tests. This session looks at what we need to consider if we want to create tests that our future selves and teammates will find valuable instead of becoming another burden on top of delivering the feature itself.
If there is one place that we find it easy to take shortcuts it's when writing tests. Whether we're under the cosh or have an overly-optimistic view of our ability to write self-documenting code, instead of creating tests that support the production code and development process we can find ourselves producing WTFs (Weak Test Functions). The net effect is often a viscous cycle that disparages, instead of encourages us.
In the past I've tried many different ways to try and short-circuit the test writing process, but have only come-up short every time. This session takes a look at why skimping on elements of the test structure, such as organisation, naming and scope only leads to pain and hardship in the long run. Along the way we'll uncover the truth behind common folklore, such as only having one assertion per test.
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...Data Con LA
This presentation will explore how Bloomberg uses Spark, with its formidable computational model for distributed, high-performance analytics, to take this process to the next level, and look into one of the innovative practices the team is currently developing to increase efficiency: the introduction of a logical signature for datasets.
The Ring programming language version 1.5.2 book - Part 67 of 181Mahmoud Samir Fayed
This document describes how to create reports using the Ring WebLib and GUILib. It provides an example that loads the necessary libraries, creates a Qt application, and defines a report controller class. The class initializes a report view and calls a CreateReport method. This method generates an HTML report using the HtmlPage class, adding headers, tables, and sample customer data. It styles the tables and loops to add multiple rows of sample data. This allows quickly generating reports by combining Ring's Web and GUI libraries.
Using Change Streams to Keep Up with Your DataMongoDB
Speaker: Aly Cabral
Real-time feedback is an essential part of modern application development where developers want to sync across platforms, systems, and users to provide better end-user experiences. In MongoDB 3.6, change streams will empower developers to easily leverage the power of MongoDB's internal real-time functionality to react to relevant data changes immediately. This session introduces change streams and walks you through developing against them. We dive into use cases and explore how to make good architectural decisions around this new functionality.
React table tutorial project setup, use table, and usefilterKaty Slemon
Learn about React Table v7 and its new features in this React Table tutorial. Moreover, explore how can you implement useTable and useFilter hooks in your app.
The document discusses several topics related to programming including delegates, using statements, LINQ queries, parsing text files, and preventing SQL injection attacks. Key points covered include how to use delegates to handle variable processing logic, the using statement to ensure objects are disposed properly, comparing text in files using LINQ queries, and protecting against SQL injection by parameterizing queries.
ITB2019 10 in 50: Ten Coldbox Modules You Should be Using in Every App - Jon ...Ortus Solutions, Corp
In this 50-minute session, we'll take a fast-paced look at 10 Coldbox modules you owe it to yourself to be using in every application you develop. These modules run the gamut from security and authenticatiom to data serialization, but they all have one thing in common: they will save you hours of repetitive coding and make your life easier!
By now, you’ve probably heard of Akka, the JVM toolkit for building scalable, resilient and resource efficient applications in Java or Scala. With over 12 open-source and commercial modules in the toolkit, Akka takes developers from actors on a single JVM, all the way out to network partition healing and clusters of servers distributed across fleets of JVMs.
In this technical webinar, O’Reilly author and Lightbend Developer Advocate, Hugh McKee, takes you on a deep dive into advanced Akka features for enhanced application resilience, elasticity, and responsiveness.
Building on real-world examples from his past experience as an Akka developer, Hugh will show how you can take advantage of:
• Akka Multi-Data Center Clustering improves reliability across multiple data centers, availability zones or regions.
• Logically grouping related nodes in large clusters to improve stability and scalability.
• Akka Multi-Data Center Persistence for increased fault tolerance, locate data near the user, and balance the load across data centers.
• Akka Cluster Sharding in a multi data center environment, to optimize shard locality within a single data center and go Active-Active for improved resilience and availability.
The document discusses SOLID principles of object-oriented design. It provides examples of code that demonstrate poor adherence to SOLID and ways the code can be refactored to better follow SOLID. Specifically, it shows how to apply the single responsibility principle, open/closed principle, Liskov substitution principle, interface segregation principle and dependency inversion principle to structure code for flexibility, reusability and maintainability.
1. Rxjs provides a better way of handling asynchronous code through observables which are streams of values over time. Observables allow for cancellable, retryable operations and easy composition of different asynchronous sources.
2. Common Rxjs operators like map, filter, and flatMap allow transforming and combining observable streams. Operators make observables quite powerful for tasks like async logic, event handling, and API requests.
3. In Angular, observables are used extensively for tasks like HTTP requests, routing, and component communication. Key aspects are using async pipes for subscriptions and unsubscribing during lifecycle hooks. Rxjs greatly simplifies many common asynchronous patterns in Angular applications.
RxJs - demystified provides an overview of reactive programming and RxJs. The key points covered are:
- Reactive programming focuses on propagating changes without explicitly specifying how propagation happens.
- Observables are at the heart of RxJs and emit values in a push-based manner. Operators allow transforming, filtering, and combining observables.
- Common operators include map, filter, reduce, buffer, and switchMap. Over 120 operators exist for tasks like error handling, multicasting, and conditional logic.
- Marble diagrams visually demonstrate how operators transform observable streams.
- Creating observables from events, promises, arrays and iterables allows wrapping different data sources in a uniform API
This document discusses scaling machine learning using Apache Spark. It covers several key topics:
1) Parallelizing machine learning algorithms and neural networks to distribute computation across clusters. This includes data, model, and parameter server parallelism.
2) Apache Spark's Resilient Distributed Datasets (RDDs) programming model which allows distributing data and computation across a cluster in a fault-tolerant manner.
3) Examples of very large neural networks trained on clusters, such as a Google face detection model using 1,000 servers and a IBM brain-inspired chip model using 262,144 CPUs.
Large volume data analysis on the Typesafe Reactive PlatformMartin Zapletal
The document discusses several topics related to distributed machine learning and distributed systems including:
- Reasons for using distributed machine learning being either due to large data volumes or hopes of increased speed
- Failure rates of hardware and network links in large data centers
- Examples of database inconsistencies and data loss caused by network partitions in different distributed databases
- Key aspects of distributed data processing including data storage, integration, computing primitives, and analytics
This document provides an overview of Rxjs (Reactive Extensions for JavaScript). It begins by explaining why Rxjs is useful for dealing with asynchronous code in a synchronous way and provides one paradigm for asynchronous operations. It then discusses the history of callbacks and promises for asynchronous code. The bulk of the document explains key concepts in Rxjs including Observables, Operators, error handling, testing with Schedulers, and compares Promises to Rxjs. It provides examples of many common Rxjs operators and patterns.
This document provides an introduction to RxJS, a library for reactive programming using Observables. It discusses some of the key concepts in RxJS including Observables, operators, and recipes. Observables can be created from events, promises, arrays and other sources. Operators allow chaining transformations on Observables like map, filter, switchMap. RxJS is useful for complex asynchronous flows and is easier to test than promises using marble testing. Overall, RxJS enables rich composition of asynchronous code and cancellation of Observables.
This document summarizes a test for an ExchangeRateUploader class that uploads exchange rates to a database. The test mocks several dependencies, sets up test data including currencies and exchange rates, and verifies the uploader saves the rates and finalizes the thread.
The Ring programming language version 1.4 book - Part 17 of 30Mahmoud Samir Fayed
The document provides examples of using Qt widgets and classes in Ring. It demonstrates how to:
1. Create a main window with a menu bar containing File and Exit options.
2. Add a status bar to the window and set properties like window title, stylesheets, and mouse tracking.
3. Use QLineEdit to get user input, and QMessageBox to display messages.
4. Respond to QLineEdit events like text changed and return pressed.
5. Use other widgets like QProgressBar and respond to their signals and events.
6. Draw on widgets using QPainter, and print to PDF using QPrinter.
7. Play audio files with QMediaPlayer and get colors from the user with
Rxjs provides a paradigm for dealing with asynchronous operations in a way that resembles synchronous code. It uses Observables to represent asynchronous data streams over time that can be composed using operators. This allows handling of events, asynchronous code, and other reactive sources in a declarative way. Key points are:
- Observables represent asynchronous data streams that can be subscribed to.
- Operators allow manipulating and transforming streams through methods like map, filter, switchMap.
- Schedulers allow controlling virtual time for testing asynchronous behavior.
- Promises represent single values while Observables represent continuous streams, making Observables more powerful for reactive programming.
- Cascading asynchronous calls can be modeled elegantly using switch
Unit Test Candidate Solutions discusses various techniques for unit testing including:
- Using mock environments to isolate tests and focus on single responsibilities
- Preparing test data, files, and mock objects to provide context for developers
- Verifying XML output using techniques like XPath, XMLUnit, and REXML
- Mocking domain logic using libraries like EasyMock, jMock, and Mockito
- Employing BDD specifications styles for more readable test code
When it comes to writing tests we often live in the here-and-now and consequently end up producing "write-only" tests. This session looks at what we need to consider if we want to create tests that our future selves and teammates will find valuable instead of becoming another burden on top of delivering the feature itself.
If there is one place that we find it easy to take shortcuts it's when writing tests. Whether we're under the cosh or have an overly-optimistic view of our ability to write self-documenting code, instead of creating tests that support the production code and development process we can find ourselves producing WTFs (Weak Test Functions). The net effect is often a viscous cycle that disparages, instead of encourages us.
In the past I've tried many different ways to try and short-circuit the test writing process, but have only come-up short every time. This session takes a look at why skimping on elements of the test structure, such as organisation, naming and scope only leads to pain and hardship in the long run. Along the way we'll uncover the truth behind common folklore, such as only having one assertion per test.
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Iterative Spark Developmen...Data Con LA
This presentation will explore how Bloomberg uses Spark, with its formidable computational model for distributed, high-performance analytics, to take this process to the next level, and look into one of the innovative practices the team is currently developing to increase efficiency: the introduction of a logical signature for datasets.
The Ring programming language version 1.5.2 book - Part 67 of 181Mahmoud Samir Fayed
This document describes how to create reports using the Ring WebLib and GUILib. It provides an example that loads the necessary libraries, creates a Qt application, and defines a report controller class. The class initializes a report view and calls a CreateReport method. This method generates an HTML report using the HtmlPage class, adding headers, tables, and sample customer data. It styles the tables and loops to add multiple rows of sample data. This allows quickly generating reports by combining Ring's Web and GUI libraries.
Using Change Streams to Keep Up with Your DataMongoDB
Speaker: Aly Cabral
Real-time feedback is an essential part of modern application development where developers want to sync across platforms, systems, and users to provide better end-user experiences. In MongoDB 3.6, change streams will empower developers to easily leverage the power of MongoDB's internal real-time functionality to react to relevant data changes immediately. This session introduces change streams and walks you through developing against them. We dive into use cases and explore how to make good architectural decisions around this new functionality.
React table tutorial project setup, use table, and usefilterKaty Slemon
Learn about React Table v7 and its new features in this React Table tutorial. Moreover, explore how can you implement useTable and useFilter hooks in your app.
The document discusses several topics related to programming including delegates, using statements, LINQ queries, parsing text files, and preventing SQL injection attacks. Key points covered include how to use delegates to handle variable processing logic, the using statement to ensure objects are disposed properly, comparing text in files using LINQ queries, and protecting against SQL injection by parameterizing queries.
ITB2019 10 in 50: Ten Coldbox Modules You Should be Using in Every App - Jon ...Ortus Solutions, Corp
In this 50-minute session, we'll take a fast-paced look at 10 Coldbox modules you owe it to yourself to be using in every application you develop. These modules run the gamut from security and authenticatiom to data serialization, but they all have one thing in common: they will save you hours of repetitive coding and make your life easier!
By now, you’ve probably heard of Akka, the JVM toolkit for building scalable, resilient and resource efficient applications in Java or Scala. With over 12 open-source and commercial modules in the toolkit, Akka takes developers from actors on a single JVM, all the way out to network partition healing and clusters of servers distributed across fleets of JVMs.
In this technical webinar, O’Reilly author and Lightbend Developer Advocate, Hugh McKee, takes you on a deep dive into advanced Akka features for enhanced application resilience, elasticity, and responsiveness.
Building on real-world examples from his past experience as an Akka developer, Hugh will show how you can take advantage of:
• Akka Multi-Data Center Clustering improves reliability across multiple data centers, availability zones or regions.
• Logically grouping related nodes in large clusters to improve stability and scalability.
• Akka Multi-Data Center Persistence for increased fault tolerance, locate data near the user, and balance the load across data centers.
• Akka Cluster Sharding in a multi data center environment, to optimize shard locality within a single data center and go Active-Active for improved resilience and availability.
The document discusses SOLID principles of object-oriented design. It provides examples of code that demonstrate poor adherence to SOLID and ways the code can be refactored to better follow SOLID. Specifically, it shows how to apply the single responsibility principle, open/closed principle, Liskov substitution principle, interface segregation principle and dependency inversion principle to structure code for flexibility, reusability and maintainability.
1. Rxjs provides a better way of handling asynchronous code through observables which are streams of values over time. Observables allow for cancellable, retryable operations and easy composition of different asynchronous sources.
2. Common Rxjs operators like map, filter, and flatMap allow transforming and combining observable streams. Operators make observables quite powerful for tasks like async logic, event handling, and API requests.
3. In Angular, observables are used extensively for tasks like HTTP requests, routing, and component communication. Key aspects are using async pipes for subscriptions and unsubscribing during lifecycle hooks. Rxjs greatly simplifies many common asynchronous patterns in Angular applications.
RxJs - demystified provides an overview of reactive programming and RxJs. The key points covered are:
- Reactive programming focuses on propagating changes without explicitly specifying how propagation happens.
- Observables are at the heart of RxJs and emit values in a push-based manner. Operators allow transforming, filtering, and combining observables.
- Common operators include map, filter, reduce, buffer, and switchMap. Over 120 operators exist for tasks like error handling, multicasting, and conditional logic.
- Marble diagrams visually demonstrate how operators transform observable streams.
- Creating observables from events, promises, arrays and iterables allows wrapping different data sources in a uniform API
This document discusses scaling machine learning using Apache Spark. It covers several key topics:
1) Parallelizing machine learning algorithms and neural networks to distribute computation across clusters. This includes data, model, and parameter server parallelism.
2) Apache Spark's Resilient Distributed Datasets (RDDs) programming model which allows distributing data and computation across a cluster in a fault-tolerant manner.
3) Examples of very large neural networks trained on clusters, such as a Google face detection model using 1,000 servers and a IBM brain-inspired chip model using 262,144 CPUs.
Large volume data analysis on the Typesafe Reactive PlatformMartin Zapletal
The document discusses several topics related to distributed machine learning and distributed systems including:
- Reasons for using distributed machine learning being either due to large data volumes or hopes of increased speed
- Failure rates of hardware and network links in large data centers
- Examples of database inconsistencies and data loss caused by network partitions in different distributed databases
- Key aspects of distributed data processing including data storage, integration, computing primitives, and analytics
This document provides an overview of installing and deploying Apache Spark, including:
1. Spark can be installed via prebuilt packages or by building from source.
2. Spark runs in local, standalone, YARN, or Mesos cluster modes and the SparkContext is used to connect to the cluster.
3. Jobs are deployed to the cluster using the spark-submit script which handles building jars and dependencies.
Large volume data analysis on the Typesafe Reactive Platform - Big Data Scala...Martin Zapletal
The document discusses distributed machine learning and data processing. It covers several topics including reasons for using distributed machine learning, different distributed computing architectures and primitives, distributed data stores and analytics tools like Spark, streaming architectures like Lambda and Kappa, and challenges around distributed state management and fault tolerance. It provides examples of failures in distributed databases and suggestions to choose the appropriate tools based on the use case and understand their internals.
Spark Based Distributed Deep Learning Framework For Big Data Applications Humoyun Ahmedov
Deep Learning architectures, such as deep neural networks, are currently the hottest emerging areas of data science, especially in Big Data. Deep Learning could be effectively exploited to address some major issues of Big Data, such as fast information retrieval, data classification, semantic indexing and so on. In this work, we designed and implemented a framework to train deep neural networks using Spark, fast and general data flow engine for large scale data processing, which can utilize cluster computing to train large scale deep networks. Training Deep Learning models requires extensive data and computation. Our proposed framework can accelerate the training time by distributing the model replicas, via stochastic gradient descent, among cluster nodes for data resided on HDFS.
This document provides a history and market overview of Apache Spark. It discusses the motivation for distributed data processing due to increasing data volumes, velocities and varieties. It then covers brief histories of Google File System, MapReduce, BigTable, and other technologies. Hadoop and MapReduce are explained. Apache Spark is introduced as a faster alternative to MapReduce that keeps data in memory. Competitors like Flink, Tez and Storm are also mentioned.
The document provides an overview and summary of Curator, a client library for Apache ZooKeeper. Curator aims to simplify ZooKeeper development by providing a friendlier API, handling retries, and implementing common patterns ("recipes") like leader election and locks. It consists of a client, framework, and recipes components. The framework handles connection management and retries, while recipes implement distributed primitives. Details about common recipes like locks and leader election are provided.
Cassandra as an event sourced journal for big data analytics Cassandra Summit...Martin Zapletal
The document discusses event sourcing and CQRS architectures using technologies like Akka, Cassandra, and Spark. It provides an overview of how event sourcing avoids the limitations of traditional mutable databases by using an immutable write log. It describes how CQRS separates read and write concerns for better scalability. Example architectures show how Akka persistence can store events in Cassandra and provide views of data, while Spark can perform analytics on the full event stream.
Spark's distributed programming model uses resilient distributed datasets (RDDs) and a directed acyclic graph (DAG) approach. RDDs support transformations like map, filter, and actions like collect. Transformations are lazy and form the DAG, while actions execute the DAG. RDDs support caching, partitioning, and sharing state through broadcasts and accumulators. The programming model aims to optimize the DAG through operations like predicate pushdown and partition coalescing.
codecentric AG: CQRS and Event Sourcing Applications with CassandraDataStax Academy
CQRS (Command Query Responsibility Segregation) is a pattern, which separates the process of querying and updating data. As a query only returns data without any side effects, a command is designed to change data. CQRS is often combined with Event Sourcing. This is an architecture in which all changes to an application state are stored as a sequence of events.
Because of its great capability to store time series data Cassandra is the perfect fit for implementing the event store. But there a still a lot of open questions: What about the data modeling? What techniques will be used to process and store data in the Cassandra database? How to access the current state of the application, without replaying every event? And what about failure handling?
In this talk, I will give a brief introduction to CQRS and the Event Sourcing pattern and will then answer the questions above using a real life example of a data store for customer data.
Talk on Nov. 14, 2013 SF Scala Meet up lighting talk.
Presented SPA --- Scala Persistence API (spa), a JDBC wrapper. A simple query API with out ORM or functional to relational mapping. It supports batch update, transaction and query return list or single objects
The document discusses Redux, which is a state management library. It describes the core concepts of Redux including a stream of state, a stream of actions, and a reducer function that takes the current state and an action as input and returns the next state. It provides an example implementation of a Redux store in Kotlin using RxJava streams to manage state updates from actions. The store exposes the current state as an observable and allows subscribing to streams of actions to trigger state updates via the reducer function.
This document summarizes an Akka Persistence webinar. It discusses key features of Akka Persistence like storing application state after crashes, modeling domain events, and benefits like auditing and scalability. It also provides examples of using processors, eventsourced processors, views, channels, and cluster sharding with Akka Persistence. The webinar recommends further resources for learning more about Akka Persistence.
Using Akka Persistence to build a configuration datastoreAnargyros Kiourkos
Akka is a library implementing the actor model on the JVM. It enables the development of distributed, concurrent, fault-tolerant and scalable applications. Akka persistence enables actors to persist their internal state so that it can be recovered where an actor is restarted. We demonstrate the practical application of Akka persistence to build a configuration datastore using CQRS and Event Sourcing
Lightning Talk on Software Transactional Memory in Scala. The Actors Pattern has gotten a lot of attention in the Scala ecosphere, but STM of often a good first solution for solving concurrency problems. It's odd that it hasn't had as much attention in the Scala world, so this talk aims to show how easy it is to use, and compare it to typical lock-based synchronization by showing how easy it is to make errors with lock-based synchronization
[4developers] The saga pattern v3- Robert PankowieckiPROIDEA
As you build more advanced solutions, you may find that certain interactions in your system depend on more than one bounded context. Order, Inventory, Payments, Delivery. To deliver one feature often many sub-system are involved. But you want the modules to be isolated and independent. Yet something must coordinate their work and business processes. Welcome the choreographer - the Saga Pattern a.k.a. Process Manager. In my talk I would like to: describe the Saga Pattern. show how you can simply introduce it to legacy codebase using existing gems and... ActiveRecord :) describe a few examples of Saga that we have in our systems so the audience can see many places where it fits. convince everyone that it is not so hard
Functional streams with Kafka - A comparison between Akka-streams and FS2Luis Miguel Reis
Kafka is a distributed streaming platform whose main strength is the ability to serve as the single message hub for applications of massive scale.It relies on a topic-based publish-subscribe model, where each topic may have multiple partitions to be consumed/published from/into. Kafka diverges from regular message queues in a lot of its functionalities. A significant change from other standard message queues is that each consumer has an associated offset, representing the identifier of the last consumed message from the subscribed topic. This allows for replaying of messages in cases of failures, deployment issues, and other occurrences. How can fully functional applications interact with Kafka and its features while maintaining the characteristics of the functional domain? The Scala ecosystem has many stream-based frameworks that can be leveraged to use Kafka. Two of the most used are Akka-Streams and FS2. These frameworks imply different approaches for processing data streams and dealing with Kafka’s features. The goal of this talk is to provide insight into these differences, in terms of functionality, performance, and their impact in maintaining the code, keeping it functional, robust and readable.
The document discusses event sourcing and command query responsibility segregation (CQRS) patterns at different levels of implementation. It covers the benefits of event sourcing including having a complete log of state changes, improved debugging, performance, scalability, and microservices integration. It also discusses challenges with increasing complexity, eventual consistency, and different event store and serialization options.
This document discusses the benefits of using Scala with Vaadin, including a simple programming model without JavaScript, security from all server-side code, beautiful widgets and themes, and performance advantages over other frameworks like PrimeFaces. Scala allows for a more declarative programming style, inline component constructors, delayed initialization, callbacks and higher-order functions to reduce coupling, localization with symbols, navigating nested objects, and eliminating dependencies with structural types. Hot reloading of classes is also possible with JRebel for Scala applications.
A Practical Approach to Building a Streaming Processing Pipeline for an Onlin...Databricks
Yelp’s ad platform handles millions of ad requests everyday. To generate ad metrics and analytics in real-time, they built they ad event tracking and analyzing pipeline on top of Spark Streaming. It allows Yelp to manage large number of active ad campaigns and greatly reduce over-delivery. It also enables them to share ad metrics with advertisers in a more timely fashion.
This session will start with an overview of the entire pipeline and then focus on two specific challenges in the event consolidation part of the pipeline that Yelp had to solve. The first challenge will be about joining multiple data sources together to generate a single stream of ad events that feeds into various downstream systems. That involves solving several problems that are unique to real-time applications, such as windowed processing and handling of event delays. The second challenge covered is with regards to state management across code deployments and application restarts. Throughout the session, the speakers will share best practices for the design and development of large-scale Spark Streaming pipelines for production environments.
Matteo Antony Mistretta - React, the Inglorious way - Codemotion Amsterdam 2019Codemotion
If you haven't read all of the docs, or you have learnt React just from experience, there's a chance you missed some gotchas, patterns, and features, thus preventing you to build more robust and performant applications. For example, when is it better to use Higher Order Components than Render Props? Why is a functional component slower than a class component? And why are hooks better than Recompose? In this talk we will uncover all of React's deepest secrets and newest features, how to use them, and especially why.
This document discusses the evolution of component composition in React, from mixins to higher-order components. It describes how mixins were commonly used in early React but caused problems with name clashes and complexity. React then introduced classes but removed mixin support. Developers experimented with inheritance but it did not allow for pure composability. Later, higher-order components were introduced as a way to compose logic in a non-hierarchical manner by wrapping one component inside another. The document demonstrates how to build a controlled input component using higher-order components to separate concerns into model, view and controller logic in a modular, reusable way. It also discusses debugging techniques using the Recompact library which treats components as streams of props.
Vadym Khondar is a senior software engineer with 8 years of experience, including 2.5 years at EPAM. He leads a development team that works on web and JavaScript projects. The document discusses reactive programming, including its benefits of responsiveness, resilience, and other qualities. Examples demonstrate using streams, behaviors, and other reactive concepts to write more declarative and asynchronous code.
An intro to functional programming with Kotlin.
How can we leverage Kotlin to write code in a more functional approach rather than just write java-style imperative code with addition of 'var' and 'val'?
Kamil Chmielewski, Jacek Juraszek - "Hadoop. W poszukiwaniu złotego młotka."sjabs
The document discusses Hadoop and its applications. It provides examples of companies like Facebook and their use of Hadoop. It also discusses Hadoop components like HDFS, MapReduce, Pig and HBase. It provides examples of using Hadoop with databases like MongoDB and search engines like Solr. It notes that not every problem requires large-scale solutions and discusses potential use cases for Hadoop including log analysis, indexing documents and building recommendation systems.
pragmaticrealworldscalajfokus2009-1233251076441384-2.pdfHiroshi Ono
The document discusses Scala and functional programming concepts. It provides examples of building a chat application in 30 lines of code using Lift, defining messages as case classes, and implementing a chat server and comet component. It then summarizes that Scala is a pragmatically-oriented, statically typed language that runs on the JVM and provides a unique blend of object-oriented and functional programming. Traits allow for code reuse and multiple class inheritances. Functional programming concepts like immutable data structures, higher-order functions, and for-comprehensions are discussed.
pragmaticrealworldscalajfokus2009-1233251076441384-2.pdfHiroshi Ono
The document discusses Scala and functional programming concepts. It provides examples of building a chat application in 30 lines of code using Lift, defining case classes and actors for messages. It summarizes that Scala is a pragmatically oriented, statically typed language that runs on the JVM and has a unique blend of object-oriented and functional programming. Functional programming concepts like immutable data structures, functions as first-class values, and for-comprehensions are demonstrated with examples in Scala.
pragmaticrealworldscalajfokus2009-1233251076441384-2.pdfHiroshi Ono
This document discusses Scala and its features. It provides an example of building a chat application in 30 lines of code using Lift framework. It also demonstrates pattern matching, functional data structures like lists and tuples, for comprehensions, and common Scala tools and frameworks. The document promotes Scala as a pragmatic and scalable language that blends object-oriented and functional programming. It encourages learning more about Scala.
pragmaticrealworldscalajfokus2009-1233251076441384-2.pdfHiroshi Ono
The document discusses Scala and functional programming concepts. It provides examples of building a chat application in 30 lines of code using Lift, defining messages as case classes, and implementing a chat server and comet component. It then summarizes that Scala is a pragmatically-oriented, statically typed language that runs on the JVM and provides a unique blend of object-oriented and functional programming. Traits allow for static and dynamic mixin-based composition. Functional programming concepts like immutable data structures, higher-order functions, and for-comprehensions are discussed.
Similar to Data in Motion: Streaming Static Data Efficiently 2 (20)
Nashik's top web development company, Upturn India Technologies, crafts innovative digital solutions for your success. Partner with us and achieve your goals
Building API data products on top of your real-time data infrastructureconfluent
This talk and live demonstration will examine how Confluent and Gravitee.io integrate to unlock value from streaming data through API products.
You will learn how data owners and API providers can document, secure data products on top of Confluent brokers, including schema validation, topic routing and message filtering.
You will also see how data and API consumers can discover and subscribe to products in a developer portal, as well as how they can integrate with Confluent topics through protocols like REST, Websockets, Server-sent Events and Webhooks.
Whether you want to monetize your real-time data, enable new integrations with partners, or provide self-service access to topics through various protocols, this webinar is for you!
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...The Third Creative Media
"Navigating Invideo: A Comprehensive Guide" is an essential resource for anyone looking to master Invideo, an AI-powered video creation tool. This guide provides step-by-step instructions, helpful tips, and comparisons with other AI video creators. Whether you're a beginner or an experienced video editor, you'll find valuable insights to enhance your video projects and bring your creative ideas to life.
A neural network is a machine learning program, or model, that makes decisions in a manner similar to the human brain, by using processes that mimic the way biological neurons work together to identify phenomena, weigh options and arrive at conclusions.
Strengthening Web Development with CommandBox 6: Seamless Transition and Scal...Ortus Solutions, Corp
Join us for a session exploring CommandBox 6’s smooth website transition and efficient deployment. CommandBox revolutionizes web development, simplifying tasks across Linux, Windows, and Mac platforms. Gain insights and practical tips to enhance your development workflow.
Come join us for an enlightening session where we delve into the smooth transition of current websites and the efficient deployment of new ones using CommandBox 6. CommandBox has revolutionized web development, consistently introducing user-friendly enhancements that catalyze progress in the field. During this presentation, we’ll explore CommandBox’s rich history and showcase its unmatched capabilities within the realm of ColdFusion, covering both major variations.
The journey of CommandBox has been one of continuous innovation, constantly pushing boundaries to simplify and optimize development processes. Regardless of whether you’re working on Linux, Windows, or Mac platforms, CommandBox empowers developers to streamline tasks with unparalleled ease.
In our session, we’ll illustrate the simple process of transitioning existing websites to CommandBox 6, highlighting its intuitive features and seamless integration. Moreover, we’ll unveil the potential for effortlessly deploying multiple websites, demonstrating CommandBox’s versatility and adaptability.
Join us on this journey through the evolution of web development, guided by the transformative power of CommandBox 6. Gain invaluable insights, practical tips, and firsthand experiences that will enhance your development workflow and embolden your projects.
Building the Ideal CI-CD Pipeline_ Achieving Visual PerfectionApplitools
Explore the advantages of integrating AI-powered testing into the CI/CD pipeline in this session from Applitools engineer Brandon Murray. More information and session materials at applitools.com
Discover how shift-left strategies and advanced testing in CI/CD pipelines can enhance customer satisfaction and streamline development processes, including:
• Significantly reduced time and effort needed for test creation and maintenance compared to traditional testing methods.
• Enhanced UI coverage that eliminates the necessity for manual testing, leading to quicker and more effective testing processes.
• Effortless integration with the development workflow, offering instant feedback on pull requests and facilitating swifter product releases.
Alluxio Webinar | 10x Faster Trino Queries on Your Data PlatformAlluxio, Inc.
Alluxio Webinar
June. 18, 2024
For more Alluxio Events: https://www.alluxio.io/events/
Speaker:
- Jianjian Xie (Staff Software Engineer, Alluxio)
As Trino users increasingly rely on cloud object storage for retrieving data, speed and cloud cost have become major challenges. The separation of compute and storage creates latency challenges when querying datasets; scanning data between storage and compute tiers becomes I/O bound. On the other hand, cloud API costs related to GET/LIST operations and cross-region data transfer add up quickly.
The newly introduced Trino file system cache by Alluxio aims to overcome the above challenges. In this session, Jianjian will dive into Trino data caching strategies, the latest test results, and discuss the multi-level caching architecture. This architecture makes Trino 10x faster for data lakes of any scale, from GB to EB.
What you will learn:
- Challenges relating to the speed and costs of running Trino in the cloud
- The new Trino file system cache feature overview, including the latest development status and test results
- A multi-level cache framework for maximized speed, including Trino file system cache and Alluxio distributed cache
- Real-world cases, including a large online payment firm and a top ridesharing company
- The future roadmap of Trino file system cache and Trino-Alluxio integration
Superpower Your Apache Kafka Applications Development with Complementary Open...Paul Brebner
Kafka Summit talk (Bangalore, India, May 2, 2024, https://events.bizzabo.com/573863/agenda/session/1300469 )
Many Apache Kafka use cases take advantage of Kafka’s ability to integrate multiple heterogeneous systems for stream processing and real-time machine learning scenarios. But Kafka also exists in a rich ecosystem of related but complementary stream processing technologies and tools, particularly from the open-source community. In this talk, we’ll take you on a tour of a selection of complementary tools that can make Kafka even more powerful. We’ll focus on tools for stream processing and querying, streaming machine learning, stream visibility and observation, stream meta-data, stream visualisation, stream development including testing and the use of Generative AI and LLMs, and stream performance and scalability. By the end you will have a good idea of the types of Kafka “superhero” tools that exist, which are my favourites (and what superpowers they have), and how they combine to save your Kafka applications development universe from swamploads of data stagnation monsters!
🏎️Tech Transformation: DevOps Insights from the Experts 👩💻campbellclarkson
Connect with fellow Trailblazers, learn from industry experts Glenda Thomson (Salesforce, Principal Technical Architect) and Will Dinn (Judo Bank, Salesforce Development Lead), and discover how to harness DevOps tools with Salesforce.
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid
IBM watsonx Code Assistant for Z, our latest Generative AI-assisted mainframe application modernization solution. Mainframe (IBM Z) application modernization is a topic that every mainframe client is addressing to various degrees today, driven largely from digital transformation. With generative AI comes the opportunity to reimagine the mainframe application modernization experience. Infusing generative AI will enable speed and trust, help de-risk, and lower total costs associated with heavy-lifting application modernization initiatives. This document provides an overview of the IBM watsonx Code Assistant for Z which uses the power of generative AI to make it easier for developers to selectively modernize COBOL business services while maintaining mainframe qualities of service.
What’s new in VictoriaMetrics - Q2 2024 UpdateVictoriaMetrics
These slides were presented during the virtual VictoriaMetrics User Meetup for Q2 2024.
Topics covered:
1. VictoriaMetrics development strategy
* Prioritize bug fixing over new features
* Prioritize security, usability and reliability over new features
* Provide good practices for using existing features, as many of them are overlooked or misused by users
2. New releases in Q2
3. Updates in LTS releases
Security fixes:
● SECURITY: upgrade Go builder from Go1.22.2 to Go1.22.4
● SECURITY: upgrade base docker image (Alpine)
Bugfixes:
● vmui
● vmalert
● vmagent
● vmauth
● vmbackupmanager
4. New Features
* Support SRV URLs in vmagent, vmalert, vmauth
* vmagent: aggregation and relabeling
* vmagent: Global aggregation and relabeling
* vmagent: global aggregation and relabeling
* Stream aggregation
- Add rate_sum aggregation output
- Add rate_avg aggregation output
- Reduce the number of allocated objects in heap during deduplication and aggregation up to 5 times! The change reduces the CPU usage.
* Vultr service discovery
* vmauth: backend TLS setup
5. Let's Encrypt support
All the VictoriaMetrics Enterprise components support automatic issuing of TLS certificates for public HTTPS server via Let’s Encrypt service: https://docs.victoriametrics.com/#automatic-issuing-of-tls-certificates
6. Performance optimizations
● vmagent: reduce CPU usage when sharding among remote storage systems is enabled
● vmalert: reduce CPU usage when evaluating high number of alerting and recording rules.
● vmalert: speed up retrieving rules files from object storages by skipping unchanged objects during reloading.
7. VictoriaMetrics k8s operator
● Add new status.updateStatus field to the all objects with pods. It helps to track rollout updates properly.
● Add more context to the log messages. It must greatly improve debugging process and log quality.
● Changee error handling for reconcile. Operator sends Events into kubernetes API, if any error happened during object reconcile.
See changes at https://github.com/VictoriaMetrics/operator/releases
8. Helm charts: charts/victoria-metrics-distributed
This chart sets up multiple VictoriaMetrics cluster instances on multiple Availability Zones:
● Improved reliability
● Faster read queries
● Easy maintenance
9. Other Updates
● Dashboards and alerting rules updates
● vmui interface improvements and bugfixes
● Security updates
● Add release images built from scratch image. Such images could be more
preferable for using in environments with higher security standards
● Many minor bugfixes and improvements
● See more at https://docs.victoriametrics.com/changelog/
Also check the new VictoriaLogs PlayGround https://play-vmlogs.victoriametrics.com/
Folding Cheat Sheet #6 - sixth in a seriesPhilip Schwarz
Left and right folds and tail recursion.
Errata: there are some errors on slide 4. See here for a corrected versionsof the deck:
https://speakerdeck.com/philipschwarz/folding-cheat-sheet-number-6
https://fpilluminated.com/deck/227
In this infographic, we have explored cost-effective strategies for iOS app development, focusing on building high-quality apps within a budget. Key points covered include prioritizing essential features, leveraging existing tools and libraries, adopting cross-platform development approaches, optimizing for a Minimum Viable Product (MVP), and integrating with cloud services and third-party APIs. By implementing these strategies, businesses and developers can create functional and engaging iOS apps while minimizing development costs and time-to-market.
Hands-on with Apache Druid: Installation & Data Ingestion StepsservicesNitor
Supercharge your analytics workflow with https://bityl.co/Qcuk Apache Druid's real-time capabilities and seamless Kafka integration. Learn about it in just 14 steps.
The Ultimate Guide to Top 36 DevOps Testing Tools for 2024.pdfkalichargn70th171
Testing is pivotal in the DevOps framework, serving as a linchpin for early bug detection and the seamless transition from code creation to deployment.
DevOps teams frequently adopt a Continuous Integration/Continuous Deployment (CI/CD) methodology to automate processes. A robust testing strategy empowers them to confidently deploy new code, backed by assurance that it has passed rigorous unit and performance tests.
62. Events by tag
0 0
0 1
event 1,
tag 1
event 100,
tag 1
event 101 event 102
event 0 event 2,
tag 1
1 0
event 0 event 1 event 2,
tag 1
63. 0 0
0 1
event 1,
tag 1
event 100,
tag 1
event 101 event 102
event 2,
tag 1
1 0
event 0 event 1
event 0
event 2,
tag 1
64. 0 0
0 1
event 1,
tag 1
event 100,
tag 1
event 101 event 102
event 0 event 2,
tag 1
1 0
event 1event 0 event 2,
tag 1
65. 0 0
0 1
event 1,
tag 1
event 100,
tag 1
event 101 event 102
event 0 event 2,
tag 1
1 0
event 0 event 1 event 2,
tag 1
66. event 0
event 0
0 0
0 1
event 1,
tag 1
event 100,
tag 1
event 101 event 102
event 2,
tag 1
1 0
event 1 event 2,
tag 1
67. event 0
event 0 event 1
0 0
0 1
event 100,
tag 1
event 101 event 102
event 2,
tag 1
1 0
event 2,
tag 1
event 1,
tag 1
68. 0 0
0 1
event 1,
tag 1
event 100,
tag 1
event 101 event 102
event 2,
tag 1
1 0
event 2,
tag 1
event 0
event 0 event 1
event 1,
tag 1
69. event 1,
tag 1
event 2,
tag 1
event 0
event 0 event 1
event 1,
tag 10 0
0 1
event 100,
tag 1
event 101 event 102
1 0
event 2,
tag 1
70. event 2,
tag 1
event 0
event 0 event 1
0 0
0 1
event 100,
tag 1
event 101 event 102
1 0
event 2,
tag 1
event 1,
tag 1
71. 0 0
0 1
1 0
event 2,
tag 1
event 0
event 0 event 1
event 100,
tag 1
event 101 event 102
event 2,
tag 1
event 1,
tag 1
72. Events by tag
Id 0,
event 1
Id 1,
event 2
Id 0,
event 100
0 0
0 1
event 1,
tag 1
event 100,
tag 1
event 101 event 102
event 0
1 0
event 0 event 1 event 2,
tag 1
Id 0,
event 2
tag 1 1/1/2016
tag 1 1/2/2016
event 2,
tag 1
SELECT * FROM $eventsByTagViewName$tagId WHERE
tag$tagId = ? AND
timebucket = ? AND
timestamp > ? AND
timestamp <= ?
ORDER BY timestamp ASC
LIMIT ?
73. Id 1,
event 2
Id 0,
event 100
Id 0,
event 1
0 0
0 1
event 1,
tag 1
event 100,
tag 1
event 101 event 102
event 0
Id 0,
event 2
1 0
event 0 event 1 event 2,
tag 1
tag 1 1/1/2016
tag 1 1/2/2016
event 2,
tag 1
74. Id 1,
event 2
Id 0,
event 100
Id 0,
event 1
0 0
0 1
event 1,
tag 1
event 100,
tag 1
event 101 event 102
event 0
Id 0,
event 2
1 0
event 0 event 1 event 2,
tag 1
tag 1 1/1/2016
tag 1 1/2/2016
event 2,
tag 1
75. Id 0,
event 100
Id 1,
event 2
Id 0,
event 1
0 0
0 1
event 1,
tag 1
event 100,
tag 1
event 101 event 102
event 0
Id 0,
event 2
1 0
event 0 event 1 event 2,
tag 1
tag 1 1/1/2016
tag 1 1/2/2016
event 2,
tag 1
76. Id 0,
event 100
Id 1,
event 2
Id 0,
event 1
0 0
0 1
event 1,
tag 1
event 100,
tag 1
event 101 event 102
event 0
1 0
event 0 event 1 event 2,
tag 1
tag 1 1/1/2016
tag 1 1/2/2016
event 2,
tag 1
Id 0,
event 2
106. Ordering
10 2
1 12:34:57 1
KEY TIME VALUE
2 12:34:58 2
KEY TIME VALUE
0 12:34:56 0
KEY TIME VALUE
107. 0
1
2
1 12:34:57 1
KEY TIME VALUE
2 12:34:58 2
KEY TIME VALUE
0 12:34:56 0
KEY TIME VALUE
108. Distributed causal stream merging
Id 0,
event 2
Id 0,
event 1
Id 0,
event 0
Id 1,
event 00
1
Id 2,
event 0
Id 0,
event 3
node_id
109. Id 0,
event 2
Id 0,
event 1
Id 0,
event 0
Id 1,
event 00
1
Id 2,
event 0
Id 0,
event 3
Id 0,
event 0
node_id
110. Id 0,
event 2
Id 0,
event 1
Id 0,
event 0
Id 1,
event 00
1
Id 2,
event 0
Id 0,
event 3
Id 0,
event 0
node_id
111. Id 0,
event 2
Id 0,
event 1
Id 0,
event 0
Id 1,
event 00
1
Id 2,
event 0
Id 0,
event 3
Id 0,
event 0
node_id
persistence
_id
seq
0 0
1 . . .
2 . . .
112. persistence
_id
seq
0 1
1 . . .
2 . . .
Id 0,
event 2
Id 0,
event 1
Id 0,
event 0
Id 1,
event 0
node_id
0
1
Id 2,
event 0
Id 0,
event 0
Id 0,
event 1
Id 0,
event 3
113. persistence
_id
seq
0 2
1 0
2 0
Id 0,
event 1
Id 0,
event 0
Id 1,
event 0
node_id
0
1
Id 2,
event 0
Id 0,
event 0
Id 0,
event 1
Id 0,
event 2
Id 0,
event 3
Id 2,
event 0
Id 0,
event 2
Id 1,
event 0
114. Id 0,
event 2
Id 0,
event 1
Id 0,
event 0
Id 1,
event 00
1
Id 2,
event 0
Id 0,
event 3
Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 2
Id 0,
event 3
node_id
Id 1,
event 0
persistence
_id
seq
0 3
1 0
2 0
115. Id 0,
event 2
Id 0,
event 1
Id 0,
event 0
Id 1,
event 00
1
Id 2,
event 0
Id 0,
event 3
Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 2
node_id
Id 1,
event 0
0 0 Id 0,
event 0
Id 0,
event 1
Replay
116. Id 0,
event 2
Id 0,
event 1
Id 0,
event 0
Id 1,
event 00
1
Id 2,
event 0
Id 0,
event 3
Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 2
node_id
Id 1,
event 0
0 0 Id 0,
event 0
Id 0,
event 1
117. Id 0,
event 2
Id 0,
event 1
Id 0,
event 0
Id 1,
event 00
1
Id 2,
event 0
Id 0,
event 3
Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 2
Id 1,
event 0
0 0 Id 0,
event 0
Id 0,
event 1
node_id
118. Id 0,
event 2
Id 0,
event 1
Id 0,
event 0
Id 1,
event 00
1
Id 2,
event 0
Id 0,
event 3
Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 2
Id 1,
event 0
0 0 Id 0,
event 0
Id 0,
event 1
node_id
persistence
_id
seq
0 2
119. Id 0,
event 2
Id 0,
event 1
Id 0,
event 0
Id 1,
event 00
Id 2,
event 0
Id 0,
event 3
Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 2
Id 1,
event 0
0 0 Id 0,
event 0
Id 0,
event 1
persistence
_id
seq
0 2
stream_id seq
0 1
1 2
1
node_id
121. Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 2
Id 0,
event 3
Id 1,
event 0
122. Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 2
Id 0,
event 3
Id 1,
event 0
123. Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 2
Id 0,
event 3
Id 1,
event 0
Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 3
Id 1,
event 0
ACK ACK ACK ACK ACK
124. Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 2
Id 0,
event 3
Id 1,
event 0
Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 3
Id 1,
event 0
ACK ACK ACK ACK ACK
125. Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 2
Id 0,
event 3
Id 1,
event 0
Id 0,
event 0
Id 0,
event 1
Id 2,
event 0
Id 0,
event 3
Id 1,
event 0
ACK ACK ACK ACK ACK
130. node_id
0
1
Id 0,
event 0
Id 0,
event 1
Id 1,
event 0
Id 0,
event 2
Id 2,
event 0
Id 0,
event 3
Id 0,
event 0
Id 0,
event 1
Id 1,
event 0
Id 2,
event 0
Id 0,
event 2
Id 0,
event 3
tag 1 0
allIds
Id 0,
event 1
Id 2,
event 1
0 1
0 0 event 0 event 1
131. val conf = new SparkConf().setAppName("...").setMaster("...").set("spark.cassandra.connection.host", "...")
val sc = new SparkContext(conf)
implicit val ordering = new Ordering[(String, Double)] {
override def compare(x: (String, Double), y: (String, Double)): Int =
implicitly[Ordering[Double]].compare(x._2, y._2)
}
sc.eventTable()
.cache()
.flatMap {
case (JournalKey(persistenceId, _, _), BalanceUpdatedEvent(change)) =>
(persistenceId -> change) :: Nil
case _ => Nil
}
.reduceByKey(_ + _)
.top(100)
.foreach(println)
sc.stop()
Akka Analytics
132. val conf = new SparkConf().setAppName("...").setMaster("...").set("spark.cassandra.connection.host", "...")
val sc = new StreamingContext(conf, Seconds(5))
implicit val ordering = new Ordering[(String, Double)] {
override def compare(x: (String, Double), y: (String, Double)): Int =
implicitly[Ordering[Double]].compare(x._2, y._2)
}
sc.eventTable()
.cache()
.flatMap {
case (JournalKey(persistenceId, _, _), BalanceUpdatedEvent(change)) =>
(persistenceId -> change) :: Nil
case _ => Nil
}
.reduceByKey(_ + _)
.top(100)
.foreach(println)
sc.stop()
135. Client 1
Client 2
Client 3
Update
Update
Update
Model devices Model devices Model devices
Input data Input data Input data
Parameter devices
P
ΔP
ΔP
ΔP
136. Challenges
● All the solved problems
○ Exactly once delivery
○ Consistency
○ Availability
○ Fault tolerance
○ Cross service invariants and consistency
○ Transactions
○ Automated deployment and configuration management
○ Serialization, versioning, compatibility
○ Automated elasticity
○ No downtime version upgrades
○ Graceful shutdown of nodes
○ Distributed system verification, logging, tracing, monitoring, debugging
○ Split brains
○ ...
137. Conclusion
● From request, response, synchronous, mutable state
● To streams, asynchronous messaging
● Production ready distributed systems
139. MANCHESTER LONDON NEW YORK
@zapletal_martin @cakesolutions
347 708 1518
enquiries@cakesolutions.net
We are hiring
http://www.cakesolutions.net/careers