Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strata Singapore 2016

320 views

Published on

Learn about what technologies enable a new, modern Stream-based architecture to connect everything within application modules or across data centers and public clouds. Combine Kafka-style streaming and stream processing frameworks like Spark and Flink with Microservices and completely rethink your big data architecture away from state and into data flows.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
320
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
35
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strata Singapore 2016

  1. 1. © 2016 MapR Technologies 1© 2016 MapR Technologies 1MapR Confidential © 2016 MapR Technologies Architecting a hybrid cloud application using a global publish-subscribe streaming message system Mathieu Dumoulin (MapR Technologies) Strata Singapore 2016
  2. 2. © 2016 MapR Technologies 2© 2016 MapR Technologies 2MapR Confidential © 2016 MapR Technologies Streaming Architecture to Connect Everything (including Hybrid Cloud) Mathieu Dumoulin (MapR Technologies) Strata Singapore 2016
  3. 3. © 2016 MapR Technologies 3© 2016 MapR Technologies 3MapR Confidential Mathieu Dumoulin, Data Engineer • Master’s degree in text classification on Hadoop at Fujitsu Canada’s Innovation Lab and Laval University • In Tokyo, I’ve worked as a Data Scientist, Search Engineer and Data Engineer • Working on streaming, complex event processing and machine learning
  4. 4. © 2016 MapR Technologies 4© 2016 MapR Technologies 4MapR Confidential The new rule for the future is going to be, “Anything that can be connected, will be connected.” Jacob Morgan, Forbes - May 2014
  5. 5. © 2016 MapR Technologies 5© 2016 MapR Technologies 5MapR Confidential Talk Summary • Clouds: private vs. public vs. hybrid • It’s all about that streaming – Streaming for IoT – Publish-subscribe messaging systems (Kafka) – Stream Processing (Apache Spark Streaming, Apache Flink) – Microservices • Streams-based Architecture in the hybrid cloud – Design goals – Examples • Recap, Q&A
  6. 6. © 2016 MapR Technologies 6© 2016 MapR Technologies 6MapR Confidential © 2016 MapR Technologies Weather today for IT:
  7. 7. © 2016 MapR Technologies 7© 2016 MapR Technologies 7MapR Confidential Public Cloud - Low Upfront Cost and Flexibility The Good • Right size instances for application • Grow with the business • “Forever” extensible • Global in a few clicks The Bad • New complexity, no magic • Costs can run away The Ugly • Local data is far from processing • Severe lock-in without huge in-house expertise
  8. 8. © 2016 MapR Technologies 8© 2016 MapR Technologies 8MapR Confidential Private Clouds - The Benefits of Ownership The Bad • Harder to scale vertically & horizontally • Cost of multiple datacenters The Ugly • Pay for spike, wasted resources • Never right size in a growing organization The Good • Direct access to data • Security, privacy and legal compliance • Hardware certainty • Low running cost
  9. 9. © 2016 MapR Technologies 9© 2016 MapR Technologies 9MapR Confidential Private Cloud - Europe Private Cloud - Tokyo Hybrid = Public vs. + Private Spans at least one public and one private cloud. • Test new ideas with low up-front capital cost • Cloudbursting • High Availability and Disaster Recovery • Regulatory Requirements IT infrastructure agility
  10. 10. © 2016 MapR Technologies 10© 2016 MapR Technologies 10MapR Confidential © 2016 MapR Technologies It’s all about that streaming
  11. 11. © 2016 MapR Technologies 11© 2016 MapR Technologies 11MapR Confidential Streaming Architecture the Norm for Data Driven Organizations “Stream-based computing is becoming the norm for data-driven organizations” - Friedman & Dunning, Streaming Architecture • Build flexible systems – more efficient and easier to build – Decouples dependencies between data source and processing • Better model the way business processes take place. • More value now… and later – Aggregates data from many sources once – Serves data to one or many projects immediately – More efficient and high performance – Run batch analytics, reprocess data
  12. 12. © 2016 MapR Technologies 12© 2016 MapR Technologies 12MapR Confidential IoT is a Natural Use Case for Streaming Connected devices produce data as real-time events that are modelled naturally as event streams. Event Some actions have value only if taken immediately – Navigation updates from traffic conditions, accident reports, disasters, … – Slowing down or stopping a factory line in response to quality issues – Re-routing items mid-way during shipping to increase efficiency – Continuous engine tuning
  13. 13. © 2016 MapR Technologies 13© 2016 MapR Technologies 13MapR Confidential IoT is Happening Right Now!
  14. 14. © 2016 MapR Technologies 14© 2016 MapR Technologies 14MapR Confidential Streams Make the Hybrid Cloud Practical Streams can serve for inter-cloud communication in the exact same way they support any other scenario. ● Abstracts the differences between on-premise and cloud ● Standardize the expected flow of data between modules ● Reuse data many times, break down data silos
  15. 15. © 2016 MapR Technologies 15© 2016 MapR Technologies 15MapR Confidential What Streaming Requires from a Messaging System ● The producer and consumer are fully independent ● Very high throughput 1,000+/s → 1,000,000+/s ● Persistence ○ Fault-tolerance ○ Data is kept as a replayable sequence ○ Strong ordering of events ● Naming of topics (consumers pick the data they need ) ● Geo-distributed replication (for Hybrid Cloud use cases) It’s very hard to get full isolation of producer and consumers while also keeping very high speed, but we must have both.
  16. 16. © 2016 MapR Technologies 16© 2016 MapR Technologies 16MapR Confidential What Streaming Requires from Stream Processing Frameworks Desirable features for real-time analytics frameworks: • Open Source, active development and developer community • Supports “exactly once” guarantee, stream reprocessing • How much real-time? Microbatch vs. record-at-a-time • Performance (latency, throughput) • Other: Easy to use, compatibility, talent availability To Know more: https://www.mapr.com/blog/stream-processing-everywhere-what-use Jim Scott - Stream Processing Everywhere - What to Use? Strata San Jose 2015 Also see Data Artisan’s Blog on Stream Processing Framework Myths
  17. 17. © 2016 MapR Technologies 17© 2016 MapR Technologies 17MapR Confidential Which Stream Processing Frameworks?
  18. 18. © 2016 MapR Technologies 18© 2016 MapR Technologies 18MapR Confidential Summing up: Technology to support Streaming 1. Lightweight messaging system 2. Stream Processing Framework You can get an Introduction to Flink in this Free Book published by O’Reilly
  19. 19. © 2016 MapR Technologies 19© 2016 MapR Technologies 19MapR Confidential Key Ideas For Effectively Using Streams Real-time Analysis Persist to Disk Geo-distributed Replication Core part of Architecture
  20. 20. © 2016 MapR Technologies 20© 2016 MapR Technologies 20MapR Confidential
  21. 21. © 2016 MapR Technologies 21© 2016 MapR Technologies 21MapR Confidential Streaming Architecture: Ideal Platform for Microservices Microservices are a modern distributed architecture that realizes the promises of SOA, Service Oriented Architecture • Scale up from a test use case to a global deployment • Decouples components, more modular • Modern, agile development, testing and deployment • More robust and responsive See Krystal Valentine’s “The keys to an event-based microservices application” presentation, Strata New York 2016
  22. 22. © 2016 MapR Technologies 22© 2016 MapR Technologies 22MapR Confidential Monolithic to Microservices Architecture See Fowler’s blog about microservices: http://www.martinfowler.com/articles/microservices.html
  23. 23. © 2016 MapR Technologies 23© 2016 MapR Technologies 23MapR Confidential Microservices are Truly Decoupled
  24. 24. © 2016 MapR Technologies 24© 2016 MapR Technologies 24MapR Confidential When to Use Streaming Architecture
  25. 25. © 2016 MapR Technologies 25© 2016 MapR Technologies 25MapR Confidential © 2016 MapR Technologies Connect Clouds with Streams: Streams-based Architecture
  26. 26. © 2016 MapR Technologies 26© 2016 MapR Technologies 26MapR Confidential Switch from thinking of computer programs as state-oriented to thinking of them in terms of flows” Ted Dunning & Ellen Friedman, Streaming Architecture - O’Reilly - 2016
  27. 27. © 2016 MapR Technologies 27© 2016 MapR Technologies 27MapR Confidential An End-to-End Streaming Architecture Japan North Data Center Stream GW Global Data Center Stream
  28. 28. © 2016 MapR Technologies 28© 2016 MapR Technologies 28MapR Confidential Example Architecture: Log Analysis
  29. 29. © 2016 MapR Technologies 29© 2016 MapR Technologies 29MapR Confidential Example Architecture: Log Analysis
  30. 30. © 2016 MapR Technologies 30© 2016 MapR Technologies 30MapR Confidential Example Architecture: The MapR Blueprint Download the Finserve app from Github! https://github.com/mapr-demos/finserv-application-blueprint
  31. 31. © 2016 MapR Technologies 31© 2016 MapR Technologies 31MapR Confidential Conclusion • The hybrid cloud matters for IT agility • Use streams for communication between elements • Streaming-based systems can be arbitrarily complex – Still fast, responsive, reliable and easier to develop! • In a streaming architecture world, a converged platform (built-in streaming, storage and DB) makes a difference.
  32. 32. © 2016 MapR Technologies 32© 2016 MapR Technologies 32MapR Confidential Suggested Reading And Video Links Get Ted & Ellen’s book: Read it Online for Free! New content presented by Ted Dunning: 1. Big Data in the Cloud (blog): www.mapr.com/big-data-cloud a. Direct video link: https://youtu.be/90KrQAb1_Cw 2. Converged Advantages in the Cloud (blog): www.mapr.com/converged-cloud a. Direct video link: https://youtu.be/yjfBXNcmAHA
  33. 33. © 2016 MapR Technologies 33© 2016 MapR Technologies 33MapR Confidential Q & A @mapr mdumoulin@mapr.com @lordxar Engage with us! mapr-technologies
  34. 34. © 2016 MapR Technologies 34© 2016 MapR Technologies 34MapR Confidential Key Ideas for Microservices • Services are opaque - API only • They communicate with only a few other services using lightweight, flexible protocols. – HTTP+REST - Synchronous (frontend) – Messaging Systems (Kafka, MapR Streams) - Asynchronous (backend) • Data formats should be future-proofed – JSON - Human readable, easy to use, low efficiency – Binary (Avro, Protobuf, Thrift) - Efficient but (somewhat) harder to use {RESTful}
  35. 35. © 2016 MapR Technologies 35© 2016 MapR Technologies 35MapR Confidential Spark Streaming or Flink: Case by Case Micro-batches. Time-based window. Latency: seconds Continuous flow model. Record-based window. Latency: ms Both provide exactly once guarantee, high throughput and low overhead of fault tolerance. Both streaming and batch supported.
  36. 36. © 2016 MapR Technologies 36© 2016 MapR Technologies 36MapR Confidential The Hybrid Cloud for IoT Infrastructure • IoT is a new use case - Need to Test • Built-in need for baseload capacity and bursting data spikes • Global marketplace requires geographically dispersed datacenters • Increasingly strict compliance requirements • IoT Security issues need to be taken seriously Why do IoT applications call out for the flexibility of Hybrid Cloud?

×