Building a Data Exchange with Spring Cloud Data Flow

VMware Tanzu
VMware TanzuVMware Tanzu
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
Building a Data Exchange with Spring Cloud Data Flow
1 of 21

More Related Content

What's hot(20)

REST API のコツREST API のコツ
REST API のコツ
pospome52.3K views
OpenID Connect入門OpenID Connect入門
OpenID Connect入門
土岐 孝平1.9K views
Keycloak入門Keycloak入門
Keycloak入門
Hiroyuki Wada11.8K views
20190522 AWS Black Belt Online Seminar AWS Step Functions20190522 AWS Black Belt Online Seminar AWS Step Functions
20190522 AWS Black Belt Online Seminar AWS Step Functions
Amazon Web Services Japan96.8K views
Azure aws비교Azure aws비교
Azure aws비교
Youshin Kim218 views
The Twelve-Factor Appで考えるAWSのサービス開発The Twelve-Factor Appで考えるAWSのサービス開発
The Twelve-Factor Appで考えるAWSのサービス開発
Amazon Web Services Japan24.2K views
Rust Error HandlingRust Error Handling
Rust Error Handling
ShunsukeNakamura17331 views
Swaggerでのapi開発よもやま話Swaggerでのapi開発よもやま話
Swaggerでのapi開発よもやま話
KEISUKE KONISHI27.8K views
Swagger ではない OpenAPI Specification 3.0 による API サーバー開発Swagger ではない OpenAPI Specification 3.0 による API サーバー開発
Swagger ではない OpenAPI Specification 3.0 による API サーバー開発
Yahoo!デベロッパーネットワーク12.2K views

Similar to Building a Data Exchange with Spring Cloud Data Flow(20)

PCF 2.3: A First LookPCF 2.3: A First Look
PCF 2.3: A First Look
VMware Tanzu331 views
Experience + Education = EmpowermentExperience + Education = Empowerment
Experience + Education = Empowerment
VMware Tanzu224 views

Building a Data Exchange with Spring Cloud Data Flow

  • 1. Building a Data Exchange with Spring Cloud Data Flow October7–10, 2019 AustinConvention Center
  • 2. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Introduction What is Spring Cloud Data Flow, and what is a Data Exchange? TransformationCase Study • Decision factors • Assessing currentstate • Applying patterns • Lessons learned • Ways forward Agenda
  • 3. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Software development, design and architecturefor over 25 years Using Java since 1999, Spring since 2008 Using SCDF on Pivotal Platformsince 2016 Member of the Pivotal Vanguards Believe software developmentis more of an art than a science Avid jogger and climber (until recently) Gamer ordinaire Introduction – Hi! Welcome! Technical Director Charles Schwab & Co., Inc.
  • 4. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Orchestrationservice on Cloud Foundry or Kubernetes Evolution of Spring XD • Spring Integration • Spring Batch • Spring Cloud Stream • Spring Cloud Task Streaming and batch What is Spring Cloud Data Flow? dataflow.spring.io
  • 5. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Provision a Rabbit Service instance Provision a PostgreSQL Service instance Download SCDF server application jar Download SCDF shell jar Download Skipper jar Configureand run the SCDF application Two Ways to Deploy SCDF Provision the Data Flow Server tile Install the Spring Boot Application https://dataflow.spring.io/docs/installation/cloudfoundry/cf-clihttps://docs.pivotal.io/scdf/1-6
  • 6. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ System or group of applications that gets data from “point a” to “point b” Validates, transformsdata Consumesdata in multiple formats Disseminationto 1 or more consumers Similar in concept to ETL What is a Data Exchange?
  • 7. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ May not know much about source system(s) Consumesdata in many different formats Disseminatesto disparate systems Disseminatesin different formats Sources and destinations are not necessarily databases May or may not own the data being exchanged Data Exchange vs. ETL Consumesfrom disparatesystems, format is generally known Destinationsystem is typically singular Data is owned by the “loader” Typically refers to databases Data Exchange ETL
  • 8. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Case Study: Deciding on a Data Exchange Why did we use Spring Cloud Data Flow?
  • 9. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Monolithic legacy application more than 10 years old Ugly technology stack (Perl, .sh, PL/SQL, Java, etc) Band-aided, face-lifted over the years Strategic misalignment Exclusively batch, inherently limited Exposure to risks – safe the way it is, but don’t want to touch it Decision Factors Microservice architecture cuz … “microservices” Strategically aligned “Futureproof” – not really a thing Promotespatterns Industry standard integrations Addresses data protection concerns Facilitates migrationtoward real-time Facilitates fast time-to-market What might we be dealing with? What do we want to build?
  • 10. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Isolated segments can be used to separate systemswith different levels of sensitivity or exposure Hyper-converged infrastructure secures each application in its own subnet Communication internally on the platform is secured via mutual TLS Application security groups open ports to outside the platform only as needed PCI DSS compliance can be achieved How does Pivotal PlatformAddress Concerns? Decision Factors
  • 11. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Assessing Current State Distilling Requirements
  • 12. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Applying Patterns Laying Out a Plan
  • 13. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Duh… (“Gang of Four” book link) Representrepeatable implementations Configured rather than built Facilitates speed-to-market Facilitates governance of data lineage Focuses application specific efforts Why Patterns?
  • 14. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ What Does it Look Like? STREAM_1 = file > :funnel-in STREAM_2 = jms > :funnel-in STREAM_3 = jdbc > :funnel-in STREAM_4 = s3 > :funnel-in STREAM_5 = rabbit > :funnel-in STREAM_6 = mongodb > :funnel-in STREAM_7 = sftp > :funnel-in STREAM_8 = :funnel-in > transform > :fan-out STREAM_9 = :fan-out > file STREAM_10 = :fan-out > sftp STREAM_11 = :fan-out > rabbit STREAM_12 = :fan-out > jdbc STREAM_13 = :fan-out > s3 STREAM_14 = :fan-out > mongodb STREAM_15 = :fan-out > hdfs
  • 15. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ One Pattern in Detail – The File Pattern
  • 16. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Lessons Learned What have we learned that has informed our strategy?
  • 17. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ RabbitMQ and Kafka binders are supported out-of-box in Spring Cloud Stream No experience with Kafka Managing the transport and the reliability of message delivery; implementing a backing service for message persistence (probably not needed with Kafka) becomes something to look into Underlying Transport Lessons Learned
  • 18. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Billing model based on number of app instances is punitive Streams with lots of transformations can incur high cost Multiple foundationsin a topology can mean multiplying AIs So learn to be careful about what is a ‘batch’ process appropriate for a Task Build some streams that could provide shared capabilities Ensure design of streams whereevery app is essential Total Cost of Ownership Lessons Learned
  • 19. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ SCDF and Pivotal Platformhave evolved a lot since 2016 Staying current with tooling and platform are very desireable Spring Boot evolved (1.4.x  2.1.x) SCDF evolved (1.0  2.2) Pivotal Platform evolved (PCF  Pivotal Platform)(1.4.x  2.6) Currency Lessons Learned
  • 20. Unless otherwise indicated, these slides are © 2013-2019 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommerc ial license: http://creativecommons .or g/licenses/ by-nc/3.0/ Cannot happen all at once Deprecate all work on the old platform from a point in time Effective to build new alongside old, and then retire old Planning for actual retirement is essential Retirement of Legacy Platform Lessons Learned
  • 21. Check out these related sessions: • High-PerformanceData Processing with Spring Cloud Data Flow and Geode • Real-Time Performance Analysisof Data-Processing Pipelines with Spring Cloud Data Flow, Micrometer • Streaming with Spring Cloud Stream and Apache Kafka #springone@s1p