SlideShare a Scribd company logo
Continuous Integration
for Spark Apps
Hi, I’m Sean!
© 2015 Uncharted Software Inc.
It’s hard to test Spark Apps :(
© 2015 Uncharted Software Inc.
Case Study: Uncharted Spark Pipeline
© 2015 Uncharted Software Inc.
Case Study: Uncharted Spark Pipeline
Some key issues:
● Ensure reliability
● Prevent regressions
● Maintain compatibility with multiple versions of Spark
● Open-source - need a quick and easy way to evaluate PRs
© 2015 Uncharted Software Inc.
What is Continuous Integration?
© 2015 Uncharted Software Inc.
“Continuous Integration (CI) is a development practice that requires developers to integrate
code into a shared repository several times a day. Each check-in is then verified by an
automated build, allowing teams to detect problems early.”
-- ThoughtWorks
© 2015 Uncharted Software Inc.
“Continuous Integration (CI) is a development practice that is pretty damned
important for writing quality software.”
-- Me
© 2015 Uncharted Software Inc.
So, What is Continuous Integration?
© 2015 Uncharted Software Inc.
Best Practices, courtesy of Wikipedia
1. Maintain a code repository (Git)
© 2015 Uncharted Software Inc.
Best Practices, courtesy of Wikipedia
1. Maintain a code repository (Git)
2. Automate the build (Gradle)
© 2015 Uncharted Software Inc.
Best Practices, courtesy of Wikipedia
1. Maintain a code repository (Git)
2. Automate the build (Gradle)
3. Tests should be part of the build (ScalaTest)
© 2015 Uncharted Software Inc.
Best Practices, courtesy of Wikipedia
1. Maintain a code repository (Git)
2. Automate the build (Gradle)
3. Tests should be part of the build (ScalaTest)
4. Commit/push feature branches often
© 2015 Uncharted Software Inc.
Best Practices, courtesy of Wikipedia
1. Maintain a code repository (Git)
2. Automate the build (Gradle)
3. Tests should be part of the build (ScalaTest)
4. Commit/push feature branches often
5. Build (and test) All The Branches
© 2015 Uncharted Software Inc.
Best Practices, courtesy of Wikipedia
1. Maintain a code repository (Git)
2. Automate the build (Gradle)
3. Tests should be part of the build (ScalaTest)
4. Commit/push feature branches often
5. Build (and test) All The Branches
6. Test in a clone of the production environment
© 2015 Uncharted Software Inc.
Best Practices, courtesy of Wikipedia
1. Maintain a code repository (Git)
2. Automate the build (Gradle)
3. Tests should be part of the build (ScalaTest)
4. Commit/push feature branches often
5. Build (and test) All The Branches
6. Test in a clone of the production environment
7. Keep the build fast
© 2015 Uncharted Software Inc.
Best Practices, courtesy of Wikipedia
1. Maintain a code repository (Git)
2. Automate the build (Gradle)
3. Tests should be part of the build (ScalaTest)
4. Commit/push feature branches often
5. Build (and test) All The Branches
6. Test in a clone of the production environment
7. Keep the build fast
8. Everyone can see the results of builds
© 2015 Uncharted Software Inc.
Best Practices, courtesy of Wikipedia
1. Maintain a code repository (Git)
2. Automate the build (Gradle)
3. Tests should be part of the build (ScalaTest)
4. Commit/push feature branches often
5. Build (and test) All The Branches
6. Test in a clone of the production environment
7. Keep the build fast
8. Everyone can see the results of builds
}duh.
© 2015 Uncharted Software Inc.
Best Practices, courtesy of Wikipedia
1. Maintain a code repository (Git)
2. Automate the build (Gradle)
3. Tests should be part of the build (ScalaTest)
4. Commit/push feature branches often
5. Build (and test) All The Branches
6. Test in a clone of the production environment
7. Keep the build fast
8. Everyone can see the results of builds
}...less duh.
© 2015 Uncharted Software Inc.
Why are these difficult with Apache Spark?
5. Build (and test) All The Branches
6. Test in a clone of the production
environment
7. Keep the build fast
8. Everyone can see the results of builds
© 2015 Uncharted Software Inc.
What is a Spark App?
© 2015 Uncharted Software Inc.
What is a Spark app?
Source
JAR
Spark ?
This thing.
JAR
© 2015 Uncharted Software Inc.
And...
Source
JAR
Spark ?
We need to test this
JAR
© 2015 Uncharted Software Inc.
But...
Source
JAR
ScalaTest
Scala RE
By default, we have this
JAR
(boom)
© 2015 Uncharted Software Inc.
v1: Squish Spark inside ScalaTest
Source
JAR
ScalaTest
with
SparkContext
So, we try this
JAR
it works!
(sort of)
© 2015 Uncharted Software Inc.
it works!
(sort of)
© 2015 Uncharted Software Inc.
6. Test in a clone of the production environment
© 2015 Uncharted Software Inc.
v2: Squish ScalaTest into Spark
Source
Test
JAR
Tests Main.scala
Spark
JAR
Test
JAR
Test
Output
JAR
© 2015 Uncharted Software Inc.
Main.scala
© 2015 Uncharted Software Inc.
6. Test in a clone of the production environment
© 2015 Uncharted Software Inc.
Progress?
5. Build (and test) All The Branches
6. Test in a clone of the production environment
7. Keep the build fast
8. Everyone can see the results of builds
© 2015 Uncharted Software Inc.
What now?
5. Build (and test) All The Branches
6. Test in a clone of the production environment
7. Keep the build fast
8. Everyone can see the results of builds
© 2015 Uncharted Software Inc.
Docker Container
(uncharted/sparklet)
v3: Squish Spark and Test JAR into Docker
Test
Output
Source
Test
JAR
Tests Main.scala
Spark
JAR
JAR
Test
JAR
© 2015 Uncharted Software Inc.
test.sh
© 2015 Uncharted Software Inc.
build.gradle (excerpt)
© 2015 Uncharted Software Inc.
Progress?
5. Build (and test) All The Branches
6. Test in a clone of the production environment
7. Keep the build fast
8. Everyone can see the results of builds
© 2015 Uncharted Software Inc.
Travis CI VM
Docker Container
v4: Squish Docker into Travis CI
Test
Output
Source
Test
JAR
Tests Main.scala
Spark
JAR
JAR
Test
JAR
© 2015 Uncharted Software Inc.
.travis.yml
© 2015 Uncharted Software Inc.
Voilà!
© 2015 Uncharted Software Inc.
Progress?
5. Build (and test) All The Branches
6. Test in a clone of the production environment
7. Keep the build fast
8. Everyone can see the results of builds
© 2015 Uncharted Software Inc.
© 2015 Uncharted Software Inc.
© 2015 Uncharted Software Inc.
All done!
5. Build (and test) All The Branches
6. Test in a clone of the production environment
7. Keep the build fast
8. Everyone can see the results of builds
© 2015 Uncharted Software Inc.
Next Steps?
Alpine Linux
docker-compose
Windows (dev environment) support
python
© 2015 Uncharted Software Inc.
Questions?
https://github.com/unchartedsoftware/sparkpipe-core
https://github.com/Ghnuberath
@Ghnuberath
https://hub.docker.com/r/uncharted/sparklet/
smcintyre@unchartedsoftware.com

More Related Content

What's hot

TDC 2016 Floripa - Testando APIs REST com Supertest e Promises
TDC 2016 Floripa - Testando APIs REST com Supertest e PromisesTDC 2016 Floripa - Testando APIs REST com Supertest e Promises
TDC 2016 Floripa - Testando APIs REST com Supertest e Promises
Stefan Teixeira
 
Fundamentals of DevOps and CI/CD
Fundamentals of DevOps and CI/CDFundamentals of DevOps and CI/CD
Fundamentals of DevOps and CI/CD
Batyr Nuryyev
 
Ágiles 2016 - Using open source tools to support Continuous Delivery
Ágiles 2016 - Using open source tools to support Continuous DeliveryÁgiles 2016 - Using open source tools to support Continuous Delivery
Ágiles 2016 - Using open source tools to support Continuous Delivery
Stefan Teixeira
 
Latinoware 2016 - Continuous Delivery com ferramentas open source
Latinoware 2016 - Continuous Delivery com ferramentas open sourceLatinoware 2016 - Continuous Delivery com ferramentas open source
Latinoware 2016 - Continuous Delivery com ferramentas open source
Stefan Teixeira
 
TDC 2016 SP - Continuous Delivery para aplicações Java com ferramentas open-s...
TDC 2016 SP - Continuous Delivery para aplicações Java com ferramentas open-s...TDC 2016 SP - Continuous Delivery para aplicações Java com ferramentas open-s...
TDC 2016 SP - Continuous Delivery para aplicações Java com ferramentas open-s...
Stefan Teixeira
 
Continuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL ServerContinuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL Server
Chris Adkin
 
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison DowdneySetting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Weaveworks
 
Scrum Gathering Portugal 2016 - Containerizing Tests with Docker
Scrum Gathering Portugal 2016 - Containerizing Tests with DockerScrum Gathering Portugal 2016 - Containerizing Tests with Docker
Scrum Gathering Portugal 2016 - Containerizing Tests with Docker
Stefan Teixeira
 
DevOps Transformation in Technical
DevOps Transformation in TechnicalDevOps Transformation in Technical
DevOps Transformation in Technical
Opsta
 
The Seven Habits of Highly Effective Puppet Users - PuppetConf 2014
The Seven Habits of Highly Effective Puppet Users - PuppetConf 2014The Seven Habits of Highly Effective Puppet Users - PuppetConf 2014
The Seven Habits of Highly Effective Puppet Users - PuppetConf 2014
Puppet
 
6º Encontro do Grupo de Testes Carioca - Testes em um contexto de Continuous ...
6º Encontro do Grupo de Testes Carioca - Testes em um contexto de Continuous ...6º Encontro do Grupo de Testes Carioca - Testes em um contexto de Continuous ...
6º Encontro do Grupo de Testes Carioca - Testes em um contexto de Continuous ...
Stefan Teixeira
 
Hands-on GitOps Patterns for Helm Users
Hands-on GitOps Patterns for Helm UsersHands-on GitOps Patterns for Helm Users
Hands-on GitOps Patterns for Helm Users
Weaveworks
 
Unit Testing in R with Testthat - HRUG
Unit Testing in R with Testthat - HRUGUnit Testing in R with Testthat - HRUG
Unit Testing in R with Testthat - HRUG
egoodwintx
 
Git Power Routines
Git Power RoutinesGit Power Routines
Git Power Routines
Nicola Paolucci
 
Git with t for teams
Git with t for teamsGit with t for teams
Git with t for teams
Sven Peters
 
20170807 - How to Fail Your TDD Rollout - A Train Wreck Story
20170807 - How to Fail Your TDD Rollout - A Train Wreck Story20170807 - How to Fail Your TDD Rollout - A Train Wreck Story
20170807 - How to Fail Your TDD Rollout - A Train Wreck Story
Chris Edwards, P.Eng.
 
Achieving Continuous Delivery with Puppet
Achieving Continuous Delivery with PuppetAchieving Continuous Delivery with Puppet
Achieving Continuous Delivery with Puppet
Devoteam Revolve
 
DevOps 及 TDD 開發流程哲學
DevOps 及 TDD 開發流程哲學DevOps 及 TDD 開發流程哲學
DevOps 及 TDD 開發流程哲學
謝 宗穎
 
Scaling your CI Pipeline with Docker and Concourse
Scaling your CI Pipeline with Docker and ConcourseScaling your CI Pipeline with Docker and Concourse
Scaling your CI Pipeline with Docker and Concourse
Chris Edwards, P.Eng.
 

What's hot (20)

TDC 2016 Floripa - Testando APIs REST com Supertest e Promises
TDC 2016 Floripa - Testando APIs REST com Supertest e PromisesTDC 2016 Floripa - Testando APIs REST com Supertest e Promises
TDC 2016 Floripa - Testando APIs REST com Supertest e Promises
 
Fundamentals of DevOps and CI/CD
Fundamentals of DevOps and CI/CDFundamentals of DevOps and CI/CD
Fundamentals of DevOps and CI/CD
 
Ágiles 2016 - Using open source tools to support Continuous Delivery
Ágiles 2016 - Using open source tools to support Continuous DeliveryÁgiles 2016 - Using open source tools to support Continuous Delivery
Ágiles 2016 - Using open source tools to support Continuous Delivery
 
Latinoware 2016 - Continuous Delivery com ferramentas open source
Latinoware 2016 - Continuous Delivery com ferramentas open sourceLatinoware 2016 - Continuous Delivery com ferramentas open source
Latinoware 2016 - Continuous Delivery com ferramentas open source
 
TDC 2016 SP - Continuous Delivery para aplicações Java com ferramentas open-s...
TDC 2016 SP - Continuous Delivery para aplicações Java com ferramentas open-s...TDC 2016 SP - Continuous Delivery para aplicações Java com ferramentas open-s...
TDC 2016 SP - Continuous Delivery para aplicações Java com ferramentas open-s...
 
Continuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL ServerContinuous Integration With Jenkins Docker SQL Server
Continuous Integration With Jenkins Docker SQL Server
 
True Git
True Git True Git
True Git
 
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison DowdneySetting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
Setting up Notifications, Alerts & Webhooks with Flux v2 by Alison Dowdney
 
Scrum Gathering Portugal 2016 - Containerizing Tests with Docker
Scrum Gathering Portugal 2016 - Containerizing Tests with DockerScrum Gathering Portugal 2016 - Containerizing Tests with Docker
Scrum Gathering Portugal 2016 - Containerizing Tests with Docker
 
DevOps Transformation in Technical
DevOps Transformation in TechnicalDevOps Transformation in Technical
DevOps Transformation in Technical
 
The Seven Habits of Highly Effective Puppet Users - PuppetConf 2014
The Seven Habits of Highly Effective Puppet Users - PuppetConf 2014The Seven Habits of Highly Effective Puppet Users - PuppetConf 2014
The Seven Habits of Highly Effective Puppet Users - PuppetConf 2014
 
6º Encontro do Grupo de Testes Carioca - Testes em um contexto de Continuous ...
6º Encontro do Grupo de Testes Carioca - Testes em um contexto de Continuous ...6º Encontro do Grupo de Testes Carioca - Testes em um contexto de Continuous ...
6º Encontro do Grupo de Testes Carioca - Testes em um contexto de Continuous ...
 
Hands-on GitOps Patterns for Helm Users
Hands-on GitOps Patterns for Helm UsersHands-on GitOps Patterns for Helm Users
Hands-on GitOps Patterns for Helm Users
 
Unit Testing in R with Testthat - HRUG
Unit Testing in R with Testthat - HRUGUnit Testing in R with Testthat - HRUG
Unit Testing in R with Testthat - HRUG
 
Git Power Routines
Git Power RoutinesGit Power Routines
Git Power Routines
 
Git with t for teams
Git with t for teamsGit with t for teams
Git with t for teams
 
20170807 - How to Fail Your TDD Rollout - A Train Wreck Story
20170807 - How to Fail Your TDD Rollout - A Train Wreck Story20170807 - How to Fail Your TDD Rollout - A Train Wreck Story
20170807 - How to Fail Your TDD Rollout - A Train Wreck Story
 
Achieving Continuous Delivery with Puppet
Achieving Continuous Delivery with PuppetAchieving Continuous Delivery with Puppet
Achieving Continuous Delivery with Puppet
 
DevOps 及 TDD 開發流程哲學
DevOps 及 TDD 開發流程哲學DevOps 及 TDD 開發流程哲學
DevOps 及 TDD 開發流程哲學
 
Scaling your CI Pipeline with Docker and Concourse
Scaling your CI Pipeline with Docker and ConcourseScaling your CI Pipeline with Docker and Concourse
Scaling your CI Pipeline with Docker and Concourse
 

Similar to Continuous Integration for Spark Apps by Sean McIntyre

Apigee deploy grunt plugin.1.0
Apigee deploy grunt plugin.1.0Apigee deploy grunt plugin.1.0
Apigee deploy grunt plugin.1.0
Diego Zuluaga
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
Roberto Pérez Alcolea
 
Our DevOps Journey: 6 Month Waterfalls to 1 Hour Code Deploys
Our DevOps Journey: 6 Month Waterfalls to 1 Hour Code DeploysOur DevOps Journey: 6 Month Waterfalls to 1 Hour Code Deploys
Our DevOps Journey: 6 Month Waterfalls to 1 Hour Code Deploys
Dynatrace
 
Continuous Load Testing with CloudTest and Jenkins
Continuous Load Testing with CloudTest and JenkinsContinuous Load Testing with CloudTest and Jenkins
Continuous Load Testing with CloudTest and Jenkins
SOASTA
 
Continuous integration and delivery for java based web applications
Continuous integration and delivery for java based web applicationsContinuous integration and delivery for java based web applications
Continuous integration and delivery for java based web applications
Sunil Dalal
 
Software Factory - Overview
Software Factory - OverviewSoftware Factory - Overview
Software Factory - Overviewslides_teltools
 
Continous integration and delivery for single page applications
Continous integration and delivery for single page applicationsContinous integration and delivery for single page applications
Continous integration and delivery for single page applications
Sunil Dalal
 
Continuous Delivery for IT Operations Teams
Continuous Delivery for IT Operations TeamsContinuous Delivery for IT Operations Teams
Continuous Delivery for IT Operations Teams
Mark Rendell
 
From 0 to DevOps in 80 Days [Webinar Replay]
From 0 to DevOps in 80 Days [Webinar Replay]From 0 to DevOps in 80 Days [Webinar Replay]
From 0 to DevOps in 80 Days [Webinar Replay]
Dynatrace
 
Emulators as an Emerging Best Practice for API Providers
Emulators as an Emerging Best Practice for API ProvidersEmulators as an Emerging Best Practice for API Providers
Emulators as an Emerging Best Practice for API Providers
Cisco DevNet
 
Continuous delivery is more than dev ops
Continuous delivery is more than dev opsContinuous delivery is more than dev ops
Continuous delivery is more than dev ops
Agile Montréal
 
Mobile App Quality Roadmap for DevTest Teams
Mobile App Quality Roadmap for DevTest TeamsMobile App Quality Roadmap for DevTest Teams
Mobile App Quality Roadmap for DevTest Teams
Perfecto by Perforce
 
Continuous Delivery with a PaaS Application
Continuous Delivery with a PaaS ApplicationContinuous Delivery with a PaaS Application
Continuous Delivery with a PaaS Application
Mark Rendell
 
Code Coverage for Total Security in Application Migrations
Code Coverage for Total Security in Application MigrationsCode Coverage for Total Security in Application Migrations
Code Coverage for Total Security in Application Migrations
Dana Luther
 
Adrian marinica continuous integration in the visual studio world
Adrian marinica   continuous integration in the visual studio worldAdrian marinica   continuous integration in the visual studio world
Adrian marinica continuous integration in the visual studio world
Codecamp Romania
 
Continuous Load Testing with CloudTest and Jenkins
Continuous Load Testing with CloudTest and JenkinsContinuous Load Testing with CloudTest and Jenkins
Continuous Load Testing with CloudTest and Jenkins
SOASTA
 
.Net OSS Ci & CD with Jenkins - JUC ISRAEL 2013
.Net OSS Ci & CD with Jenkins - JUC ISRAEL 2013 .Net OSS Ci & CD with Jenkins - JUC ISRAEL 2013
.Net OSS Ci & CD with Jenkins - JUC ISRAEL 2013
Tikal Knowledge
 
Spring Boot
Spring BootSpring Boot
Spring Boot
Jaran Flaath
 
SAPUI5/OpenUI5 - Continuous Integration
SAPUI5/OpenUI5 - Continuous IntegrationSAPUI5/OpenUI5 - Continuous Integration
SAPUI5/OpenUI5 - Continuous Integration
Peter Muessig
 
Unit Tests and Test Seams for abap Hamburg June 2017 presented
Unit Tests and Test Seams for abap Hamburg June 2017   presentedUnit Tests and Test Seams for abap Hamburg June 2017   presented
Unit Tests and Test Seams for abap Hamburg June 2017 presented
Rainer Winkler
 

Similar to Continuous Integration for Spark Apps by Sean McIntyre (20)

Apigee deploy grunt plugin.1.0
Apigee deploy grunt plugin.1.0Apigee deploy grunt plugin.1.0
Apigee deploy grunt plugin.1.0
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Our DevOps Journey: 6 Month Waterfalls to 1 Hour Code Deploys
Our DevOps Journey: 6 Month Waterfalls to 1 Hour Code DeploysOur DevOps Journey: 6 Month Waterfalls to 1 Hour Code Deploys
Our DevOps Journey: 6 Month Waterfalls to 1 Hour Code Deploys
 
Continuous Load Testing with CloudTest and Jenkins
Continuous Load Testing with CloudTest and JenkinsContinuous Load Testing with CloudTest and Jenkins
Continuous Load Testing with CloudTest and Jenkins
 
Continuous integration and delivery for java based web applications
Continuous integration and delivery for java based web applicationsContinuous integration and delivery for java based web applications
Continuous integration and delivery for java based web applications
 
Software Factory - Overview
Software Factory - OverviewSoftware Factory - Overview
Software Factory - Overview
 
Continous integration and delivery for single page applications
Continous integration and delivery for single page applicationsContinous integration and delivery for single page applications
Continous integration and delivery for single page applications
 
Continuous Delivery for IT Operations Teams
Continuous Delivery for IT Operations TeamsContinuous Delivery for IT Operations Teams
Continuous Delivery for IT Operations Teams
 
From 0 to DevOps in 80 Days [Webinar Replay]
From 0 to DevOps in 80 Days [Webinar Replay]From 0 to DevOps in 80 Days [Webinar Replay]
From 0 to DevOps in 80 Days [Webinar Replay]
 
Emulators as an Emerging Best Practice for API Providers
Emulators as an Emerging Best Practice for API ProvidersEmulators as an Emerging Best Practice for API Providers
Emulators as an Emerging Best Practice for API Providers
 
Continuous delivery is more than dev ops
Continuous delivery is more than dev opsContinuous delivery is more than dev ops
Continuous delivery is more than dev ops
 
Mobile App Quality Roadmap for DevTest Teams
Mobile App Quality Roadmap for DevTest TeamsMobile App Quality Roadmap for DevTest Teams
Mobile App Quality Roadmap for DevTest Teams
 
Continuous Delivery with a PaaS Application
Continuous Delivery with a PaaS ApplicationContinuous Delivery with a PaaS Application
Continuous Delivery with a PaaS Application
 
Code Coverage for Total Security in Application Migrations
Code Coverage for Total Security in Application MigrationsCode Coverage for Total Security in Application Migrations
Code Coverage for Total Security in Application Migrations
 
Adrian marinica continuous integration in the visual studio world
Adrian marinica   continuous integration in the visual studio worldAdrian marinica   continuous integration in the visual studio world
Adrian marinica continuous integration in the visual studio world
 
Continuous Load Testing with CloudTest and Jenkins
Continuous Load Testing with CloudTest and JenkinsContinuous Load Testing with CloudTest and Jenkins
Continuous Load Testing with CloudTest and Jenkins
 
.Net OSS Ci & CD with Jenkins - JUC ISRAEL 2013
.Net OSS Ci & CD with Jenkins - JUC ISRAEL 2013 .Net OSS Ci & CD with Jenkins - JUC ISRAEL 2013
.Net OSS Ci & CD with Jenkins - JUC ISRAEL 2013
 
Spring Boot
Spring BootSpring Boot
Spring Boot
 
SAPUI5/OpenUI5 - Continuous Integration
SAPUI5/OpenUI5 - Continuous IntegrationSAPUI5/OpenUI5 - Continuous Integration
SAPUI5/OpenUI5 - Continuous Integration
 
Unit Tests and Test Seams for abap Hamburg June 2017 presented
Unit Tests and Test Seams for abap Hamburg June 2017   presentedUnit Tests and Test Seams for abap Hamburg June 2017   presented
Unit Tests and Test Seams for abap Hamburg June 2017 presented
 

More from Spark Summit

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Spark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Spark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Spark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Spark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
 

More from Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
 

Recently uploaded

一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 

Recently uploaded (20)

一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 

Continuous Integration for Spark Apps by Sean McIntyre

  • 2. Hi, I’m Sean! © 2015 Uncharted Software Inc.
  • 3. It’s hard to test Spark Apps :( © 2015 Uncharted Software Inc.
  • 4. Case Study: Uncharted Spark Pipeline © 2015 Uncharted Software Inc.
  • 5. Case Study: Uncharted Spark Pipeline Some key issues: ● Ensure reliability ● Prevent regressions ● Maintain compatibility with multiple versions of Spark ● Open-source - need a quick and easy way to evaluate PRs © 2015 Uncharted Software Inc.
  • 6. What is Continuous Integration? © 2015 Uncharted Software Inc.
  • 7. “Continuous Integration (CI) is a development practice that requires developers to integrate code into a shared repository several times a day. Each check-in is then verified by an automated build, allowing teams to detect problems early.” -- ThoughtWorks © 2015 Uncharted Software Inc.
  • 8. “Continuous Integration (CI) is a development practice that is pretty damned important for writing quality software.” -- Me © 2015 Uncharted Software Inc.
  • 9. So, What is Continuous Integration? © 2015 Uncharted Software Inc.
  • 10. Best Practices, courtesy of Wikipedia 1. Maintain a code repository (Git) © 2015 Uncharted Software Inc.
  • 11. Best Practices, courtesy of Wikipedia 1. Maintain a code repository (Git) 2. Automate the build (Gradle) © 2015 Uncharted Software Inc.
  • 12. Best Practices, courtesy of Wikipedia 1. Maintain a code repository (Git) 2. Automate the build (Gradle) 3. Tests should be part of the build (ScalaTest) © 2015 Uncharted Software Inc.
  • 13. Best Practices, courtesy of Wikipedia 1. Maintain a code repository (Git) 2. Automate the build (Gradle) 3. Tests should be part of the build (ScalaTest) 4. Commit/push feature branches often © 2015 Uncharted Software Inc.
  • 14. Best Practices, courtesy of Wikipedia 1. Maintain a code repository (Git) 2. Automate the build (Gradle) 3. Tests should be part of the build (ScalaTest) 4. Commit/push feature branches often 5. Build (and test) All The Branches © 2015 Uncharted Software Inc.
  • 15. Best Practices, courtesy of Wikipedia 1. Maintain a code repository (Git) 2. Automate the build (Gradle) 3. Tests should be part of the build (ScalaTest) 4. Commit/push feature branches often 5. Build (and test) All The Branches 6. Test in a clone of the production environment © 2015 Uncharted Software Inc.
  • 16. Best Practices, courtesy of Wikipedia 1. Maintain a code repository (Git) 2. Automate the build (Gradle) 3. Tests should be part of the build (ScalaTest) 4. Commit/push feature branches often 5. Build (and test) All The Branches 6. Test in a clone of the production environment 7. Keep the build fast © 2015 Uncharted Software Inc.
  • 17. Best Practices, courtesy of Wikipedia 1. Maintain a code repository (Git) 2. Automate the build (Gradle) 3. Tests should be part of the build (ScalaTest) 4. Commit/push feature branches often 5. Build (and test) All The Branches 6. Test in a clone of the production environment 7. Keep the build fast 8. Everyone can see the results of builds © 2015 Uncharted Software Inc.
  • 18. Best Practices, courtesy of Wikipedia 1. Maintain a code repository (Git) 2. Automate the build (Gradle) 3. Tests should be part of the build (ScalaTest) 4. Commit/push feature branches often 5. Build (and test) All The Branches 6. Test in a clone of the production environment 7. Keep the build fast 8. Everyone can see the results of builds }duh. © 2015 Uncharted Software Inc.
  • 19. Best Practices, courtesy of Wikipedia 1. Maintain a code repository (Git) 2. Automate the build (Gradle) 3. Tests should be part of the build (ScalaTest) 4. Commit/push feature branches often 5. Build (and test) All The Branches 6. Test in a clone of the production environment 7. Keep the build fast 8. Everyone can see the results of builds }...less duh. © 2015 Uncharted Software Inc.
  • 20. Why are these difficult with Apache Spark? 5. Build (and test) All The Branches 6. Test in a clone of the production environment 7. Keep the build fast 8. Everyone can see the results of builds © 2015 Uncharted Software Inc.
  • 21. What is a Spark App? © 2015 Uncharted Software Inc.
  • 22. What is a Spark app? Source JAR Spark ? This thing. JAR © 2015 Uncharted Software Inc.
  • 23. And... Source JAR Spark ? We need to test this JAR © 2015 Uncharted Software Inc.
  • 24. But... Source JAR ScalaTest Scala RE By default, we have this JAR (boom) © 2015 Uncharted Software Inc.
  • 25. v1: Squish Spark inside ScalaTest Source JAR ScalaTest with SparkContext So, we try this JAR it works! (sort of) © 2015 Uncharted Software Inc.
  • 26. it works! (sort of) © 2015 Uncharted Software Inc.
  • 27. 6. Test in a clone of the production environment © 2015 Uncharted Software Inc.
  • 28. v2: Squish ScalaTest into Spark Source Test JAR Tests Main.scala Spark JAR Test JAR Test Output JAR © 2015 Uncharted Software Inc.
  • 30. 6. Test in a clone of the production environment © 2015 Uncharted Software Inc.
  • 31. Progress? 5. Build (and test) All The Branches 6. Test in a clone of the production environment 7. Keep the build fast 8. Everyone can see the results of builds © 2015 Uncharted Software Inc.
  • 32. What now? 5. Build (and test) All The Branches 6. Test in a clone of the production environment 7. Keep the build fast 8. Everyone can see the results of builds © 2015 Uncharted Software Inc.
  • 33. Docker Container (uncharted/sparklet) v3: Squish Spark and Test JAR into Docker Test Output Source Test JAR Tests Main.scala Spark JAR JAR Test JAR © 2015 Uncharted Software Inc.
  • 34. test.sh © 2015 Uncharted Software Inc.
  • 35. build.gradle (excerpt) © 2015 Uncharted Software Inc.
  • 36. Progress? 5. Build (and test) All The Branches 6. Test in a clone of the production environment 7. Keep the build fast 8. Everyone can see the results of builds © 2015 Uncharted Software Inc.
  • 37. Travis CI VM Docker Container v4: Squish Docker into Travis CI Test Output Source Test JAR Tests Main.scala Spark JAR JAR Test JAR © 2015 Uncharted Software Inc.
  • 39. Voilà! © 2015 Uncharted Software Inc.
  • 40. Progress? 5. Build (and test) All The Branches 6. Test in a clone of the production environment 7. Keep the build fast 8. Everyone can see the results of builds © 2015 Uncharted Software Inc.
  • 41. © 2015 Uncharted Software Inc.
  • 42. © 2015 Uncharted Software Inc.
  • 43. All done! 5. Build (and test) All The Branches 6. Test in a clone of the production environment 7. Keep the build fast 8. Everyone can see the results of builds © 2015 Uncharted Software Inc.
  • 44. Next Steps? Alpine Linux docker-compose Windows (dev environment) support python © 2015 Uncharted Software Inc.