SlideShare a Scribd company logo
1 of 15
CICD Pipeline and delivery of Apache Spark Applications on
the cloud using AWS
Maria Fung
Data Warehouse Development Lead
California State University, Office of the Chancellor
mfung@calstate.edu
Babu Repaka
Business Intelligence Solution Architect
California State University, Office of the Chancellor
brepaka@calstate.edu
October 25, 2020
• The Largest and most
diverse system of 4-
year higher education in
the U.S.
• 23 Campuses
• ~50K Employees
• Nearly 500k enrolled
students this year
CICD Overview
 CSU Past technologies
 CSU Present & Future Cloud with Agile methodology
- Devops
- Dataops
- CICD
Develop
Build
Package
Test
Deploy
Operate
AWS CICD Components
 S3
 CODE COMMIT
 CODE BUILD
 CODE PIPELINE
 CLOUDFORMATION
 CLOUDWATCH
 EMR
 LAMBDA
 SNS
 SQS
Current CICD Process Overview
S3 bucket
Start
Developer A
CodeCommit
Check in code
to Git
Notify Peers to
review pull
request
Automatically
merge Pull
Request
Developers
B,C,D
CloudWatch
Event
Rule
Alarm
SNS
Topic
email
CodeBuild
Code Testing
Pytest.py
Code Coverage
Coverage.py Submit EMR steps
to Dev EMR cluster
Build and
package
dependencies
from Git
CodeBuild CodePipeline
EMR
Source Test Build Deploy
CloudFormation
Production
Deployment
Create Pull
Request and
resolve conflicts
Test
Result
?
review
comments and
changes
Approve
?
Yes
No
Failed
Passed
Lambda
Security for the CI/CD Pipeline
Develop
Build
Package
Test
Deploy
Operate
Security
 Multi-factor authentication (MFA)
 Role assignments and segregation of duties
 Test Cases and acceptable outcomes
 Secured code repositories
 Secured Build and package environment
 Sensitive information
 Weekly Security Assessment Review using AWS Trust Advisor, AWS inspector and AWS Guard Duty
Security Checklist
AWS Trust Advisor is being used to review the following:
Any exceptions raised by the tool will be investigated and address right away
Checklist cost
Checklist fault
tolerant
Checklist
performance
Checklist
security
Checklist
Management and Governance
AWS guard duty
Security, Identity, & Compliance
AWS Inspector
Development Pipeline
Start
Developer create Pull
Request for changes from
developer’s branch to dev
branch
Notify Approvers to approval
pull request
Submit EMR steps on dev
EMR Cluster
Test
Result?
Auto Update Pull Request
comment and merge from
developer’s changes to dev
branch
Failed
Success
Run unit testing and code
coverage
Pull
Request
Approve?
Developer review comments
and commit changes
No
Yes
CloudWatch
SNS
Topic email
Event
Rule
S3
Lambda
CodeCommit
CodeBuild
No
 Code Coverage
 Pytest
 EMR
Pytest
Code coverage
Development Pipeline
EMR
Development Pipeline
Production Deployment
Devops admin
Submit Pull
Request to merge
from Dev to
master branch
Notify Manager
for approval
Update Pull Request
comment and merge to
master branch
Pull
Request
Approve?
No
Yes
Build all dependencies
and packaging all files to
production s3 bucket
Deploy cloudformation
stack to provision
production EMR cluster
and run steps
Developer review
comments and
suggested changes
CloudWatch SNS
Topic emailEvent Rule
Lambda
KMS key
CodeCommit
CodeBuild
CodePipeline
EMR
CloudFormation
S3
S3
Dev Account Prod Account
Lambda
IAM Role Permissions
Cross-account
Role Permissions
STS Assume Role
PROD stack
Create
change set
Execute
change set
 Lesson Learned
Continuous learning and exploration
 Challenges
Lack of jobs orchestration in AWS EMR.
AWS Code Pipeline Cross Account Roles are not transparent on the AWS console
 Next Steps
Explore better jobs orchestration
Analyzing & planning to implement AWS EKS
Thank you for
watching
Maria Fung
mfung@calstate.edu
Babu Repaka
brepaka@calstate.edu
maria-fung-7931001b4
babu-repaka-3167534

More Related Content

What's hot

What's hot (20)

How to teach your data scientist to leverage an analytics cluster with Presto...
How to teach your data scientist to leverage an analytics cluster with Presto...How to teach your data scientist to leverage an analytics cluster with Presto...
How to teach your data scientist to leverage an analytics cluster with Presto...
 
Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!Make your PySpark Data Fly with Arrow!
Make your PySpark Data Fly with Arrow!
 
Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...
Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...
Scaling Through Simplicity—How a 300 million User Chat App Reduced Data Engin...
 
Google Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsGoogle Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teams
 
Developing high frequency indicators using real time tick data on apache supe...
Developing high frequency indicators using real time tick data on apache supe...Developing high frequency indicators using real time tick data on apache supe...
Developing high frequency indicators using real time tick data on apache supe...
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
 
Data Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache ArrowData Science Across Data Sources with Apache Arrow
Data Science Across Data Sources with Apache Arrow
 
Spark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos ErotocritouSpark Summit EU talk by Christos Erotocritou
Spark Summit EU talk by Christos Erotocritou
 
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
 
Running Fast, Interactive Queries on Petabyte Datasets using Presto - AWS Jul...
Running Fast, Interactive Queries on Petabyte Datasets using Presto - AWS Jul...Running Fast, Interactive Queries on Petabyte Datasets using Presto - AWS Jul...
Running Fast, Interactive Queries on Petabyte Datasets using Presto - AWS Jul...
 
Serverless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud PlatformServerless Data Architecture at scale on Google Cloud Platform
Serverless Data Architecture at scale on Google Cloud Platform
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
 
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
 
Fast Data with Apache Ignite and Apache Spark with Christos Erotocritou
Fast Data with Apache Ignite and Apache Spark with Christos ErotocritouFast Data with Apache Ignite and Apache Spark with Christos Erotocritou
Fast Data with Apache Ignite and Apache Spark with Christos Erotocritou
 
BigDL: A Distributed Deep Learning Library on Spark: Spark Summit East talk b...
BigDL: A Distributed Deep Learning Library on Spark: Spark Summit East talk b...BigDL: A Distributed Deep Learning Library on Spark: Spark Summit East talk b...
BigDL: A Distributed Deep Learning Library on Spark: Spark Summit East talk b...
 
Databricks @ Strata SJ
Databricks @ Strata SJDatabricks @ Strata SJ
Databricks @ Strata SJ
 
Trends for Big Data and Apache Spark in 2017 by Matei Zaharia
Trends for Big Data and Apache Spark in 2017 by Matei ZahariaTrends for Big Data and Apache Spark in 2017 by Matei Zaharia
Trends for Big Data and Apache Spark in 2017 by Matei Zaharia
 
Petabytes, Exabytes, and Beyond: Managing Delta Lakes for Interactive Queries...
Petabytes, Exabytes, and Beyond: Managing Delta Lakes for Interactive Queries...Petabytes, Exabytes, and Beyond: Managing Delta Lakes for Interactive Queries...
Petabytes, Exabytes, and Beyond: Managing Delta Lakes for Interactive Queries...
 
Automated Production Ready ML at Scale
Automated Production Ready ML at ScaleAutomated Production Ready ML at Scale
Automated Production Ready ML at Scale
 

Similar to CICD Pipeline and delivery of Apache Spark Applications on the cloud using AWS

Track 4 Session 4_ MAD02 MAD 04 如何藉由 CICD 流程管理容器化和無伺服器應用
Track 4 Session 4_ MAD02 MAD 04 如何藉由 CICD 流程管理容器化和無伺服器應用Track 4 Session 4_ MAD02 MAD 04 如何藉由 CICD 流程管理容器化和無伺服器應用
Track 4 Session 4_ MAD02 MAD 04 如何藉由 CICD 流程管理容器化和無伺服器應用
Amazon Web Services
 

Similar to CICD Pipeline and delivery of Apache Spark Applications on the cloud using AWS (20)

Automated DevOps Workflows with Chef on AWS
Automated DevOps Workflows with Chef on AWSAutomated DevOps Workflows with Chef on AWS
Automated DevOps Workflows with Chef on AWS
 
Advanced Continuous Delivery on AWS
Advanced Continuous Delivery on AWSAdvanced Continuous Delivery on AWS
Advanced Continuous Delivery on AWS
 
DevOps on AWS - Building Systems to Deliver Faster
DevOps on AWS - Building Systems to Deliver FasterDevOps on AWS - Building Systems to Deliver Faster
DevOps on AWS - Building Systems to Deliver Faster
 
Continuous Delivery Live
Continuous Delivery LiveContinuous Delivery Live
Continuous Delivery Live
 
Intro to AWS Developer Tools feat. AWS Codestar, and AWS SDKs & Developer Res...
Intro to AWS Developer Tools feat. AWS Codestar, and AWS SDKs & Developer Res...Intro to AWS Developer Tools feat. AWS Codestar, and AWS SDKs & Developer Res...
Intro to AWS Developer Tools feat. AWS Codestar, and AWS SDKs & Developer Res...
 
Global Azure 2024 - On-Premises to Azure Cloud: .NET Web App Journey
Global Azure 2024 - On-Premises to Azure Cloud: .NET Web App JourneyGlobal Azure 2024 - On-Premises to Azure Cloud: .NET Web App Journey
Global Azure 2024 - On-Premises to Azure Cloud: .NET Web App Journey
 
Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)
 
DevOps at Amazon: A Look at Our Tools and Processes
DevOps at Amazon: A Look at Our Tools and ProcessesDevOps at Amazon: A Look at Our Tools and Processes
DevOps at Amazon: A Look at Our Tools and Processes
 
CSC AWS re:Invent Enterprise DevOps session
CSC AWS re:Invent Enterprise DevOps sessionCSC AWS re:Invent Enterprise DevOps session
CSC AWS re:Invent Enterprise DevOps session
 
Track 4 Session 4_ MAD02 MAD 04 如何藉由 CICD 流程管理容器化和無伺服器應用
Track 4 Session 4_ MAD02 MAD 04 如何藉由 CICD 流程管理容器化和無伺服器應用Track 4 Session 4_ MAD02 MAD 04 如何藉由 CICD 流程管理容器化和無伺服器應用
Track 4 Session 4_ MAD02 MAD 04 如何藉由 CICD 流程管理容器化和無伺服器應用
 
Accelerating Software Delivery with AWS Developer Tools & AWS Mobile services...
Accelerating Software Delivery with AWS Developer Tools & AWS Mobile services...Accelerating Software Delivery with AWS Developer Tools & AWS Mobile services...
Accelerating Software Delivery with AWS Developer Tools & AWS Mobile services...
 
Microsoft Azure For Solutions Architects
Microsoft Azure For Solutions ArchitectsMicrosoft Azure For Solutions Architects
Microsoft Azure For Solutions Architects
 
Azure DevOps Best Practices Webinar
Azure DevOps Best Practices WebinarAzure DevOps Best Practices Webinar
Azure DevOps Best Practices Webinar
 
Intro to AWS Developer Tools, featuring AWS CodeStar
Intro to AWS Developer Tools, featuring AWS CodeStarIntro to AWS Developer Tools, featuring AWS CodeStar
Intro to AWS Developer Tools, featuring AWS CodeStar
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
 
Kubernetes vs App Service
Kubernetes vs App ServiceKubernetes vs App Service
Kubernetes vs App Service
 
WIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWSWIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWS
 
Introducing AWS CodeStar and the AWS CI:CD workflow - AWS Summit Tel Aviv 2017
Introducing AWS CodeStar and the  AWS CI:CD workflow - AWS Summit Tel Aviv 2017Introducing AWS CodeStar and the  AWS CI:CD workflow - AWS Summit Tel Aviv 2017
Introducing AWS CodeStar and the AWS CI:CD workflow - AWS Summit Tel Aviv 2017
 
AWS CodeStar - AWS TelAviv Summit 2017
AWS CodeStar - AWS TelAviv Summit 2017  AWS CodeStar - AWS TelAviv Summit 2017
AWS CodeStar - AWS TelAviv Summit 2017
 
Microservices: Architecting for Innovation - Level 300
Microservices: Architecting for Innovation - Level 300Microservices: Architecting for Innovation - Level 300
Microservices: Architecting for Innovation - Level 300
 

More from Data Con LA

Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA
 

More from Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Recently uploaded

Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 

Recently uploaded (20)

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 

CICD Pipeline and delivery of Apache Spark Applications on the cloud using AWS

  • 1. CICD Pipeline and delivery of Apache Spark Applications on the cloud using AWS Maria Fung Data Warehouse Development Lead California State University, Office of the Chancellor mfung@calstate.edu Babu Repaka Business Intelligence Solution Architect California State University, Office of the Chancellor brepaka@calstate.edu October 25, 2020
  • 2. • The Largest and most diverse system of 4- year higher education in the U.S. • 23 Campuses • ~50K Employees • Nearly 500k enrolled students this year
  • 3. CICD Overview  CSU Past technologies  CSU Present & Future Cloud with Agile methodology - Devops - Dataops - CICD Develop Build Package Test Deploy Operate
  • 4. AWS CICD Components  S3  CODE COMMIT  CODE BUILD  CODE PIPELINE  CLOUDFORMATION  CLOUDWATCH  EMR  LAMBDA  SNS  SQS
  • 5. Current CICD Process Overview S3 bucket Start Developer A CodeCommit Check in code to Git Notify Peers to review pull request Automatically merge Pull Request Developers B,C,D CloudWatch Event Rule Alarm SNS Topic email CodeBuild Code Testing Pytest.py Code Coverage Coverage.py Submit EMR steps to Dev EMR cluster Build and package dependencies from Git CodeBuild CodePipeline EMR Source Test Build Deploy CloudFormation Production Deployment Create Pull Request and resolve conflicts Test Result ? review comments and changes Approve ? Yes No Failed Passed Lambda
  • 6. Security for the CI/CD Pipeline Develop Build Package Test Deploy Operate Security
  • 7.  Multi-factor authentication (MFA)  Role assignments and segregation of duties  Test Cases and acceptable outcomes  Secured code repositories  Secured Build and package environment  Sensitive information  Weekly Security Assessment Review using AWS Trust Advisor, AWS inspector and AWS Guard Duty Security Checklist
  • 8. AWS Trust Advisor is being used to review the following: Any exceptions raised by the tool will be investigated and address right away Checklist cost Checklist fault tolerant Checklist performance Checklist security Checklist Management and Governance
  • 9. AWS guard duty Security, Identity, & Compliance AWS Inspector
  • 10. Development Pipeline Start Developer create Pull Request for changes from developer’s branch to dev branch Notify Approvers to approval pull request Submit EMR steps on dev EMR Cluster Test Result? Auto Update Pull Request comment and merge from developer’s changes to dev branch Failed Success Run unit testing and code coverage Pull Request Approve? Developer review comments and commit changes No Yes CloudWatch SNS Topic email Event Rule S3 Lambda CodeCommit CodeBuild No  Code Coverage  Pytest  EMR
  • 13. Production Deployment Devops admin Submit Pull Request to merge from Dev to master branch Notify Manager for approval Update Pull Request comment and merge to master branch Pull Request Approve? No Yes Build all dependencies and packaging all files to production s3 bucket Deploy cloudformation stack to provision production EMR cluster and run steps Developer review comments and suggested changes CloudWatch SNS Topic emailEvent Rule Lambda KMS key CodeCommit CodeBuild CodePipeline EMR CloudFormation S3 S3 Dev Account Prod Account Lambda IAM Role Permissions Cross-account Role Permissions STS Assume Role PROD stack Create change set Execute change set
  • 14.  Lesson Learned Continuous learning and exploration  Challenges Lack of jobs orchestration in AWS EMR. AWS Code Pipeline Cross Account Roles are not transparent on the AWS console  Next Steps Explore better jobs orchestration Analyzing & planning to implement AWS EKS
  • 15. Thank you for watching Maria Fung mfung@calstate.edu Babu Repaka brepaka@calstate.edu maria-fung-7931001b4 babu-repaka-3167534

Editor's Notes

  1. The CSU awards nearly half of the state’s baccalaureate degrees. 1 in 10 employed graduates came from CSU… representing 1 in 20 college degree holders nationwide! We produce in the neighborhood of 120,000 graduates per year and our more than 3.4 MILLION living alumni are employed in every field across the world
  2. Image source: https://www.shutterstock.com/image-photo/abstract-source-code-background-writing-program-1475214392