SlideShare a Scribd company logo
1 of 18
Featured Project:
Marina Bay Sands Casino Resort, Singapore
Connecting teams project-wide
Big Data in a production environment:
Lessons Learnt
LAST Conference 2017
Mark Grebler - Aconex
CONFIDENTIAL | 2
Featured Project:
Marina Bay Sands Casino Resort, Singapore
Connecting teams project-wide
What does Big Data mean to you?
Summary
• What is the Insights project
• Big Data for Data Science
• Big Data in a production, user-facing environment
• Lessons Learnt
• Problems still to solve
What’s an Aconex?
Pronounced: Ay-conn-ex
Highly flexible and customisable data model with low level concepts
=
useful for many types of projects
Aconex has Flexible data
The Insights Project
Highly flexible and customisable data model with low level concepts
=
Difficult to produce meaningful customer reports
Flexible Data Needs transformation
The Insights Project
What does it look like
The Insights Project
Typical Big Data Architectures
Insights architecture
Looks similar to the other architectures
But the differences exist
We use the AWS console
to deploy new
infrastructure
I add new hardware by
buying a new box and
connecting it to the
network
Quotes from Data Engineers interviewed
We deploy by copying the
jar file to the cluster
We don’t have any CI, I
just build it on my box
We test by running it over
some data and ensuring it
doesn’t crash
We have some
rudimentary tests
What are the differences
Other Big Data Projects
● Internal client
● Simple authentication
● For Data Scientists
● Single environment
○ Sometimes 2 or 3
● Manual infrastructure management
● Sanity testing
● Manual integration
● Manual deployment
● Unrestricted data access
Insights Project
● External client
● Integrated authentication
● For end users
● Multiple environments
○ Due to data sovereignty (10)
● Infrastructure as code
● Unit → end-to-end testing
● Continuous integration
● Single-step deployment
● Data access restrictions
It’s not always so black and white, but the left side represents quite a lot of other projects I’ve seen.
Lessons Learnt
● VPN to control data access
● Autoscaling application server
● Network independence
● Zero downtime-deployments with
automatic rollback
○ ElasticBeanstalk provides this
Lessons learned: Infrastructure-as-code
● Must be easily reproducible because we need to do it 10+ times
● Automation of infrastructure management
○ Infrastructure is a core part of the Big Data project, so it must be treated as important as our
application code
○ Terraform is used to manage the infrastructure, including:
■ Networking and VPN management
■ Security
■ Provisioning VMs and other infrastructure
■ Replication and ingestion of data from Data Centres
■ Database Administration and Automation
Lessons learned: Access segregation
● Different accounts for testing and
production
● Separate VPCs for each environment
● Multiple user roles allows fine-grained
control of access
● VPN used as a further level to restrict
data access
Lessons learned: Integration and deployment
Continuous Integration
Once built, versioned artifacts are pushed to s3 buckets
Deployments
Ansible is used to roll out new versions of the
application and transformations
Infrastructure
Terraform controls the base infrastructure
● Deployments run in parallel across environments
● Docker image used for deployments to control
dependencies
Lessons Learnt: Automate Testing
● Big Data testing is hard
● Automated unit tests to ensure transformations are correct
○ We pair with our QA to generate the data, and validate the expected output for the unit tests
○ TDD-ish, but often testing done after development
● Automated Integration tests using a large data set
○ To ensure regressions haven’t occurred
● Manual end-to-end sanity tests
○ This should be automated in the future
● Manual exploratory testing
Problems to resolve
● Testing
○ Big Data testing is time consuming
■ Particularly around data generation
○ How to effectively automate testing of the infrastructure
○ How to automate end-to-end sanity testing.
● Infrastructure
○ CI/CD with Terraform
○ So many moving parts makes management difficult
● Ingestion and transformations
○ How to move from batch processing to incremental or streaming
○ Removing the database clones
● Effectively communicating to the business what/why we’re doing what we are
○ Why are things so slow?

More Related Content

What's hot

Webinar slides: DevOps Tutorial: how to automate your database infrastructure
Webinar slides: DevOps Tutorial: how to automate your database infrastructureWebinar slides: DevOps Tutorial: how to automate your database infrastructure
Webinar slides: DevOps Tutorial: how to automate your database infrastructure
Severalnines
 

What's hot (20)

Reinventing enterprise defense with the Elastic Stack
Reinventing enterprise defense with the Elastic StackReinventing enterprise defense with the Elastic Stack
Reinventing enterprise defense with the Elastic Stack
 
Build A Better Way to Deliver IT
Build A Better Way to Deliver ITBuild A Better Way to Deliver IT
Build A Better Way to Deliver IT
 
IPv17 sync17
IPv17 sync17IPv17 sync17
IPv17 sync17
 
Monitoring
MonitoringMonitoring
Monitoring
 
O monitoramento da infraestrutura facilitado, da ingestão ao insight
O monitoramento da infraestrutura facilitado, da ingestão ao insightO monitoramento da infraestrutura facilitado, da ingestão ao insight
O monitoramento da infraestrutura facilitado, da ingestão ao insight
 
InfluxDB + Telegraf Operator: Easy Kubernetes Monitoring
InfluxDB + Telegraf Operator: Easy Kubernetes MonitoringInfluxDB + Telegraf Operator: Easy Kubernetes Monitoring
InfluxDB + Telegraf Operator: Easy Kubernetes Monitoring
 
Automatize a detecção de ameaças e evite falsos positivos
Automatize a detecção de ameaças e evite falsos positivosAutomatize a detecção de ameaças e evite falsos positivos
Automatize a detecção de ameaças e evite falsos positivos
 
Sumit Goel - Monitoring Cloud Applications Using Zabbix | ZabConf2016
Sumit Goel - Monitoring Cloud Applications Using Zabbix | ZabConf2016Sumit Goel - Monitoring Cloud Applications Using Zabbix | ZabConf2016
Sumit Goel - Monitoring Cloud Applications Using Zabbix | ZabConf2016
 
Detection, Response and the Azazel Rootkit
Detection, Response and the Azazel RootkitDetection, Response and the Azazel Rootkit
Detection, Response and the Azazel Rootkit
 
Maplelabs scalable-field-device-cloud-native
Maplelabs scalable-field-device-cloud-nativeMaplelabs scalable-field-device-cloud-native
Maplelabs scalable-field-device-cloud-native
 
Architecture for Scale [AppFirst]
Architecture for Scale [AppFirst]Architecture for Scale [AppFirst]
Architecture for Scale [AppFirst]
 
Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...
Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...
Elastic APM: amplificação dos seus logs e métricas para proporcionar um panor...
 
OSMC 2017 | Icinga2 in a 24/7 Broadcast Environment by Dave Kempe
OSMC 2017 | Icinga2 in a 24/7 Broadcast Environment by Dave KempeOSMC 2017 | Icinga2 in a 24/7 Broadcast Environment by Dave Kempe
OSMC 2017 | Icinga2 in a 24/7 Broadcast Environment by Dave Kempe
 
How To Build Auto-Adaptive Machine Learning Models with Kubernetes
How To Build Auto-Adaptive Machine Learning Models with KubernetesHow To Build Auto-Adaptive Machine Learning Models with Kubernetes
How To Build Auto-Adaptive Machine Learning Models with Kubernetes
 
Using OPC-UA to Extract IIoT Time Series Data from PLC and SCADA Systems
Using OPC-UA to Extract IIoT Time Series Data from PLC and SCADA SystemsUsing OPC-UA to Extract IIoT Time Series Data from PLC and SCADA Systems
Using OPC-UA to Extract IIoT Time Series Data from PLC and SCADA Systems
 
Webinar slides: DevOps Tutorial: how to automate your database infrastructure
Webinar slides: DevOps Tutorial: how to automate your database infrastructureWebinar slides: DevOps Tutorial: how to automate your database infrastructure
Webinar slides: DevOps Tutorial: how to automate your database infrastructure
 
Automate threat detections and avoid false positives
Automate threat detections and avoid false positivesAutomate threat detections and avoid false positives
Automate threat detections and avoid false positives
 
DBOps
DBOpsDBOps
DBOps
 
[WSO2Con USA 2018] Microservices, Containers, and Beyond
[WSO2Con USA 2018] Microservices, Containers, and Beyond[WSO2Con USA 2018] Microservices, Containers, and Beyond
[WSO2Con USA 2018] Microservices, Containers, and Beyond
 
Historic Opportunities: Discover the Power of Ignition's Historian
Historic Opportunities: Discover the Power of Ignition's HistorianHistoric Opportunities: Discover the Power of Ignition's Historian
Historic Opportunities: Discover the Power of Ignition's Historian
 

Similar to Last Conference 2017: Big Data in a Production Environment: Lessons Learnt

DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and Projects
Fedir RYKHTIK
 

Similar to Last Conference 2017: Big Data in a Production Environment: Lessons Learnt (20)

Data Science in Production: Technologies That Drive Adoption of Data Science ...
Data Science in Production: Technologies That Drive Adoption of Data Science ...Data Science in Production: Technologies That Drive Adoption of Data Science ...
Data Science in Production: Technologies That Drive Adoption of Data Science ...
 
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
 
The journey to Native Cloud Architecture & Microservices, tracing the footste...
The journey to Native Cloud Architecture & Microservices, tracing the footste...The journey to Native Cloud Architecture & Microservices, tracing the footste...
The journey to Native Cloud Architecture & Microservices, tracing the footste...
 
Designing for operability and managability
Designing for operability and managabilityDesigning for operability and managability
Designing for operability and managability
 
Netflix Architecture and Open Source
Netflix Architecture and Open SourceNetflix Architecture and Open Source
Netflix Architecture and Open Source
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
 
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
 
Deploy Eclipse hawBit in Production
Deploy Eclipse hawBit in ProductionDeploy Eclipse hawBit in Production
Deploy Eclipse hawBit in Production
 
DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and Projects
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015
 
GRANT DELP724
GRANT DELP724GRANT DELP724
GRANT DELP724
 
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
Room 2 - 4 - Juncheng Anthony Lin - Redhat - A Practical Approach to Traditio...
 
From monolith to microservices
From monolith to microservicesFrom monolith to microservices
From monolith to microservices
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
Big Data on Cloud Native Platform
Big Data on Cloud Native PlatformBig Data on Cloud Native Platform
Big Data on Cloud Native Platform
 
DATA @ NFLX (Tableau Conference 2014 Presentation)
DATA @ NFLX (Tableau Conference 2014 Presentation)DATA @ NFLX (Tableau Conference 2014 Presentation)
DATA @ NFLX (Tableau Conference 2014 Presentation)
 
Workshop: Delivering chnages for applications and databases
Workshop: Delivering chnages for applications and databasesWorkshop: Delivering chnages for applications and databases
Workshop: Delivering chnages for applications and databases
 
PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge
 PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge
PLNOG19 - Piotr Marecki - Espresso: Scalable and Programmable Peering Edge
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 

Recently uploaded

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
anilsa9823
 

Recently uploaded (20)

Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 

Last Conference 2017: Big Data in a Production Environment: Lessons Learnt

  • 1. Featured Project: Marina Bay Sands Casino Resort, Singapore Connecting teams project-wide Big Data in a production environment: Lessons Learnt LAST Conference 2017 Mark Grebler - Aconex
  • 2. CONFIDENTIAL | 2 Featured Project: Marina Bay Sands Casino Resort, Singapore Connecting teams project-wide
  • 3. What does Big Data mean to you?
  • 4. Summary • What is the Insights project • Big Data for Data Science • Big Data in a production, user-facing environment • Lessons Learnt • Problems still to solve
  • 6. Highly flexible and customisable data model with low level concepts = useful for many types of projects Aconex has Flexible data The Insights Project
  • 7. Highly flexible and customisable data model with low level concepts = Difficult to produce meaningful customer reports Flexible Data Needs transformation The Insights Project
  • 8. What does it look like The Insights Project
  • 9. Typical Big Data Architectures
  • 10. Insights architecture Looks similar to the other architectures
  • 11. But the differences exist We use the AWS console to deploy new infrastructure I add new hardware by buying a new box and connecting it to the network Quotes from Data Engineers interviewed We deploy by copying the jar file to the cluster We don’t have any CI, I just build it on my box We test by running it over some data and ensuring it doesn’t crash We have some rudimentary tests
  • 12. What are the differences Other Big Data Projects ● Internal client ● Simple authentication ● For Data Scientists ● Single environment ○ Sometimes 2 or 3 ● Manual infrastructure management ● Sanity testing ● Manual integration ● Manual deployment ● Unrestricted data access Insights Project ● External client ● Integrated authentication ● For end users ● Multiple environments ○ Due to data sovereignty (10) ● Infrastructure as code ● Unit → end-to-end testing ● Continuous integration ● Single-step deployment ● Data access restrictions It’s not always so black and white, but the left side represents quite a lot of other projects I’ve seen.
  • 13. Lessons Learnt ● VPN to control data access ● Autoscaling application server ● Network independence ● Zero downtime-deployments with automatic rollback ○ ElasticBeanstalk provides this
  • 14. Lessons learned: Infrastructure-as-code ● Must be easily reproducible because we need to do it 10+ times ● Automation of infrastructure management ○ Infrastructure is a core part of the Big Data project, so it must be treated as important as our application code ○ Terraform is used to manage the infrastructure, including: ■ Networking and VPN management ■ Security ■ Provisioning VMs and other infrastructure ■ Replication and ingestion of data from Data Centres ■ Database Administration and Automation
  • 15. Lessons learned: Access segregation ● Different accounts for testing and production ● Separate VPCs for each environment ● Multiple user roles allows fine-grained control of access ● VPN used as a further level to restrict data access
  • 16. Lessons learned: Integration and deployment Continuous Integration Once built, versioned artifacts are pushed to s3 buckets Deployments Ansible is used to roll out new versions of the application and transformations Infrastructure Terraform controls the base infrastructure ● Deployments run in parallel across environments ● Docker image used for deployments to control dependencies
  • 17. Lessons Learnt: Automate Testing ● Big Data testing is hard ● Automated unit tests to ensure transformations are correct ○ We pair with our QA to generate the data, and validate the expected output for the unit tests ○ TDD-ish, but often testing done after development ● Automated Integration tests using a large data set ○ To ensure regressions haven’t occurred ● Manual end-to-end sanity tests ○ This should be automated in the future ● Manual exploratory testing
  • 18. Problems to resolve ● Testing ○ Big Data testing is time consuming ■ Particularly around data generation ○ How to effectively automate testing of the infrastructure ○ How to automate end-to-end sanity testing. ● Infrastructure ○ CI/CD with Terraform ○ So many moving parts makes management difficult ● Ingestion and transformations ○ How to move from batch processing to incremental or streaming ○ Removing the database clones ● Effectively communicating to the business what/why we’re doing what we are ○ Why are things so slow?

Editor's Notes

  1. Who's had a house built for them, built their own house, or organised a significant renovation? How many documents were needed? How many conversations were had? Think of the number of documents to build a skyscraper, or a refinery, etc.
  2. Looks similar. From the outside, no real differences.