Data Science - Delivered Continuously - XConf 2017

Christian Deger
Christian DegerChief Architect at RIO - The Logistic Flow
Arif Wider & Christian Deger
DATA SCIENCE,
DELIVERED CONTINUOUSLY
A CONFERENCE ALL ABOUT TECHNOLOGY
Christian Deger
Chief Architect
cdeger@autoscout24.com
@cdeger
Arif Wider
Senior Consultant/Developer
awider@thoughtworks.com
@arifwider
PL
S
RUS
UA
RO
CZ
D
NL
B
F
A
HR
I
E
BG
TR
18countries
2.4m+cars & motos
10m+users per
month
The task: A consumer-facing data product
6XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
The task: A consumer-facing data product
7XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
The task: A consumer-facing data product
8XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
The prediction model: Random forest
9
Volkswagen GolfCar listings of
last two years
XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
How to turn an R-based prediction model
into a high-performance web application?
10
?
XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
How to turn an R-based prediction model
into a high-performance web application?
11XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
How to turn an R-based prediction model
into a high-performance web application?
12XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
How to turn an R-based prediction model
into a high-performance web application?
13
 Continuous Delivery!
XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
Application code in
one repository per
service.
CI
Deployment package
as artifact.
CD
Deliver package to
servers
Typical delivery pipeline
XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
Continuous delivery pipelines
15
Prediction Model Pipeline
XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
Continuous delivery pipelines
16
Prediction Model Pipeline
Web Application Pipeline
XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
The price for CD: Extensive model validation
17XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
The price for CD: Extensive model validation
18XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
Lessons learned
19
Form a cross-functional team of
data scientists & software engineers!
Software engineers
… learn how data scientists work
… and understand the quirks of a prediction model
Data Scientist
… learn about unit testing, stable interfaces, git, etc.
... get quick feedback about the impact of their work
 Model and product iterations become much faster!
XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
Lessons learned
20
Generating gigabytes of Java code
is a challenge for the JVM
Use the G1 garbage collector
Turn off Tiered Compilation
 Do extensive warm-ups
XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
Lessons learned – Warm up
21XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
Lessons learned
22
The approach of applying Continuous Delivery to
Data Science is useful independently of the tech
 Successfully applied similarly to a Python- and
Spark-based project
 Even more useful when quick model evolution
is required because of rapidly changing inputs
(e.g. user interaction)
XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
Conclusions
23
 Continuous Delivery allows us to bring prediction
model changes live very quickly.
 Only extensive automated end-to-end tests
provide confidence to deploy to production
automatically.
 Java code generation allows for very low response
times and excellent scalability for high loads but
requires plenty of memory.
XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
Conclusions: Price evaluation everywhere
24XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
Conclusions: Price evaluation everywhere
25XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
QUESTIONS?
THANK YOU
Arif Wider & Christian Deger
@arifwider @cdeger
1 of 26

Recommended

Building A Cloud-Native Advanced Logistics Ecosystem by
Building A Cloud-Native Advanced Logistics EcosystemBuilding A Cloud-Native Advanced Logistics Ecosystem
Building A Cloud-Native Advanced Logistics EcosystemChristian Deger
1.5K views99 slides
Cloud native Continuous Delivery by
Cloud native Continuous DeliveryCloud native Continuous Delivery
Cloud native Continuous DeliveryChristian Deger
589 views89 slides
Cloud native Continuous Delivery by
Cloud native Continuous DeliveryCloud native Continuous Delivery
Cloud native Continuous DeliveryChristian Deger
926 views89 slides
GOTO Amsterdam 2017 - Enterprise Fast Lane by
GOTO Amsterdam 2017 - Enterprise Fast LaneGOTO Amsterdam 2017 - Enterprise Fast Lane
GOTO Amsterdam 2017 - Enterprise Fast LaneChristian Deger
640 views49 slides
AWS Cloud For Breakfast - Building Microservices in the Cloud by
AWS Cloud For Breakfast - Building Microservices in the CloudAWS Cloud For Breakfast - Building Microservices in the Cloud
AWS Cloud For Breakfast - Building Microservices in the CloudChristian Deger
876 views43 slides
GOTO Berlin 2016 by
GOTO Berlin 2016GOTO Berlin 2016
GOTO Berlin 2016Christian Deger
621 views50 slides

More Related Content

More from Christian Deger

Microservices in der Cloud - Software Architecture Summit Berlin 2016 by
Microservices in der Cloud - Software Architecture Summit Berlin 2016Microservices in der Cloud - Software Architecture Summit Berlin 2016
Microservices in der Cloud - Software Architecture Summit Berlin 2016Christian Deger
982 views96 slides
Building Microservices in the cloud - GOTO Nights Berlin 2016 by
Building Microservices in the cloud - GOTO Nights Berlin 2016Building Microservices in the cloud - GOTO Nights Berlin 2016
Building Microservices in the cloud - GOTO Nights Berlin 2016Christian Deger
881 views47 slides
Building Microservices in the cloud - Software Architecture Summit 2016 by
Building Microservices in the cloud - Software Architecture Summit 2016Building Microservices in the cloud - Software Architecture Summit 2016
Building Microservices in the cloud - Software Architecture Summit 2016Christian Deger
1.6K views47 slides
Microservices in the cloud at AutoScout24 by
Microservices in the cloud at AutoScout24Microservices in the cloud at AutoScout24
Microservices in the cloud at AutoScout24Christian Deger
1.4K views42 slides
Highway to heaven - Microservices Meetup Dublin by
Highway to heaven - Microservices Meetup DublinHighway to heaven - Microservices Meetup Dublin
Highway to heaven - Microservices Meetup DublinChristian Deger
1.1K views30 slides
Building Microservices in the cloud at AutoScout24 by
Building Microservices in the cloud at AutoScout24Building Microservices in the cloud at AutoScout24
Building Microservices in the cloud at AutoScout24Christian Deger
1.2K views29 slides

More from Christian Deger(10)

Microservices in der Cloud - Software Architecture Summit Berlin 2016 by Christian Deger
Microservices in der Cloud - Software Architecture Summit Berlin 2016Microservices in der Cloud - Software Architecture Summit Berlin 2016
Microservices in der Cloud - Software Architecture Summit Berlin 2016
Christian Deger982 views
Building Microservices in the cloud - GOTO Nights Berlin 2016 by Christian Deger
Building Microservices in the cloud - GOTO Nights Berlin 2016Building Microservices in the cloud - GOTO Nights Berlin 2016
Building Microservices in the cloud - GOTO Nights Berlin 2016
Christian Deger881 views
Building Microservices in the cloud - Software Architecture Summit 2016 by Christian Deger
Building Microservices in the cloud - Software Architecture Summit 2016Building Microservices in the cloud - Software Architecture Summit 2016
Building Microservices in the cloud - Software Architecture Summit 2016
Christian Deger1.6K views
Microservices in the cloud at AutoScout24 by Christian Deger
Microservices in the cloud at AutoScout24Microservices in the cloud at AutoScout24
Microservices in the cloud at AutoScout24
Christian Deger1.4K views
Highway to heaven - Microservices Meetup Dublin by Christian Deger
Highway to heaven - Microservices Meetup DublinHighway to heaven - Microservices Meetup Dublin
Highway to heaven - Microservices Meetup Dublin
Christian Deger1.1K views
Building Microservices in the cloud at AutoScout24 by Christian Deger
Building Microservices in the cloud at AutoScout24Building Microservices in the cloud at AutoScout24
Building Microservices in the cloud at AutoScout24
Christian Deger1.2K views
Highway to heaven - Voxxed Days Belgrade by Christian Deger
Highway to heaven - Voxxed Days BelgradeHighway to heaven - Voxxed Days Belgrade
Highway to heaven - Voxxed Days Belgrade
Christian Deger390 views
Highway to heaven - Microservices Meetup Berlin by Christian Deger
Highway to heaven - Microservices Meetup BerlinHighway to heaven - Microservices Meetup Berlin
Highway to heaven - Microservices Meetup Berlin
Christian Deger544 views
Highway to heaven - XConf Manchester 2015 by Christian Deger
Highway to heaven - XConf Manchester 2015Highway to heaven - XConf Manchester 2015
Highway to heaven - XConf Manchester 2015
Christian Deger1.3K views
Highway to heaven - Microservices Meetup Munich by Christian Deger
Highway to heaven - Microservices Meetup MunichHighway to heaven - Microservices Meetup Munich
Highway to heaven - Microservices Meetup Munich
Christian Deger1.5K views

Recently uploaded

AvizoImageSegmentation.pptx by
AvizoImageSegmentation.pptxAvizoImageSegmentation.pptx
AvizoImageSegmentation.pptxnathanielbutterworth1
6 views14 slides
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ... by
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...DataScienceConferenc1
10 views18 slides
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init... by
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...DataScienceConferenc1
5 views18 slides
Ukraine Infographic_22NOV2023_v2.pdf by
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdfAnastosiyaGurin
1.4K views3 slides
Chapter 3b- Process Communication (1) (1)(1) (1).pptx by
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptxayeshabaig2004
8 views30 slides
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks by
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks
[DSC Europe 23] Aleksandar Tomcic - Adversarial AttacksDataScienceConferenc1
5 views20 slides

Recently uploaded(20)

[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ... by DataScienceConferenc1
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh ...
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init... by DataScienceConferenc1
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
Ukraine Infographic_22NOV2023_v2.pdf by AnastosiyaGurin
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdf
AnastosiyaGurin1.4K views
Chapter 3b- Process Communication (1) (1)(1) (1).pptx by ayeshabaig2004
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptx
ayeshabaig20048 views
[DSC Europe 23] Ales Gros - Quantum and Today s security with Quantum.pdf by DataScienceConferenc1
[DSC Europe 23] Ales Gros - Quantum and Today s security with Quantum.pdf[DSC Europe 23] Ales Gros - Quantum and Today s security with Quantum.pdf
[DSC Europe 23] Ales Gros - Quantum and Today s security with Quantum.pdf
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx by DataScienceConferenc1
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M... by DataScienceConferenc1
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f... by DataScienceConferenc1
[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f...[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f...
[DSC Europe 23] Matteo Molteni - Implementing a Robust CI Workflow with dbt f...
Listed Instruments Survey 2022.pptx by secretariat4
Listed Instruments Survey  2022.pptxListed Instruments Survey  2022.pptx
Listed Instruments Survey 2022.pptx
secretariat452 views
LIVE OAK MEMORIAL PARK.pptx by ms2332always
LIVE OAK MEMORIAL PARK.pptxLIVE OAK MEMORIAL PARK.pptx
LIVE OAK MEMORIAL PARK.pptx
ms2332always7 views
Data Journeys Hard Talk workshop final.pptx by info828217
Data Journeys Hard Talk workshop final.pptxData Journeys Hard Talk workshop final.pptx
Data Journeys Hard Talk workshop final.pptx
info82821711 views
CRM stick or twist.pptx by info828217
CRM stick or twist.pptxCRM stick or twist.pptx
CRM stick or twist.pptx
info82821711 views
Short Story Assignment by Kelly Nguyen by kellynguyen01
Short Story Assignment by Kelly NguyenShort Story Assignment by Kelly Nguyen
Short Story Assignment by Kelly Nguyen
kellynguyen0120 views

Data Science - Delivered Continuously - XConf 2017

  • 1. Arif Wider & Christian Deger DATA SCIENCE, DELIVERED CONTINUOUSLY
  • 2. A CONFERENCE ALL ABOUT TECHNOLOGY
  • 6. The task: A consumer-facing data product 6XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 7. The task: A consumer-facing data product 7XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 8. The task: A consumer-facing data product 8XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 9. The prediction model: Random forest 9 Volkswagen GolfCar listings of last two years XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 10. How to turn an R-based prediction model into a high-performance web application? 10 ? XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 11. How to turn an R-based prediction model into a high-performance web application? 11XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 12. How to turn an R-based prediction model into a high-performance web application? 12XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 13. How to turn an R-based prediction model into a high-performance web application? 13  Continuous Delivery! XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 14. Application code in one repository per service. CI Deployment package as artifact. CD Deliver package to servers Typical delivery pipeline XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 15. Continuous delivery pipelines 15 Prediction Model Pipeline XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 16. Continuous delivery pipelines 16 Prediction Model Pipeline Web Application Pipeline XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 17. The price for CD: Extensive model validation 17XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 18. The price for CD: Extensive model validation 18XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 19. Lessons learned 19 Form a cross-functional team of data scientists & software engineers! Software engineers … learn how data scientists work … and understand the quirks of a prediction model Data Scientist … learn about unit testing, stable interfaces, git, etc. ... get quick feedback about the impact of their work  Model and product iterations become much faster! XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 20. Lessons learned 20 Generating gigabytes of Java code is a challenge for the JVM Use the G1 garbage collector Turn off Tiered Compilation  Do extensive warm-ups XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 21. Lessons learned – Warm up 21XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 22. Lessons learned 22 The approach of applying Continuous Delivery to Data Science is useful independently of the tech  Successfully applied similarly to a Python- and Spark-based project  Even more useful when quick model evolution is required because of rapidly changing inputs (e.g. user interaction) XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 23. Conclusions 23  Continuous Delivery allows us to bring prediction model changes live very quickly.  Only extensive automated end-to-end tests provide confidence to deploy to production automatically.  Java code generation allows for very low response times and excellent scalability for high loads but requires plenty of memory. XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 24. Conclusions: Price evaluation everywhere 24XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 25. Conclusions: Price evaluation everywhere 25XConf ’17 Hamburg/Manchester Data Science, Delivered Continuously – A. Wider & C. Deger
  • 26. QUESTIONS? THANK YOU Arif Wider & Christian Deger @arifwider @cdeger

Editor's Notes

  1. Hi everybody, thanks a lot for being with us at XConf, thanks for attending this talk
  2. A This is Christian, he is now AutoScout24‘s chief architect but he actually joined AutoScout as a mere Developer, and then first became a team lead, before he moved into a role of a Coding Architect. At AutoScout we as TW have worked a lot with Christian and I think I can say that we‘ve enjoyed each others company quite a bit
  3. C is a developer at TW Germany where Scala is his language of choice, particularly in the context of Big Data applications Before joining TW he has been in academia doing research on applying FP techniques to data synchronization
  4. C AutoScout24 is the largest online car marketplace Europe-wide, with roughly 2.4 million listings on the platform, which means that they have a lot of data about how cars are sold.
  5. A AutoScout has a lot of data about how cars are sold and at what prices. - Now our task was to turn all this data into something actually useful for the end user of the page. - So our task was to create a consumer-facing data product where users can quickly estimate the current value of their car. This works as follows… basic information about the car
  6. A Optionally indicate equipment and condition
  7. A You get a price range
  8. A - What we had when we started working on this was a prediction model because that‘s what the data scientists at AutoScout had already build, the language they used for it was R, and the approach being used for that is called random forest. - Who of you has heard of Random Forest before? - Let‘s have look how this works: The data of the last two years is used to train a prediction model, and what you get out of training are many of such decision trees. - RF is the algorithm that decides … and it is a technique to work agains overfitting, i.e., producing a prediction model that only works on the training data.
  9. C - But our task was to turn this model into a high-performance web application. - And in fact, that is not so common yet in the context of data science, because often, such data is only used for internal decision purposes. - But if you want to create a user facing application, you have a very different situation, where you have to deal with load peaks etc. - And that was also the reason why we ruled out to run an R server in production pretty early. The problem is that R, at least in its open source version does not support multi-threading, thus, scaling for many concurrent requests is extremely difficult.
  10. C Traditional approach that we still see quite often: model is developed by data scientists in some language that suits their way of working best, e.g., R, and then in order to get a good performance, software engineers translate… For linear regression, reimplementing the model is not a big problem. Data scientists already changed the model once.
  11. C - However, with this manual approach, what do you do, if the internal structure of the prediction model changes? If a software engineer has to reimplement theses changes, it first of all takes a long time, and also mistakes can be introduced in that translation. For example changed from a random forest to gradient boosted machines.
  12. A - We therefore looked on how we can automate this and the technology that helped us with that was H2O. - Has anybody heard of H2O? - It‘s a Java based analytics engine that can be programmed using R (which the data scientists liked) and, and that was the important piece for us, provides the possibility to export your fully trained prediction model as Java source code. - This then allowed us to integrate this model generation into a continuous delivery pipeline.
  13. C Commit stage: Unit tests etc. Additional database migration scripts. Blue/ Green delivery on the instance.
  14. C This looks as follows: … - Then Java code is generated, actually in our case gigabytes of Java source code, which is then compiled into a JAR which is uploaded to AWS S3. This the prediction model pipeline. - Now, whenever something is changed in the R-based configuration or, at least as important, when the model should be updated using the latest data from the platform, a new model JAR gets generated automatically, and is deployed to S3.
  15. C - Now for the web application, which we implemented in Scala using the Play Framework, there is another CD pipeline. - This pipeline also generates a JAR, the application JAR which is then deployed to AWS EC2. - Now, everytime when deploying this application, the pipeline also pulls the latest prediction model from S3 and then loads both into the same JVM. - But also when the model is updated, this triggers a redeployment of the web application with the newest model. - This way all prediction model changes made by the data scientists go straight to production and users can immediately benefit.
  16. A - However, this only works, if you have enough confidence to do so. - Therefore we build an extensive model validation workflow. - Let‘s start with how a model is usually trained and how the success of the training is evaluated. - You, that is the data scientist, divides the existing historical data into training data and test data, and those two sets need to be disjunct. Then the model is trained using the training data. - The test data is then used to create test estimations and these results are compared with the actual price in the test data. - Will never be exactly the same but indicate how good the model is. - We want to validate how the model reacts to new data.
  17. A - Further down the pipeline we use these test estimation results for a comprehensive end-to-end model validation. - That means we check whether the JAR that was created by compiling the generated Java code gives us exactly the same results as directly asking the model that was created by the data scientists. - Furthermore, we also check whether this model fufills all the expectations that our web application poses on it. - This is called a consumer-driven contract test (CDC), the web application in this case is the consumer of the model. Only if all those things are green, we release to production.
  18. C
  19. A
  20. C The warmup time becomes especially problematic during an incident. Time to recover is drastically increased. You also need to configure your autoscaling to take the period of high load into account.
  21. A
  22. A
  23. C Price suggestions during listing creation.
  24. C Price labels on list and detail pages for fair price, good price and top price. Price prediction done on listing creation or change.