SlideShare a Scribd company logo
1 of 25
made with from innovation lab
AI Hackathon.
Team: Vera Ekimenko
made with from innovation lab
Technology Stack.
Spark for data wrangling
Spark ML Random Forests for the model
HTMLUnit for web scrapping
made with from innovation lab
Approach - Phases
Phases:
1. 0 days before Departure (all given data) <= used for evaluation
2. 2 days before Departure
3. 5 days before Departure
4. 14 days before Departure
5. 100 days before Departure
6. 0 days after Booking
7. 7 days after Booking
Departure
Booking 7 days
Evaluation Date
made with from innovation lab
Approach - Feature Engineering
Booking Departure
First Last
Event types:
- 1. TLT action
- 2. Passenger details
- 3. Payments
- 4. Service requests
- 5. Tickets issue
Dates
• Every
event has
a date
Duration
since/prior
• Every
range has
2 dates
Number of
days
• Every
action has
4 time
features
Counts
• Every
action is
an event
Occurrence
since/prior
• Calculate
how many
times
occurred
Sum of
occurrence
• Every
action has
2 count
features
• The number of days between the booking and the
first/last addition of individual passenger details
• The number of days between the first/last payments and
the departure
• The number of additions of TLT record
made with from innovation lab
Travel types based on segments analysis
MELDXB-DXBMAA-CCUDXB-DXBMEL
SINDXB-DXBATH-ATHDXB-DXBSIN
WAWDXB-DXBIKA-DXBWAW
DACDXB-DXBBAH
55%
37%
4%
5%
Two
destinations
One way
Disperse
One
destination
made with from innovation lab
External data
Passport
Index
TCdata 360Holidays
Airports
USA travel
advisories
Segments
Destination 1 Destination 2Boarding Point
Departure Date
made with from innovation lab
External data – airports
made with from innovation lab
External data – TC data 360
Travel and Tourism Competitiveness Report 2017
made with from innovation lab
External data – Passport Index
Visa requirements by country
made with from innovation lab
External data – Holidays
made with from innovation lab
External data – USA travel advisories
made with from innovation lab
Performance
made with from innovation lab
Accuracy
0%
20%
40%
60%
80%
100%
120%
With external data
Without external data
made with from innovation lab
External data boost
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
7.00%
Accuracy PR AUC ROC AUC
made with from innovation lab
Feature Importance for different phases - 1
made with from innovation lab
Feature Importance for different phases - 2
made with from innovation lab
Scalability
airport
•Country data
•Politics
organizator
•Travel agency
•Regular group
travellers
season
•Christmas
•Events
Technical scalability Features scalability
made with from innovation lab
Self-learning
• Selected
features
• Train data
Initial model
• Monitor model
deterioration
• Re-generate the
adjusted model
Test accuracy
nightly • New features
• New data
Adjustments
made with from innovation lab
Reasons why it’s the best solution
• Native for HELIX -> easy to deploy and maintain
• Low maintenance -> easy to update the model
• Fast and scalable -> evaluate group bookings nightly with no fuzz
• Good foundation for other models -> Recommender for new stations
• Pluggable -> can be used to enrich the existing models
• Transparency -> Easy to communicate non-tech how the model works with
Feature Importance
• Robustness -> the model works even if data quality is not perfect
made with from innovation lab
Annex (1) – Full list of features used in the model
PAX - Total number of passenger in the group
Pcc2City - 1 = PCC equals to City
IsGWS - 1 = GWS_ID is not empty
DepDateDays - the number of days between the booking and the departure
NumberSegments - the number of segments booked
BookingToRequestDays - the number of days between the request and the booking
RequestCreatedPriorDays - the number of days between the request and the departure
IndPaxAdditionFromDays - the number of days between the booking and the first addition of individual passenger
details
IndPaxAdditionFromPriorDays - the number of days between the first addition of individual passenger details and
the departure
IndPaxAdditionToDays - the number of days between the last addition of individual passenger details
IndPaxAdditionToPriorDays - the number of days between the last addition of individual passenger details and the
departure
IndividualPaxAdditionSum - the number of addition of individual passenger details
IndividualPaxRemovalSum - the number of removal of individual passenger details
PaymentsFromDays - the number of days between the booking and the first payment
PaymentsFromPriorDays - the number of days between the first of payment and the departure
PaymentsToDays - the number of days between the last of payment
PaymentsToPriorDays - the number of days between the last of payment and the departure
IncludedCSR - The payment is included in the sales report
made with from innovation lab
Annex (2) – Full list of features used in the model
TLTActionDateAdditionsFromDays - the number of days between the booking and the first addition of TLT record
TLTActionDateAdditionsFromPriorDays - the number of days between the first addition of TLT record and the
departure
TLTActionDateAdditionsToDays - the number of days between the last addition of TLT record
TLTActionDateAdditionsToPriorDays - the number of days between the last addition of TLT record and the
departure
TLTActionDateRemovalFromDays - the number of days between the booking and the first removal of TLT record
TLTActionDateRemovalFromPriorDays - the number of days between the first removal of TLT record and the
departure
TLTActionDateRemovalToDays - the number of days between the last removal of TLT record
TLTActionDateRemovalToPriorDays - the number of days between the last removal of TLT record and the departure
TLTAdditionsSum - the number of addition of TLT record
TLTRemovalSum - the number of removal of TLT record
ServicesCount - The number of the service requests added
isonedest - The journey has one destination
ismultidest - The journey has two destinations
ismultirtn - The journey has multiple returning points
isoneleg - The journey is one way
isgathering - The journey has multiple boarding points
made with from innovation lab
Annex (3) – Full list of features used in the model
ServicesDepartureDateMinDays - the number of days between the booking and the earliest departure date for the added
services
ServicesDepartureDateMinPriorDays - the number of days between the earliest departure date for the added services and the
departure
FirstDepDateMonth - the months of the departure
FirstDepDateDay - the day of the departure
IsIATA - 1=The agent is a member of IATA
ACCEPTED_GROUP_SIZE - The accepted group size
ActuallyGivenAndAccepted - The difference between the accepted group size and total number of passenger
KidPerAdult - The number of infants and children passengers per adult passenger
NotAcceptedAdult - The difference between the requested number of number of adult passengers and accepted number of adult
passengers
NotAcceptedChild - The difference between the requested number of number of child passengers and accepted number of child
passengers
NotAcceptedInfant - The difference between the requested number of number of infant passengers and accepted number of
infant passengers
ACC_ADULT - The accepted number of adult passengers
ACC_CHILD - The accepted number of child passengers
ACC_INF - The accepted number of infant passengers
made with from innovation lab
Annex (4) – Full list of features used in the model
airport_infrastructure - The difference in levels of airport infrastructure in the first destination country
business_environment - The difference in levels of business environment in the first destination country
culresources_bustravel - The difference in levels of cultural resources and business travel in the first destination country
enabling_environment - The difference in levels of enabling environment in the first destination country
environmental_sustainability - The difference in levels of environmental sustainability in the first destination country
tourism_priority - The difference in levels of tourism priority in the first destination country
ground_port_infrastructure - The difference in levels of ground port infrastructure in the first destination country
health_hygiene - The difference in levels of health hygiene in the first destination country
labor_market - The difference in levels of labour market in the first destination country
infrastructure_subindex - The difference in levels of infrastructure sub-index in the first destination country
international_openness - The difference in levels of international openness in the first destination country
natural_cultural_resources - The difference in levels of natural and cultural resources in the first destination country
natural_resources - The difference in levels of natural resources in the first destination country
price_competitiveness - The difference in levels of price competitiveness in the first destination country
safety_security - The difference in levels of safety and security in the first destination country
tourist_infrastructure - The difference in levels of tourist infrastructure in the first destination country
travel_ict_readiness - The difference in levels of travel and tourism ICT readiness in the first destination country
travel_policy - The difference in levels of travel policy in the first destination country
travel_competitiveness - The difference in levels of Travel and Tourism policy and enabling conditions in the first destination
country
made with from innovation lab
Annex (5) – Full list of features used in the model
dest1_passport_requirements - The level of difficulty to get a visa to the first destination country
dest2_passport_requirements - The level of difficulty to get a visa to the second destination country if any
dest1_distance - The geographical distance between the boarding point and the first destination
dest1_timediff - The time lag between the boarding point and the first destination
adv_dest1_levelNN - USA travel advisory level for the first destination country
adv_dest2_levelNN - USA travel advisory level for the second destination country if any
AroundHoliday - The departure date is a public holiday (+/- 3 days) in the original country
AroundWeekend2 - The departure date is a weekend (+ / - 1 day) in the original country
countriesOHE - The original country
made with from innovation lab
Thank You

More Related Content

What's hot

[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdfChris Hoyean Song
 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practicesLior Sidi
 
Real Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik RamasamyReal Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik RamasamyData Con LA
 
JavaScript - Chapter 12 - Document Object Model
  JavaScript - Chapter 12 - Document Object Model  JavaScript - Chapter 12 - Document Object Model
JavaScript - Chapter 12 - Document Object ModelWebStackAcademy
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Powering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraphPowering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraphScyllaDB
 
Uber Cadence Fault Tolerant Actor Framework
Uber Cadence Fault Tolerant Actor FrameworkUber Cadence Fault Tolerant Actor Framework
Uber Cadence Fault Tolerant Actor FrameworkMaxim Fateev
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems confluent
 
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...confluent
 
Client-side JavaScript
Client-side JavaScriptClient-side JavaScript
Client-side JavaScriptLilia Sfaxi
 
Twitter Heron in Practice
Twitter Heron in PracticeTwitter Heron in Practice
Twitter Heron in PracticeBill Graham
 
KFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature StoreKFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature StoreDatabricks
 
Monitoring Microservices
Monitoring MicroservicesMonitoring Microservices
Monitoring MicroservicesWeaveworks
 
IoT Toolkit and the Smart Object API Tutorial Introduction
IoT Toolkit and the Smart Object API Tutorial IntroductionIoT Toolkit and the Smart Object API Tutorial Introduction
IoT Toolkit and the Smart Object API Tutorial IntroductionMichael Koster
 
Applications secure by default
Applications secure by defaultApplications secure by default
Applications secure by defaultSecuRing
 
Global Data Science Platform : Platform for AI Democratization
Global Data Science Platform : Platform for AI DemocratizationGlobal Data Science Platform : Platform for AI Democratization
Global Data Science Platform : Platform for AI DemocratizationRakuten Group, Inc.
 

What's hot (20)

[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practices
 
Real Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik RamasamyReal Time Processing Using Twitter Heron by Karthik Ramasamy
Real Time Processing Using Twitter Heron by Karthik Ramasamy
 
JavaScript - Chapter 12 - Document Object Model
  JavaScript - Chapter 12 - Document Object Model  JavaScript - Chapter 12 - Document Object Model
JavaScript - Chapter 12 - Document Object Model
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Powering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraphPowering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraph
 
Uber Cadence Fault Tolerant Actor Framework
Uber Cadence Fault Tolerant Actor FrameworkUber Cadence Fault Tolerant Actor Framework
Uber Cadence Fault Tolerant Actor Framework
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
 
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...
Pinterest’s Story of Streaming Hundreds of Terabytes of Pins from MySQL to S3...
 
Client-side JavaScript
Client-side JavaScriptClient-side JavaScript
Client-side JavaScript
 
Twitter Heron in Practice
Twitter Heron in PracticeTwitter Heron in Practice
Twitter Heron in Practice
 
KFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature StoreKFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature Store
 
Cloud Monitoring tool Grafana
Cloud Monitoring  tool Grafana Cloud Monitoring  tool Grafana
Cloud Monitoring tool Grafana
 
Monitoring Microservices
Monitoring MicroservicesMonitoring Microservices
Monitoring Microservices
 
IoT Toolkit and the Smart Object API Tutorial Introduction
IoT Toolkit and the Smart Object API Tutorial IntroductionIoT Toolkit and the Smart Object API Tutorial Introduction
IoT Toolkit and the Smart Object API Tutorial Introduction
 
Pushed Authorization Requests
Pushed Authorization RequestsPushed Authorization Requests
Pushed Authorization Requests
 
Applications secure by default
Applications secure by defaultApplications secure by default
Applications secure by default
 
Grafana 7.0
Grafana 7.0Grafana 7.0
Grafana 7.0
 
Global Data Science Platform : Platform for AI Democratization
Global Data Science Platform : Platform for AI DemocratizationGlobal Data Science Platform : Platform for AI Democratization
Global Data Science Platform : Platform for AI Democratization
 
HazelCast
HazelCastHazelCast
HazelCast
 

Similar to Artificial Intelligence Hackathon

Unit 2 hci in software process
Unit 2   hci in software processUnit 2   hci in software process
Unit 2 hci in software processRoselin Mary S
 
Participatory Project
Participatory ProjectParticipatory Project
Participatory Project#Xiao Zhe#
 
IRJET- Intelligent Traffic Management System
IRJET- Intelligent Traffic Management SystemIRJET- Intelligent Traffic Management System
IRJET- Intelligent Traffic Management SystemIRJET Journal
 
Harnessing business intelligence and big data. Is collaboration the key to su...
Harnessing business intelligence and big data. Is collaboration the key to su...Harnessing business intelligence and big data. Is collaboration the key to su...
Harnessing business intelligence and big data. Is collaboration the key to su...SITA
 
m-government & android application development
m-government & android application developmentm-government & android application development
m-government & android application developmentArif Huda
 
orashid_2016_New
orashid_2016_Neworashid_2016_New
orashid_2016_NewOmar Rashid
 
orashid_2016_New
orashid_2016_Neworashid_2016_New
orashid_2016_NewOmar Rashid
 
Airline Flight Schedule Notification Application (AFSNA)
Airline Flight Schedule Notification Application (AFSNA)Airline Flight Schedule Notification Application (AFSNA)
Airline Flight Schedule Notification Application (AFSNA)IRJET Journal
 
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNING
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNINGTRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNING
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNINGIRJET Journal
 
Commercial Drone Best Practices: How to Incorporate Data and Job Specs
Commercial Drone Best Practices: How to Incorporate Data and Job SpecsCommercial Drone Best Practices: How to Incorporate Data and Job Specs
Commercial Drone Best Practices: How to Incorporate Data and Job SpecsColin Snow
 
Top Ten Skills 2020 (Logistics & Distribution)
Top Ten Skills 2020 (Logistics & Distribution)Top Ten Skills 2020 (Logistics & Distribution)
Top Ten Skills 2020 (Logistics & Distribution)Daoud Qaazy
 
Big Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at UberBig Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at UberSudhir Tonse
 
E-TICKETING ON RAILWAY TICKET RESERVATION
E-TICKETING ON RAILWAY TICKET RESERVATIONE-TICKETING ON RAILWAY TICKET RESERVATION
E-TICKETING ON RAILWAY TICKET RESERVATIONNandana Priyanka Eluri
 
Big Data Analytics and Artifical Intelligence
Big Data Analytics and Artifical IntelligenceBig Data Analytics and Artifical Intelligence
Big Data Analytics and Artifical IntelligenceAnand Narayanan
 
Determination and visualization of density210409
Determination and visualization of density210409 Determination and visualization of density210409
Determination and visualization of density210409 Kenji Sugihara
 
Driving Efficiency with Splunk Cloud at Gatwick Airport
Driving Efficiency with Splunk Cloud at Gatwick AirportDriving Efficiency with Splunk Cloud at Gatwick Airport
Driving Efficiency with Splunk Cloud at Gatwick AirportSplunk
 
Strategic Location Analysis for setting up of Manufacturing Facility
Strategic Location Analysis for setting up of Manufacturing FacilityStrategic Location Analysis for setting up of Manufacturing Facility
Strategic Location Analysis for setting up of Manufacturing FacilityDragon Sourcing
 
Strategic Selection of FTZ for setting up a new manufacturing facility for a ...
Strategic Selection of FTZ for setting up a new manufacturing facility for a ...Strategic Selection of FTZ for setting up a new manufacturing facility for a ...
Strategic Selection of FTZ for setting up a new manufacturing facility for a ...John William
 

Similar to Artificial Intelligence Hackathon (20)

Unit 2 hci in software process
Unit 2   hci in software processUnit 2   hci in software process
Unit 2 hci in software process
 
Participatory Project
Participatory ProjectParticipatory Project
Participatory Project
 
Furuyama - analysis of factors that affect productivity
Furuyama - analysis of factors that affect productivityFuruyama - analysis of factors that affect productivity
Furuyama - analysis of factors that affect productivity
 
Project synopsis.
Project synopsis.Project synopsis.
Project synopsis.
 
IRJET- Intelligent Traffic Management System
IRJET- Intelligent Traffic Management SystemIRJET- Intelligent Traffic Management System
IRJET- Intelligent Traffic Management System
 
Harnessing business intelligence and big data. Is collaboration the key to su...
Harnessing business intelligence and big data. Is collaboration the key to su...Harnessing business intelligence and big data. Is collaboration the key to su...
Harnessing business intelligence and big data. Is collaboration the key to su...
 
m-government & android application development
m-government & android application developmentm-government & android application development
m-government & android application development
 
orashid_2016_New
orashid_2016_Neworashid_2016_New
orashid_2016_New
 
orashid_2016_New
orashid_2016_Neworashid_2016_New
orashid_2016_New
 
Airline Flight Schedule Notification Application (AFSNA)
Airline Flight Schedule Notification Application (AFSNA)Airline Flight Schedule Notification Application (AFSNA)
Airline Flight Schedule Notification Application (AFSNA)
 
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNING
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNINGTRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNING
TRAFFIC FORECAST FOR INTELLECTUAL TRANSPORTATION SYSTEM USING MACHINE LEARNING
 
Commercial Drone Best Practices: How to Incorporate Data and Job Specs
Commercial Drone Best Practices: How to Incorporate Data and Job SpecsCommercial Drone Best Practices: How to Incorporate Data and Job Specs
Commercial Drone Best Practices: How to Incorporate Data and Job Specs
 
Top Ten Skills 2020 (Logistics & Distribution)
Top Ten Skills 2020 (Logistics & Distribution)Top Ten Skills 2020 (Logistics & Distribution)
Top Ten Skills 2020 (Logistics & Distribution)
 
Big Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at UberBig Data Pipelines and Machine Learning at Uber
Big Data Pipelines and Machine Learning at Uber
 
E-TICKETING ON RAILWAY TICKET RESERVATION
E-TICKETING ON RAILWAY TICKET RESERVATIONE-TICKETING ON RAILWAY TICKET RESERVATION
E-TICKETING ON RAILWAY TICKET RESERVATION
 
Big Data Analytics and Artifical Intelligence
Big Data Analytics and Artifical IntelligenceBig Data Analytics and Artifical Intelligence
Big Data Analytics and Artifical Intelligence
 
Determination and visualization of density210409
Determination and visualization of density210409 Determination and visualization of density210409
Determination and visualization of density210409
 
Driving Efficiency with Splunk Cloud at Gatwick Airport
Driving Efficiency with Splunk Cloud at Gatwick AirportDriving Efficiency with Splunk Cloud at Gatwick Airport
Driving Efficiency with Splunk Cloud at Gatwick Airport
 
Strategic Location Analysis for setting up of Manufacturing Facility
Strategic Location Analysis for setting up of Manufacturing FacilityStrategic Location Analysis for setting up of Manufacturing Facility
Strategic Location Analysis for setting up of Manufacturing Facility
 
Strategic Selection of FTZ for setting up a new manufacturing facility for a ...
Strategic Selection of FTZ for setting up a new manufacturing facility for a ...Strategic Selection of FTZ for setting up a new manufacturing facility for a ...
Strategic Selection of FTZ for setting up a new manufacturing facility for a ...
 

More from Vera Ekimenko

Data Quality with AI
Data Quality with AIData Quality with AI
Data Quality with AIVera Ekimenko
 
Deep Reinforcement Learning for Portfolio Optimization
Deep Reinforcement Learning for Portfolio OptimizationDeep Reinforcement Learning for Portfolio Optimization
Deep Reinforcement Learning for Portfolio OptimizationVera Ekimenko
 
Artificial Intelligence for Data Quality
Artificial Intelligence for Data QualityArtificial Intelligence for Data Quality
Artificial Intelligence for Data QualityVera Ekimenko
 
Unsupervised AI for Data Quality
Unsupervised AI for Data QualityUnsupervised AI for Data Quality
Unsupervised AI for Data QualityVera Ekimenko
 
Deep Learning Hackathon
Deep Learning HackathonDeep Learning Hackathon
Deep Learning HackathonVera Ekimenko
 
Cloudera migration oozie_hadoop_ci_cd_pipeline
Cloudera migration oozie_hadoop_ci_cd_pipelineCloudera migration oozie_hadoop_ci_cd_pipeline
Cloudera migration oozie_hadoop_ci_cd_pipelineVera Ekimenko
 
KeyAchivementsMimecast
KeyAchivementsMimecastKeyAchivementsMimecast
KeyAchivementsMimecastVera Ekimenko
 
KeyAchivementsJustisPublishing
KeyAchivementsJustisPublishingKeyAchivementsJustisPublishing
KeyAchivementsJustisPublishingVera Ekimenko
 
HCM Access Insight Dashboard
HCM Access Insight DashboardHCM Access Insight Dashboard
HCM Access Insight DashboardVera Ekimenko
 

More from Vera Ekimenko (13)

Data Quality with AI
Data Quality with AIData Quality with AI
Data Quality with AI
 
AML Knowledge Graph
AML Knowledge GraphAML Knowledge Graph
AML Knowledge Graph
 
Deep Reinforcement Learning for Portfolio Optimization
Deep Reinforcement Learning for Portfolio OptimizationDeep Reinforcement Learning for Portfolio Optimization
Deep Reinforcement Learning for Portfolio Optimization
 
Artificial Intelligence for Data Quality
Artificial Intelligence for Data QualityArtificial Intelligence for Data Quality
Artificial Intelligence for Data Quality
 
Unsupervised AI for Data Quality
Unsupervised AI for Data QualityUnsupervised AI for Data Quality
Unsupervised AI for Data Quality
 
Deep Learning Hackathon
Deep Learning HackathonDeep Learning Hackathon
Deep Learning Hackathon
 
Cloudera migration oozie_hadoop_ci_cd_pipeline
Cloudera migration oozie_hadoop_ci_cd_pipelineCloudera migration oozie_hadoop_ci_cd_pipeline
Cloudera migration oozie_hadoop_ci_cd_pipeline
 
CSharp
CSharpCSharp
CSharp
 
DWHRestructure
DWHRestructureDWHRestructure
DWHRestructure
 
KeyAchivementsMimecast
KeyAchivementsMimecastKeyAchivementsMimecast
KeyAchivementsMimecast
 
KeyAchivementsJustisPublishing
KeyAchivementsJustisPublishingKeyAchivementsJustisPublishing
KeyAchivementsJustisPublishing
 
buy_in
buy_inbuy_in
buy_in
 
HCM Access Insight Dashboard
HCM Access Insight DashboardHCM Access Insight Dashboard
HCM Access Insight Dashboard
 

Recently uploaded

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 

Recently uploaded (20)

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 

Artificial Intelligence Hackathon

  • 1. made with from innovation lab AI Hackathon. Team: Vera Ekimenko
  • 2. made with from innovation lab Technology Stack. Spark for data wrangling Spark ML Random Forests for the model HTMLUnit for web scrapping
  • 3. made with from innovation lab Approach - Phases Phases: 1. 0 days before Departure (all given data) <= used for evaluation 2. 2 days before Departure 3. 5 days before Departure 4. 14 days before Departure 5. 100 days before Departure 6. 0 days after Booking 7. 7 days after Booking Departure Booking 7 days Evaluation Date
  • 4. made with from innovation lab Approach - Feature Engineering Booking Departure First Last Event types: - 1. TLT action - 2. Passenger details - 3. Payments - 4. Service requests - 5. Tickets issue Dates • Every event has a date Duration since/prior • Every range has 2 dates Number of days • Every action has 4 time features Counts • Every action is an event Occurrence since/prior • Calculate how many times occurred Sum of occurrence • Every action has 2 count features • The number of days between the booking and the first/last addition of individual passenger details • The number of days between the first/last payments and the departure • The number of additions of TLT record
  • 5. made with from innovation lab Travel types based on segments analysis MELDXB-DXBMAA-CCUDXB-DXBMEL SINDXB-DXBATH-ATHDXB-DXBSIN WAWDXB-DXBIKA-DXBWAW DACDXB-DXBBAH 55% 37% 4% 5% Two destinations One way Disperse One destination
  • 6. made with from innovation lab External data Passport Index TCdata 360Holidays Airports USA travel advisories Segments Destination 1 Destination 2Boarding Point Departure Date
  • 7. made with from innovation lab External data – airports
  • 8. made with from innovation lab External data – TC data 360 Travel and Tourism Competitiveness Report 2017
  • 9. made with from innovation lab External data – Passport Index Visa requirements by country
  • 10. made with from innovation lab External data – Holidays
  • 11. made with from innovation lab External data – USA travel advisories
  • 12. made with from innovation lab Performance
  • 13. made with from innovation lab Accuracy 0% 20% 40% 60% 80% 100% 120% With external data Without external data
  • 14. made with from innovation lab External data boost 0.00% 1.00% 2.00% 3.00% 4.00% 5.00% 6.00% 7.00% Accuracy PR AUC ROC AUC
  • 15. made with from innovation lab Feature Importance for different phases - 1
  • 16. made with from innovation lab Feature Importance for different phases - 2
  • 17. made with from innovation lab Scalability airport •Country data •Politics organizator •Travel agency •Regular group travellers season •Christmas •Events Technical scalability Features scalability
  • 18. made with from innovation lab Self-learning • Selected features • Train data Initial model • Monitor model deterioration • Re-generate the adjusted model Test accuracy nightly • New features • New data Adjustments
  • 19. made with from innovation lab Reasons why it’s the best solution • Native for HELIX -> easy to deploy and maintain • Low maintenance -> easy to update the model • Fast and scalable -> evaluate group bookings nightly with no fuzz • Good foundation for other models -> Recommender for new stations • Pluggable -> can be used to enrich the existing models • Transparency -> Easy to communicate non-tech how the model works with Feature Importance • Robustness -> the model works even if data quality is not perfect
  • 20. made with from innovation lab Annex (1) – Full list of features used in the model PAX - Total number of passenger in the group Pcc2City - 1 = PCC equals to City IsGWS - 1 = GWS_ID is not empty DepDateDays - the number of days between the booking and the departure NumberSegments - the number of segments booked BookingToRequestDays - the number of days between the request and the booking RequestCreatedPriorDays - the number of days between the request and the departure IndPaxAdditionFromDays - the number of days between the booking and the first addition of individual passenger details IndPaxAdditionFromPriorDays - the number of days between the first addition of individual passenger details and the departure IndPaxAdditionToDays - the number of days between the last addition of individual passenger details IndPaxAdditionToPriorDays - the number of days between the last addition of individual passenger details and the departure IndividualPaxAdditionSum - the number of addition of individual passenger details IndividualPaxRemovalSum - the number of removal of individual passenger details PaymentsFromDays - the number of days between the booking and the first payment PaymentsFromPriorDays - the number of days between the first of payment and the departure PaymentsToDays - the number of days between the last of payment PaymentsToPriorDays - the number of days between the last of payment and the departure IncludedCSR - The payment is included in the sales report
  • 21. made with from innovation lab Annex (2) – Full list of features used in the model TLTActionDateAdditionsFromDays - the number of days between the booking and the first addition of TLT record TLTActionDateAdditionsFromPriorDays - the number of days between the first addition of TLT record and the departure TLTActionDateAdditionsToDays - the number of days between the last addition of TLT record TLTActionDateAdditionsToPriorDays - the number of days between the last addition of TLT record and the departure TLTActionDateRemovalFromDays - the number of days between the booking and the first removal of TLT record TLTActionDateRemovalFromPriorDays - the number of days between the first removal of TLT record and the departure TLTActionDateRemovalToDays - the number of days between the last removal of TLT record TLTActionDateRemovalToPriorDays - the number of days between the last removal of TLT record and the departure TLTAdditionsSum - the number of addition of TLT record TLTRemovalSum - the number of removal of TLT record ServicesCount - The number of the service requests added isonedest - The journey has one destination ismultidest - The journey has two destinations ismultirtn - The journey has multiple returning points isoneleg - The journey is one way isgathering - The journey has multiple boarding points
  • 22. made with from innovation lab Annex (3) – Full list of features used in the model ServicesDepartureDateMinDays - the number of days between the booking and the earliest departure date for the added services ServicesDepartureDateMinPriorDays - the number of days between the earliest departure date for the added services and the departure FirstDepDateMonth - the months of the departure FirstDepDateDay - the day of the departure IsIATA - 1=The agent is a member of IATA ACCEPTED_GROUP_SIZE - The accepted group size ActuallyGivenAndAccepted - The difference between the accepted group size and total number of passenger KidPerAdult - The number of infants and children passengers per adult passenger NotAcceptedAdult - The difference between the requested number of number of adult passengers and accepted number of adult passengers NotAcceptedChild - The difference between the requested number of number of child passengers and accepted number of child passengers NotAcceptedInfant - The difference between the requested number of number of infant passengers and accepted number of infant passengers ACC_ADULT - The accepted number of adult passengers ACC_CHILD - The accepted number of child passengers ACC_INF - The accepted number of infant passengers
  • 23. made with from innovation lab Annex (4) – Full list of features used in the model airport_infrastructure - The difference in levels of airport infrastructure in the first destination country business_environment - The difference in levels of business environment in the first destination country culresources_bustravel - The difference in levels of cultural resources and business travel in the first destination country enabling_environment - The difference in levels of enabling environment in the first destination country environmental_sustainability - The difference in levels of environmental sustainability in the first destination country tourism_priority - The difference in levels of tourism priority in the first destination country ground_port_infrastructure - The difference in levels of ground port infrastructure in the first destination country health_hygiene - The difference in levels of health hygiene in the first destination country labor_market - The difference in levels of labour market in the first destination country infrastructure_subindex - The difference in levels of infrastructure sub-index in the first destination country international_openness - The difference in levels of international openness in the first destination country natural_cultural_resources - The difference in levels of natural and cultural resources in the first destination country natural_resources - The difference in levels of natural resources in the first destination country price_competitiveness - The difference in levels of price competitiveness in the first destination country safety_security - The difference in levels of safety and security in the first destination country tourist_infrastructure - The difference in levels of tourist infrastructure in the first destination country travel_ict_readiness - The difference in levels of travel and tourism ICT readiness in the first destination country travel_policy - The difference in levels of travel policy in the first destination country travel_competitiveness - The difference in levels of Travel and Tourism policy and enabling conditions in the first destination country
  • 24. made with from innovation lab Annex (5) – Full list of features used in the model dest1_passport_requirements - The level of difficulty to get a visa to the first destination country dest2_passport_requirements - The level of difficulty to get a visa to the second destination country if any dest1_distance - The geographical distance between the boarding point and the first destination dest1_timediff - The time lag between the boarding point and the first destination adv_dest1_levelNN - USA travel advisory level for the first destination country adv_dest2_levelNN - USA travel advisory level for the second destination country if any AroundHoliday - The departure date is a public holiday (+/- 3 days) in the original country AroundWeekend2 - The departure date is a weekend (+ / - 1 day) in the original country countriesOHE - The original country
  • 25. made with from innovation lab Thank You