SlideShare a Scribd company logo
Strata	Hadoop	World		|		New	York	City		|		September	29th,	2016
Choice Hotels’ journey to better
understand its customers
through self-service analytics
Narasimhan	Sampath	&	Avinash	Ramineni
Agenda
• Who is Choice Hotels
• Platform Architecture
• Implementation
• Value Add
Strata	Hadoop	Word		|		New	York	City		|		September	29th,	2016Page	3
Who is Choice Hotels?
Page	4 Strata	Hadoop	Word		|		New	York	City		|		September	29th,	2016
Who is Choice Hotels?
United	States	&	Caribbean
Hotels	open 5,276
Hotels	under	development 606
Rooms	open	&	under	dev. 446,813
Canada
Hotels	open 323
Hotels	under	development 45
Rooms	open	&	under	dev. 30,135
South	America
Hotels	open 64
Hotels	under	development 7
Rooms	open	&	under	dev. 9,737
Asia	Pacific	
Hotels	open 315
Hotels	under	development 25
Rooms	open	&	under	dev. 23,289
Europe
Hotels	open 402
Hotels	under	development 31
Rooms	open	&	under	dev. 50,388
Mexico
Hotels	open 28
Hotels	under	development 4
Rooms	open	&	under	dev. 3,219
Central	America
Hotels	open 14
Hotels	under	development 0
Rooms	open	&	under	dev.	 1,468
Middle	East
Hotels	open 1
Hotels	under	development 2
Rooms	open	&	under	dev. 564
How About a Technology Company?
Strata	Hadoop	World		|		New	York	City		|		September	29th,	2016Page	6
Evolution of Guest Experience
Page	7 Strata	Hadoop	Word		|		New	York	City		|		September	29th,	2016
Project Goals
Page	8 Strata	Hadoop	Word		|		New	York	City		|		September	29th,	2016
• Business Drivers
− Self	Service	Reporting	and	Analytics
− Requirements	for	near	real-time	analytics
− Simplify	Governance,	Compliance	and	Auditing
− Better	support	for	new	applications	
• Technical Drivers
− Unable	to	handle	volume,	velocity,	and	veracity
− Retire	Legacy	Systems
− Difficult	to	find	skillset	(Informix	4GL)
− Simplify	Technology	Stack
Key Design Tenets
• Separation of Compute and Storage
• Independently	scale	compute	and	storage
• Data	Democratization	and	Governance
• Bring	your	own	Compute	(BYOC)
• Lift and Shift between cloud provider(s)
and On-premise
• HA / DR
• Open Source Stack
Page	9
Separation of Compute and Storage
• Scale storage and compute independently (up or down)
• Shifts bottleneck from Disk IO to Network
• Centralized Data Storage
• Write once & read everywhere
• Data Democratization
• Easier Hardware upgrade paths
• Flexibile Architecture
Page	10
Storage
Servers
BYOC (Bring Your Own Cluster)
• Eliminates the need for very large clusters
• Easier to administer and maintain
• Reduces multi-tenancy issues
• Clusters can be upgraded independently
• Enables on-demand computing
• Lower costs
Page	11
Marketing
Cluster
Centralized	
Storage
Personalization
Cluster
Main
Cluster
Platform Architecture
Page	12
Platform Architecture – Data Ingestion Layer
• DB Ingestor
• Stream Ingestor
− Kafka	and	Spark	Streaming
• File Ingestor
• FTP / SFTP / Logs
• Ingestion using Service API
Page	13
Platform Architecture – Data Processing Layer
• Storage layer carved into logical buckets
• Landing, Raw, Derived and Delivery
• Schema stored with data (no guesswork)
• Platform Jobs for
• Converting text to Parquet
• Saving streaming data Parquet
• Derivatives
• Compaction
• Standardization
Page	14
Platform Architecture – Data Delivery Layer
• Data Delivery
• SQL - Spark Thrift Server / Impala
• Tableau, SQL IDE, Applications
• SparkR
• Self Service
• Derivatives
• Represented Via SQL on Delivery Layer
• Stored in Derived Storage Layer
• Metadata driven
• Derived Layer Generators
• Long running Spark Job
• Derivative Refresh
Page	15
Implementation
• CDH Cloud ready-ness
• Cloudera Director Limitations
• Multi-Availability zone, regions
• Spark Thrift Server
• Support
• Performance Tuning
• Concurrency, partition strategy
• Cache Tables
• Security
• Sentry Integration
• Kerberos Ticket Renewal
• Navigator Integration
Page	16
Implementation
• Rapidly Changing Technology
• Feature addition
• Documentation
• Bugs
• Jar hell
• Compression Codec for Parquet
• S3 Eventual Consistency
• Small files
• Performance Issues
• Compaction
Page	17
Implementation
• Partition Strategy
• Parquet Files
• Balancing parallelism and throughput
• Table Partitions
• Cluster sizing, optimization and tuning
• Integrating with Corporate infrastructure
• Deployment practices
• Monitoring and Alerting
• Information Security Policies
Page	18
Value Add
Enabling predictive analytics and real-time decisions
Integrated Scorecards – Daily /Weekly / Monthly Insights Near Real Time / Hourly / Daily Insights
Multivariate Testing, APT (Test vs. Control Analysis), and Text
Analytics
Testing for Both Hotel and Customer / Research For Guest
Insights
Personalized Display Ad Serving
Real-time Actions (Machine Learning) Across Guest Touch
Points
Hotel Lifecycle Data Real-time Alerts for Hotel Related Actions
Strata	Hadoop	World		|		New	York	City		|		September	29th,	2016Page	19
• One of the fastest growing big data companies
• Extensive experience in providing strategic and architectural consulting on Big
Data platforms and implementations
• Global delivery experience across multiple locations in US, Asia and Latin
America
• 100+ big data experts worldwide - US, Latin America and Asia
B A C K G R O U N D
C L A I R V O Y A N T S O F T . C O M
CLAIRVOYANT
A W A R D S & R E C O G N I T I O N
Questions
Strata	Hadoop	Word		|		New	York	City		|		September	29th,	2016Page	21
Principal @ Clairvoyant
Email: avinash@clairvoyantsoft.com
LinkedIn: https://www.linkedin.com/in/avinashramineni

More Related Content

Similar to Strata+Hadoop World NY 2016 - Avinash Ramineni

21st century quant
21st century quant21st century quant
21st century quant
QuantUniversity
 
Big Data for Smart City
Big Data for Smart CityBig Data for Smart City
Big Data for Smart City
Koltiva
 

Similar to Strata+Hadoop World NY 2016 - Avinash Ramineni (20)

Km in the cloud
Km in the cloudKm in the cloud
Km in the cloud
 
Neo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to GraphsNeo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to Graphs
 
Blackboard Learn Deployment: A Detailed Update of Managed Hosting and SaaS De...
Blackboard Learn Deployment: A Detailed Update of Managed Hosting and SaaS De...Blackboard Learn Deployment: A Detailed Update of Managed Hosting and SaaS De...
Blackboard Learn Deployment: A Detailed Update of Managed Hosting and SaaS De...
 
MRA AMA Part 7: The Circuit Breaker Pattern
MRA AMA Part 7: The Circuit Breaker PatternMRA AMA Part 7: The Circuit Breaker Pattern
MRA AMA Part 7: The Circuit Breaker Pattern
 
Microsoft Cloud Adoption Framework for Azure: Governance Conversation
Microsoft Cloud Adoption Framework for Azure: Governance ConversationMicrosoft Cloud Adoption Framework for Azure: Governance Conversation
Microsoft Cloud Adoption Framework for Azure: Governance Conversation
 
In-Stream Processing Service Blueprint, Reference architecture for real-time ...
In-Stream Processing Service Blueprint, Reference architecture for real-time ...In-Stream Processing Service Blueprint, Reference architecture for real-time ...
In-Stream Processing Service Blueprint, Reference architecture for real-time ...
 
How to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the CloudHow to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the Cloud
 
Transforming Education in the Cloud
Transforming Education in the CloudTransforming Education in the Cloud
Transforming Education in the Cloud
 
21st century quant
21st century quant21st century quant
21st century quant
 
Cloud Technology and Your Printing Business
Cloud Technology and Your Printing BusinessCloud Technology and Your Printing Business
Cloud Technology and Your Printing Business
 
Big Data for Smart City
Big Data for Smart CityBig Data for Smart City
Big Data for Smart City
 
Architecting Your Own DBaaS in a Private Cloud with EM12c
Architecting Your Own DBaaS in a Private Cloud with EM12cArchitecting Your Own DBaaS in a Private Cloud with EM12c
Architecting Your Own DBaaS in a Private Cloud with EM12c
 
OpenSource and the Cloud ApacheCon.pptx
OpenSource and the Cloud  ApacheCon.pptxOpenSource and the Cloud  ApacheCon.pptx
OpenSource and the Cloud ApacheCon.pptx
 
Marlabs capabilities overview: cloud services
Marlabs capabilities overview: cloud servicesMarlabs capabilities overview: cloud services
Marlabs capabilities overview: cloud services
 
AWS Cloud Assessment
AWS Cloud AssessmentAWS Cloud Assessment
AWS Cloud Assessment
 
Sabrina Kirstein @ RapidMiner
Sabrina Kirstein @ RapidMinerSabrina Kirstein @ RapidMiner
Sabrina Kirstein @ RapidMiner
 
Neo4j GraphTalk Basel - Building intelligent Software with Graphs
Neo4j GraphTalk Basel - Building intelligent Software with GraphsNeo4j GraphTalk Basel - Building intelligent Software with Graphs
Neo4j GraphTalk Basel - Building intelligent Software with Graphs
 
Slides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMinerSlides PAPIs.io'14 RapidMiner
Slides PAPIs.io'14 RapidMiner
 
SaaS company in north america
SaaS company in north americaSaaS company in north america
SaaS company in north america
 
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellenGraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
GraphTalk Wien - Intelligente Lösungen mit Graphen erstellen
 

More from clairvoyantllc

More from clairvoyantllc (12)

Getting started with SparkSQL - Desert Code Camp 2016
Getting started with SparkSQL  - Desert Code Camp 2016Getting started with SparkSQL  - Desert Code Camp 2016
Getting started with SparkSQL - Desert Code Camp 2016
 
MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014MongoDB Replication fundamentals - Desert Code Camp - October 2014
MongoDB Replication fundamentals - Desert Code Camp - October 2014
 
Architecture - December 2013 - Avinash Ramineni, Shekhar Veumuri
Architecture   - December 2013 - Avinash Ramineni, Shekhar VeumuriArchitecture   - December 2013 - Avinash Ramineni, Shekhar Veumuri
Architecture - December 2013 - Avinash Ramineni, Shekhar Veumuri
 
Big data in the cloud - Shekhar Vemuri
Big data in the cloud - Shekhar VemuriBig data in the cloud - Shekhar Vemuri
Big data in the cloud - Shekhar Vemuri
 
Webservices Workshop - september 2014
Webservices Workshop -  september 2014Webservices Workshop -  september 2014
Webservices Workshop - september 2014
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Running Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on HadoopRunning Airflow Workflows as ETL Processes on Hadoop
Running Airflow Workflows as ETL Processes on Hadoop
 
Databricks Community Cloud
Databricks Community CloudDatabricks Community Cloud
Databricks Community Cloud
 
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
 
Event Driven Architectures - Phoenix Java Users Group 2013
Event Driven Architectures - Phoenix Java Users Group 2013Event Driven Architectures - Phoenix Java Users Group 2013
Event Driven Architectures - Phoenix Java Users Group 2013
 
HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Recently uploaded (20)

The architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfThe architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdf
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 

Strata+Hadoop World NY 2016 - Avinash Ramineni

  • 1. Strata Hadoop World | New York City | September 29th, 2016 Choice Hotels’ journey to better understand its customers through self-service analytics Narasimhan Sampath & Avinash Ramineni
  • 2. Agenda • Who is Choice Hotels • Platform Architecture • Implementation • Value Add Strata Hadoop Word | New York City | September 29th, 2016Page 3
  • 3. Who is Choice Hotels? Page 4 Strata Hadoop Word | New York City | September 29th, 2016
  • 4. Who is Choice Hotels? United States & Caribbean Hotels open 5,276 Hotels under development 606 Rooms open & under dev. 446,813 Canada Hotels open 323 Hotels under development 45 Rooms open & under dev. 30,135 South America Hotels open 64 Hotels under development 7 Rooms open & under dev. 9,737 Asia Pacific Hotels open 315 Hotels under development 25 Rooms open & under dev. 23,289 Europe Hotels open 402 Hotels under development 31 Rooms open & under dev. 50,388 Mexico Hotels open 28 Hotels under development 4 Rooms open & under dev. 3,219 Central America Hotels open 14 Hotels under development 0 Rooms open & under dev. 1,468 Middle East Hotels open 1 Hotels under development 2 Rooms open & under dev. 564
  • 5. How About a Technology Company? Strata Hadoop World | New York City | September 29th, 2016Page 6
  • 6. Evolution of Guest Experience Page 7 Strata Hadoop Word | New York City | September 29th, 2016
  • 7. Project Goals Page 8 Strata Hadoop Word | New York City | September 29th, 2016 • Business Drivers − Self Service Reporting and Analytics − Requirements for near real-time analytics − Simplify Governance, Compliance and Auditing − Better support for new applications • Technical Drivers − Unable to handle volume, velocity, and veracity − Retire Legacy Systems − Difficult to find skillset (Informix 4GL) − Simplify Technology Stack
  • 8. Key Design Tenets • Separation of Compute and Storage • Independently scale compute and storage • Data Democratization and Governance • Bring your own Compute (BYOC) • Lift and Shift between cloud provider(s) and On-premise • HA / DR • Open Source Stack Page 9
  • 9. Separation of Compute and Storage • Scale storage and compute independently (up or down) • Shifts bottleneck from Disk IO to Network • Centralized Data Storage • Write once & read everywhere • Data Democratization • Easier Hardware upgrade paths • Flexibile Architecture Page 10 Storage Servers
  • 10. BYOC (Bring Your Own Cluster) • Eliminates the need for very large clusters • Easier to administer and maintain • Reduces multi-tenancy issues • Clusters can be upgraded independently • Enables on-demand computing • Lower costs Page 11 Marketing Cluster Centralized Storage Personalization Cluster Main Cluster
  • 12. Platform Architecture – Data Ingestion Layer • DB Ingestor • Stream Ingestor − Kafka and Spark Streaming • File Ingestor • FTP / SFTP / Logs • Ingestion using Service API Page 13
  • 13. Platform Architecture – Data Processing Layer • Storage layer carved into logical buckets • Landing, Raw, Derived and Delivery • Schema stored with data (no guesswork) • Platform Jobs for • Converting text to Parquet • Saving streaming data Parquet • Derivatives • Compaction • Standardization Page 14
  • 14. Platform Architecture – Data Delivery Layer • Data Delivery • SQL - Spark Thrift Server / Impala • Tableau, SQL IDE, Applications • SparkR • Self Service • Derivatives • Represented Via SQL on Delivery Layer • Stored in Derived Storage Layer • Metadata driven • Derived Layer Generators • Long running Spark Job • Derivative Refresh Page 15
  • 15. Implementation • CDH Cloud ready-ness • Cloudera Director Limitations • Multi-Availability zone, regions • Spark Thrift Server • Support • Performance Tuning • Concurrency, partition strategy • Cache Tables • Security • Sentry Integration • Kerberos Ticket Renewal • Navigator Integration Page 16
  • 16. Implementation • Rapidly Changing Technology • Feature addition • Documentation • Bugs • Jar hell • Compression Codec for Parquet • S3 Eventual Consistency • Small files • Performance Issues • Compaction Page 17
  • 17. Implementation • Partition Strategy • Parquet Files • Balancing parallelism and throughput • Table Partitions • Cluster sizing, optimization and tuning • Integrating with Corporate infrastructure • Deployment practices • Monitoring and Alerting • Information Security Policies Page 18
  • 18. Value Add Enabling predictive analytics and real-time decisions Integrated Scorecards – Daily /Weekly / Monthly Insights Near Real Time / Hourly / Daily Insights Multivariate Testing, APT (Test vs. Control Analysis), and Text Analytics Testing for Both Hotel and Customer / Research For Guest Insights Personalized Display Ad Serving Real-time Actions (Machine Learning) Across Guest Touch Points Hotel Lifecycle Data Real-time Alerts for Hotel Related Actions Strata Hadoop World | New York City | September 29th, 2016Page 19
  • 19. • One of the fastest growing big data companies • Extensive experience in providing strategic and architectural consulting on Big Data platforms and implementations • Global delivery experience across multiple locations in US, Asia and Latin America • 100+ big data experts worldwide - US, Latin America and Asia B A C K G R O U N D C L A I R V O Y A N T S O F T . C O M CLAIRVOYANT A W A R D S & R E C O G N I T I O N
  • 20. Questions Strata Hadoop Word | New York City | September 29th, 2016Page 21 Principal @ Clairvoyant Email: avinash@clairvoyantsoft.com LinkedIn: https://www.linkedin.com/in/avinashramineni