SlideShare a Scribd company logo
1 of 36
Download to read offline
MySQL at Sabre

Alan Walker
Sabre Labs

February 2004
Confidential
Agenda
• Sabre Holdings Overview
• Business drivers for MySQL & Open Source
• Shopping for fares
• Air Travel Shopping Engine (ATSE)
• Data replication strategy
• ESQL precompiler for MySQL
• Other MySQL users at Sabre

2
22
Who is Sabre Holdings?

A world leader in travel commerce,
retailing travel products, and
providing distribution and
technology solutions for the
travel industry

3
33
Sabre Holdings Businesses

4
44
Sabre Holdings Fast Facts

• Industry leader in multiple travel channels

• Revenues of $2.06 billion in 2002
• S&P 500 company

• NYSE:TSG
• Headquarters in Dallas/Fort Worth, Texas
• 6,500 employees in 45 countries

5
55
Business drivers

Over 3 billion
fare combinations
for a single customer request

Multiple airlines, flights, fare types, dates
prices, taxes, surcharges
6
66
Business drivers
• No direct revenue for shopping queries
• Revenue for booking, but not looking (searching)
• Look-to-book ratio increasing
• Competition requires staying on the “leading edge”
• Highly reliable and scalable database
• Fast processors
• Large real memory
• Smart algorithms

• Shopping is a good fit for horizontal scale
• Pricing requires higher precision
7
77
Business drivers
Application

DB / Middleware

Computing
Stack

Commodity
Point

Operating System

Hardware

Hardware, operating system, database and middleware are
becoming commodities. This drives the cost down rapidly.
Open source software is a major driver of this effect.
8
88
Business Solution
• Linux servers alongside HP NonStop servers to create
“hybrid” Air Travel Shopping Engine (ATSE) platform
• HP NonStop delivers high availability and reliability
– Better than or equal to legacy, but at significantly lower cost
– Best fit for critical workloads and master database
management
• Linux / MySQL delivers 64-bit memory and faster CPUs

– Lower availability and reliability than HP NonStop but at
significantly lower cost
– Best fit for CPU-intensive shopping workloads

Most cost-effective platform for the shopping workload
9
99
Business drivers
• Sabre’s legacy
• World’s first commercial OLTP system in 1960
• Mainframe clusters running TPF
• Operating system customized to our needs
• True 7*24 application, with zero scheduled downtime
• Most application code in assembler
• Sabre’s future
• Higher-level languages
• Relational databases
• Internet
• Open systems
• Reduce specialized training
• Use off the shelf software
• HP NonStop with OSS is a key component (LINUX?)

10
10
10
Shopping
• Finding cheap air fares is hard!
• With 50+ connect points to consider, and >100 fares per
leg, we need to evaluate >3 billion combinations
• Up to a million fares can change every day
• Availability changes continuously
• Solve it >100 times per second
• Other functions
• Price 250 tickets per second
• Process 1000 flight routing requests per second

11
11
11
Pricing
• Shopping vs. Pricing
• Shopping is the problem of finding low fares
• Pricing is used to print the ticket
• Pricing has to be accurate, or we pay the difference to the
airline
• Many internet search engines still rely on mainframes to
actually print the ticket
• Pricing also requires additional functions, such as refunds,
exchanges and auditing

12
12
12
Algorithms
• Fare-led search
• Graph-based algorithm that searches all fare
combinations across 50+ connect points
• Can generate up to a 4-segment connection
• Search space of >3 billion fare combinations
• Match or exceed any competitor in finding lowest fare
• Only loses to competitors to have access to exclusive
private fares and/or other discounts
• Search actually checks Direct Connect Availability, so that
low fare options are actually bookable

13
13
13
Algorithms
• Dynamic schedules
• Connections are not generated overnight and stored
• Not limited to routes explicitly setup by airlines or other
marketing staff
• Availability Manager
• Flexible rules to access airline availability
• Current methods
– Direct Connect
– Host Availability
– Teletype (AVS)
• Can also use

– Cached DCA
– Inventory proxy

14
14
14
ATSE Hybrid
• Air shopping for desirable itineraries
• Must search through multiple airlines, flights, fare types,
dates, adjacent airports, etc.
• Must calculate prices, taxes, surcharges
• Complexity
• Single round-trip request can have over 3 billion fare
combinations
• Search is CPU and memory intensive

• Business driver
• No direct revenue for shopping transactions
• Increasing look to book ratio
15
15
15
ATSE Hybrid
• Combine Linux servers and HP NonStop servers
• HP NonStop delivers high availability and reliability
• Better than or equal to TPF at significantly lower cost
• Master database management
• Data replicated in real-time to Linux servers
• PNR pricing, schedules and availability
• Linux delivers 64-bit memory model and faster CPUs
• Lower availability and reliability than HP NonStop but at
significantly lower cost
• Horizontally scaled server farm with spare capacity
• Best fit for CPU-intensive shopping workloads
16
16
16
ATSE Hybrid
IBM

Fare and Rule
Updates

Schedule and Availability
Updates

IBM

PSS

MVS
d i g i t a l

d i g i t a l

d i g i t a l

d i g i t a l

d i g i t a l

d i g i t a l

HP Non-Stop
Air Shopping
Transactions

Shopping
Availability
Transactions Requests

Naming Service
And
Load Balancing

DB Image
Load
and Updates
E/R

Logging
and Billing

Linux Server Farm

Load Information

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

17
17
17
ATSE Linux servers
• In production since July 2003
• Started with HP rp5405 servers (Unix PA-RISC)
– Migrated to Itanium in December 2003
• Using 45 HP rx5670 servers

– 4-way, 1.5 GHz, 6MB L2 cache, 32GB RAM, 4x72GB SCSI

• Software
• MySQL 4.0.15
• GNU compilers – g++ 3.2.3 and glibc 2.3.2
• TAO object request broker
• Redhat RHAS 2.1
• GoldenGate Extractor/Replicator
• Monitoring – Prognosis, CA Unicenter, scripts
18
18
18
ATSE Software
• Extensive use of open source software
• MySQL 4.0.15
• GNU compilers – g++ 3.2.3 and glibc 2.3.2
• TAO object request broker
• Redhat Linux AS 3.0

• Third party software
• GoldenGate Extractor/Replicator
• Monitoring – Prognosis, CA Unicenter, scripts
• Internally developed applications and scripts

19
19
19
Data replication
• HP NonStop (Tandem) is master database
• Golden Gate Software used to replicate to MySQL
– Extracts data form undo/redo logs on the NonStop server
– Performs INSERT / UPDATE / DELETE on MySQL
– Software performs catch-up / resync in case of crashes or
other failures
• Each Linux server has an identical copy of the database

– 50GB database on each server, all InnoDB

• Replication volume
• 150 tables replicated (over 300 on NonStop server)
• Can replicate 1M fare changes / hour
• Data updates on 7x24 basis
20
20
20
Data replication
HP NonStop

SQL/MP

DB

TMF
Log

Linux IA-64

Data
Pump

Queue

Extract

Receive

Updater

Queue

MySQL

DB

= Golden Gate Software

21
21
21
Data Replication
Server-Net

Extract
Queue

Extract
Queue

Extract
Queue

Extract
Queue

Extract
Queue

Extract
Queue

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Data
Pump

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Extract
Collector

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Queue

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

Replicator

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

Linux

22
22
22
Results
Reduced development
costs

Decreased fare
loading cycle times

Competitive
Advantage

Increased
functionality

Reduced runtime costs
(over 80% compared to legacy)
23
23
23
Hybrid
• Horizontal scalability
• Ability to throw inexpensive CPUs at the problem
• Tolerate failure of a single server
• How do we get there from here?
• Database and network functions remain on Himalaya
• C++ code readily ports to Linux
• Publish/subscribe metaphor for data in memory
• 64-bit addressing to avoid memory constraints

24
24
24
Connectivity
• CORBA
• Major functions use CORBA internally
• CORBA requests to TPF for availability
• CORBA to CTS for DCA this Summer (bypass TPF)
• Asynchronous messaging via MQ Series

• XML
• Currently uses XML requests from TPF (over RPPC) for
pricing functions
• Working on direct access from Travelocity to ATSE
– Will be used for BIP
– Already working over HTTP (development systems)
– Working on security & billing for production
25
25
25
Timeline
• 2000
• Proof Of Concept, April – August
• 5 core developers, partnership with Compaq
• 2001
• Development & training began in February
• Initial hardware delivered
• 2002
• Phase 1 in production since July
• Zero downtime since implementation
• Rapidly developing additional functionality
• Wow – this is from an ancient slide, huh?
26
26
26
Precompiler
• Challenge
• 500K lines of C/C++, 150+

files with embedded SQL
• We did not want to rewrite
ESQL / C code by hand
• Solution
• Wrote a precompiler that

converts ESQL to inline
MySQL calls
• About 1000 lines of awk
• We are willing to share this
code with others

EXEC SQL
int
double
char
EXEC SQL

BEGIN DECLARE SECTION;
host_a;
host_b;
host_c;
END DECLARE SECTION;

EXEC SQL DECLARE csr1 CURSOR FOR
SELECT a, b, c
FROM table1
WHERE x = :hostvar1;
EXEC SQL OPEN csr1;
while (rc >= 0 && rc != 100){
EXEC SQL FETCH csr1 INTO
:host_a, :host_b, :host_c;
printf("Fetch %d, %lf, %sn",
host_a, host_b, host_c);
}
EXEC SQL CLOSE csr1;

27
27
27
Precompiler
• How it works
• Convert C / ESQL to C++ code
• Polymorphism matches data types in the declare section
• Can ignore the declare section
EXEC SQL
int
double
char
EXEC SQL

BEGIN DECLARE SECTION;
host_a;
host_b;
host_c;
END DECLARE SECTION;

// EXEC
int
double
char
// EXEC

SQL BEGIN DECLARE SECTION;
host_a;
host_b;
host_c;
SQL END DECLARE SECTION;

28
28
28
Precompiler

Cursor declarations (SELECT statements) are converted to a static
struct. The struct has the text of the SQL, as well as statement
handles for doing prepare / execute (where applicable)

EXEC SQL DECLARE csr1 CURSOR FOR
SELECT a, b, c
FROM table1
WHERE x = :hostvar1;

// EXEC SQL DECLARE csr1
static e2mysql csr1 = {
" SELECT a,b,c FROM table1 WHERE x = :hostvar1"
, NULL , 0};

29
29
29
Precompiler
The OPEN, FETCH and CLOSE statements are converted into
function calls. The precompiler generates the code for these calls
and puts it at the end of the source module.
EXEC SQL FETCH csr1 INTO :host_a, :host_b, :host_c;
// EXEC SQL FETCH csr1
static int16 fetch_csr1()
{
if ( ! csr1.rslt )
return SQL_ERROR;
if ( csr1.row >= mysql_num_rows(csr1.rslt) )
return SQL_NO_DATA;
MYSQL_ROW row = mysql_fetch_row(csr1.rslt);
SQLBindColPoly(row[0], host_a, sizeof(host_a));
SQLBindColPoly(row[1], host_b, sizeof(host_b));
SQLBindColPoly(row[2], host_c, sizeof(host_c));
++csr1.row;
return SQL_SUCCESS;
}

30
30
30
Precompiler

A lightweight wrapper around the database API lets us
use polymorphism to convert to the types specified in the
declare section. There is a wrapper function for each
simple C++ type that we handle.

inline int32
SQLBindColPoly(const char* value, int32& parm, uint16 size)
{
parm = atoi(value);
return SQL_SUCCESS;
}

31
31
31
Precompiler
• Notes
• Light-weight C++ wrapper to MySQL API
• The precompiler understands some SQL syntax and does
some modifications of NonStop SQL/MP statements
• We have also used our precompiler to target other DBMS
– ODBC API
– Oracle
– PostgreSQL
• Since we convert C to C++, this may be problematic for

ESQL programs that used deprecated K&R syntax
– C++ compilers are stricter than C compilers
– However, we did not have this problem with our application
32
32
32
Other MySQL applications at Sabre
• ATSE is our largest and most mission critical
• We have other production systems that rely on MySQL
• Site59.com is the most visible
• MySQL also used for some internal databases
• More under development
• MySQL / Linux / SATA drives make cheap data marts
• Sometimes cheaper to replicate to a data mart than to
upgrade a central data warehouse
• Currently testing with a 1.5B row database

33
33
33
Site59
• Last minute travel packages
• Acquired by Travelocity in
March 2002
• Sales volume?
• Transaction rates?
• All dynamic content generated
using PHP & MySQL

34
34
34
Site59
Site59 implements a fairly “classic” dynamic website using MySQL.
Dynamic content is generated at about 30Mbits / second. Extensive
use is made of single and dual processor Linux machines (IA-32)

Presentation
(Apache/PHP)
Internet
HTTP

Application
Server

Reservations
System Gateway

XML/HTTP

Frontend DB
(MySQL, Linux)

Replication

Backend DB
(Oracle, Sun)

35
35
35
Travel Commerce Processing Chain

Session

Shop

Price

Sell

Fulfill

36
36
36

More Related Content

What's hot

RedisConf17 - Redfin - The Real Estate Brokerage and the In-memory Database
RedisConf17 - Redfin - The Real Estate Brokerage and the In-memory Database RedisConf17 - Redfin - The Real Estate Brokerage and the In-memory Database
RedisConf17 - Redfin - The Real Estate Brokerage and the In-memory Database
Redis Labs
 

What's hot (20)

RedisConf17 - Redfin - The Real Estate Brokerage and the In-memory Database
RedisConf17 - Redfin - The Real Estate Brokerage and the In-memory Database RedisConf17 - Redfin - The Real Estate Brokerage and the In-memory Database
RedisConf17 - Redfin - The Real Estate Brokerage and the In-memory Database
 
Moving Beyond Cache by Yiftach Shoolman - Redis Day Bangalore 2020
Moving Beyond Cache by Yiftach Shoolman - Redis Day Bangalore 2020Moving Beyond Cache by Yiftach Shoolman - Redis Day Bangalore 2020
Moving Beyond Cache by Yiftach Shoolman - Redis Day Bangalore 2020
 
HBaseConEast2016: Splice machine open source rdbms
HBaseConEast2016: Splice machine open source rdbmsHBaseConEast2016: Splice machine open source rdbms
HBaseConEast2016: Splice machine open source rdbms
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
 
Hp hadoop platform
Hp hadoop platformHp hadoop platform
Hp hadoop platform
 
Powering an API with GraphQL, Golang, and NoSQL
Powering an API with GraphQL, Golang, and NoSQLPowering an API with GraphQL, Golang, and NoSQL
Powering an API with GraphQL, Golang, and NoSQL
 
Postgres Plus Cloud Database
Postgres Plus Cloud DatabasePostgres Plus Cloud Database
Postgres Plus Cloud Database
 
Data streaming-systems
Data streaming-systemsData streaming-systems
Data streaming-systems
 
Kafka and Kafka Streams in the Global Schibsted Data Platform
Kafka and Kafka Streams in the Global Schibsted Data PlatformKafka and Kafka Streams in the Global Schibsted Data Platform
Kafka and Kafka Streams in the Global Schibsted Data Platform
 
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable,  Robust Kafka ReplicatoruReplicator: Uber Engineering’s Scalable,  Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
 
E commerce data migration in moving systems across data centres
E commerce data migration in moving systems across data centres E commerce data migration in moving systems across data centres
E commerce data migration in moving systems across data centres
 
Row #9: An architecture overview of APNIC's RDAP deployment to the cloud
Row #9: An architecture overview of APNIC's RDAP deployment to the cloudRow #9: An architecture overview of APNIC's RDAP deployment to the cloud
Row #9: An architecture overview of APNIC's RDAP deployment to the cloud
 
Building a custom time series db - Colin Hemmings at #DOXLON
Building a custom time series db - Colin Hemmings at #DOXLONBuilding a custom time series db - Colin Hemmings at #DOXLON
Building a custom time series db - Colin Hemmings at #DOXLON
 
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
 
Boost on!!next generation big data platform
Boost on!!next generation big data platformBoost on!!next generation big data platform
Boost on!!next generation big data platform
 
Apache geode
Apache geodeApache geode
Apache geode
 
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
 
Storing State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your AnalyticsStoring State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your Analytics
 
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
Flink Forward Berlin 2017 Keynote: Ferd Scheepers - Taking away customer fric...
 
Joe witt may2015_kafka_nyc_apachenifi-overview
Joe witt may2015_kafka_nyc_apachenifi-overviewJoe witt may2015_kafka_nyc_apachenifi-overview
Joe witt may2015_kafka_nyc_apachenifi-overview
 

Viewers also liked

MongoDB at ex.fm
MongoDB at ex.fmMongoDB at ex.fm
MongoDB at ex.fm
MongoDB
 
Science Communication 2.0: changing University attitude through Science resea...
Science Communication 2.0: changing University attitude through Science resea...Science Communication 2.0: changing University attitude through Science resea...
Science Communication 2.0: changing University attitude through Science resea...
Miquel Duran
 
MongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB and AWS Best Practices
MongoDB and AWS Best Practices
MongoDB
 
Microsoft xamarin-experience
Microsoft xamarin-experienceMicrosoft xamarin-experience
Microsoft xamarin-experience
Xpand IT
 
Online Travel: Today and Tomorrow
Online Travel: Today and TomorrowOnline Travel: Today and Tomorrow
Online Travel: Today and Tomorrow
Yanis Dzenis
 
Av capabilities presentation
Av capabilities presentationAv capabilities presentation
Av capabilities presentation
NAISales2
 
USJBF Overview Presentation
USJBF Overview PresentationUSJBF Overview Presentation
USJBF Overview Presentation
kdieckgraeff
 

Viewers also liked (20)

Alan Walker - Faded
Alan Walker - FadedAlan Walker - Faded
Alan Walker - Faded
 
Airline scheduling and pricing using a genetic algorithm
Airline scheduling and pricing using a genetic algorithmAirline scheduling and pricing using a genetic algorithm
Airline scheduling and pricing using a genetic algorithm
 
NOSQL Session GlueCon May 2010
NOSQL Session GlueCon May 2010NOSQL Session GlueCon May 2010
NOSQL Session GlueCon May 2010
 
Revving Up Revenue By Replenishing
Revving Up Revenue By ReplenishingRevving Up Revenue By Replenishing
Revving Up Revenue By Replenishing
 
MongoDB at ex.fm
MongoDB at ex.fmMongoDB at ex.fm
MongoDB at ex.fm
 
Introduction Pentaho 5.0
Introduction Pentaho 5.0 Introduction Pentaho 5.0
Introduction Pentaho 5.0
 
Review: Leadership Frameworks
Review: Leadership FrameworksReview: Leadership Frameworks
Review: Leadership Frameworks
 
Strongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasStrongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible Schemas
 
GIT Best Practices V 0.1
GIT Best Practices V 0.1GIT Best Practices V 0.1
GIT Best Practices V 0.1
 
Science Communication 2.0: changing University attitude through Science resea...
Science Communication 2.0: changing University attitude through Science resea...Science Communication 2.0: changing University attitude through Science resea...
Science Communication 2.0: changing University attitude through Science resea...
 
MongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB and AWS Best Practices
MongoDB and AWS Best Practices
 
Challenges in opening up qualitative research data
Challenges in opening up qualitative research dataChallenges in opening up qualitative research data
Challenges in opening up qualitative research data
 
Microsoft xamarin-experience
Microsoft xamarin-experienceMicrosoft xamarin-experience
Microsoft xamarin-experience
 
Special project
Special projectSpecial project
Special project
 
MongoDB at Flight Centre Ltd
MongoDB at Flight Centre LtdMongoDB at Flight Centre Ltd
MongoDB at Flight Centre Ltd
 
Online Travel: Today and Tomorrow
Online Travel: Today and TomorrowOnline Travel: Today and Tomorrow
Online Travel: Today and Tomorrow
 
Av capabilities presentation
Av capabilities presentationAv capabilities presentation
Av capabilities presentation
 
USJBF Overview Presentation
USJBF Overview PresentationUSJBF Overview Presentation
USJBF Overview Presentation
 
онлайн бронирование модуль для турагенств
онлайн бронирование модуль для турагенствонлайн бронирование модуль для турагенств
онлайн бронирование модуль для турагенств
 
Data meets Creativity - Webbdagarna 2015
Data meets Creativity - Webbdagarna 2015Data meets Creativity - Webbdagarna 2015
Data meets Creativity - Webbdagarna 2015
 

Similar to Sabre presentation for MySQL user conference 2004

Building high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftBuilding high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache Thrift
RX-M Enterprises LLC
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Data Con LA
 

Similar to Sabre presentation for MySQL user conference 2004 (20)

Operational-Analytics
Operational-AnalyticsOperational-Analytics
Operational-Analytics
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
Kinesis @ lyft
Kinesis @ lyftKinesis @ lyft
Kinesis @ lyft
 
How Liberty Mutual Moves toward Real-Time Financial Closing
How Liberty Mutual Moves toward Real-Time Financial ClosingHow Liberty Mutual Moves toward Real-Time Financial Closing
How Liberty Mutual Moves toward Real-Time Financial Closing
 
Building high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftBuilding high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache Thrift
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
 
Apache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode Meetup, London
Apache Geode Meetup, London
 
Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
 
Presentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - englishPresentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - english
 
Transforms Document Management at Scale with Distributed Database Solution wi...
Transforms Document Management at Scale with Distributed Database Solution wi...Transforms Document Management at Scale with Distributed Database Solution wi...
Transforms Document Management at Scale with Distributed Database Solution wi...
 
Supercharging Data Performance for Real-Time Data Analysis
Supercharging Data Performance for Real-Time Data Analysis Supercharging Data Performance for Real-Time Data Analysis
Supercharging Data Performance for Real-Time Data Analysis
 
Power Your Mobile Applications On The Cloud [IndicThreads Mobile Application ...
Power Your Mobile Applications On The Cloud [IndicThreads Mobile Application ...Power Your Mobile Applications On The Cloud [IndicThreads Mobile Application ...
Power Your Mobile Applications On The Cloud [IndicThreads Mobile Application ...
 
2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final
 
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauBig Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
 
Geode - Day 1
Geode - Day 1Geode - Day 1
Geode - Day 1
 
Amazon Web Services Architecture - An Overview
Amazon Web Services Architecture - An OverviewAmazon Web Services Architecture - An Overview
Amazon Web Services Architecture - An Overview
 
How to Integrate Hyperconverged Systems with Existing SANs
How to Integrate Hyperconverged Systems with Existing SANsHow to Integrate Hyperconverged Systems with Existing SANs
How to Integrate Hyperconverged Systems with Existing SANs
 
James Corcoran, Head of Engineering EMEA, First Derivatives, "Simplifying Bi...
James Corcoran, Head of Engineering EMEA, First Derivatives,  "Simplifying Bi...James Corcoran, Head of Engineering EMEA, First Derivatives,  "Simplifying Bi...
James Corcoran, Head of Engineering EMEA, First Derivatives, "Simplifying Bi...
 
FinOps introduction
FinOps introductionFinOps introduction
FinOps introduction
 

Recently uploaded

Recently uploaded (20)

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 

Sabre presentation for MySQL user conference 2004

  • 1. MySQL at Sabre Alan Walker Sabre Labs February 2004 Confidential
  • 2. Agenda • Sabre Holdings Overview • Business drivers for MySQL & Open Source • Shopping for fares • Air Travel Shopping Engine (ATSE) • Data replication strategy • ESQL precompiler for MySQL • Other MySQL users at Sabre 2 22
  • 3. Who is Sabre Holdings? A world leader in travel commerce, retailing travel products, and providing distribution and technology solutions for the travel industry 3 33
  • 5. Sabre Holdings Fast Facts • Industry leader in multiple travel channels • Revenues of $2.06 billion in 2002 • S&P 500 company • NYSE:TSG • Headquarters in Dallas/Fort Worth, Texas • 6,500 employees in 45 countries 5 55
  • 6. Business drivers Over 3 billion fare combinations for a single customer request Multiple airlines, flights, fare types, dates prices, taxes, surcharges 6 66
  • 7. Business drivers • No direct revenue for shopping queries • Revenue for booking, but not looking (searching) • Look-to-book ratio increasing • Competition requires staying on the “leading edge” • Highly reliable and scalable database • Fast processors • Large real memory • Smart algorithms • Shopping is a good fit for horizontal scale • Pricing requires higher precision 7 77
  • 8. Business drivers Application DB / Middleware Computing Stack Commodity Point Operating System Hardware Hardware, operating system, database and middleware are becoming commodities. This drives the cost down rapidly. Open source software is a major driver of this effect. 8 88
  • 9. Business Solution • Linux servers alongside HP NonStop servers to create “hybrid” Air Travel Shopping Engine (ATSE) platform • HP NonStop delivers high availability and reliability – Better than or equal to legacy, but at significantly lower cost – Best fit for critical workloads and master database management • Linux / MySQL delivers 64-bit memory and faster CPUs – Lower availability and reliability than HP NonStop but at significantly lower cost – Best fit for CPU-intensive shopping workloads Most cost-effective platform for the shopping workload 9 99
  • 10. Business drivers • Sabre’s legacy • World’s first commercial OLTP system in 1960 • Mainframe clusters running TPF • Operating system customized to our needs • True 7*24 application, with zero scheduled downtime • Most application code in assembler • Sabre’s future • Higher-level languages • Relational databases • Internet • Open systems • Reduce specialized training • Use off the shelf software • HP NonStop with OSS is a key component (LINUX?) 10 10 10
  • 11. Shopping • Finding cheap air fares is hard! • With 50+ connect points to consider, and >100 fares per leg, we need to evaluate >3 billion combinations • Up to a million fares can change every day • Availability changes continuously • Solve it >100 times per second • Other functions • Price 250 tickets per second • Process 1000 flight routing requests per second 11 11 11
  • 12. Pricing • Shopping vs. Pricing • Shopping is the problem of finding low fares • Pricing is used to print the ticket • Pricing has to be accurate, or we pay the difference to the airline • Many internet search engines still rely on mainframes to actually print the ticket • Pricing also requires additional functions, such as refunds, exchanges and auditing 12 12 12
  • 13. Algorithms • Fare-led search • Graph-based algorithm that searches all fare combinations across 50+ connect points • Can generate up to a 4-segment connection • Search space of >3 billion fare combinations • Match or exceed any competitor in finding lowest fare • Only loses to competitors to have access to exclusive private fares and/or other discounts • Search actually checks Direct Connect Availability, so that low fare options are actually bookable 13 13 13
  • 14. Algorithms • Dynamic schedules • Connections are not generated overnight and stored • Not limited to routes explicitly setup by airlines or other marketing staff • Availability Manager • Flexible rules to access airline availability • Current methods – Direct Connect – Host Availability – Teletype (AVS) • Can also use – Cached DCA – Inventory proxy 14 14 14
  • 15. ATSE Hybrid • Air shopping for desirable itineraries • Must search through multiple airlines, flights, fare types, dates, adjacent airports, etc. • Must calculate prices, taxes, surcharges • Complexity • Single round-trip request can have over 3 billion fare combinations • Search is CPU and memory intensive • Business driver • No direct revenue for shopping transactions • Increasing look to book ratio 15 15 15
  • 16. ATSE Hybrid • Combine Linux servers and HP NonStop servers • HP NonStop delivers high availability and reliability • Better than or equal to TPF at significantly lower cost • Master database management • Data replicated in real-time to Linux servers • PNR pricing, schedules and availability • Linux delivers 64-bit memory model and faster CPUs • Lower availability and reliability than HP NonStop but at significantly lower cost • Horizontally scaled server farm with spare capacity • Best fit for CPU-intensive shopping workloads 16 16 16
  • 17. ATSE Hybrid IBM Fare and Rule Updates Schedule and Availability Updates IBM PSS MVS d i g i t a l d i g i t a l d i g i t a l d i g i t a l d i g i t a l d i g i t a l HP Non-Stop Air Shopping Transactions Shopping Availability Transactions Requests Naming Service And Load Balancing DB Image Load and Updates E/R Logging and Billing Linux Server Farm Load Information Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux 17 17 17
  • 18. ATSE Linux servers • In production since July 2003 • Started with HP rp5405 servers (Unix PA-RISC) – Migrated to Itanium in December 2003 • Using 45 HP rx5670 servers – 4-way, 1.5 GHz, 6MB L2 cache, 32GB RAM, 4x72GB SCSI • Software • MySQL 4.0.15 • GNU compilers – g++ 3.2.3 and glibc 2.3.2 • TAO object request broker • Redhat RHAS 2.1 • GoldenGate Extractor/Replicator • Monitoring – Prognosis, CA Unicenter, scripts 18 18 18
  • 19. ATSE Software • Extensive use of open source software • MySQL 4.0.15 • GNU compilers – g++ 3.2.3 and glibc 2.3.2 • TAO object request broker • Redhat Linux AS 3.0 • Third party software • GoldenGate Extractor/Replicator • Monitoring – Prognosis, CA Unicenter, scripts • Internally developed applications and scripts 19 19 19
  • 20. Data replication • HP NonStop (Tandem) is master database • Golden Gate Software used to replicate to MySQL – Extracts data form undo/redo logs on the NonStop server – Performs INSERT / UPDATE / DELETE on MySQL – Software performs catch-up / resync in case of crashes or other failures • Each Linux server has an identical copy of the database – 50GB database on each server, all InnoDB • Replication volume • 150 tables replicated (over 300 on NonStop server) • Can replicate 1M fare changes / hour • Data updates on 7x24 basis 20 20 20
  • 21. Data replication HP NonStop SQL/MP DB TMF Log Linux IA-64 Data Pump Queue Extract Receive Updater Queue MySQL DB = Golden Gate Software 21 21 21
  • 22. Data Replication Server-Net Extract Queue Extract Queue Extract Queue Extract Queue Extract Queue Extract Queue Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Data Pump Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Extract Collector Queue Queue Queue Queue Queue Queue Queue Queue Queue Queue Queue Queue Replicator Replicator Replicator Replicator Replicator Replicator Replicator Replicator Replicator Replicator Replicator Replicator MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL MySQL Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux 22 22 22
  • 23. Results Reduced development costs Decreased fare loading cycle times Competitive Advantage Increased functionality Reduced runtime costs (over 80% compared to legacy) 23 23 23
  • 24. Hybrid • Horizontal scalability • Ability to throw inexpensive CPUs at the problem • Tolerate failure of a single server • How do we get there from here? • Database and network functions remain on Himalaya • C++ code readily ports to Linux • Publish/subscribe metaphor for data in memory • 64-bit addressing to avoid memory constraints 24 24 24
  • 25. Connectivity • CORBA • Major functions use CORBA internally • CORBA requests to TPF for availability • CORBA to CTS for DCA this Summer (bypass TPF) • Asynchronous messaging via MQ Series • XML • Currently uses XML requests from TPF (over RPPC) for pricing functions • Working on direct access from Travelocity to ATSE – Will be used for BIP – Already working over HTTP (development systems) – Working on security & billing for production 25 25 25
  • 26. Timeline • 2000 • Proof Of Concept, April – August • 5 core developers, partnership with Compaq • 2001 • Development & training began in February • Initial hardware delivered • 2002 • Phase 1 in production since July • Zero downtime since implementation • Rapidly developing additional functionality • Wow – this is from an ancient slide, huh? 26 26 26
  • 27. Precompiler • Challenge • 500K lines of C/C++, 150+ files with embedded SQL • We did not want to rewrite ESQL / C code by hand • Solution • Wrote a precompiler that converts ESQL to inline MySQL calls • About 1000 lines of awk • We are willing to share this code with others EXEC SQL int double char EXEC SQL BEGIN DECLARE SECTION; host_a; host_b; host_c; END DECLARE SECTION; EXEC SQL DECLARE csr1 CURSOR FOR SELECT a, b, c FROM table1 WHERE x = :hostvar1; EXEC SQL OPEN csr1; while (rc >= 0 && rc != 100){ EXEC SQL FETCH csr1 INTO :host_a, :host_b, :host_c; printf("Fetch %d, %lf, %sn", host_a, host_b, host_c); } EXEC SQL CLOSE csr1; 27 27 27
  • 28. Precompiler • How it works • Convert C / ESQL to C++ code • Polymorphism matches data types in the declare section • Can ignore the declare section EXEC SQL int double char EXEC SQL BEGIN DECLARE SECTION; host_a; host_b; host_c; END DECLARE SECTION; // EXEC int double char // EXEC SQL BEGIN DECLARE SECTION; host_a; host_b; host_c; SQL END DECLARE SECTION; 28 28 28
  • 29. Precompiler Cursor declarations (SELECT statements) are converted to a static struct. The struct has the text of the SQL, as well as statement handles for doing prepare / execute (where applicable) EXEC SQL DECLARE csr1 CURSOR FOR SELECT a, b, c FROM table1 WHERE x = :hostvar1; // EXEC SQL DECLARE csr1 static e2mysql csr1 = { " SELECT a,b,c FROM table1 WHERE x = :hostvar1" , NULL , 0}; 29 29 29
  • 30. Precompiler The OPEN, FETCH and CLOSE statements are converted into function calls. The precompiler generates the code for these calls and puts it at the end of the source module. EXEC SQL FETCH csr1 INTO :host_a, :host_b, :host_c; // EXEC SQL FETCH csr1 static int16 fetch_csr1() { if ( ! csr1.rslt ) return SQL_ERROR; if ( csr1.row >= mysql_num_rows(csr1.rslt) ) return SQL_NO_DATA; MYSQL_ROW row = mysql_fetch_row(csr1.rslt); SQLBindColPoly(row[0], host_a, sizeof(host_a)); SQLBindColPoly(row[1], host_b, sizeof(host_b)); SQLBindColPoly(row[2], host_c, sizeof(host_c)); ++csr1.row; return SQL_SUCCESS; } 30 30 30
  • 31. Precompiler A lightweight wrapper around the database API lets us use polymorphism to convert to the types specified in the declare section. There is a wrapper function for each simple C++ type that we handle. inline int32 SQLBindColPoly(const char* value, int32& parm, uint16 size) { parm = atoi(value); return SQL_SUCCESS; } 31 31 31
  • 32. Precompiler • Notes • Light-weight C++ wrapper to MySQL API • The precompiler understands some SQL syntax and does some modifications of NonStop SQL/MP statements • We have also used our precompiler to target other DBMS – ODBC API – Oracle – PostgreSQL • Since we convert C to C++, this may be problematic for ESQL programs that used deprecated K&R syntax – C++ compilers are stricter than C compilers – However, we did not have this problem with our application 32 32 32
  • 33. Other MySQL applications at Sabre • ATSE is our largest and most mission critical • We have other production systems that rely on MySQL • Site59.com is the most visible • MySQL also used for some internal databases • More under development • MySQL / Linux / SATA drives make cheap data marts • Sometimes cheaper to replicate to a data mart than to upgrade a central data warehouse • Currently testing with a 1.5B row database 33 33 33
  • 34. Site59 • Last minute travel packages • Acquired by Travelocity in March 2002 • Sales volume? • Transaction rates? • All dynamic content generated using PHP & MySQL 34 34 34
  • 35. Site59 Site59 implements a fairly “classic” dynamic website using MySQL. Dynamic content is generated at about 30Mbits / second. Extensive use is made of single and dual processor Linux machines (IA-32) Presentation (Apache/PHP) Internet HTTP Application Server Reservations System Gateway XML/HTTP Frontend DB (MySQL, Linux) Replication Backend DB (Oracle, Sun) 35 35 35
  • 36. Travel Commerce Processing Chain Session Shop Price Sell Fulfill 36 36 36