SlideShare a Scribd company logo
3/15/2011 Product Catalogue - Case Study 1
Product Catalogue – Case Study
3/15/2011 Product Catalogue - Case Study 2
Piotr Modzelewski Bartłomiej Rymarski
3/15/2011 Product Catalogue - Case Study 3
an overview of technologies helping to create a
scalable and efficient service quickly
Cheating the project
management triangle
Project management triangle with
constant budget
3/15/2011 Product Catalogue - Case Study 4
3/15/2011 Product Catalogue - Case Study 5
Solution for the lazy
3/15/2011 Product Catalogue - Case Study 6
WTH?
– Binary protocols are fast
• Try to make fast server with XML! :)
– Managing binary protocols is hard
• Is it?
• Can it be easier?
– Web API should be language independent
• Binary in PHP?
• Little Endian vs. Big Endian
So why Thrift?
– Language support:
• C/C++/D, PHP, Java, Erlang, Perl, Python …
– Why to write it, if you can generate it?
– Stand alone server!
• No tomcat BS.
– To good to be true?
3/15/2011 Product Catalogue - Case Study 7
Implementation Step 1 / 3 - Design
namespace java pl.allegro.example.server
typedef i64 Timestamp
service ExampleService {
Timestamp ping(),
}
3/15/2011 Product Catalogue - Case Study 8
Implementation Step 2 / 3 – Gen gen
thrift --gen java ping.thrift
3/15/2011 Product Catalogue - Case Study 9
<dependency>
<groupId>slf4j-api</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.5.10</version>
</dependency>
<dependency>
<groupId>libthrift</groupId>
<artifactId>libthrift</artifactId>
<version>0.2.0</version>
</dependency>
Implementation Step 3 / 3 – Coding
3/15/2011 Product Catalogue - Case Study 10
Deployment – Step 1/2
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<executions>
<execution>
<phase>package</phase>
<goals> <goal>shade</goal> </goals>
</execution>
</executions>
<configuration> <finalName>
${artifactId}-${version}-with-dependencies
</finalName></configuration>
</plugin>
3/15/2011 Product Catalogue - Case Study 11
Deployment – Step 2/2
> mvn clean install
> java –jar PingService-1.0-with-dependencies
3/15/2011 Product Catalogue - Case Study 12
So how to solve our problem?
namespace java pl.allegro.example.server
typedef i64 Timestamp
struct ApiObject {
1: i64 id,
2: string name,
3: string description,
4: list<string> images,
}
struct ApiSearchResult {
1: list<ApiObject> products,
2: i64 totalHits
}
service PrototypeService {
Timestamp ping(),
ApiObject getById (1: i64 id),
ApiSearchResult search (1: string query, 2: i32 page, 3: i32 limit),
}
3/15/2011 Product Catalogue - Case Study 13
When DB is not enought.
Cassandra for example?
3/15/2011 Product Catalogue - Case Study 14
Why Db is not good for our cause?
• Db is heavy
• Downsides of transactions
• Select via index vs Memcached
• Db is unpredictable
• Heavy queries can lock tables
• Db maintenance queries can become heavy
3/15/2011 Product Catalogue - Case Study 15
NoSQL for the rescue!
3/15/2011 Product Catalogue - Case Study 16
Cassandra
• Thrift based Service
• Cluster of agents – Ring
• The data is stored on hard disk and memory (via cache)
• The data distribution is configured
• It can be requested to have data copied on each server or just on
a part of them
• Many options of data storage
3/15/2011 Product Catalogue - Case Study 17
Cassandra in practice
• Keep same data on every node
• Deploy Cassandra near the API server – same machine is
the best
• Keep the data in the scheme simple
• The bigger cache, the better
• Help Cassandra to keep things in cache
3/15/2011 Product Catalogue - Case Study 18
Example deployment
3/15/2011 Product Catalogue - Case Study 19
C-1-1 C-2-1
C-1-2 C-1-3 C-1-4 C-2-2 C-2-3 C-2-4
API-1-1 API-1-2 API-1-3 API-1-1 API-1-2 API-1-3
getById() algorithm
1. Get the byte[] data from the Cassandra by ID
2. If there is no data, throw an exception, that object is not
found
3. Unserialize the data as Java object
4. If it isn’t Thrift class object, translate the data to a
Thrift object
5. Return the object
3/15/2011 Product Catalogue - Case Study 20
Solr
Let sphinx be a relic
3/15/2011 Product Catalogue - Case Study 21
There was a Sphinx
- Popular and well known search engine
- Speed gained by very simple use only
- Really bad debug support
- Command line tools only
- No good build-in replication method
- Bad support for connection pooling
3/15/2011 Product Catalogue - Case Study 22
Solr
- Web based administration panel with statistics, options
and query tool
- Extendable by hook-like plugins
- Many build in features
- Highlights
- Spellchecking
- Rich schema possibilities
3/15/2011 Product Catalogue - Case Study 23
Solr in practice
• It works!
• Master – slave – should be used only if the data
modification isn’t to often
• Use binary protocol if possible
• Reindex clears cache unless configured otherwise
• Cache can’t be bigger then 2GB
• Integrating cache mechanics into Solr costs a lot
3/15/2011 Product Catalogue - Case Study 24
Search() algorithm
1) Process the query
2) Send query to solr
3) Get list of object ids
4) Get objects from cassandra with multiget
5) Unserialize the data as Java object
6) Translate the data to a Thrift object
7) Return the object
3/15/2011 Product Catalogue - Case Study 25
3/15/2011 Product Catalogue - Case Study 26
MogileFS – OMG files!
3/15/2011 Product Catalogue - Case Study 27
• Distributed filesystem
• OpenSource
• Developed by Danga Interactive
• Designed for scalable web app storage
• Users:
What is MogileFS
Why use MogileFS
• Horizontal scalability
• Reliability (no Single Point Of Failure)
• No RAID required (better than RAID)
• Application level (no kernel modules)
• Library available for
• Perl, PHP, Python, Java, Ruby …
3/15/2011 Product Catalogue - Case Study 28
MogileFS Components
• Client
• Tracker
• Handles communication (Application/Frontend<>Storage/DB)
• Multiple workers (processes)
• Storage
• Physical storage for files
• Any HTTP server (with WebDAV support)
• Your OS & FS
• Database
• Stores metadata, server settings, hosts & devices
• Your RDBMS
3/15/2011 Product Catalogue - Case Study 29
All your files are belong to …
• Files are organized in a flat namespace
• Classes (replication policy)
• Domains (many applications)
• Replication according to file class policy
• Each class has its own minimal replica count
• You decide how safe your files should be
3/15/2011 Product Catalogue - Case Study 30
Overview
3/15/2011 Product Catalogue - Case Study 31
Frontend
Tracker 1
Tracker 2
Storage 4Storage 3Storage 1 Storage 2
Database
Scaling & Performance
• Add more frontends / trackers / workers
• Add more storage nodes / devices
• Increase replicate count
• Add MySQL Slave replica for reads
• Replace Mogstored with Nginx
• Use Nginx as a frontend
• Cache tracker replies (memcached)
3/15/2011 Product Catalogue - Case Study 32
SLA
3/15/2011 Product Catalogue - Case Study 33
Reliability & HA
• MySQL Master<>Master replication (active/standby)
• Redundant storage nodes / trackers / frontends
• Haproxy for tracker and DB slave connections
• OpenSolaris & ZFS for storage nodes
3/15/2011 Product Catalogue - Case Study 34
Multiple datacenters awarness
• MultipleNetworks plugin
• Directs requests from one IP class to storage nodes on the same
address class
• As easy as
• MySQL Master<>Master replication between datacenters
3/15/2011 Product Catalogue - Case Study 35
Usage examples
3/15/2011 Product Catalogue - Case Study 36
• Adding a new storage node
• Adding a new device on a storage
• Changing replication policy for class1
• Adding a MySQL slave replica for doing reads
Now
3/15/2011 Product Catalogue - Case Study 37
Client
Storage
Storage
Tracker
MySQL MasterMySQL Slave
Tracker
Frontend
Frontend
Storage
Storage
Tracker
MySQL Master MySQL Slave
Tracker
Frontend
Frontend
DC1 DC2
Future
3/15/2011 Product Catalogue - Case Study 38
Client
Tracker
MySQL
Master
MySQL Slave
Tracker
FrontendFrontend Frontend
FrontendFrontend Frontend
Tracker
MemcachedMemcached Memcached
FrontendFrontend Frontend
StorageStorage Storage
StorageStorage Storage
StorageStorage Storage
MySQL Slave
Tracker
MySQL SlaveMySQL Slave
Tracker
FrontendFrontend Frontend
FrontendFrontend Frontend
Tracker
MemcachedMemcached Memcached
FrontendFrontend Frontend
StorageStorage Storage
StorageStorage Storage
StorageStorage Storage
MySQL
Master
Thank you!
Q&A?
3/15/2011 Product Catalogue - Case Study 39
Piotr Modzelewski
piotr.modzelewski@allegro.pl
Bartłomiej Rymarski
bartlomiej.rymarski@allegro.pl

More Related Content

What's hot

Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDB
Ahmed Farag
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
botsplash.com
 
Adding Support for Networking and Web Technologies to an Embedded System
Adding Support for Networking and Web Technologies to an Embedded SystemAdding Support for Networking and Web Technologies to an Embedded System
Adding Support for Networking and Web Technologies to an Embedded System
John Efstathiades
 
1. SQL Server forSharePoint geeksA gentle introductionThomas Vochten • Septem...
1. SQL Server forSharePoint geeksA gentle introductionThomas Vochten • Septem...1. SQL Server forSharePoint geeksA gentle introductionThomas Vochten • Septem...
1. SQL Server forSharePoint geeksA gentle introductionThomas Vochten • Septem...
BIWUG
 
Data warehouse 11 introduction to data transformation
Data warehouse 11 introduction to data transformationData warehouse 11 introduction to data transformation
Data warehouse 11 introduction to data transformation
Vaibhav Khanna
 
Selecting the right cache framework
Selecting the right cache frameworkSelecting the right cache framework
Selecting the right cache framework
Mohammed Fazuluddin
 

What's hot (6)

Introduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDB
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
 
Adding Support for Networking and Web Technologies to an Embedded System
Adding Support for Networking and Web Technologies to an Embedded SystemAdding Support for Networking and Web Technologies to an Embedded System
Adding Support for Networking and Web Technologies to an Embedded System
 
1. SQL Server forSharePoint geeksA gentle introductionThomas Vochten • Septem...
1. SQL Server forSharePoint geeksA gentle introductionThomas Vochten • Septem...1. SQL Server forSharePoint geeksA gentle introductionThomas Vochten • Septem...
1. SQL Server forSharePoint geeksA gentle introductionThomas Vochten • Septem...
 
Data warehouse 11 introduction to data transformation
Data warehouse 11 introduction to data transformationData warehouse 11 introduction to data transformation
Data warehouse 11 introduction to data transformation
 
Selecting the right cache framework
Selecting the right cache frameworkSelecting the right cache framework
Selecting the right cache framework
 

Similar to PLNOG 6: Piotr Modzelewski, Bartłomiej Rymarski - Product Catalogue - Case Study

Hardware Provisioning
Hardware Provisioning Hardware Provisioning
Hardware Provisioning
MongoDB
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
MongoDB
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
RahulBhole12
 
Scaling Security Workflows in Government Agencies
Scaling Security Workflows in Government AgenciesScaling Security Workflows in Government Agencies
Scaling Security Workflows in Government Agencies
Avere Systems
 
VoltDB.ppt
VoltDB.pptVoltDB.ppt
VoltDB.ppt
ssuser2c91ab
 
2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches
George Ang
 
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB
 
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
GlobalLogic Ukraine
 
Fastest Servlets in the West
Fastest Servlets in the WestFastest Servlets in the West
Fastest Servlets in the West
Stuart (Pid) Williams
 
Data Virtualization Reference Architectures: Correctly Architecting your Solu...
Data Virtualization Reference Architectures: Correctly Architecting your Solu...Data Virtualization Reference Architectures: Correctly Architecting your Solu...
Data Virtualization Reference Architectures: Correctly Architecting your Solu...
Denodo
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
Alex Moskvin
 
Data management for Quantitative Biology -Basics and challenges in biomedical...
Data management for Quantitative Biology -Basics and challenges in biomedical...Data management for Quantitative Biology -Basics and challenges in biomedical...
Data management for Quantitative Biology -Basics and challenges in biomedical...
QBiC_Tue
 
An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)
Marco Gralike
 
An AMIS overview of database 12c
An AMIS overview of database 12cAn AMIS overview of database 12c
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles Judith
NETWAYS
 
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnikbiz
 
Stored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s GuideStored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s Guide
VoltDB
 
Internals of Presto Service
Internals of Presto ServiceInternals of Presto Service
Internals of Presto Service
Treasure Data, Inc.
 
9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf
sreedb2
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Splunk
 

Similar to PLNOG 6: Piotr Modzelewski, Bartłomiej Rymarski - Product Catalogue - Case Study (20)

Hardware Provisioning
Hardware Provisioning Hardware Provisioning
Hardware Provisioning
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
 
Scaling Security Workflows in Government Agencies
Scaling Security Workflows in Government AgenciesScaling Security Workflows in Government Agencies
Scaling Security Workflows in Government Agencies
 
VoltDB.ppt
VoltDB.pptVoltDB.ppt
VoltDB.ppt
 
2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches
 
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
 
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
 
Fastest Servlets in the West
Fastest Servlets in the WestFastest Servlets in the West
Fastest Servlets in the West
 
Data Virtualization Reference Architectures: Correctly Architecting your Solu...
Data Virtualization Reference Architectures: Correctly Architecting your Solu...Data Virtualization Reference Architectures: Correctly Architecting your Solu...
Data Virtualization Reference Architectures: Correctly Architecting your Solu...
 
Realtime traffic analyser
Realtime traffic analyserRealtime traffic analyser
Realtime traffic analyser
 
Data management for Quantitative Biology -Basics and challenges in biomedical...
Data management for Quantitative Biology -Basics and challenges in biomedical...Data management for Quantitative Biology -Basics and challenges in biomedical...
Data management for Quantitative Biology -Basics and challenges in biomedical...
 
An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)
 
An AMIS overview of database 12c
An AMIS overview of database 12cAn AMIS overview of database 12c
An AMIS overview of database 12c
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles Judith
 
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
 
Stored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s GuideStored Procedure Superpowers: A Developer’s Guide
Stored Procedure Superpowers: A Developer’s Guide
 
Internals of Presto Service
Internals of Presto ServiceInternals of Presto Service
Internals of Presto Service
 
9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 

Recently uploaded

Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsCollapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Rosie Wells
 
BRIC_2024_2024-06-06-11:30-haunschild_archival_version.pdf
BRIC_2024_2024-06-06-11:30-haunschild_archival_version.pdfBRIC_2024_2024-06-06-11:30-haunschild_archival_version.pdf
BRIC_2024_2024-06-06-11:30-haunschild_archival_version.pdf
Robin Haunschild
 
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
OECD Directorate for Financial and Enterprise Affairs
 
Why Psychological Safety Matters for Software Teams - ACE 2024 - Ben Linders.pdf
Why Psychological Safety Matters for Software Teams - ACE 2024 - Ben Linders.pdfWhy Psychological Safety Matters for Software Teams - ACE 2024 - Ben Linders.pdf
Why Psychological Safety Matters for Software Teams - ACE 2024 - Ben Linders.pdf
Ben Linders
 
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
OECD Directorate for Financial and Enterprise Affairs
 
The remarkable life of Sir Mokshagundam Visvesvaraya.pptx
The remarkable life of Sir Mokshagundam Visvesvaraya.pptxThe remarkable life of Sir Mokshagundam Visvesvaraya.pptx
The remarkable life of Sir Mokshagundam Visvesvaraya.pptx
JiteshKumarChoudhary2
 
2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf
Frederic Leger
 
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
OECD Directorate for Financial and Enterprise Affairs
 
Pro-competitive Industrial Policy – LANE – June 2024 OECD discussion
Pro-competitive Industrial Policy – LANE – June 2024 OECD discussionPro-competitive Industrial Policy – LANE – June 2024 OECD discussion
Pro-competitive Industrial Policy – LANE – June 2024 OECD discussion
OECD Directorate for Financial and Enterprise Affairs
 
原版制作贝德福特大学毕业证(bedfordhire毕业证)硕士文凭原版一模一样
原版制作贝德福特大学毕业证(bedfordhire毕业证)硕士文凭原版一模一样原版制作贝德福特大学毕业证(bedfordhire毕业证)硕士文凭原版一模一样
原版制作贝德福特大学毕业证(bedfordhire毕业证)硕士文凭原版一模一样
gpww3sf4
 
Artificial Intelligence, Data and Competition – SCHREPEL – June 2024 OECD dis...
Artificial Intelligence, Data and Competition – SCHREPEL – June 2024 OECD dis...Artificial Intelligence, Data and Competition – SCHREPEL – June 2024 OECD dis...
Artificial Intelligence, Data and Competition – SCHREPEL – June 2024 OECD dis...
OECD Directorate for Financial and Enterprise Affairs
 
Pro-competitive Industrial Policy – OECD – June 2024 OECD discussion
Pro-competitive Industrial Policy – OECD – June 2024 OECD discussionPro-competitive Industrial Policy – OECD – June 2024 OECD discussion
Pro-competitive Industrial Policy – OECD – June 2024 OECD discussion
OECD Directorate for Financial and Enterprise Affairs
 
Gregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics PresentationGregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics Presentation
gharris9
 
Artificial Intelligence, Data and Competition – LIM – June 2024 OECD discussion
Artificial Intelligence, Data and Competition – LIM – June 2024 OECD discussionArtificial Intelligence, Data and Competition – LIM – June 2024 OECD discussion
Artificial Intelligence, Data and Competition – LIM – June 2024 OECD discussion
OECD Directorate for Financial and Enterprise Affairs
 
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij
 
Artificial Intelligence, Data and Competition – ČORBA – June 2024 OECD discus...
Artificial Intelligence, Data and Competition – ČORBA – June 2024 OECD discus...Artificial Intelligence, Data and Competition – ČORBA – June 2024 OECD discus...
Artificial Intelligence, Data and Competition – ČORBA – June 2024 OECD discus...
OECD Directorate for Financial and Enterprise Affairs
 
Artificial Intelligence, Data and Competition – OECD – June 2024 OECD discussion
Artificial Intelligence, Data and Competition – OECD – June 2024 OECD discussionArtificial Intelligence, Data and Competition – OECD – June 2024 OECD discussion
Artificial Intelligence, Data and Competition – OECD – June 2024 OECD discussion
OECD Directorate for Financial and Enterprise Affairs
 
ASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdfASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdf
ToshihiroIto4
 
IEEE CIS Webinar Sustainable futures.pdf
IEEE CIS Webinar Sustainable futures.pdfIEEE CIS Webinar Sustainable futures.pdf
IEEE CIS Webinar Sustainable futures.pdf
Claudio Gallicchio
 
Carrer goals.pptx and their importance in real life
Carrer goals.pptx  and their importance in real lifeCarrer goals.pptx  and their importance in real life
Carrer goals.pptx and their importance in real life
artemacademy2
 

Recently uploaded (20)

Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsCollapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
 
BRIC_2024_2024-06-06-11:30-haunschild_archival_version.pdf
BRIC_2024_2024-06-06-11:30-haunschild_archival_version.pdfBRIC_2024_2024-06-06-11:30-haunschild_archival_version.pdf
BRIC_2024_2024-06-06-11:30-haunschild_archival_version.pdf
 
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
The Intersection between Competition and Data Privacy – OECD – June 2024 OECD...
 
Why Psychological Safety Matters for Software Teams - ACE 2024 - Ben Linders.pdf
Why Psychological Safety Matters for Software Teams - ACE 2024 - Ben Linders.pdfWhy Psychological Safety Matters for Software Teams - ACE 2024 - Ben Linders.pdf
Why Psychological Safety Matters for Software Teams - ACE 2024 - Ben Linders.pdf
 
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
The Intersection between Competition and Data Privacy – KEMP – June 2024 OECD...
 
The remarkable life of Sir Mokshagundam Visvesvaraya.pptx
The remarkable life of Sir Mokshagundam Visvesvaraya.pptxThe remarkable life of Sir Mokshagundam Visvesvaraya.pptx
The remarkable life of Sir Mokshagundam Visvesvaraya.pptx
 
2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf
 
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
 
Pro-competitive Industrial Policy – LANE – June 2024 OECD discussion
Pro-competitive Industrial Policy – LANE – June 2024 OECD discussionPro-competitive Industrial Policy – LANE – June 2024 OECD discussion
Pro-competitive Industrial Policy – LANE – June 2024 OECD discussion
 
原版制作贝德福特大学毕业证(bedfordhire毕业证)硕士文凭原版一模一样
原版制作贝德福特大学毕业证(bedfordhire毕业证)硕士文凭原版一模一样原版制作贝德福特大学毕业证(bedfordhire毕业证)硕士文凭原版一模一样
原版制作贝德福特大学毕业证(bedfordhire毕业证)硕士文凭原版一模一样
 
Artificial Intelligence, Data and Competition – SCHREPEL – June 2024 OECD dis...
Artificial Intelligence, Data and Competition – SCHREPEL – June 2024 OECD dis...Artificial Intelligence, Data and Competition – SCHREPEL – June 2024 OECD dis...
Artificial Intelligence, Data and Competition – SCHREPEL – June 2024 OECD dis...
 
Pro-competitive Industrial Policy – OECD – June 2024 OECD discussion
Pro-competitive Industrial Policy – OECD – June 2024 OECD discussionPro-competitive Industrial Policy – OECD – June 2024 OECD discussion
Pro-competitive Industrial Policy – OECD – June 2024 OECD discussion
 
Gregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics PresentationGregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics Presentation
 
Artificial Intelligence, Data and Competition – LIM – June 2024 OECD discussion
Artificial Intelligence, Data and Competition – LIM – June 2024 OECD discussionArtificial Intelligence, Data and Competition – LIM – June 2024 OECD discussion
Artificial Intelligence, Data and Competition – LIM – June 2024 OECD discussion
 
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
 
Artificial Intelligence, Data and Competition – ČORBA – June 2024 OECD discus...
Artificial Intelligence, Data and Competition – ČORBA – June 2024 OECD discus...Artificial Intelligence, Data and Competition – ČORBA – June 2024 OECD discus...
Artificial Intelligence, Data and Competition – ČORBA – June 2024 OECD discus...
 
Artificial Intelligence, Data and Competition – OECD – June 2024 OECD discussion
Artificial Intelligence, Data and Competition – OECD – June 2024 OECD discussionArtificial Intelligence, Data and Competition – OECD – June 2024 OECD discussion
Artificial Intelligence, Data and Competition – OECD – June 2024 OECD discussion
 
ASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdfASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdf
 
IEEE CIS Webinar Sustainable futures.pdf
IEEE CIS Webinar Sustainable futures.pdfIEEE CIS Webinar Sustainable futures.pdf
IEEE CIS Webinar Sustainable futures.pdf
 
Carrer goals.pptx and their importance in real life
Carrer goals.pptx  and their importance in real lifeCarrer goals.pptx  and their importance in real life
Carrer goals.pptx and their importance in real life
 

PLNOG 6: Piotr Modzelewski, Bartłomiej Rymarski - Product Catalogue - Case Study

  • 2. Product Catalogue – Case Study 3/15/2011 Product Catalogue - Case Study 2 Piotr Modzelewski Bartłomiej Rymarski
  • 3. 3/15/2011 Product Catalogue - Case Study 3 an overview of technologies helping to create a scalable and efficient service quickly Cheating the project management triangle
  • 4. Project management triangle with constant budget 3/15/2011 Product Catalogue - Case Study 4
  • 5. 3/15/2011 Product Catalogue - Case Study 5 Solution for the lazy
  • 6. 3/15/2011 Product Catalogue - Case Study 6 WTH? – Binary protocols are fast • Try to make fast server with XML! :) – Managing binary protocols is hard • Is it? • Can it be easier? – Web API should be language independent • Binary in PHP? • Little Endian vs. Big Endian
  • 7. So why Thrift? – Language support: • C/C++/D, PHP, Java, Erlang, Perl, Python … – Why to write it, if you can generate it? – Stand alone server! • No tomcat BS. – To good to be true? 3/15/2011 Product Catalogue - Case Study 7
  • 8. Implementation Step 1 / 3 - Design namespace java pl.allegro.example.server typedef i64 Timestamp service ExampleService { Timestamp ping(), } 3/15/2011 Product Catalogue - Case Study 8
  • 9. Implementation Step 2 / 3 – Gen gen thrift --gen java ping.thrift 3/15/2011 Product Catalogue - Case Study 9 <dependency> <groupId>slf4j-api</groupId> <artifactId>slf4j-api</artifactId> <version>1.5.10</version> </dependency> <dependency> <groupId>libthrift</groupId> <artifactId>libthrift</artifactId> <version>0.2.0</version> </dependency>
  • 10. Implementation Step 3 / 3 – Coding 3/15/2011 Product Catalogue - Case Study 10
  • 11. Deployment – Step 1/2 <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> </execution> </executions> <configuration> <finalName> ${artifactId}-${version}-with-dependencies </finalName></configuration> </plugin> 3/15/2011 Product Catalogue - Case Study 11
  • 12. Deployment – Step 2/2 > mvn clean install > java –jar PingService-1.0-with-dependencies 3/15/2011 Product Catalogue - Case Study 12
  • 13. So how to solve our problem? namespace java pl.allegro.example.server typedef i64 Timestamp struct ApiObject { 1: i64 id, 2: string name, 3: string description, 4: list<string> images, } struct ApiSearchResult { 1: list<ApiObject> products, 2: i64 totalHits } service PrototypeService { Timestamp ping(), ApiObject getById (1: i64 id), ApiSearchResult search (1: string query, 2: i32 page, 3: i32 limit), } 3/15/2011 Product Catalogue - Case Study 13
  • 14. When DB is not enought. Cassandra for example? 3/15/2011 Product Catalogue - Case Study 14
  • 15. Why Db is not good for our cause? • Db is heavy • Downsides of transactions • Select via index vs Memcached • Db is unpredictable • Heavy queries can lock tables • Db maintenance queries can become heavy 3/15/2011 Product Catalogue - Case Study 15
  • 16. NoSQL for the rescue! 3/15/2011 Product Catalogue - Case Study 16
  • 17. Cassandra • Thrift based Service • Cluster of agents – Ring • The data is stored on hard disk and memory (via cache) • The data distribution is configured • It can be requested to have data copied on each server or just on a part of them • Many options of data storage 3/15/2011 Product Catalogue - Case Study 17
  • 18. Cassandra in practice • Keep same data on every node • Deploy Cassandra near the API server – same machine is the best • Keep the data in the scheme simple • The bigger cache, the better • Help Cassandra to keep things in cache 3/15/2011 Product Catalogue - Case Study 18
  • 19. Example deployment 3/15/2011 Product Catalogue - Case Study 19 C-1-1 C-2-1 C-1-2 C-1-3 C-1-4 C-2-2 C-2-3 C-2-4 API-1-1 API-1-2 API-1-3 API-1-1 API-1-2 API-1-3
  • 20. getById() algorithm 1. Get the byte[] data from the Cassandra by ID 2. If there is no data, throw an exception, that object is not found 3. Unserialize the data as Java object 4. If it isn’t Thrift class object, translate the data to a Thrift object 5. Return the object 3/15/2011 Product Catalogue - Case Study 20
  • 21. Solr Let sphinx be a relic 3/15/2011 Product Catalogue - Case Study 21
  • 22. There was a Sphinx - Popular and well known search engine - Speed gained by very simple use only - Really bad debug support - Command line tools only - No good build-in replication method - Bad support for connection pooling 3/15/2011 Product Catalogue - Case Study 22
  • 23. Solr - Web based administration panel with statistics, options and query tool - Extendable by hook-like plugins - Many build in features - Highlights - Spellchecking - Rich schema possibilities 3/15/2011 Product Catalogue - Case Study 23
  • 24. Solr in practice • It works! • Master – slave – should be used only if the data modification isn’t to often • Use binary protocol if possible • Reindex clears cache unless configured otherwise • Cache can’t be bigger then 2GB • Integrating cache mechanics into Solr costs a lot 3/15/2011 Product Catalogue - Case Study 24
  • 25. Search() algorithm 1) Process the query 2) Send query to solr 3) Get list of object ids 4) Get objects from cassandra with multiget 5) Unserialize the data as Java object 6) Translate the data to a Thrift object 7) Return the object 3/15/2011 Product Catalogue - Case Study 25
  • 26. 3/15/2011 Product Catalogue - Case Study 26 MogileFS – OMG files!
  • 27. 3/15/2011 Product Catalogue - Case Study 27 • Distributed filesystem • OpenSource • Developed by Danga Interactive • Designed for scalable web app storage • Users: What is MogileFS
  • 28. Why use MogileFS • Horizontal scalability • Reliability (no Single Point Of Failure) • No RAID required (better than RAID) • Application level (no kernel modules) • Library available for • Perl, PHP, Python, Java, Ruby … 3/15/2011 Product Catalogue - Case Study 28
  • 29. MogileFS Components • Client • Tracker • Handles communication (Application/Frontend<>Storage/DB) • Multiple workers (processes) • Storage • Physical storage for files • Any HTTP server (with WebDAV support) • Your OS & FS • Database • Stores metadata, server settings, hosts & devices • Your RDBMS 3/15/2011 Product Catalogue - Case Study 29
  • 30. All your files are belong to … • Files are organized in a flat namespace • Classes (replication policy) • Domains (many applications) • Replication according to file class policy • Each class has its own minimal replica count • You decide how safe your files should be 3/15/2011 Product Catalogue - Case Study 30
  • 31. Overview 3/15/2011 Product Catalogue - Case Study 31 Frontend Tracker 1 Tracker 2 Storage 4Storage 3Storage 1 Storage 2 Database
  • 32. Scaling & Performance • Add more frontends / trackers / workers • Add more storage nodes / devices • Increase replicate count • Add MySQL Slave replica for reads • Replace Mogstored with Nginx • Use Nginx as a frontend • Cache tracker replies (memcached) 3/15/2011 Product Catalogue - Case Study 32
  • 34. Reliability & HA • MySQL Master<>Master replication (active/standby) • Redundant storage nodes / trackers / frontends • Haproxy for tracker and DB slave connections • OpenSolaris & ZFS for storage nodes 3/15/2011 Product Catalogue - Case Study 34
  • 35. Multiple datacenters awarness • MultipleNetworks plugin • Directs requests from one IP class to storage nodes on the same address class • As easy as • MySQL Master<>Master replication between datacenters 3/15/2011 Product Catalogue - Case Study 35
  • 36. Usage examples 3/15/2011 Product Catalogue - Case Study 36 • Adding a new storage node • Adding a new device on a storage • Changing replication policy for class1 • Adding a MySQL slave replica for doing reads
  • 37. Now 3/15/2011 Product Catalogue - Case Study 37 Client Storage Storage Tracker MySQL MasterMySQL Slave Tracker Frontend Frontend Storage Storage Tracker MySQL Master MySQL Slave Tracker Frontend Frontend DC1 DC2
  • 38. Future 3/15/2011 Product Catalogue - Case Study 38 Client Tracker MySQL Master MySQL Slave Tracker FrontendFrontend Frontend FrontendFrontend Frontend Tracker MemcachedMemcached Memcached FrontendFrontend Frontend StorageStorage Storage StorageStorage Storage StorageStorage Storage MySQL Slave Tracker MySQL SlaveMySQL Slave Tracker FrontendFrontend Frontend FrontendFrontend Frontend Tracker MemcachedMemcached Memcached FrontendFrontend Frontend StorageStorage Storage StorageStorage Storage StorageStorage Storage MySQL Master
  • 39. Thank you! Q&A? 3/15/2011 Product Catalogue - Case Study 39 Piotr Modzelewski piotr.modzelewski@allegro.pl Bartłomiej Rymarski bartlomiej.rymarski@allegro.pl